On 6/21/2018 1:38 PM, Junio C Hamano wrote:
Derrick Stolee <sto...@gmail.com> writes:

On 6/7/2018 2:26 PM, Duy Nguyen wrote:
On Thu, Jun 7, 2018 at 4:03 PM, Derrick Stolee <sto...@gmail.com> wrote:
@@ -74,6 +80,31 @@ struct midxed_git *load_midxed_git(const char *object_dir)
          m->num_chunks = *(m->data + 6);
          m->num_packs = get_be32(m->data + 8);

+       for (i = 0; i < m->num_chunks; i++) {
+               uint32_t chunk_id = get_be32(m->data + 12 + 
MIDX_CHUNKLOOKUP_WIDTH * i);
+               uint64_t chunk_offset = get_be64(m->data + 16 + 
MIDX_CHUNKLOOKUP_WIDTH * i);
Would be good to reduce magic numbers like 12 and 16, I think you have
some header length constants for those already.

+               switch (chunk_id) {
+                       case MIDX_CHUNKID_PACKNAMES:
+                               m->chunk_pack_names = m->data + chunk_offset;
+                               break;
(style: aren't these case arms indented one level too deep)?

+                       case 0:
+                               die("terminating MIDX chunk id appears earlier than 
expected");
_()
This die() and others like it are not marked for translation on
purpose, as they should never be seen by an end-user.
Should never be seen because it indicates a software bug, in which
case this should be BUG() instead of die()?

Or did we just find a file corruption on the filesystem?  If so,
then the error is end-user facing and should tell the user something
that hints what is going on in the language the user understands, I
would guess.

+                       default:
+                               /*
+                                * Do nothing on unrecognized chunks, allowing 
future
+                                * extensions to add optional chunks.
+                                */
I wrote about the chunk term reminding me of PNG format then deleted
it. But it may help to do similar to PNG here. The first letter can
let us know if the chunk is optional and can be safely ignored. E.g.
uppercase first letter cannot be ignored, lowercase go wild.
That's an interesting way to think about it. That way you could add a
new "required" chunk and earlier versions could die() realizing they
don't know how to parse that required chunk.
That is how the index extension sections work and may be a good
example to follow.

The index extension documentation doesn't appear to be clear about which extensions are optional or required, but it seems the split-index is the only "required" one and uses lowercase for its extension id.

Since the multi-pack-index has similar structure to the commit-graph file, and that file includes an optional chunk with no special casing of the chunk id, I think we should stick with the existing model: chunks that are added later are optional and if Git _must_ understand it, then we increment the version number. Hence, for each version number there is a fixed list of required chunks, but an extendible list of optional chunks.

Thanks,
-Stolee

Reply via email to