On Tue, Apr 29, 2014 at 4:18 AM, Shawn Pearce <spea...@spearce.org> wrote:
> On Mon, Apr 28, 2014 at 3:55 AM, Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
>> I hinted about it earlier . It now passes the test suite and with a
>> design that I'm happy with (thanks to Junio for a suggestion about the
>> rename problem).
>> From the user point of view, this reduces the writable size of index
>> down to the number of updated files. For example my webkit index v4 is
>> 14MB. With a fresh split, I only have to update an index of 200KB.
>> Every file I touch will add about 80 bytes to that. As long as I don't
>> touch every single tracked file in my worktree, I should not pay
>> penalty for writing 14MB index file on every operation.
> This is a very welcome type of improvement.
> I am however concerned about the complexity of the format employed.
> Why do we need two EWAH bitmaps in the new index? Why isn't this just
> a pair of sorted files that are merge-joined at read, with records in
> $GIT_DIR/index taking priority over same-named records in
> $GIT_DIR/sharedindex.$SHA1? Deletes could be marked with a bit or an
> "all zero" metadata record.
With the bitmaps, I know the exact position to replace or delete an
entry. Merge sort works, but I would need to walk through all entries
in both indexes to compare entry name and stage, a bit costly in my
opinion. And if you look at the format description in patch 0017, I
store the replaced entries without their names to save a bit more
space. "EWAH" is just an implementation detail. A straightforward
bitmap should work fine (25kb for 200k entries seem reasonable).
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html