On Thu, Jul 11, 2013 at 7:26 PM, Thomas Gummerer <t.gumme...@gmail.com> wrote:
> Duy Nguyen <pclo...@gmail.com> writes:
>> On Thu, Jul 11, 2013 at 6:39 PM, Thomas Gummerer <t.gumme...@gmail.com>
>>>> Question about the possibility of updating index file directly. If git
>>>> updates a few fields of an entry (but not entrycrc yet) and crashes,
>>>> the entry would become corrupt because its entrycrc does not match the
>>>> content. What do we do? Do we need to save a copy of the entry
>>>> somewhere in the index file (maybe in the conflict data section), so
>>>> that the reader can recover the index? Losing the index because of
>>>> bugs is big deal in my opinion. pre-v5 never faces this because we
>>>> keep the original copy til the end.
>>>> Maybe entrycrc should not cover stat fields and statcrc. It would make
>>>> refreshing safer. If the above happens during refresh, only statcrc is
>>>> corrupt and we can just refresh the entry. entrycrc still says the
>>>> other fields are good (and they are).
>>> The original idea was to change the lock-file for partial writing to
>>> make it work for this case. The exact structure of the file still has
>>> to be defined, but generally it would be done in the following steps:
>>> 1. Write the changed entry to the lock-file
>>> 2. Change the entry in the index
>>> 3. If we succeed delete the lock-file (commit the transaction)
>>> If git crashes, and leaves the index corrupted, we can recover the
>>> information from the lock-file and write the new information to the
>>> index file and then delete the lock-file.
>> Ah makes sense. Still concerned about refreshing though. Updated files
>> are usually few while refreshed files could be a lot more, increasing
>> the cost at #1.
> Any idea how common refreshing a big part of the cache is?
No, probably not common. Anyone who does "find|xargs touch" deserves
to be punished. Files can be edited, then reverted by an editor, but
there should not be many of those. The only sensible case is "git
checkout <path>" with lots of modified files. But that can't happen
> If it's not to common, I'd prefer to leave the stat data and stat crc in the
> entrycrc, as we can inform the user if something is wrong with the
> index, be it from git failing, or from disk corruption.
> On the other hand if refresh_cache is relatively common and usually
> changes a big part of the index we should leave them out, as git can
> still run correctly with incorrect stat data, but takes a little longer,
> because it may have to check the file contents. That will be trade-off
> to make here.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html