Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-31 Thread Junio C Hamano
Junio C Hamano  writes:

> Yes, but you need to realize that "it is better not to bother users
> with a report of failure to touch in read-only repository" and "we
> ignore all failures".

Sorry about an unfinished sentence here.  "need to realize that
... and ... are different things."

> ... It is very similar
> to a situation where you ... run "status".
> The command first runs the equivalent of "update-index --refresh"
> only in-core, and it attempts to write the updated index because
> (1) it paid the cost of doing the refreshing already, and (2) if it
> can write into the index, it will help future operations in the
> repository.  But it does not report a failure to write the index
> exactly because it is merely doing an opportunistic write.
>
> And in the "we read from the split index, and we attempt to touch
> the underlying shared one to update its timestamp" case, it is OK
> not to report if we failed to touch.
> ...
> ... On the
> other hand, if you added new information, i.e. wrote the split index
> based on it, it is a good indication that the  index
> pair has information that is more valuable.  We must warn in that
> case.

This reminds us of a third case.  What should happen if we are doing
the "opportunistic writing of the index" in "git status", managed to
write the split index, but failed to touch the shared one?

In the ideal world, I think we should do the following sequence:

 - "status" tries to write cache to the file.

 - we try to write and on any error, we return error to the caller,
   who is already prepared to ignore it and stay silent.

- as the first step of writing the index, we first try to touch
  the shared one.  If it fails, we return an error here without
  writing the split one out.

- then we try to write the split one out.  If this fails, we
  also return an error.

- otherwise, both touching of the shared one and writing of the
  split one are successful.  

 - "status" finishes the opportunistic refreshing of the index, by
   either ignoring an error silently (if either touching of shared
   one or writing of split one fails) or writing the refreshed index
   out successfully.

It is OK to swap the order of touching the shared one and writing of
the split one in the above sequence, as long as an error in either
step signals a failure to the opportunistic caller.

I do not offhand know if the split-index code is structured in such
a way to allow the above sequence easily, or it needs refactoring.

If such a restructuring is required, it might not be within the
scope of the series and I am OK if you just left the NEEDSWORK
comment that describes the above (i.e. what we should be doing) as
an in-code comment so that we can pick it up later.  The whole point
of the step 14/21 on the other hand is to make sure that a shared
index that is still in active use will not go stale, and from that
point of view, such a "punting" may not be a good idea---it
deliberately finishes the series knowing that it does not adequately
do what it promises to do.  

So, ... I dunno.



Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-31 Thread Junio C Hamano
Christian Couder  writes:

>> You are listing only the irrelevant cases.  The shared one may be
>> used immediately, and the user can keep using it for a while without
>> "touching".
>
> Now you are talking about a case where the shared index file can be
> used immediately and the user can keep using it.
> This is a different case than the case I was talking about (which is
> the case when the shared index file does not exist anymore).

Yes, as I said, you are listing irrelevant and uninteresting one.
If the shared index is already gone, reporting failure to touch it
is of no use---an attempt to read and use the split index that
depends on the shared index that is gone will fail anyway.

> In a previous email you wrote that if the file system is read only, it
> is a bad thing if we warn.

Yes, but you need to realize that "it is better not to bother users
with a report of failure to touch in read-only repository" and "we
ignore all failures".

IIUC, you attempt to touch the shared index even when you are
only reading the index, because you want to mark the fact that the
shared index is still being depended upon.  And I tend to agree that
it is OK not to report a failure for that case.  It is very similar
to a situation where you are asked to peek into your coworker's
repository, which you do not have write access to, and run "status".
The command first runs the equivalent of "update-index --refresh"
only in-core, and it attempts to write the updated index because
(1) it paid the cost of doing the refreshing already, and (2) if it
can write into the index, it will help future operations in the
repository.  But it does not report a failure to write the index
exactly because it is merely doing an opportunistic write.

And in the "we read from the split index, and we attempt to touch
the underlying shared one to update its timestamp" case, it is OK
not to report if we failed to touch.

But you also attempt to touch the shared index when you are actually
writing out a new split index file based on it, no?  The "you
created a ticking bomb" situation is where you fail to touch the
shared index for whatever reason, even when you managed to write the
new split index file.

We agreed that read-only operation should not nag, so it won't
trigger when you are peeking somebody else's repository to help him.
As I said, it is an uninteresting and irrelevant case---when your
read-only peeking did not add new information to be preserved, it is
less severe problem that you fail to reset the expiration.  On the
other hand, if you added new information, i.e. wrote the split index
based on it, it is a good indication that the  index
pair has information that is more valuable.  We must warn in that
case.

> Now if you want to talk about the case when the shared index file has
> not been removed, but just chowned to "root", then I wonder how is it
> different from the case when the file system is read only.

The difference is that your code has enough information to notice
the case where you know your touch failed, you know that you wanted
to write (and the write succeeded), and yet you do NOT know why your
touch failed.  That is the ticking bomb we need to warn the user
about.



Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-31 Thread Christian Couder
On Wed, Jan 25, 2017 at 9:52 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Well, when we cannot freshen a loose file (with
>> freshen_loose_object()), we don't warn or die, we just write the loose
>> file. But here we cannot write the shared index file.
>
> I think that is an excellent point.  Let me make sure I got you
> right.  For loose object files, we may attempt to freshen and when
> it fails we stay silent and do not talk about the failure.  Instead,
> we write the same file again.  That will have two potential outcomes:
>
>  1. write fails and we fail the whole thing.  It is very clear to
> the user that there is something wrong going on.
>
>  2. write succeeds, and because we just wrote it, we know that the
> file is fresh and is protected from gc.
>
> So the "freshen and if fails just write" is sufficient.  It is
> absolutely the right thing to do for loose object files.
>
> When we are forking off a new split index file based on an old
> shared index file, we may attempt to "touch" the old shared one, but
> we cannot write the old shared one again, because other split index
> may be based on that, and we do not have enough information to
> recreate the old one [*1*].  The fallback safety is not available.

Yeah I agree. But note that I was talking about the case where the
shared index file does not exist anymore.
I was just saying that when the shared index file has been removed
behind us, it is equivalent as when
loose files have been removed behind us, except that we cannot write
the removed file to disk.

>> And this doesn't lead to a catastrophic failure right away.
>
> Exactly.
>
>> There
>> could be a catastrophic failure if the shared index file is needed
>> again later, but we are not sure that it's going to be needed later.
>> In fact it may have just been removed because it won't be needed
>> later.
>
> You are listing only the irrelevant cases.  The shared one may be
> used immediately, and the user can keep using it for a while without
> "touching".

Now you are talking about a case where the shared index file can be
used immediately and the user can keep using it.
This is a different case than the case I was talking about (which is
the case when the shared index file does not exist anymore).

> Perhaps the shared one was chowned to "root" while the
> user is looking the other way, and its timestamp is now frozen to
> the time of chown.  It is a ticking time-bomb that will go off when
> your expiry logic kicks in.

In the case I was talking about, the shared index file doesn't exist
anymore. So there is no time ticking bomb, because what could happen,
if we cannot freshen the shared index file, is that the shared index
file could be removed, which wouldn't make things worse as anyway it
does not exist anymore.

So the case when the shared index file doesn't exist is different than
other cases.

Now if you want to talk about the case when the shared index file has
not been removed, but just chowned to "root", then I wonder how is it
different from the case when the file system is read only.
Perhaps the admin just chowned everything to make sure that the repo
could not be changed anymore by the usual processes and users.

>> So I am not sure it's a good idea to anticipate a catastrophic failure
>> that may not happen. Perhaps we could just warn, but I am not sure it
>> will help the user. If the catastrophic failure doesn't happen because
>> the shared index file is not needed, I can't see how the warning could
>> help. And if there is a catastrophic failure, the following will be
>> displayed:
>>
>> fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
>> index file open failed: No such file or directory
>
> That "fatal" is given _ONLY_ after time passes and our failure to
> "touch" the file that is still-in-use left it subject to "gc".  Of
> course, when "fatal" is given, it is too late to warn about ticking
> time bomb.
>
> At the time we notice a ticking time bomb is the only sensible time
> to warn.  Or better yet take a corrective action.

In a previous email you wrote that if the file system is read only, it
is a bad thing if we warn.
So I wonder why it would be a good thing to warn or try to do
something if the admin chowned everything to prevent usual processes
and users to change anything.

> As I expect (and you seem to agree) that a failure to "touch" would
> be a very rare event (like, sysadmin chowning it to "root" by
> mistake),

Yeah, I agree that a failure to "touch" would be a rare event. But the
thing is we often don't know if admins or users do things by mistake
or not.
If they rm -rf everything forcefully, for example, they might not like
it, if we start to rewrite stuff to try to "correct" things.

> I do not mind if the "corrective action" were "immediately
> unsplit the index, so that at least the current and the latest index
> contents will be written out safely to a new single 

Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-25 Thread Junio C Hamano
Christian Couder  writes:

> Well, when we cannot freshen a loose file (with
> freshen_loose_object()), we don't warn or die, we just write the loose
> file. But here we cannot write the shared index file.

I think that is an excellent point.  Let me make sure I got you
right.  For loose object files, we may attempt to freshen and when
it fails we stay silent and do not talk about the failure.  Instead,
we write the same file again.  That will have two potential outcomes:

 1. write fails and we fail the whole thing.  It is very clear to
the user that there is something wrong going on.

 2. write succeeds, and because we just wrote it, we know that the
file is fresh and is protected from gc.

So the "freshen and if fails just write" is sufficient.  It is
absolutely the right thing to do for loose object files.

When we are forking off a new split index file based on an old
shared index file, we may attempt to "touch" the old shared one, but
we cannot write the old shared one again, because other split index
may be based on that, and we do not have enough information to
recreate the old one [*1*].  The fallback safety is not available.

> And this doesn't lead to a catastrophic failure right away. 

Exactly.

> There
> could be a catastrophic failure if the shared index file is needed
> again later, but we are not sure that it's going to be needed later.
> In fact it may have just been removed because it won't be needed
> later.

You are listing only the irrelevant cases.  The shared one may be
used immediately, and the user can keep using it for a while without
"touching".  Perhaps the shared one was chowned to "root" while the
user is looking the other way, and its timestamp is now frozen to
the time of chown.  It is a ticking time-bomb that will go off when
your expiry logic kicks in.

> So I am not sure it's a good idea to anticipate a catastrophic failure
> that may not happen. Perhaps we could just warn, but I am not sure it
> will help the user. If the catastrophic failure doesn't happen because
> the shared index file is not needed, I can't see how the warning could
> help. And if there is a catastrophic failure, the following will be
> displayed:
>
> fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
> index file open failed: No such file or directory

That "fatal" is given _ONLY_ after time passes and our failure to
"touch" the file that is still-in-use left it subject to "gc".  Of
course, when "fatal" is given, it is too late to warn about ticking
time bomb.

At the time we notice a ticking time bomb is the only sensible time
to warn.  Or better yet take a corrective action.

As I expect (and you seem to agree) that a failure to "touch" would
be a very rare event (like, sysadmin chowning it to "root" by
mistake), I do not mind if the "corrective action" were "immediately
unsplit the index, so that at least the current and the latest index
contents will be written out safely to a new single unsplit index
file".  That won't help _other_ split index files that are based on
the same "untouchable" shared index, but I do not think that is a
problem we need to solve---if they are still in use, the code that
use them will notice it, and otherwise they are not in use and can
be aged away safely.


[Footnote]

*1* My understanding is that we lose information on stale entries in
the shared file that are covered by the split index overlay
after read_index() returns, so we _might_ be able to write the
"old" one that is sufficient for _our_ split index, but we do
not have good enough information to recreate "old" one usable by
other split index files.


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-25 Thread Christian Couder
On Mon, Jan 23, 2017 at 7:53 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> Also in general the split-index mode is useful when you often write
>> new indexes, and in this case shared index files that are used will
>> often be freshened, so the risk of deleting interesting shared index
>> files should be low.
>> ...
>>> Not that I think freshening would actually fail in a repository
>>> where you can actually write into to update the index or its refs to
>>> make a difference (iow, even if we make it die() loudly when shared
>>> index cannot be "touched" because we are paranoid, no real life
>>> usage will trigger that die(), and if a repository does trigger the
>>> die(), I think you would really want to know about it).
>>
>> As I wrote above, I think if we can actually write the shared index
>> file but its freshening fails, it probably means that the shared index
>> file has been removed behind us, and this case is equivalent as when
>> loose files have been removed behind us.
>
> OK, so it is unlikely to happen, and when it happens it leads to a
> catastrophic failure---do we just ignore or do we report an error?

Well, when we cannot freshen a loose file (with
freshen_loose_object()), we don't warn or die, we just write the loose
file. But here we cannot write the shared index file.

And this doesn't lead to a catastrophic failure right away. There
could be a catastrophic failure if the shared index file is needed
again later, but we are not sure that it's going to be needed later.
In fact it may have just been removed because it won't be needed
later.

So I am not sure it's a good idea to anticipate a catastrophic failure
that may not happen. Perhaps we could just warn, but I am not sure it
will help the user. If the catastrophic failure doesn't happen because
the shared index file is not needed, I can't see how the warning could
help. And if there is a catastrophic failure, the following will be
displayed:

fatal: .git/sharedindex.cbfe41352a4623d4d3e9757861fed53f3094e0ac:
index file open failed: No such file or directory

and I don't see how the warning could help on top of that. It could at
most repeat the same thing.


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-23 Thread Junio C Hamano
Christian Couder  writes:

> Also in general the split-index mode is useful when you often write
> new indexes, and in this case shared index files that are used will
> often be freshened, so the risk of deleting interesting shared index
> files should be low.
> ...
>> Not that I think freshening would actually fail in a repository
>> where you can actually write into to update the index or its refs to
>> make a difference (iow, even if we make it die() loudly when shared
>> index cannot be "touched" because we are paranoid, no real life
>> usage will trigger that die(), and if a repository does trigger the
>> die(), I think you would really want to know about it).
>
> As I wrote above, I think if we can actually write the shared index
> file but its freshening fails, it probably means that the shared index
> file has been removed behind us, and this case is equivalent as when
> loose files have been removed behind us.

OK, so it is unlikely to happen, and when it happens it leads to a
catastrophic failure---do we just ignore or do we report an error?


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-23 Thread Christian Couder
On Thu, Jan 19, 2017 at 8:00 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano  wrote:
>>> Duy Nguyen  writes:
>>>
 On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> So what should we do if freshen_file() returns 0 which means that the
>> freshening failed?
>
> You tell me ;-)  as you are the one who is proposing this feature.

Before the above question I already had given my opinion about what we
should do.

There are the following cases:

- Freshening failed because the shared index file does not exist
anymore. In this case it could have been removed for a good reason
(for example maybe the user wants to remove all the shared index
files), or it could be a bug somewhere else. Anyway we cannot know and
the user will get an error if the shared index file that disappeared
is read from afterwards, so I don't think we need to warn or do
anything.

- Freshening failed because the mtime of the shared index cannot be
changed. You already in a previous email wrote that we shoudn't warn
if the file system is read only, and I agree with that, as anyway if
the file system is read only, the shared index file cannot be deleted,
so there is no risk from the current user. Also the split index mode
is useful to speed up index writing at the cost of making index
reading a little slower, so its use in a read only mode should not be
the primary way it is used.

So my opinion is that there are good reasons not to do anything if
freshening fails.

 My answer is, we are not worse than freshening loose objects case
 (especially since I took the idea from there).
>>>
>>> I do not think so, unfortunately.  Loose object files with stale
>>> timestamps are not removed as long as objects are still reachable.

As the current index is read, which freshens its shared index file,
before a new index is created, the number of shared index files
doesn't go below 2. This can be seen in the tests changed in patch
19/21. So the risk of deleting interesting shared index files is quite
low in my opinion.

Also in general the split-index mode is useful when you often write
new indexes, and in this case shared index files that are used will
often be freshened, so the risk of deleting interesting shared index
files should be low.

>> But there are plenty of unreachable loose objects, added in index,
>> then got replaced with new versions. cache-tree can create loose trees
>> too and it's been run more often, behind user's back, to take
>> advantage of the shortcut in unpack-trees.
>
> I am not sure if I follow.  Aren't objects reachable from the
> cache-tree in the index protected from gc?
>
> Not that I think freshening would actually fail in a repository
> where you can actually write into to update the index or its refs to
> make a difference (iow, even if we make it die() loudly when shared
> index cannot be "touched" because we are paranoid, no real life
> usage will trigger that die(), and if a repository does trigger the
> die(), I think you would really want to know about it).

As I wrote above, I think if we can actually write the shared index
file but its freshening fails, it probably means that the shared index
file has been removed behind us, and this case is equivalent as when
loose files have been removed behind us.


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-20 Thread Duy Nguyen
On Fri, Jan 20, 2017 at 2:00 AM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano  wrote:
>>> Duy Nguyen  writes:
>>>
 On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> So what should we do if freshen_file() returns 0 which means that the
>> freshening failed?
>
> You tell me ;-)  as you are the one who is proposing this feature.

 My answer is, we are not worse than freshening loose objects case
 (especially since I took the idea from there).
>>>
>>> I do not think so, unfortunately.  Loose object files with stale
>>> timestamps are not removed as long as objects are still reachable.
>>
>> But there are plenty of unreachable loose objects, added in index,
>> then got replaced with new versions. cache-tree can create loose trees
>> too and it's been run more often, behind user's back, to take
>> advantage of the shortcut in unpack-trees.
>
> I am not sure if I follow.  Aren't objects reachable from the
> cache-tree in the index protected from gc?

I think the problem is loose objects created then gc run just before
they are referenced (e.g. index written down). But I think I may be
following a wrong road. If mtime is in fact to deal with race
conditions, applying the same idea here is wrong because we have a
different problem here.

> Not that I think freshening would actually fail in a repository
> where you can actually write into to update the index or its refs to
> make a difference (iow, even if we make it die() loudly when shared
> index cannot be "touched" because we are paranoid, no real life
> usage will trigger that die(), and if a repository does trigger the
> die(), I think you would really want to know about it).
-- 
Duy


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-19 Thread Junio C Hamano
Duy Nguyen  writes:

> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano  wrote:
>> Duy Nguyen  writes:
>>
>>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
 Christian Couder  writes:

> So what should we do if freshen_file() returns 0 which means that the
> freshening failed?

 You tell me ;-)  as you are the one who is proposing this feature.
>>>
>>> My answer is, we are not worse than freshening loose objects case
>>> (especially since I took the idea from there).
>>
>> I do not think so, unfortunately.  Loose object files with stale
>> timestamps are not removed as long as objects are still reachable.
>
> But there are plenty of unreachable loose objects, added in index,
> then got replaced with new versions. cache-tree can create loose trees
> too and it's been run more often, behind user's back, to take
> advantage of the shortcut in unpack-trees.

I am not sure if I follow.  Aren't objects reachable from the
cache-tree in the index protected from gc?

Not that I think freshening would actually fail in a repository
where you can actually write into to update the index or its refs to
make a difference (iow, even if we make it die() loudly when shared
index cannot be "touched" because we are paranoid, no real life
usage will trigger that die(), and if a repository does trigger the
die(), I think you would really want to know about it).


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-19 Thread Duy Nguyen
On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
>>> Christian Couder  writes:
>>>
 So what should we do if freshen_file() returns 0 which means that the
 freshening failed?
>>>
>>> You tell me ;-)  as you are the one who is proposing this feature.
>>
>> My answer is, we are not worse than freshening loose objects case
>> (especially since I took the idea from there).
>
> I do not think so, unfortunately.  Loose object files with stale
> timestamps are not removed as long as objects are still reachable.

But there are plenty of unreachable loose objects, added in index,
then got replaced with new versions. cache-tree can create loose trees
too and it's been run more often, behind user's back, to take
advantage of the shortcut in unpack-trees.

> For the base/shared index file, the timestamp is the only thing that
> protects them from pruning, unless it is serving as the base file
> for the currently active $GIT_DIR/index that is split.
-- 
Duy


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-09 Thread Junio C Hamano
Duy Nguyen  writes:

> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
>> Christian Couder  writes:
>>
>>> So what should we do if freshen_file() returns 0 which means that the
>>> freshening failed?
>>
>> You tell me ;-)  as you are the one who is proposing this feature.
>
> My answer is, we are not worse than freshening loose objects case
> (especially since I took the idea from there). 

I do not think so, unfortunately.  Loose object files with stale
timestamps are not removed as long as objects are still reachable.
For the base/shared index file, the timestamp is the only thing that
protects them from pruning, unless it is serving as the base file
for the currently active $GIT_DIR/index that is split.

>> What is the failure mode after such a premature GC happens?  What
>> does the end-user see?  Can you try to (1) split the index (2)
>> modify bunch of entries (3) remove the base/shared index with /bin/rm
>> and then see how various Git commands fail?  Do they fail gracefully?
>>
>> I am trying to gauge the seriousness of ignoring such an error here.
>
> If we fail to refresh it and the file is old enough and gc happens,
> any index file referenced to it are broken. Any commands that read the
> index will die(). The best you could do is delete $GIT_DIR/index and
> read-tree HEAD.


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-09 Thread Duy Nguyen
On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> So what should we do if freshen_file() returns 0 which means that the
>> freshening failed?
>
> You tell me ;-)  as you are the one who is proposing this feature.

My answer is, we are not worse than freshening loose objects case
(especially since I took the idea from there). In both cases we
silently ignore utime() error (sha1_file.c does retry, but will do
nothing else if the retry fails). And errors in odb are much more
serious than index files. If we are to improve it, I think we should
do it inside check_and_freshen_file(), maybe with an optional flag to
silence it.

> Isn't a failure to freshen it a grave error?  We are letting a
> base/shared index file that is known to be in-use go stale and
> eventually subject for garbage collection, and the user should be
> notified in some way before the actual GC happens that renders the
> index file unusable?
>
> What is the failure mode after such a premature GC happens?  What
> does the end-user see?  Can you try to (1) split the index (2)
> modify bunch of entries (3) remove the base/shared index with /bin/rm
> and then see how various Git commands fail?  Do they fail gracefully?
>
> I am trying to gauge the seriousness of ignoring such an error here.

If we fail to refresh it and the file is old enough and gc happens,
any index file referenced to it are broken. Any commands that read the
index will die(). The best you could do is delete $GIT_DIR/index and
read-tree HEAD.
-- 
Duy


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-07 Thread Junio C Hamano
Christian Couder  writes:

> So what should we do if freshen_file() returns 0 which means that the
> freshening failed?

You tell me ;-)  as you are the one who is proposing this feature.

Isn't a failure to freshen it a grave error?  We are letting a
base/shared index file that is known to be in-use go stale and
eventually subject for garbage collection, and the user should be
notified in some way before the actual GC happens that renders the
index file unusable?

What is the failure mode after such a premature GC happens?  What
does the end-user see?  Can you try to (1) split the index (2)
modify bunch of entries (3) remove the base/shared index with /bin/rm
and then see how various Git commands fail?  Do they fail gracefully?

I am trying to gauge the seriousness of ignoring such an error here.


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2017-01-02 Thread Christian Couder
On Tue, Dec 27, 2016 at 8:10 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> +/*
>> + * Signal that the shared index is used by updating its mtime.
>> + *
>> + * This way, shared index can be removed if they have not been used
>> + * for some time. It's ok to fail to update the mtime if we are on a
>> + * read only file system.
>> + */
>> +void freshen_shared_index(char *base_sha1_hex)
>> +{
>> + const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
>> + check_and_freshen_file(shared_index, 1);
>
> What happens when this call fails?  The function returns 0 if the
> file did not even exist.  It also returns 0 if you cannot update its
> timestamp.

Yeah and I don't think it's a problem in either case.
If we cannot update its timestamp, it's not a problem, as we could be
on a read-only file system, and you said in a previous iteration that
we should not even warn in this case.
If the file does not exist, it could be because it has just been
deleted for a good reason, and anyway, if it is a problem that the
shared index file has been deleted, it is better addressed when we
will actually need the shared index file to read something into from
it, rather than just to update its mtime.

> Shouldn't the series be exposing freshen_file() instead _and_ taking
> its error return value seriously?

So what should we do if freshen_file() returns 0 which means that the
freshening failed?

>> +}


Re: [PATCH v3 14/21] read-cache: touch shared index files when used

2016-12-27 Thread Junio C Hamano
Christian Couder  writes:

> +/*
> + * Signal that the shared index is used by updating its mtime.
> + *
> + * This way, shared index can be removed if they have not been used
> + * for some time. It's ok to fail to update the mtime if we are on a
> + * read only file system.
> + */
> +void freshen_shared_index(char *base_sha1_hex)
> +{
> + const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
> + check_and_freshen_file(shared_index, 1);

What happens when this call fails?  The function returns 0 if the
file did not even exist.  It also returns 0 if you cannot update its
timestamp.

Shouldn't the series be exposing freshen_file() instead _and_ taking
its error return value seriously?

> +}
> +
>  int read_index_from(struct index_state *istate, const char *path)
>  {
>   struct split_index *split_index;
> @@ -2273,6 +2286,8 @@ int write_locked_index(struct index_state *istate, 
> struct lock_file *lock,
>   int ret = write_shared_index(istate, lock, flags);
>   if (ret)
>   return ret;
> + } else {
> + freshen_shared_index(sha1_to_hex(si->base_sha1));
>   }
>  
>   return write_split_index(istate, lock, flags);


[PATCH v3 14/21] read-cache: touch shared index files when used

2016-12-26 Thread Christian Couder
When a split-index file is created, let's update the mtime of the
shared index file that the split-index file is referencing.

In a following commit we will make shared index file expire
depending on their mtime, so updating the mtime makes sure that
the shared index file will not be deleted soon.

Signed-off-by: Christian Couder 
---
 read-cache.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a1aaec5135..9fbad2044b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1682,6 +1682,19 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
die("index file corrupt");
 }
 
+/*
+ * Signal that the shared index is used by updating its mtime.
+ *
+ * This way, shared index can be removed if they have not been used
+ * for some time. It's ok to fail to update the mtime if we are on a
+ * read only file system.
+ */
+void freshen_shared_index(char *base_sha1_hex)
+{
+   const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+   check_and_freshen_file(shared_index, 1);
+}
+
 int read_index_from(struct index_state *istate, const char *path)
 {
struct split_index *split_index;
@@ -2273,6 +2286,8 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
int ret = write_shared_index(istate, lock, flags);
if (ret)
return ret;
+   } else {
+   freshen_shared_index(sha1_to_hex(si->base_sha1));
}
 
return write_split_index(istate, lock, flags);
-- 
2.11.0.209.gda91e66374.dirty