[PATCH] open_sha1_file: report most interesting errno

2014-05-15 Thread Jeff King
When we try to open a loose object file, we first attempt to
open in the local object database, and then try any
alternates. This means that the errno value when we return
will be from the last place we looked (and due to the way
the code is structured, simply ENOENT if we do not have have
any alternates).

This can cause confusing error messages, as read_sha1_file
checks for ENOENT when reporting a missing object. If errno
is something else, we report that. If it is ENOENT, but
has_loose_object reports that we have it, then we claim the
object is corrupted. For example:

$ chmod 0 .git/objects/??/*
$ git rev-list --all
fatal: loose object b2d6fab18b92d49eac46dc3c5a0bcafabda20131 (stored in 
.git/objects/b2/d6fab18b92d49eac46dc3c5a0bcafabda20131) is corrupt

This patch instead keeps track of the most interesting
errno we receive during our search. We consider ENOENT to be
the least interesting of all, and otherwise report the first
error found (so problems in the object database take
precedence over ones in alternates). Here it is with this
patch:

$ git rev-list --all
fatal: failed to read object b2d6fab18b92d49eac46dc3c5a0bcafabda20131: 
Permission denied

Signed-off-by: Jeff King p...@peff.net
---
 sha1_file.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 3e9f55f..34d527f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1437,19 +1437,23 @@ static int open_sha1_file(const unsigned char *sha1)
 {
int fd;
struct alternate_object_database *alt;
+   int most_interesting_errno;
 
fd = git_open_noatime(sha1_file_name(sha1));
if (fd = 0)
return fd;
+   most_interesting_errno = errno;
 
prepare_alt_odb();
-   errno = ENOENT;
for (alt = alt_odb_list; alt; alt = alt-next) {
fill_sha1_path(alt-name, sha1);
fd = git_open_noatime(alt-base);
if (fd = 0)
return fd;
+   if (most_interesting_errno == ENOENT)
+   most_interesting_errno = errno;
}
+   errno = most_interesting_errno;
return -1;
 }
 
-- 
2.0.0.rc1.436.g03cb729
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] open_sha1_file: report most interesting errno

2014-05-15 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 When we try to open a loose object file, we first attempt to
 open in the local object database, and then try any
 alternates. This means that the errno value when we return
 will be from the last place we looked (and due to the way
 the code is structured, simply ENOENT if we do not have have
 any alternates).

 This can cause confusing error messages, as read_sha1_file
 checks for ENOENT when reporting a missing object. If errno
 is something else, we report that. If it is ENOENT, but
 has_loose_object reports that we have it, then we claim the
 object is corrupted. For example:

 $ chmod 0 .git/objects/??/*
 $ git rev-list --all
 fatal: loose object b2d6fab18b92d49eac46dc3c5a0bcafabda20131 (stored in 
 .git/objects/b2/d6fab18b92d49eac46dc3c5a0bcafabda20131) is corrupt

H.  So we keep track of a more interesting errno we get from
some other place than what we get for this local loose object, and
we no longer give this message pointing at the local loose
object---is that the idea?

What I am wondering is that this report we give in the new code

 $ git rev-list --all
 fatal: failed to read object b2d6fab18b92d49eac46dc3c5a0bcafabda20131: 
 Permission denied

may want to say which of the various possible places we saw this
most interesting errno, which I think was the original motivation
came from e8b15e61 (sha1_file: Show the the type and path to corrupt
objects, 2010-06-10) that added (stored in ...).

But that may involve a larger surgery, and I definitely do not want
to add unnecessary logic in the common-case codepath to keep track
of pieces of information that are only used in the error codepath,
so it smells like that this is the best fix to the issue the commit
message describes.

Thanks.


 Signed-off-by: Jeff King p...@peff.net
 ---
  sha1_file.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/sha1_file.c b/sha1_file.c
 index 3e9f55f..34d527f 100644
 --- a/sha1_file.c
 +++ b/sha1_file.c
 @@ -1437,19 +1437,23 @@ static int open_sha1_file(const unsigned char *sha1)
  {
   int fd;
   struct alternate_object_database *alt;
 + int most_interesting_errno;
  
   fd = git_open_noatime(sha1_file_name(sha1));
   if (fd = 0)
   return fd;
 + most_interesting_errno = errno;
  
   prepare_alt_odb();
 - errno = ENOENT;
   for (alt = alt_odb_list; alt; alt = alt-next) {
   fill_sha1_path(alt-name, sha1);
   fd = git_open_noatime(alt-base);
   if (fd = 0)
   return fd;
 + if (most_interesting_errno == ENOENT)
 + most_interesting_errno = errno;
   }
 + errno = most_interesting_errno;
   return -1;
  }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] open_sha1_file: report most interesting errno

2014-05-15 Thread Jeff King
On Thu, May 15, 2014 at 10:02:06AM -0700, Junio C Hamano wrote:

  $ chmod 0 .git/objects/??/*
  $ git rev-list --all
  fatal: loose object b2d6fab18b92d49eac46dc3c5a0bcafabda20131 (stored in 
  .git/objects/b2/d6fab18b92d49eac46dc3c5a0bcafabda20131) is corrupt
 
 H.  So we keep track of a more interesting errno we get from
 some other place than what we get for this local loose object, and
 we no longer give this message pointing at the local loose
 object---is that the idea?

Yes, though my main goal was to stop saying corrupt when that is not
the problem at all. Not pointing to the wrong object was a secondary
consideration. :)

I would also be happy to just show the error for the local object, even
if it is exists somewhere else.  The main thing I am changing here is
that we currently _never_ show the errno from the main odb. We either
show the errno from the last alternate we looked at, or we show ENOENT
(because we explicitly set ENOENT right before looking at the
alternates).

I think it's a separate problem that the stored in... is sometimes
wrong. That comes when we get ENOENT, and we check has_loose_object().
IOW, we guess we couldn't find it, but we claim to have it, so it must
have been corrupt. But that does not say _where_ we found it, and our
call to sha1_file_name is a guess that may be wrong.

I'm actually not sure if we can even trigger that code path now. It
depended on returning ENOENT from read_object, which we used to
frequently do erroneously. Now we will only do it when the object truly
does not exist, which means has_loose_object should generally not return
true.

I'm also a bit surprised that errno actually survives here. That clearly
was the intent, so I don't think my patch is making anything worse. But
it's possible that we would prepare_packed_git or open/mmap packfiles
between the call to open_sha1_file and when read_sha1_file_extended
looks at errno.

 What I am wondering is that this report we give in the new code
 
  $ git rev-list --all
  fatal: failed to read object b2d6fab18b92d49eac46dc3c5a0bcafabda20131: 
  Permission denied
 
 may want to say which of the various possible places we saw this
 most interesting errno, which I think was the original motivation
 came from e8b15e61 (sha1_file: Show the the type and path to corrupt
 objects, 2010-06-10) that added (stored in ...).
 
 But that may involve a larger surgery, and I definitely do not want
 to add unnecessary logic in the common-case codepath to keep track
 of pieces of information that are only used in the error codepath,
 so it smells like that this is the best fix to the issue the commit
 message describes.

Yes, I think doing this right would involve a lot more surgery, and I
don't know if it is worth the effort. But in addition to the problems
above, I note that we simply open the first object we can find, and do
not loop if mmap or checksums fail. So unlike packed objects, which are
resilient to corruption, we would fail immediately.

So I think the right thing to do would be:

  1. Don't loop across alternates in open_sha1_file. Loop in read_object
 (which means looping in _other_ calls to map_sha1_file, like in
 sha1_object_info).

  2. Fail quickly, since the common case is that we will find the object
 elsewhere. But when we do have an error, take time to go back and
 actually find the location of the object and the real error (i.e.,
 have a diagnose_object or something).

Neither is a particularly high priority to me, though, so I don't plan
on working on them anytime soon. The only reason I went this far was
that I saw the loose object is corrupt / EPERM confusion in the real
world.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html