Re: [PUB]corrupt repos does not return error with `git fsck`
Faheem Mitha fah...@faheem.info writes: I was going by the answer (by CodeWizard) in http://stackoverflow.com/q/30348615/350713 OK, so the hash you got comes from a superproject which references it. My guess is that the superproject did a private commit in a submodule, added this submodule to the superproject, and forgot to push the submodule. If so, it's a user error (that could arguably have been avoided with a better command-line interface, so Git is partly guilty), but not a repository corruption. If I just give a random hash to `git show` in that repos, I get fatal: ambiguous argument '...': unknown revision or path not in the working tree. Not a random hash, but a random abreviated hash. Look: Changing the last digit: $ git show 280c12ab49223c64c6f914944287a7d049cf4d23 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d23 $ git show 280c12ab49223c64c6f914944287a7d049cf4d24 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d24 $ git show 280c12ab49223c64c6f914944287a7d049cf4d25 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d25 $ git show 280c12ab49223c64c6f914944287a7d049cf4d26 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4d26 Removing the last digit: $ git show 280c12ab49223c64c6f914944287a7d049cf4d2 fatal: ambiguous argument '280c12ab49223c64c6f914944287a7d049cf4d2': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git command [revision...] -- [file...]' -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Faheem Mitha fah...@faheem.info writes: Hi, Clone the repos https://github.com/fmitha/SICL. Then git show 280c12ab49223c64c6f914944287a7d049cf4dd0 gives fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in your repository. The good news it: I don't think you have a corrupt repository. What makes you think you have an object with identifier 280c12ab49223c64c6f914944287a7d049cf4dd0? -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
$ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. fatal: ambiguous argument '12323213123': unknown revision or path not in the working tree. $ mv .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack . $ rm .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.idx $ git unpack-objects pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ ls .git/objects/28/0* .git/objects/28/01fef08b1dccf9725dde919a7373748a046cb7 .git/objects/28/03d8c1cb3275979ff2d8408450844f6a78a70d .git/objects/28/0663a93d702a7fcb0dd36f461397f6b50ba01e .git/objects/28/068e2656dd4bac61050e870712578032af9144 .git/objects/28/074e890d6ff2bb61eb7796bc500b6d8e344ad2 .git/objects/28/08596ac465cf8a819a9b13ad2f855e9a8a3235 .git/objects/28/098184d1ba97453227c18628cdf13087b6bce2 .git/objects/28/0ba19c68b26ee7c799ef8ca09d540a5ad7a5b2 .git/objects/28/0d66213173f0ae7aaae8684f3efcb1f8790792 .git/objects/28/0da35374c32303cbd726bef9847f18d7428d5e There is no file 28/0c... however. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote: $ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. fatal: ambiguous argument '12323213123': unknown revision or path not in the working tree. Yeah, this is well-known. If you give a partial hash, the error comes from get_sha1(), which says hey, this doesn't look like anything I know about. If you feed a whole hash, we skip all that and say well, you _definitely_ meant this sha1, and then later code complains when it cannot be read. We could add a has_sha1_file() check in get_sha1 for this case. I can't think offhand of any reason it would need to be called with a non-existent object, but there may be some lurking corner case (e.g., cat-file -e or something). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Jeff King p...@peff.net writes: I should have looked before replying. It would indeed break cat-file -e horribly. So the right answer may be to just improve the bad object message (probably by checking has_sha1_file there and diagnosing it either as missing or corrupted). I should have looked before replying, too ;-) Yeah, bad object sounds as if we tried to parse something that exists and it was corrupt. So classifying a file or a pack index entry exists where a valid object with that name should reside in as bad object and there is no such file or a pack index entry that would house the named object as missing object _might_ make things better. But let's think about it a bit more. Would it have prevented the original confusion if we said missing object? I have a feeling that it wouldn't have. Faheem was so convinced that the object named with the 40-hex *must* exist in the cloned repository, and if we told missing object to such a person, it will just enforce the (mis)conception that the repository is somehow corrupt, when it is not. So... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
sbel...@google.com writes: $ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. I did the same, but the error message is different if you provide an abreviated sha1 or a full 40-chars sha1. Any full sha1 I tried gave the same error message. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano gits...@pobox.com wrote: So... maybe we need a command: Given this SHA1, tell me anything you know about it, Is it a {blob,tree,commit,tag}? Is it referenced from anywhere else in this repository and if so, which type? And if it is not referenced, nor an object, tell me so explicitely. This would have helped a lot for this confusion: $ git frotz 280c12... No object exists with such a substring (either as prefix, postfix or in between) No other object is referencing any object containing this substring as pre/post-fix and this issue would have been resolved in a heartbeat. Specially the verbose feature is contradicting the terse unix style though and this command is tailored to this issue, so I don't know if it's any useful outside this specific problem. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Jeff King p...@peff.net writes: We could add a has_sha1_file() check in get_sha1 for this case. Please don't. get_sha1() is merely I have this string, which may be a 40-hex or an extended SHA-1 expression. Turn it into a 20-byte binary and does not require you to have any such object. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Stefan Beller sbel...@google.com writes: On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano gits...@pobox.com wrote: So... maybe we need a command: Given this SHA1, tell me anything you know about it, Is it a {blob,tree,commit,tag}? Is it referenced from anywhere else in this repository and if so, which type? And if it is not referenced, nor an object, tell me so explicitely. Let me add another to that list ;-) I have this prefix; please enumerate all known objects that share it. I do not know the value of the first two in your list. If it is a known object, then you throw it at git show, git cat-file -t and dig from there. If it is not known, there is nothing more to do. I do not know if need is the right word, but I hope that you realize the last two among the four you listed need the equivalent of git fsck. It is an expensive operation. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 2:06 PM, Junio C Hamano gits...@pobox.com wrote: Stefan Beller sbel...@google.com writes: On Wed, May 20, 2015 at 1:39 PM, Junio C Hamano gits...@pobox.com wrote: So... maybe we need a command: Given this SHA1, tell me anything you know about it, Is it a {blob,tree,commit,tag}? Is it referenced from anywhere else in this repository and if so, which type? And if it is not referenced, nor an object, tell me so explicitely. Let me add another to that list ;-) I have this prefix; please enumerate all known objects that share it. I do not know the value of the first two in your list. If it is a known object, then you throw it at git show, git cat-file -t and dig from there. If it is not known, there is nothing more to do. Right, I just tried to think of all the questions which are relevant to answer in such a case, so probably this can be outside of I do not know if need is the right word, but I hope that you realize the last two among the four you listed need the equivalent of git fsck. It is an expensive operation. Yes, I do realize that. The way I interpreted Faheems original message was: git fsck tells me everything is alright, but I don't trust fsck. So now I want to find a way to ask Git about everything it knows about this $SHA1 and print it for me so I can manually look at each entry and verify by hand. So that's why I included the parts easily done with cat-file and show. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 01:39:36PM -0700, Junio C Hamano wrote: Yeah, bad object sounds as if we tried to parse something that exists and it was corrupt. So classifying a file or a pack index entry exists where a valid object with that name should reside in as bad object and there is no such file or a pack index entry that would house the named object as missing object _might_ make things better. But let's think about it a bit more. Would it have prevented the original confusion if we said missing object? I have a feeling that it wouldn't have. Faheem was so convinced that the object named with the 40-hex *must* exist in the cloned repository, and if we told missing object to such a person, it will just enforce the (mis)conception that the repository is somehow corrupt, when it is not. So... I dunno. If it were phrased not as missing object but as there is no such object in the repository, then it seems more clear to me that the error is in the request, not in the repository (and hopefully the user would examine their assumption that it should be). But bad object is just a horrible error message. It actively implies corruption. And I think if we do have corruption, then parse_object() already reports it. For example: # helpers objfile() { printf '.git/objects/%s' $(echo $1 | sed 's,..,/,') } blob=$(echo content | git hash-object -w --stdin) # object with a sha1 mismatch mismatch=1234567890123456789012345678901234567890 mkdir .git/objects/12 cp $(objfile $blob) $(objfile $mismatch) # plain old missing object missing=1234abcdef1234abcdef1234abcdef1234abcdef # object with data corruption corrupt=$blob chmod +w $(objfile $corrupt) dd if=/dev/zero of=$(objfile $corrupt) bs=1 count=1 conv=notrunc seek=10 # now show each for bad in corrupt mismatch missing; do echo == $bad git --no-pager show $(eval echo \$$bad) done produces: == corrupt error: inflate: data stream error (invalid distance too far back) error: unable to unpack d95f3ad14dee633a758d2e331151e950dd13e4ed header error: inflate: data stream error (invalid distance too far back) fatal: loose object d95f3ad14dee633a758d2e331151e950dd13e4ed (stored in .git/objects/d9/5f3ad14dee633a758d2e331151e950dd13e4ed) is corrupt == mismatch error: sha1 mismatch 1234567890123456789012345678901234567890 fatal: bad object 1234567890123456789012345678901234567890 == missing fatal: bad object 1234abcdef1234abcdef1234abcdef1234abcdef Note that the missing case is the only one that _doesn't_ give further clarification, and it is likely to be the most common (however just changing bad object to no such object would be a bad idea, as it makes the second case harder to understand). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, 20 May 2015, Stefan Beller wrote: On Wed, May 20, 2015 at 11:24 AM, Faheem Mitha fah...@faheem.info wrote: So, is the repos corrupt or not? Also, I don't understand why you say There is no file 28/0c... however. Why would you expect there to be? I don't see it mentioned in that list. Each object is stored at .git/objects/xz/tail with xz being the first 2 characters of the sha1 and the tail the remaining 38 characters of the sha1. I did not draw a conclusion yet, as I needed to run for a meeting. So the object you're looking for is not there (stating this as a fact). But why would you expect it to be there? At the time of sending the previous email I tried to do a reverse search Give me all objects, which reference objectX but did not succeed yet. Ok. See my reply to Matthieu Moy for context. I make have been taking too much for granted before posting to this list. Maybe I should have asked here first. As I wrote to him, I can reconstruct the original setup if anyone thinks it is worthwhile trying to investigate further. Regards, Faheem Mitha -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, 20 May 2015, Matthieu Moy wrote: Faheem Mitha fah...@faheem.info writes: Hi, Clone the repos https://github.com/fmitha/SICL. Then git show 280c12ab49223c64c6f914944287a7d049cf4dd0 gives fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in your repository. The good news it: I don't think you have a corrupt repository. What makes you think you have an object with identifier 280c12ab49223c64c6f914944287a7d049cf4dd0? I was going by the answer (by CodeWizard) in http://stackoverflow.com/q/30348615/350713 The question there also gives the context of this question. The repos I referenced in my post to the git mailing list just now, is just a clone of https://github.com/drmeister/SICL. If I just give a random hash to `git show` in that repos, I get fatal: ambiguous argument '...': unknown revision or path not in the working tree. It seemed reasonable to assume (based on what little knowledge I had about) that the 280c12ab49223c64c6f914944287a7d049cf4dd0 commit was the problem. However, this repos is a fork of another repos, namely https://github.com/robert-strandh/SICL That repos contains more recent commits than the fork does. If I take any of the more recent commits from that repos, and try the hash with `git show`, i.e. git show hash in the fork, I get the same error, which makes to me think something else must be going on. Chris (drmeister) has modified the path the submodule is obtained from, so the instructions in the SO question won't work as a reproduction recipe any more, but if you want to take a look I could clone his repos and set it up the same way it was. Let me know. Regards, Faheem Mitha -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Hi Stefan, Thank you for the reply, but I don't follow what conclusion you are drawing, if any. On Wed, 20 May 2015, Stefan Beller wrote: $ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. fatal: ambiguous argument '12323213123': unknown revision or path not in the working tree. $ mv .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack . $ rm .git/objects/pack/pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.idx $ git unpack-objects pack-d56da8c18f5aa915d7fe230efae7315a0101dc19.pack $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ ls .git/objects/28/0* .git/objects/28/01fef08b1dccf9725dde919a7373748a046cb7 .git/objects/28/03d8c1cb3275979ff2d8408450844f6a78a70d .git/objects/28/0663a93d702a7fcb0dd36f461397f6b50ba01e .git/objects/28/068e2656dd4bac61050e870712578032af9144 .git/objects/28/074e890d6ff2bb61eb7796bc500b6d8e344ad2 .git/objects/28/08596ac465cf8a819a9b13ad2f855e9a8a3235 .git/objects/28/098184d1ba97453227c18628cdf13087b6bce2 .git/objects/28/0ba19c68b26ee7c799ef8ca09d540a5ad7a5b2 .git/objects/28/0d66213173f0ae7aaae8684f3efcb1f8790792 .git/objects/28/0da35374c32303cbd726bef9847f18d7428d5e There is no file 28/0c... however. So, is the repos corrupt or not? Also, I don't understand why you say There is no file 28/0c... however. Why would you expect there to be? I don't see it mentioned in that list. I apologise for my ignorance. I don't really know anything about git. I just happened to encounter this error. Regards, Faheem Mitha -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 02:22:19PM -0400, Jeff King wrote: On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote: $ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. fatal: ambiguous argument '12323213123': unknown revision or path not in the working tree. Yeah, this is well-known. If you give a partial hash, the error comes from get_sha1(), which says hey, this doesn't look like anything I know about. If you feed a whole hash, we skip all that and say well, you _definitely_ meant this sha1, and then later code complains when it cannot be read. We could add a has_sha1_file() check in get_sha1 for this case. I can't think offhand of any reason it would need to be called with a non-existent object, but there may be some lurking corner case (e.g., cat-file -e or something). I should have looked before replying. It would indeed break cat-file -e horribly. So the right answer may be to just improve the bad object message (probably by checking has_sha1_file there and diagnosing it either as missing or corrupted). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
Hi, On 2015-05-20 19:19, Matthieu Moy wrote: Faheem Mitha fah...@faheem.info writes: Clone the repos https://github.com/fmitha/SICL. Then git show 280c12ab49223c64c6f914944287a7d049cf4dd0 gives fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 It seems 280c12ab49223c64c6f914944287a7d049cf4dd0 is not an object in your repository. The good news it: I don't think you have a corrupt repository. What makes you think you have an object with identifier 280c12ab49223c64c6f914944287a7d049cf4dd0? I had a similar problem some time ago and tracked it down to a graft that was active while pushing to the public repository. Maybe it's the same problem here? Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 11:24 AM, Faheem Mitha fah...@faheem.info wrote: So, is the repos corrupt or not? Also, I don't understand why you say There is no file 28/0c... however. Why would you expect there to be? I don't see it mentioned in that list. Each object is stored at .git/objects/xz/tail with xz being the first 2 characters of the sha1 and the tail the remaining 38 characters of the sha1. I did not draw a conclusion yet, as I needed to run for a meeting. So the object you're looking for is not there (stating this as a fact). But why would you expect it to be there? At the time of sending the previous email I tried to do a reverse search Give me all objects, which reference objectX but did not succeed yet. Stefan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PUB]corrupt repos does not return error with `git fsck`
On Wed, May 20, 2015 at 11:02:14AM -0700, Stefan Beller wrote: $ git clone https://github.com/fmitha/SICL cd SICL $ git show 280c12ab49223c64c6f914944287a7d049cf4dd0 fatal: bad object 280c12ab49223c64c6f914944287a7d049cf4dd0 $ git show 12323213123 # just to be sure to have a different error message for non existing objects. fatal: ambiguous argument '12323213123': unknown revision or path not in the working tree. I think 40 hex characters is special cased. Using CGit as a repository with a submodule so I can easily get an unrelated SHA1 and short name: cgit $ git show $(git -C git rev-parse @) fatal: bad object bb8577532add843833ebf8b5324f94f84cb71ca0 cgit $ git show $(git -C git rev-parse --short @) fatal: ambiguous argument 'bb85775': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git command [revision...] -- [file...]' -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html