Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Michael Junio, On 2015-01-22 18:17, Johannes Schindelin wrote: [...] we need to avoid confusing settings such as ``` [receive.fsck] warn = missing-tagger-entry error = missing-tagger-entry ``` I *think* I found a solution. Please let me recapitulate quickly the problem Michael brought up: if we support `receive.fsck.warn` to override `receive.fsck.error` and vice versa, with comma-separated lists, then it can be quite confusing to the user, and actually quite difficult to figure out on the command-line which setting is in effect (because it really depends on the *order* of the receive.fsck.* lines, *plus* the fact that the values are comma-separated lists). On the other hand, Junio pointed out two shortcomings with my original implementation (i.e. to support `receive.fsck.id = (error|warn|ignore)`), however: it is tedious to set multiple severity levels, and it violates the config file convention that the config variable names are CamelCased (the message IDs are dashed-lowercase instead). The solution I just implemented (and will send out shortly in v4 of the patch series) is the following: the config variable is called receive.fsck.severity and it accepts comma-separated settings. Example: ``` [receive fsck] severity = multiple-authors=ignore,missing-tagger=error ``` Now, it is *still* not the easiest to figure out the setting from the command-line: ```sh $ git config --get-all receive.fsck.severity | tr , \n | grep ^multiple-authors= | tail -n 1 ``` But I hope this is good enough, Michael? Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
On 12/23/2014 06:14 PM, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: On Tue, 23 Dec 2014, Junio C Hamano wrote: I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError I had something similar in an earlier version of my patch series, but it was shot down rightfully: if you want to allow inheriting defaults from $HOME/.gitconfig, you have to configure the severity levels individually. Hmmm. What's wrong with fsck.warn -missingTagger that overrides the earlier one, or even fsck.info missingTagger after having fsck.warn other,missingTagger,yetanother, with the usual last one wins rule? Whoever shot it down rightfully is wrong here, I would think. Sorry I didn't notice this earlier; Johannes, please CC me on these series, especially the ones that I commented on earlier. I might have been the one who shot down the severity=name style of configuration [1]. I don't feel strongly enough to make a big deal about this, especially considering that the other alternative has already been implemented. But for the record, let me explain why I prefer the name=severity style of configuration. First, it is a truer representation of the data structure within the software, which is basically one severity value for each error type. This is not a decisive argument, but it often means that there is less impedance mismatch between the style of configuration and the concepts that it is configuring. For example, $ git config receive.fsck.warn A,B,C $ git config receive.fsck.error C,D,E seems to be configuring two sets, but it is not. It is mysteriously setting C to be an error, in seeming contradiction of the first line [2]. Second, it is not correct to say that this is just an application of the last setting wins rule. The last setting wins rule has heretofore, as far as I know, only covered *single* settings that take a single value. If we applied that rule to the following: $ git config receive.fsck.warn A,B,C $ git config receive.fsck.warn B,F then the net result would be B,F. But that is not your proposal at all; your proposal is for these two settings to be interpreted the same as $ git config receive.fsck.warn A,B,C,F Similarly, the traditional last setting rule, applied to the first example above, wouldn't cause the value of fsck.warn to be reduced to A,B, as you propose. This is not the last setting rule that we are familiar with--it operates *across and within* values and across *multiple* names rather than just across the values for a single name. Third, the severity=name style is hard to inquire via the command line, and probably also incompatible with the simplified internal config API in git (and probably libgit2, JGit, etc). The problem is that determining a *single* setting requires *three* configuration variables be inquired, and that the settings for those three variables need to be processed in the correct order, including the correct order of interleavings. For example, how would you inquire about the configured severity level of missingTaggerEntry using the shell? It would be a mess that would necessarily have to involve git config --get-regexp and error-prone parsing of comma-separated values. It would be so much easier to type $ git config receive.fsck.missingtaggerentry Fourth, the severity=name style would cause config files to get cluttered up with unused values. Suppose you have earlier run $ git config receive.fsck.warn A,B,C $ git config receive.fsck.ignore D,E and now you want to demote B to ignore. You can do $ git config --add receive.fsck.ignore B (don't forget --add or you've silently erased other, unrelated settings!) This gives the behavior that you want. But now your config file looks like [receive fsck] warn = A,B,C ignore = D,E ignore = B The B on the first line is now just being carried along for no reason, but it would be quite awkward to clean it up programmatically. Effectively, these settings can only be added to but never removed because of the way multiple properties are mashed into a single setting. I believe that one of the main arguments for the severity=name style of configuration is that it carries over more easily into convenient command-line options. But I think it will be unusual to want to configure these options by hand on the command line, let alone adjust many settings at the same time. The idea isn't to make it easy to work with repositories that have a level of breakage that fluctuates over time. It is to make it possible to work with *specific* repositories that have known breakage in their history. For such a repo you would configure one or two ignore options one time and then never adjust them again. (And it will also allow us to make our checks stricter in the future without breaking existing
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Michael, On 2015-01-22 16:49, Michael Haggerty wrote: On 12/23/2014 06:14 PM, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: On Tue, 23 Dec 2014, Junio C Hamano wrote: I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError I had something similar in an earlier version of my patch series, but it was shot down rightfully: if you want to allow inheriting defaults from $HOME/.gitconfig, you have to configure the severity levels individually. Hmmm. What's wrong with fsck.warn -missingTagger that overrides the earlier one, or even fsck.info missingTagger after having fsck.warn other,missingTagger,yetanother, with the usual last one wins rule? Whoever shot it down rightfully is wrong here, I would think. Sorry I didn't notice this earlier; Johannes, please CC me on these series, especially the ones that I commented on earlier. Very sorry, this is my fault. It can only be explained by my switching around some tools for other tools to work with email-based patch submission (which I had not done in a long time). But still, my mistake. [1] I prefer to think that I just offered a little gentle discussion that informed Johannes's independent decision :-) You did convince me back then. I just did not want to put up a fight against Junio because I was more interested in getting this feature merged before the holidays (it does feel awkward for me to leave work unwrapped-up before leaving for an extended amount of time, but I guess I am getting more used to that). So now I cannot avoid discussing this issue properly... In essence, I agreed with Junio from the point of view of an elegant implementation. But then, Michael is correct that it does not really matter as much how complicated the code is, but that it is much more important that the feature is elegant to use. Now let's step back a bit and think about the users which is supposed to be supported by this patch series: Git repository hosters -- such as GitHub -- need to ensure a certain cleanliness of the repositories they host (for a range of reasons, including the prevention of malicious attacks, or helping users publish their code in a correct form). And the scenario in which the feature needs to be used is most likely started by some Git user pushing some commits, and `git receive-pack` triggering an error. Then the user files a trouble ticket and GitHubber needs to inspect the error and the respective object. Now, in the vast number of cases I imagine that the objects *are* faulty. However, on occasion the problem should not prevent the push, e.g. when somebody crafted a commit object with two authors, forgetting that the tools usually cannot handle such commits. Then the GitHubber has to decide on a case by case basis whether to demote that error to a warning and allow the object to be pushed *into that specific repository*. I do see the need for this feature to be simple and robust, from the users' point of view. In other words, I agree with Michael that we need to avoid confusing settings such as ``` [receive.fsck] warn = missing-tagger-entry error = missing-tagger-entry ``` This feature will be used rarely enough that the poor soul stuck with interpreting the above config section won't remember that a very specific version of last setting wins is in effect. If I remember correctly, Peff suggested that there needs to be a way to handle these settings in the /etc/gitconfig $HOME/.gitconfig $XDG../gitconfig .git/config cascade, but now I am puzzled whether it is even desirable to demote fsck errors globally, i.e. whether we really need to pay attention to that config cascade. And finally, in the course of preparing this patch series, we came up with an alternative solution to the problem: the receive.fsck.skiplist (i.e. a file that contains a sorted list of SHA-1s of objects that should be skipped from fsck'ing). I am more and more convinced that this is the most convenient tool for the scenario described above: manual inspection of individual objects will tell whether it is safe to allow them onto the server or not. However, others might disagree and prefer the explicit approach, e.g. when some source generates a consistent stream of objects triggering fsck errors. Summary: I have no preference how to specify the severity levels of fsck messages, but I will gladly change my code to whatever you (meaning Junio and Michael in particular) want to see implemented. Thanks for helping me with this feature, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Mon, 22 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: On Mon, 22 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Except that cunning me has made it so that both missing-tagger-entry *and* missingTaggerEntry work... Then the missing-tagger-entry side needs to be dropped. The naming does not conform to the way how we name our officially supported configuration variables. But it does conform with the way we do our command-line parameters, and it would actually cause *more* work (and more complicated code) to have two separate parsers, or even to force the parser to accept only one way to specify settings... Should I really introduce more complexity just to disallow non-camelCased config variables? Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Except that cunning me has made it so that both missing-tagger-entry *and* missingTaggerEntry work... Then the missing-tagger-entry side needs to be dropped. The naming does not conform to the way how we name our officially supported configuration variables. But it does conform with the way we do our command-line parameters, Hmmm What is the expected user interaction? The way I read the series was ($MISSING_TAGGER stands for the token we choose to show): (1) The user runs fsck without customization, and may see a warning (or error): $ git fsck error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER (2) The user demotes it to warning and runs fsck again: $ git config fsck.$MISSING_TAGGER warn $ git fsck warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError Then a one-shot override would make sense and easier to give as command line option, e.g. $ git fsck --warn=missingTagger,someOtherKindOfError But the proposed organization to use one variable per questionable event type (as opposed to one variable per severity level) would lead to a one-shot override of this form, e.g. $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn which I think is insane to require us to support unbound number of dashed options. Or are you saying that we allow git config core.file-mode true from the command line to set core.fileMode configuration? Puzzled... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Except that cunning me has made it so that both missing-tagger-entry *and* missingTaggerEntry work... Then the missing-tagger-entry side needs to be dropped. The naming does not conform to the way how we name our officially supported configuration variables. But it does conform with the way we do our command-line parameters, Hmmm What is the expected user interaction? The way I read the series was ($MISSING_TAGGER stands for the token we choose to show): (1) The user runs fsck without customization, and may see a warning (or error): $ git fsck error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER (2) The user demotes it to warning and runs fsck again: $ git config fsck.$MISSING_TAGGER warn $ git fsck warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER The intended use case is actually when receive.fsckObjects = true and you call `git push`, seeing 'remote: error: $MULTIPLE_AUTHORS: ...'. Now, the $MULTIPLE_AUTHORS *config* setting is parsed by `git receive-pack`, but that is not the command that needs to customize the fsck call: it is either `git index-pack` or `git unpack-objects`. So what `git receive-pack` does is to pass the config options as command-line options to the called command. For consistency with the rest of Git, the command-line options were *not* camel-cased, but lower-case, dash-separated. The parser I wrote actually accepts both versions, allowing me to skip the tedious step to convert the camelCased config setting into a lower-case-dashed version to pass to `index-pack` or `unpack-objects`, only to be parsed by the same parser as `fsck` would use directly. So I am rather happy with the fact that the parser handles both camelCased and lower-case-dashed versions. I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError I had something similar in an earlier version of my patch series, but it was shot down rightfully: if you want to allow inheriting defaults from $HOME/.gitconfig, you have to configure the severity levels individually. (The current solution also sidesteps the problematic situation when both fsck.warn *and* fsck.error contain, say, missingTagger.) Then a one-shot override would make sense and easier to give as command line option, e.g. $ git fsck --warn=missingTagger,someOtherKindOfError Yep, my first implementation actually used `--strict=missing-tagger,-some-demoted-error`. But as I mentioned above, that approach is not as flexible as the current one. But the proposed organization to use one variable per questionable event type (as opposed to one variable per severity level) would lead to a one-shot override of this form, e.g. $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn which I think is insane to require us to support unbound number of dashed options. The intended use case is actually *not* the command-line, but the config file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and* .git/config to customize the settings. Or are you saying that we allow git config core.file-mode true from the command line to set core.fileMode configuration? I do not understand this reference. I did not suggest to change `git config`, did I? If I did, I apologize; it was definitely *not* my intention to change long-standing customs. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Junio C Hamano gits...@pobox.com writes: I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError By the way, I think I like this organization is much better than the other way around, i.e. fsck.missingTagger=warn, as we do not want the namespace under fsck.* for variables that control the behaviour of fsck that are *NOT* kinds of questionable conditions fsck can find. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: The parser I wrote actually accepts both versions, allowing me to skip the tedious step to convert the camelCased config setting into a lower-case-dashed version to pass to `index-pack` or `unpack-objects`, only to be parsed by the same parser as `fsck` would use directly. So I am rather happy with the fact that the parser handles both camelCased and lower-case-dashed versions. That is myopic view of the world that ignores maintainability and teachability, doing disservice to our user base. What message does it send to an unsuspecting new user that fsck.random-error is silently accepted (because we will never document it) as an alias for fsck.randomError, while most of the configuration variables will not take such an alias? I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError I had something similar in an earlier version of my patch series, but it was shot down rightfully: if you want to allow inheriting defaults from $HOME/.gitconfig, you have to configure the severity levels individually. Hmmm. What's wrong with fsck.warn -missingTagger that overrides the earlier one, or even fsck.info missingTagger after having fsck.warn other,missingTagger,yetanother, with the usual last one wins rule? Whoever shot it down rightfully is wrong here, I would think. But the proposed organization to use one variable per questionable event type (as opposed to one variable per severity level) would lead to a one-shot override of this form, e.g. $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn which I think is insane to require us to support unbound number of dashed options. The intended use case is actually *not* the command-line, but the config file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and* .git/config to customize the settings. But we do need to worry about one-shot override from the command line. A configuration that sticks without a way to override is a no-no. Or are you saying that we allow git config core.file-mode true from the command line to set core.fileMode configuration? I do not understand this reference. I was puzzled by your command line and wondering if you meant from the command line, aVariable can be spelled a-variable. I did not suggest to change `git config`, did I? If I did, I apologize; it was definitely *not* my intention to change long-standing customs. Then fsck.missing-tagger is definitely out. Long standing customs is that a multi-word token at the first and the last level is not dashed-multi-word. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: The parser I wrote actually accepts both versions, allowing me to skip the tedious step to convert the camelCased config setting into a lower-case-dashed version to pass to `index-pack` or `unpack-objects`, only to be parsed by the same parser as `fsck` would use directly. So I am rather happy with the fact that the parser handles both camelCased and lower-case-dashed versions. That is myopic view of the world that ignores maintainability and teachability, doing disservice to our user base. Okay, so just to clarify: you want me to - split the parser into - a parser that accepts only camelCased variable names when they come from the config (for use in fsck and receive-pack), and - another parser that rejects camelCased variable names and only accepts lower-case-dashed, intended for command-line parsing in fsck, index-pack and unpack-objects, and - consequently have a converter from the camelCased variable names we receive from the config in receive-pack so we can pass lower-case-dashed settings to index-pack and unpack-objects. If you want it this way, I will do it this way. What message does it send to an unsuspecting new user that fsck.random-error is silently accepted (because we will never document it) as an alias for fsck.randomError, while most of the configuration variables will not take such an alias? I will not participate in a discussion about consistency again. There is nothing that can be done about it; what matters is what you will accept and what not. I will make the code stricter (and consequently more complex) if that is what you want. I suspect that it would be much better if the configuration variables were organized the other way around, e.g. $ git config fsck.warn missingTagger,someOtherKindOfError I had something similar in an earlier version of my patch series, but it was shot down rightfully: if you want to allow inheriting defaults from $HOME/.gitconfig, you have to configure the severity levels individually. Hmmm. What's wrong with fsck.warn -missingTagger that overrides the earlier one, or even fsck.info missingTagger after having fsck.warn other,missingTagger,yetanother, with the usual last one wins rule? I will change the code (next year...). But the proposed organization to use one variable per questionable event type (as opposed to one variable per severity level) would lead to a one-shot override of this form, e.g. $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn which I think is insane to require us to support unbound number of dashed options. The intended use case is actually *not* the command-line, but the config file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and* .git/config to customize the settings. But we do need to worry about one-shot override from the command line. A configuration that sticks without a way to override is a no-no. And of course you can, by specifying the config setting via the -c command-line option. The only inconsistency here is that all other command-line options are lower-case-dashed, while the config settings are camelCased. Or are you saying that we allow git config core.file-mode true from the command line to set core.fileMode configuration? I do not understand this reference. I was puzzled by your command line and wondering if you meant from the command line, aVariable can be spelled a-variable. Well, of course, if you call `git -c aVariable command --option=a-variable` you have a nice accumulation of styles right there ;-) I did not suggest to change `git config`, did I? If I did, I apologize; it was definitely *not* my intention to change long-standing customs. Then fsck.missing-tagger is definitely out. Long standing customs is that a multi-word token at the first and the last level is not dashed-multi-word. But I already changed all of the patches to fsck.missingTagger. The only thing I did not do yet is to split the parser into two, one accepting only camelCased, one accepting only lower-case-dashed options, and a translator to convert from camelCase to lower-case-dashed versions (because it is a lot of work and additional complexity, as well as opportunity for bugs to hide because we'll have three code paths). I asked you above whether you want that, and I will do it if you say that you do. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Okay, so just to clarify: you want me to - split the parser into - a parser that accepts only camelCased variable names when they come from the config (for use in fsck and receive-pack), and OK. - another parser that rejects camelCased variable names and only accepts lower-case-dashed, intended for command-line parsing in fsck, index-pack and unpack-objects, and - consequently have a converter from the camelCased variable names we receive from the config in receive-pack so we can pass lower-case-dashed settings to index-pack and unpack-objects. I am not sure about the latter two. This needs a design discussion what the command line options should be. I think the command line should be like this: git cmd --warn=missingTags,missingAuthor in the first place, i.e. we may invent tokens to denote new kinds of errors as we improve fsck, not with we may add options for new kinds of errors, i.e. the command line should not look like this: git cmd --missing-tags=warn --missing-author=warn And from that point of view, I see no reason to support the dashed variant anywhere in the code, neither in the config parser or in the command line parser. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Okay, so just to clarify: you want me to - split the parser into - a parser that accepts only camelCased variable names when they come from the config (for use in fsck and receive-pack), and OK. - another parser that rejects camelCased variable names and only accepts lower-case-dashed, intended for command-line parsing in fsck, index-pack and unpack-objects, and - consequently have a converter from the camelCased variable names we receive from the config in receive-pack so we can pass lower-case-dashed settings to index-pack and unpack-objects. I am not sure about the latter two. This needs a design discussion what the command line options should be. I think the command line should be like this: git cmd --warn=missingTags,missingAuthor Okay. This contradicts the convention where Git uses lower-case-dashed command-line option values (e.g. on-demand, error-all, etc) and no camelCased options were present so far. But your wish is my command. Ciao, Dscho in the first place, i.e. we may invent tokens to denote new kinds of errors as we improve fsck, not with we may add options for new kinds of errors, i.e. the command line should not look like this: git cmd --missing-tags=warn --missing-author=warn And from that point of view, I see no reason to support the dashed variant anywhere in the code, neither in the config parser or in the command line parser. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Junio C Hamano gits...@pobox.com writes: Johannes Schindelin johannes.schinde...@gmx.de writes: Okay, so just to clarify: you want me to - split the parser into - a parser that accepts only camelCased variable names when they come from the config (for use in fsck and receive-pack), and OK. - another parser that rejects camelCased variable names and only accepts lower-case-dashed, intended for command-line parsing in fsck, index-pack and unpack-objects, and - consequently have a converter from the camelCased variable names we receive from the config in receive-pack so we can pass lower-case-dashed settings to index-pack and unpack-objects. I am not sure about the latter two. This needs a design discussion what the command line options should be. I think the command line should be like this: git cmd --warn=missingTags,missingAuthor in the first place, i.e. we may invent tokens to denote new kinds of errors as we improve fsck, not with we may add options for new kinds of errors, i.e. the command line should not look like this: git cmd --missing-tags=warn --missing-author=warn And from that point of view, I see no reason to support the dashed variant anywhere in the code, neither in the config parser or in the command line parser. Having said that, I think missingTags etc. should not be configuration variable names (instead, they should be values). Because of that, I do not think we need consistency between the way these tokens that denote kinds of errors fsck denotes are spelled and the way configuration variable names are spelled. In other words, I do not think there is nothing that comes from how our configuration variable names are spelled that gives preference to one over the other between the two styles: (1) Tokens are camelCased [fsck] warn = missingTagger,missingAuthor error = zeroPadTreeEntry $ git cmd --warn=missingTagger,missingAuthor (2) Tokens are dashed-multi-words [fsck] warn = missing-tagger,missing-author error = zero-pad-tree-entry $ git cmd --warn=missing-tagger,missing-author Do I have a strong preference between these two? Not really. My gut reaction is that (2) may be easier to read, but I can be persuaded either way. Who else has/had opinions on this? Earlier you said that the configuration the other way, i.e. [fsck] warn = missingTag, was shot down---who did shoot it? Perhaps that person may be able to point out where in my thinking above I am going in the wrong direction. Thanks. [Footnote] In either case, I'd recommend that we take [ ,]+ as inter-token separator to ease the use on the command line and config file, to allow these: [fsck] warn = missingTagger missingAuthor warn = missingTagger,missingAuthor $ git cmd --warn missingTagger,missingAuthor $ git cmd --warn missingTagger missingAuthor -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Having said that, I think missingTags etc. should not be configuration variable names (instead, they should be values). Because of that, I do not think we need consistency between the way these tokens that denote kinds of errors fsck denotes are spelled and the way configuration variable names are spelled. Okay. That makes more sense. Now I can remove the complexity introduced by teaching the parser to accept camelCased values, and we're golden. In either case, I'd recommend that we take [ ,]+ as inter-token separator to ease the use on the command line and config file And this is indeed the case: +void fsck_strict_mode(struct fsck_options *options, const char *mode) +... + while (*mode) { + int len = strcspn(mode, ,|), equal, msg_id; +... In other words, I even allowed the pipe symbol as separator. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Having said that, I think missingTags etc. should not be configuration variable names (instead, they should be values). Because of that, I do not think we need consistency between the way these tokens that denote kinds of errors fsck denotes are spelled and the way configuration variable names are spelled. Okay. That makes more sense. I am sorry that I didn't step back and think about it earlier to notice that we shouldn't be talking about configuration variable name syntax. I could have saved us time going back and forth if I did so earlier. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Tue, 23 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: On Tue, 23 Dec 2014, Junio C Hamano wrote: Having said that, I think missingTags etc. should not be configuration variable names (instead, they should be values). Because of that, I do not think we need consistency between the way these tokens that denote kinds of errors fsck denotes are spelled and the way configuration variable names are spelled. Okay. That makes more sense. I am sorry that I didn't step back and think about it earlier to notice that we shouldn't be talking about configuration variable name syntax. I could have saved us time going back and forth if I did so earlier. Do not worry. You were just trying to make this software better, same as I tried. Unfortunately, I will not be able to submit v2 of this patch series this year, but I will do so in the second week of January (including the change to the global array with the default severity levels because I do want to see this feature integrated). Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Unfortunately, I will not be able to submit v2 of this patch series this year, but I will do so in the second week of January (including the change to the global array with the default severity levels because I do want to see this feature integrated). Heh, we are not in a hurry. Enjoy your holidays. A happy new year to you in advance ;-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Wed, 10 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missing-email=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. Hopefully I do not have to repeat myself, but please do not have punctuations in the first and the last level of configuration variable names, i.e. fsck.missingEmail, not mising-email. I vetted the complete patch series and think I caught them all. Or do you want the error messages to also use camelCased IDs, i.e. warning in tag $tag: missingTaggerEntry: invalid format ... instead of warning in tag $tag: missing-tagger-entry: invalid format ... ? It is easy to do, but looks slightly uglier to this developer's eyes... Should these be tied to receive-pack ones in any way? E.g. if you set fsck.missingEmail to ignore, you do not have to do the same for receive and accept a push with the specific error turned off? Not a rhetorical question. I can see it argued both ways. The justification to defend the position of not tying these two I would have is so that I can be more strict to newer breakages (i.e. not accepting a push that introduce a new breakage by not ignoring with receive.fsck.*) while allowing breakages that are already present. The justification for the opposite position is to make it more convenient to write a consistent policy. Whichever way is chosen, we would want to see the reason left in the log message so that people do not have to wonder what the original motivation was when they decide if it is a good idea to change this part of the code. Hmm. I really tried very hard to separate the fsck.* from the receive.* settings because the two code paths already behave differently: many warnings reported (and ignored) by fsck are fatal errors when pushing with receive.fsckObjects=true. My understanding was that these differences are deliberate, and my interpretation was that the fsck and the receive settings were considered to be fundamentally different, even if the same fsck machinery performs the validation. If you agree, I would rephrase this line of argument and add it to the commit message. Do you agree? Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Or do you want the error messages to also use camelCased IDs, i.e. warning in tag $tag: missingTaggerEntry: invalid format ... instead of warning in tag $tag: missing-tagger-entry: invalid format ... ? It is easy to do, but looks slightly uglier to this developer's eyes... Do you really need to know what I think? Can I say The latter is probably slightly better, but both look ugly to me? If we want a human readable message warning: tag object lacks tagger field '$tag' would be my preference. But I personally think it may not be necessary to give a pretty message that later can go through l10n here. If we take that position, it is just a machine-readble error token, so I'd say whichever way you find more readable is OK. Should these be tied to receive-pack ones in any way? E.g. if you set fsck.missingEmail to ignore, you do not have to do the same for receive and accept a push with the specific error turned off? Not a rhetorical question. I can see it argued both ways. The justification to defend the position of not tying these two I would have is so that I can be more strict to newer breakages (i.e. not accepting a push that introduce a new breakage by not ignoring with receive.fsck.*) while allowing breakages that are already present. The justification for the opposite position is to make it more convenient to write a consistent policy. Whichever way is chosen, we would want to see the reason left in the log message so that people do not have to wonder what the original motivation was when they decide if it is a good idea to change this part of the code. Hmm. I really tried very hard to separate the fsck.* from the receive.* settings because the two code paths already behave differently:... If you agree, I would rephrase this line of argument and add it to the commit message. Do you agree? Yeah, that reasoning sounds sensible. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Mon, 22 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Or do you want the error messages to also use camelCased IDs, i.e. warning in tag $tag: missingTaggerEntry: invalid format ... instead of warning in tag $tag: missing-tagger-entry: invalid format ... ? It is easy to do, but looks slightly uglier to this developer's eyes... Do you really need to know what I think? Well, but yes ;-) Can I say The latter is probably slightly better, but both look ugly to me? Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... If we want a human readable message warning: tag object lacks tagger field '$tag' would be my preference. But I personally think it may not be necessary to give a pretty message that later can go through l10n here. If we take that position, it is just a machine-readble error token, so I'd say whichever way you find more readable is OK. Okay, I will leave it as-is, then. Should these be tied to receive-pack ones in any way? E.g. if you set fsck.missingEmail to ignore, you do not have to do the same for receive and accept a push with the specific error turned off? Not a rhetorical question. I can see it argued both ways. The justification to defend the position of not tying these two I would have is so that I can be more strict to newer breakages (i.e. not accepting a push that introduce a new breakage by not ignoring with receive.fsck.*) while allowing breakages that are already present. The justification for the opposite position is to make it more convenient to write a consistent policy. Whichever way is chosen, we would want to see the reason left in the log message so that people do not have to wonder what the original motivation was when they decide if it is a good idea to change this part of the code. Hmm. I really tried very hard to separate the fsck.* from the receive.* settings because the two code paths already behave differently:... If you agree, I would rephrase this line of argument and add it to the commit message. Do you agree? Yeah, that reasoning sounds sensible. I added this paragraph: In the same spirit that `git receive-pack`'s usage of the fsck machinery differs from `git fsck`'s – some of the non-fatal warnings in `git fsck` are fatal with `git receive-pack` when receive.fsckObjects = true, for example – we strictly separate the fsck.* from the receive.fsck.* settings. Ciao, Dscho
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Hi Junio, On Mon, 22 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Except that cunning me has made it so that both missing-tagger-entry *and* missingTaggerEntry work... Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Hi Junio, On Mon, 22 Dec 2014, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Of course you can say that! ;-) The problem these ugly messages try to solve is to give the user a hint which setting to change if they want to override the default behavior, though... Ahh, OK, then dashed form would not work as a configuration variable names, so missingTaggerEntry would be the only usable option. Except that cunning me has made it so that both missing-tagger-entry *and* missingTaggerEntry work... Then the missing-tagger-entry side needs to be dropped. The naming does not conform to the way how we name our officially supported configuration variables. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/18] fsck: support demoting errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missing-email=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. Hopefully I do not have to repeat myself, but please do not have punctuations in the first and the last level of configuration variable names, i.e. fsck.missingEmail, not mising-email. Should these be tied to receive-pack ones in any way? E.g. if you set fsck.missingEmail to ignore, you do not have to do the same for receive and accept a push with the specific error turned off? Not a rhetorical question. I can see it argued both ways. The justification to defend the position of not tying these two I would have is so that I can be more strict to newer breakages (i.e. not accepting a push that introduce a new breakage by not ignoring with receive.fsck.*) while allowing breakages that are already present. The justification for the opposite position is to make it more convenient to write a consistent policy. Whichever way is chosen, we would want to see the reason left in the log message so that people do not have to wonder what the original motivation was when they decide if it is a good idea to change this part of the code. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/18] fsck: support demoting errors to warnings
We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missing-email=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/config.txt | 13 + builtin/fsck.c | 15 +++ t/t1450-fsck.sh | 11 +++ 3 files changed, 39 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index b3276ee..fa58c26 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1192,6 +1192,19 @@ filter.driver.smudge:: object to a worktree file upon checkout. See linkgit:gitattributes[5] for details. +fsck.*:: + With these options, fsck errors can be switched to warnings and + vice versa by setting e.g. `fsck.bad-name` to `warn` or `error` + (or `ignore` to hide those errors completely). For convenience, + fsck prefixes the error/warning with the name of the option, e.g. + missing-email: invalid author/committer line - missing email + means that setting `fsck.missing-email` to `ignore` will hide that + issue. For convenience, camelCased options are accepted, too (e.g. + `fsck.missingEmail`). ++ +This feature is intended to support working with legacy repositories +which cannot be repaired without disruptive changes. + gc.aggressiveDepth:: The depth parameter used in the delta compression algorithm used by 'git gc --aggressive'. This defaults diff --git a/builtin/fsck.c b/builtin/fsck.c index 99d4538..2b8faa4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -46,6 +46,19 @@ static int show_dangling = 1; #define DIRENT_SORT_HINT(de) ((de)-d_ino) #endif +static int fsck_config(const char *var, const char *value, void *cb) +{ + if (starts_with(var, fsck.)) { + struct strbuf sb = STRBUF_INIT; + strbuf_addf(sb, %s=%s, var + 5, value ? value : error); + fsck_strict_mode(fsck_obj_options, sb.buf); + strbuf_release(sb); + return 0; + } + + return git_default_config(var, value, cb); +} + static void objreport(struct object *obj, const char *severity, const char *err) { @@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) include_reflogs = 0; } + git_config(fsck_config, NULL); + fsck_head_link(); fsck_object_dir(get_object_directory()); diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 019fddd..d74df19 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -283,6 +283,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' ' grep -q error: sha1 mismatch 63ff out ' +test_expect_success 'force fsck to ignore double author' ' + git cat-file commit HEAD basis + sed s/^author .*/,/ basis | tr , \\n multiple-authors + new=$(git hash-object -t commit -w --stdin multiple-authors) + test_when_finished remove_object $new + git update-ref refs/heads/bogus $new + test_when_finished git update-ref -d refs/heads/bogus + test_must_fail git fsck + git -c fsck.multiple-authors=ignore fsck +' + _bz='\0' _bz5=$_bz$_bz$_bz$_bz$_bz _bz20=$_bz5$_bz5$_bz5$_bz5 -- 2.0.0.rc3.9669.g840d1f9 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html