Date: Sat, 9 Aug 2025 12:56:41 +0200 From: Denys Vlasenko <dvlas...@redhat.com> Message-ID: <e9f5e8d1-55c9-7913-4bc7-65f74ff55...@redhat.com>
| I don't demand that. When you send to a bug list with a bug report claiming that existed behaviour is incorrect, that's what you're doing, even if not intentionally. | What I "demand" (advocate for, or trying to facilitate) is to have | consistent behavior across tools. That's difficult to achieve, particularly between shells and other tools, as shells have quoting, in addition to \ escaping, and the difference wasn't always recognised - other tools that use glob matching generally don't have that, all they have is \ escaping, and that makes a differece to how things get interpreted. And of course, regular expressions, used by other tools are a completely different object, even if their uses are kind of similar, and there are some common syntax elements - but comparing glob and REs is no more rational than comparing C and C++ - different things, different rules. Wrt glob matching, and shells, if sh if a word to be expanded is \** there's no question that it means a literal asterisk, followed by anything. But that isn't an escaped asterisk at the start, it is a quoted asterisk, using the shell's backslash quoting mechanism. On the other hand if you do X='\**' and then expand (unquoted) $X there the 3 chars remain, nothing quoted, and the glob expansion does see an escaped asterisk rather than a quoted one. The difference is subtle, but it exists, and it affects just how shells perform glob processing. | I look at it from the POV of the users. It's a particular source | of PITA when supposedly compatible tools have these irritating | corner-case differences. It can be, though this kind of thing vary rarely ever affects real life applications - there have been some notoriously broken behaviours in some applications that have remained for years, as in practice, nothing ever ventures far enough into the wild corners to encounter the brokenness. You also have to look at things from the POV of the users of a particular tool. They have become used to the way that tool works, and to be told "we're going to change because we were different than this other tool, and if that breaks things for you, then sorry...". For that reason (to avoid that), no-one likes to change historic behaviour, even if it is in edge corner cases, as the implementors just can't be sure what user code will be broken by any change. That makes it very difficult to ever reconcile the different cases, and why the standards make some things unspecified - that's a warning to the users "Don't do that, the results might not be what you expect". | What we (collectively in different tools) aren't doing well is: | * we don't always communicate well No, most people who are implementing something new just do what they believe is correct, and often that isn't what is actually done by wherever they're copying from. Then of course, different tools have different needs, and different contexts. | * when we fix a bug, we don't always add a testsuite item to track it. | The end result is that when we fix a bug in an obscure case, | we often break another, previously fixed behavior in another obscure | case. | | For example, dash does not seem to have a testsuite. It's not in | the git tree, at least. We (NetBSD) do, the shell tests (including for glob matching) is fairly extensive - I ran it against other shells from time to time, but while running it is simple, interpreting the results is not. Some of our tests are testing our behaviour in what are technically unspecified cases, the tests are there to make sure that (as you said) a fix (or just a change) to something doesn't inadvertantly alter some other case. So when our tests claim that dash, or bash, or ... is "broken" because those don't produce the same result we do, it needs careful examination to determine if there is a real bug, or just a different version of what is unspecified. Beyond that, I am not really a fan of adding tests in general as a response to fixing bugs - rather the tests should find the bugs in the first place. If the bug exposed some previously unconsidered corner case, which hadn't been tested for perviously, then that's a bug in the tests as well, which should be fixed. But if the bug being fixed is just a coding error, which won't magically just reappear, then adding a test for something already fixed, is just wasting everyone's time. | I don't care what the behavior would be deemed "correct" | in this case, but it'll be better to have the same behavior | across all Bourne shells. It would be nice, but backwards compat means that just isn't likely to happen - whose version of what is correct would you follow? The most popular? (by what measure?) The most "correct" ? Toss a coin? For example, in one of your tests, I noticed that almost all shells gave the same result (not one you were advocating), but the NetBSD shell, and yash, gave a different one (also not one ...). (Yes, it was one of the unspecified cases - I've forgotten which at the minute). If we were to: | Lets just choose one behavior, and add it to our testsuites | to "solidify" it, so that it doesn't changes unnoticed | in the future. Which one should be picked? Whose users are we going to disappoint when their shell alters its behaviour ... and notice that almost none of them (users) ever actually read the standard, they just use whatever works (and perhaps is documented) for the shell they're using - though if you read the bash list long enough, you'll encounter users demanding that clearly broken (buggy) code (never documented as working) not be changed, because they had decided it was a feature, and adopted it. And since I mentioned yash (with which I have nothing to do whatever, except it is one I build and test against) I would suggest that you use that for many of your tests, it is perhaps the most standards comforming of all of the shells that claim to be Bourne Shell variants. If it does something in a particular way, there is probably a reason. (Which isn't to say it has no bugs, it has a few). It is certainly more conforming than the NetBSD shell (not particularly wrt pattern matching, but in other areas) - there are some requirements in the standard that I won't even consider implementing (the newest is "declaration utilities" which is one of the stupidest additions ever made - completely unnecessary (for standard syntax), and breaks things.) kre