On Apr 14, 2017, at 5:09 PM, Ross Berteig <r...@cheshireeng.com> wrote:
> I've checked it in on the glob-docs branch until it has been read by at least > one more pair of eyes. How about two pair? (Because foureyes. Ahahah.) Go put your Nomex underwear on; I’m a brutal copy editor. Complaints, comments, and considerations, mostly in order of the current presentation: 1. It doesn’t tell you that globs and regexes are not the same thing. I see this confusion occasionally, so I think it’s worth a warning. We are, after all, targeting this at least partly at people who don’t already know what “glob” means. (Example: https://unix.stackexchange.com/q/279661) 2. I’d move that first parenthetical to a second sentence. It’s hard to read as-is. Consider: “Glob patterns are also accepted as options to certain commands as well as query parameters to certain pages.” 3. GLOB is all-caps in Fossil help output because it’s a variable parameter, but it should only be written that way in documentation when referring to syntax examples in Fossil command output or the corresponding docs on fossil-scm.org. GLOB is not an acronym. The correct term is “a glob pattern,” or more idiomatically, “a glob”: https://en.wikipedia.org/wiki/Glob_(programming) Link that Wikipedia article somewhere near the top of your article, too. 4. Para 2, sentence 1: make it two sentences. The second half doesn’t follow from the first. It’s an independent statement. 5. Para 2, sentence 2: nix the comma; the second part is not a complete sentence. 6. Nix para 4: we already know that most documentation exists to avoid the need to RTFS. It needn’t be stated here. :) 7. Move “any” definition to a sentence after the table: “Any other character matches that character exactly.” or similar. 8. “…additional features:” (Colon, not period.) 9. Ranges: How does that work with Unicode? That is, does [a-d] match ä in any collating order supported by Fossil? Does it depend on whether Fossil is linked to libics? Answers to both should be given here. Let’s not be needlessly Anglocentric. (Qualifier given because I don’t mean to suggest that you must translate this whole document to other human languages.) 10. Matching hyphen: can you also put it first in a bracket expression, after an optional ^, as specified by POSIX? http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13 If so, document it, and if not, file a ticket, because it’s a bug. (Another good link to put into the document.) 11. [^]] example: I’d prefer a different character here to avoid confusion. [^a] would be fine. 12. “...must match the entire name to be considered a match.” Emphasize this somehow. It’s a commonly-misunderstood aspect of Fossil’s glob matching rules. 13. "may have one GLOB per line.” -> “may be given as one glob per line.” 14. First para of "File names to match” section needs a rewrite: “The canonical name of a file has all directory separators changed to `/`, redundant slashes are removed, all `.` path components are removed, and all `..` path components are resolved. (There are additional details we won’t go into here.)” In that rewrite, I changed the treatment of slashes in part because I doubt “../foo” is left unchanged. By your rules, the .. component would have to have the leading slash to be affected. Double-check with the source, but I’m pretty sure I’m right. My rewrite also adds the redundant slash removal bit, as is proper according to POSIX, but may not be the way Fossil works; if it doesn’t work that way, that needs to be documented instead, since it will confuse those of us who know POSIX requires this. That is, /bin/ls and /bin////ls are the same thing on a POSIX box. Whether Fossil follows suit or not, it needs to be documented. (There’s an obscure POSIX rule that says two leading slashes must be left untouched, but I wouldn’t expect Fossil to obey this, and I certainly wouldn’t expect most readers to know about the rule and therefore expect the exception.) 15. There is no item #15. 16. "This has some consequences.” That seems to want to introduce a list, not stand alone as a complete paragraph. Make the following paragraph a bulleted list, and change . to :. 17. “Recall that…” I don’t think it’s clear from the earlier paragraph that \ becomes / even on Windows. It could be read as simply a bit of Unix-centrism, with some Windows-using readers disregarding it, thinking, “Yeah, yeah, I know what you really mean here.” That second-guesser would be wrong in this case, so say instead, “Fossil glob patterns always use forward slashes as path separators, even on Windows.” 18. The “Where are they used” and “Platform quirks” headers should be bigger. I think you have your # character counts mixed up in the Markdown source. (Or ---- where you mean ====, if you do your headings that way; I didn’t bother looking.) 19. /timeline -> `/timeline` 20. “It also can use” -> "It can also use” 21. “GLOB, LIKE, or REGEXP” I don’t think you want to talk about the SQLite operator/function GLOB here, as it’s confusing with respect to the simple treatment of globs in the rest of this document. Unless you’re going to give examples of GLOB-the-SQLite-function here, drop it from this list. (I think you can safely ignore that detail.) 22. Either give brief LIKE and REGEXP examples here in this document, add new documents for each and link to them, or drop mention of these details. As it stands, I’m left wondering why this doesn’t work: https://www.fossil-scm.org/index.html/timeline?chng=%25MakeLists.txt&ms=LIKE All three of my options prevent that confusion. 23. Does EXACT also work? This suggests that it should: https://sqlite.org/lang_expr.html (Another link opportunity. You can tell that I pepper my Markdown docs with links, can’t you?) If not, explain why not, lest someone else make the same leap. (And maybe file a feature request. It could be useful to bypass certain circles in the Quoting Inferno.) 24. "These settings are all lists of GLOBs.” Split the para here with the list between the two parts; end the first para with a colon. 25. “…or file in the repository’s…” -> “…or put a file in the repository’s” 26. If you’re going to cover `.fossil-settings` here at all, make it clear that this must be at the top level of the checkout directory. Adding `some/path/.fossil-settings/ignore-glob` to the repository won’t let you avoid prefixing globs with “some/path/“. This is a perfectly reasonable thing to try, particularly for Subversion and Git transplants, which allow .svnignore and .gitignore files anywhere in the tree, with matches based at the file’s location. (This Subversion transplant did it early in his Fossil career, and was annoyed when it didn’t work.) 27. Add a section or new document on transitioning from .fooignore files. 28. The “Commands that refer to globs” section should make clear that it is talking about things like the --clean and --ignore options to `fossil add`, and that it is not talking about the file lists some of these commands take. It should refer the reader to my next item’s rewrite, which covers that. 29. The Platform Quirks section needs a total rewrite. Sorry, but it left me confused, and I know what’s going on. How about this: ———————————————————— # Platform Quirks Fossil glob patterns are based on the glob pattern feature of POSIX shells. Fossil glob patterns also have a quoting mechanism, discussed above. Because other parts of your operating system may interpret glob patterns and quotes separately from Fossil, it is often difficult to give glob patterns correctly to Fossil on the command line. Quotes and special characters in glob patterns are likely to interpreted when given as part of a `fossil` command, causing unexpected behavior. These problems do not affect [versioned settings files](/doc/trunk/www/settings.wiki) or Admin → Settings in Fossil UI. Consequently, it is better to set long-term `*-glob` settings via these methods than to use `fossil settings` commands. That advice doesn’t help you when you are giving one-off glob patterns in `fossil` commands. The remainder of this section gives remedies and workarounds for these problems. ## POSIX Systems If you are using Fossil on a system with a POSIX-compatible shell — Linux, macOS, the BSDs, Unix, Cygwin, WSL etc. — the shell may expand the glob patterns before passing the result to the `fossil` executable. Sometimes this is exactly what you want. Consider this command for example: $ fossil add RE* If you give that command in a directory containing `README.txt` and `RELEASE-NOTES.txt`, the shell will expand the command to: $ fossil add README.txt RELEASE-NOTES.txt …which is compatible with the `fossil add` command’s argument list, which allows multiple files. Fossil doesn’t see the glob pattern at all, but since the command does what you almost certainly wanted anyway, it’s fine. Now consider what happens instead if you say: $ fossil add --ignore RE* src/*.c This *doesn’t* do what you want because the shell will expand both `RE*` and `src/*.c`, causing one of the two files matching the `RE*` glob pattern to be ignored and the other to be added to the repository. You need to say this in that case: $ fossil add --ignore 'RE*' src/*.c The single quotes force a POSIX shell to pass the `RE*` glob pattern through to Fossil untouched, which will do its own glob pattern matching. There are other methods of quoting a glob pattern or escaping its special characters; see your shell’s manual. POSIX shells also interpret the same quotation marks Fossil uses to handle things like spaces in file names, as discussed above. For example, if you needed to add all files matching `RE*` to the repository except for a file called `REALLY SECRET STUFF.txt`, you could use nested quotes: $ fossil add --ignore "'REALLY SECRET STUFF.txt'" RE* You could instead escape a second set of double quotation marks: $ fossil add --ignore "\"REALLY SECRET STUFF.txt\"" RE* It bears repeating that the two glob patterns here are not interpreted the same way when running this command from a *subdirectory* of the top checkout directory as when running it at the top of the checkout tree. If these files were in a subdirectory of the checkout tree called `doc` and that was your current working directory, the command would have to be: $ fossil add --ignore "'doc/REALLY SECRET STUFF.txt'" RE* instead. The Fossil glob pattern still needs the `doc/` prefix because Fossil always interprets glob patterns from the base of the checkout directory, not from the current working directory as POSIX shells do. ## Windows Neither standard Windows command shell — `cmd.exe` or PowerShell — expands glob patterns the way POSIX shells do. Windows command shells rely on the command itself to do the glob pattern expansion. The way this works depends on several factors: * the version of Windows you’re using * which OS upgrades have been applied to it * the compiler that built your Fossil executable * whether you’re running the command interactively * whether the command is built against a runtime system that does this at all * whether the Fossil command is being run from a file named `*.BAT` vs being named `*.CMD` * the phase of the moon and whether this is an odd-numbered Thursday. (No, not really, but the other caveats are all true. Yay, Windows!) These factors also affect how a program like `fossil.exe` interprets quotation marks on its command line. The fifth item above doesn’t apply to `fossil.exe` when built with typical tool chains, but we’ll see an example below where the exception applies in a way that affects how Fossil interprets the glob pattern. The most common problem is figuring out how to get a glob pattern passed on the command line into `fossil.exe` without it being expanded by the C runtime library that your particular Fossil executable is linked to, which tries to act like the POSIX systems described above. Windows is not strongly governed by POSIX, so it has not historically hewed closely to its strictures. (This section does not cover the [Microsoft POSIX subsystem](https://en.wikipedia.org/wiki/Microsoft_POSIX_subsystem), Windows’ obsolete [Services for Unix 3.*x*](https://en.wikipedia.org/wiki/Windows_Services_for_UNIX) feature, or the [Windows Subsystem for Linux](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux). (The latter is sometimes incorrectly called “Bash on Windows” or “Ubuntu on Windows.”) See the POSIX Systems section above for those cases.) For example, consider how you would set `crlf-glob` to `*`. The naïve approach will not work: c:\...> fossil setting crlf-glob * The C runtime library will expand that to the list of all files in the current directory, which will probably cause a Fossil error because Fossil expects either “`global`” or nothing after command line parameter giving the setting’s new value. If you happened to run this in a directory with two files, one of which was called `global`, it might appear to work but do the wrong thing, depending on whether the `global` file name was expanded first or second. Let’s try again: c:\...> fossil setting crlf-glob '*' That may or may not work, depending on the factors listed above. On one system where this was tested, it failed because the command shell sees that no file in the current directory matches the glob pattern `'*'`, so the command shell passed those three characters unchanged to `fossil.exe`, which stored them as-is. Then when Fossil went to apply that glob pattern to file names, it saw that the glob pattern is quoted, so it didn’t interpret `*` as meaning “any series of characters;” the quotes made Fossil skip the “looks like a text file” rules only for a file called exactly `'*'` rather than what we wanted, which was to skip those rule checks for all files at the top of the checkout directory. An approach that *will* work reliably is: c:\...> echo * | fossil setting crlf-glob --args - This works because the built-in command `echo` does not expand its arguments, and the global Fossil option `--args` makes it read further command arguments from `-`, meaning Fossil’s standard input, which is connected to the output of `echo` by the pipe. Another correct approach is: c:\...> fossil setting crlf-glob *, This works because the trailing comma prevents the command shell from matching any files, unless you happen to have files named with a trailing comma in the current directory. If the pattern matches no files, it is passed into Fossil’s `main()` function as-is by the C runtime system. Since Fossil uses commas to separate multiple glob patterns, this means “all files at the root of the Fossil checkout directory and nothing else.” ———————————————————— Feel free to eliminate my snark in the final bullet item. :) Also double-check my Windows section rewrites. I tried some of it here, but my Windows-fu is weaker than my POSIX-fu. A signed contributor agreement form is in the mail. _______________________________________________ fossil-dev mailing list fossil-dev@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/fossil-dev