Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
On Wed Dec 18, 2024 at 2:15 AM CET, onf wrote: > On Wed Dec 18, 2024 at 12:14 AM CET, Tadziu Hoffmann wrote: > > There are a few other words that don't follow the pattern. > > "filtrate" is the fluid that has been filtered, but I don't > > think "to filtrate" is a valid word. And "orientation" is > > the act or result of orienting, not "orientating". > > I didn't mean to imply that it's right this way (although according to > Oxford, it is[1]). I was just pointing out that it didn't sound wrong > to me (whereas your examples do; I don't know why). By the way, although there is no entry for "filtrate", there is in fact an entry for "orientate", and it's not marked "non-standard" either: orien-tate verb (BrE) = ORIENT
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
On Wed Dec 18, 2024 at 12:14 AM CET, Tadziu Hoffmann wrote: > > With that said, not being a native speaker, if I had to turn "sequestration" > > into a verb, I would say "sequestrate" too and it would sound right to me... > > There are a few other words that don't follow the pattern. > "filtrate" is the fluid that has been filtered, but I don't > think "to filtrate" is a valid word. And "orientation" is > the act or result of orienting, not "orientating". I didn't mean to imply that it's right this way (although according to Oxford, it is[1]). I was just pointing out that it didn't sound wrong to me (whereas your examples do; I don't know why). Looking the word up again shows that sequester has another, related meaning ("to keep a jury together [somewhere to prevent] them from talking to other people [...]") which sequestrate does not have. ~ onf [1] The dictionary labels gramatically incorrect words such as "gonna" with "non-standard", but sequestrate is not labeled in ANY way, indicating it's not incorrect in any way (according to them).
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
> With that said, not being a native speaker, if I had to turn "sequestration" > into a verb, I would say "sequestrate" too and it would sound right to me... There are a few other words that don't follow the pattern. "filtrate" is the fluid that has been filtered, but I don't think "to filtrate" is a valid word. And "orientation" is the act or result of orienting, not "orientating".
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
Hi Branden, On Tue Dec 17, 2024 at 8:29 PM CET, G. Branden Robinson wrote: > [...] > (Did you hear that the Siberian traps appear to be roaring to life?[1] > Many of us under the age of 60 can look forward to dying of heat stroke.) Not really, but I know there are many places where the permafrost is melting, so it doesn't surprise me. Frankly I don't read such news much; the little I read every now and then is sad more than enough. I guess what's worst is that all the purported solutions just result in more pollution and environmental destruction without really solving anything (except for income for those pushing them, obviously). > > I am not sure about "sequestrated" and especially about > > "sequestrating", > > I'm dubious about "sequestrate" itself, and therefore even more so of > these derived forms. One or two other cases exist of UK English getting > carried away with reduplicative affixes on verbs, but I can't summon any > to mind right now. More common is the pointless suffixing of "-al" to > "make" an adjective out of a word ending in "-ic" that _already is_ an > adjective, like "ironical". UK English just loves this form of > morphologic excess. I blame proximity to France. It's funny because I would expect such stuff to come from American English given how many of its speakers can't even distinguish between "its" and "it's" or even "your" and "you're" :) With that said, not being a native speaker, if I had to turn "sequestration" into a verb, I would say "sequestrate" too and it would sound right to me... > > I have modified your script into the following to be in line with the > > way I set up hyphenation: > > #!/bin/sh > > printf '.mso %s.tmac\n.ll 1Z\n\\&%s\n' "$1" "$2" | > > One _Z_? What _is_ this unit? And why isn't the formatter complaining > about it? Oh well, that's what I get for trying to write something 'smart' from memory. I was trying to get the `z` unit, but what I really meant to say was this: .ll \n[.H]u Anyway, you're right: groff complains if I say `1z`, but not when I say `1Z`. > > nroff -ww -Wbreak | > > sed -E '/^$/d' | > > `-E` is, I think, unnecessary here, since `^` and `$` as zero-width > anchoring atoms are both valid POSIX BREs, not reserved to EREs. FYI. Frankly, I don't care. I have a habit of using -E on anything that doesn't default to ERE, because the last thing I want is accidentally breaking a working script by changing some regex in a way that makes it no longer work with BRE and forgetting to add the -E flag. In my mind, I always want ERE behavior, so I give it the -E flag. Then I don't have to remember all the differences between the two just to be able to tell when I need to add the flag. This can be especially frustrating when GNU extends BRE to include ERE features such as the `+` quantifier. > > It hyphenates correctly, too: > > se‐ques‐tra‐tion > > > > However, I have a file where hyphentation is setup like this: > > .mso en.tmac > > .de HY > > . hy 4 > > .. > > > > (the macro HY is used after .nh to re-enable hyphenation.) > > [...] > > > But when I put: > > .hw se-ques-tra-tion > > after the above requests at the top of the document, it does. > > > > I have no idea what might cause this behavior. Running groff with > > -ww does not reveal anything hyphenation-related. > > I think something might be misconfigured in your installation. :( Yeah, my macros. (: To expand on the very brief explanation I provided in my previous reply, I had this: .so mac.tmac .mso en.tmac .de HY . hy 4 .. The mac.tmac file contains my version of the Mk macros, which setup hyphenation parameters for Czech. The following lines override those parameters with English ones. Well, except Mk has this great property of initializing itself only after you use one of its macros, so that the Czech hyphenation parameters which I configured in Mk's init were loaded AFTER the English ones, not before. I fixed it for now by manually initializing Mk just after loading it. I will likely get rid of this initialization behavior altogether in the future. ~ onf
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
Hi onf, At 2024-12-17T19:48:24+0100, onf wrote: > On Tue Dec 17, 2024 at 7:00 PM CET, G. Branden Robinson wrote: > > Is that a standard English word? "Sequester" is; sometimes used in > > U.S. criminal procedure to refer to a process of isolating a jury > > during its deliberations. I think I've also seen it in fiscal > > contexts. > > > > "sequester, sequestered, sequestering" would all be standard. > > > > [...] > > > > Hmm. "sequestration" _does_ seem standard to me, though. > > From Oxford Advanced Learner's Dictionary of Current English, 6th ed.: > se-ques-trate (also se-ques-ter) verb >(law) to take control of sb's property or ASSETS until a debt has > been paid >-> se-ques-tra-tion noun > > The word has gained another meaning since this book came out in the > phrase "carbon sequestration", which britannica.com defines as > "the long-term storage of carbon in plants, soils, geologic formations, > and the ocean." Yes, I'm familiar with that form of the word (as noted above) and this application of it. (Did you hear that the Siberian traps appear to be roaring to life?[1] Many of us under the age of 60 can look forward to dying of heat stroke.) > I am not sure about "sequestrated" and especially about > "sequestrating", I'm dubious about "sequestrate" itself, and therefore even more so of these derived forms. One or two other cases exist of UK English getting carried away with reduplicative affixes on verbs, but I can't summon any to mind right now. More common is the pointless suffixing of "-al" to "make" an adjective out of a word ending in "-ic" that _already is_ an adjective, like "ironical". UK English just loves this form of morphologic excess. I blame proximity to France. But I digress... > but I have added them anyway as they seem theoretically possiple and I > didn't want to risk they wouldn't hyphenate correctly. Then the thing to do is put appropriate `hw` requests in your troffrc file, into the document, or into a file that your document sources. GNU troff's hyphenation exception files are not a good first location to site hyphenations of nonstandard words. > > If TeX doesn't handle this word, I'm inclined to advise that a > > document do so itself with the `hw` request. > > I dunno. I don't have TeX installed. I do, but don't know enough TeX to write a counterpart to my "hyphen" script for it without doing a lot of homework first. Maybe someone else here does. > I have modified your script into the following to be in line with the > way I set up hyphenation: > #!/bin/sh > printf '.mso %s.tmac\n.ll 1Z\n\\&%s\n' "$1" "$2" | One _Z_? What _is_ this unit? And why isn't the formatter complaining about it? ("What are you animals doing in my head? Why is Private Pyle out of his bunk after lights out? Why is Private Pyle holding that weapon? Why aren't you stomping Private Pyle's guts out?") I see I have more work to do on Savannah #64240. https://savannah.gnu.org/bugs/?64240 > nroff -ww -Wbreak | > sed -E '/^$/d' | `-E` is, I think, unnecessary here, since `^` and `$` as zero-width anchoring atoms are both valid POSIX BREs, not reserved to EREs. FYI. > tr -d '\n' && echo > > It hyphenates correctly, too: > se‐ques‐tra‐tion > > However, I have a file where hyphentation is setup like this: > .mso en.tmac > .de HY > . hy 4 > .. > > (the macro HY is used after .nh to re-enable hyphenation.) Seems reasonable. > ...and the word "sequestration" simply does not hyphenate. Hmm. I can't reproduce this. $ cat EXPERIMENTS/onf-hyphen.roff .ll 10n .na .mso en.tmac .de HY . hy 4 .. sequestration sequestration .nh sequestration sequestration .HY sequestration sequestration .pl \n[nl]u $ nroff -ww -Wbreak EXPERIMENTS/onf-hyphen.roff sequestra‐ tion se‐ questra‐ tion sequestration sequestration sequestra‐ tion se‐ questra‐ tion I get the same results with my working copy and with groff 1.23.0. I even get the same results with groff 1.22.4, with this expected additional diagnostic. troff: EXPERIMENTS/onf-hyphen.roff:3: warning: can't find macro file 'en.tmac' We didn't have "en.tmac" back then. > But when I put: > .hw se-ques-tra-tion > after the above requests at the top of the document, it does. > > I have no idea what might cause this behavior. Running groff with > -ww does not reveal anything hyphenation-related. I think something might be misconfigured in your installation. :( What version of groff are you running? (Down to the commit, if necessary. `groff --version` should disclose this information.) I can try a build of that exact same commit, run it, and maybe we can compare `pev` and/or `phw` request output. Regards, Branden [1] https://www.theguardian.com/world/2024/dec/10/arctic-tundra-carbon-shift signature.asc Description: PGP signature
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
On Tue Dec 17, 2024 at 7:48 PM CET, onf wrote: > > But groff also breaks it just fine for me. > > > > $ hyphen sequestration > > se‐ques‐tra‐tion > > > > $ cat ~/bin/hyphen > > [...] > > However, I have a file where hyphentation is setup like this: > .mso en.tmac > .de HY > . hy 4 > .. > > (the macro HY is used after .nh to re-enable hyphenation.) > > ...and the word "sequestration" simply does not hyphenate. > But when I put: > .hw se-ques-tra-tion > after the above requests at the top of the document, it does. > > I have no idea what might cause this behavior. Running groff with > -ww does not reveal anything hyphenation-related. Ugh. The hyphenation settings were being overriden by another macro which was being triggerred after the above. Thanks for the assistance, and sorry for bothering you. ~ onf
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
Hi Branden, On Tue Dec 17, 2024 at 7:00 PM CET, G. Branden Robinson wrote: > Is that a standard English word? "Sequester" is; sometimes used in > U.S. criminal procedure to refer to a process of isolating a jury during > its deliberations. I think I've also seen it in fiscal contexts. > > "sequester, sequestered, sequestering" would all be standard. > > [...] > > Hmm. "sequestration" _does_ seem standard to me, though. >From Oxford Advanced Learner's Dictionary of Current English, 6th ed.: se-ques-trate (also se-ques-ter) verb (law) to take control of sb's property or ASSETS until a debt has been paid -> se-ques-tra-tion noun The word has gained another meaning since this book came out in the phrase "carbon sequestration", which britannica.com defines as "the long-term storage of carbon in plants, soils, geologic formations, and the ocean." I am not sure about "sequestrated" and especially about "sequestrating", but I have added them anyway as they seem theoretically possiple and I didn't want to risk they wouldn't hyphenate correctly. > Does TeX break these? Our hyphenation patterns, including the > exceptions, come from TeX. > > If TeX doesn't handle this word, I'm inclined to advise that a document > do so itself with the `hw` request. I dunno. I don't have TeX installed. > But groff also breaks it just fine for me. > > $ hyphen sequestration > se‐ques‐tra‐tion > > $ cat ~/bin/hyphen > [...] I have modified your script into the following to be in line with the way I set up hyphenation: #!/bin/sh printf '.mso %s.tmac\n.ll 1Z\n\\&%s\n' "$1" "$2" | nroff -ww -Wbreak | sed -E '/^$/d' | tr -d '\n' && echo It hyphenates correctly, too: se‐ques‐tra‐tion However, I have a file where hyphentation is setup like this: .mso en.tmac .de HY . hy 4 .. (the macro HY is used after .nh to re-enable hyphenation.) ...and the word "sequestration" simply does not hyphenate. But when I put: .hw se-ques-tra-tion after the above requests at the top of the document, it does. I have no idea what might cause this behavior. Running groff with -ww does not reveal anything hyphenation-related. ~ onf
Re: [PATCH groff] tmac/hyphenex.en: add patterns for sequestrate & its derivates
Hi onf, At 2024-12-17T18:18:39+0100, onf wrote: > --- > These words currently don't hyphenate at all with en.tmac. Is that a standard English word? "Sequester" is; sometimes used in U.S. criminal procedure to refer to a process of isolating a jury during its deliberations. I think I've also seen it in fiscal contexts. "sequester, sequestered, sequestering" would all be standard. Does TeX break these? Our hyphenation patterns, including the exceptions, come from TeX. If TeX doesn't handle this word, I'm inclined to advise that a document do so itself with the `hw` request. Hmm. "sequestration" _does_ seem standard to me, though. But groff also breaks it just fine for me. $ hyphen sequestration se‐ques‐tra‐tion $ cat ~/bin/hyphen #!/bin/sh : ${HY:=4} for W do printf ".hy $HY\n.ll 1u\n%s\n" "$W" | nroff -Wbreak | sed '/^$/d' \ | tr -d '\n' echo done # vim:set ai et sw=4 ts=4 tw=80: > tmac/hyphenex.en | 4 > 1 file changed, 4 insertions(+) > > diff --git a/tmac/hyphenex.en b/tmac/hyphenex.en > index 768c0af9d..bd7303613 100644 > --- a/tmac/hyphenex.en > +++ b/tmac/hyphenex.en > @@ -59,6 +59,10 @@ >ring-leaders >round-table >round-tables > + se-ques-tra-te > + se-ques-tra-ted > + se-ques-tra-ting > + se-ques-tra-tion >single-space >single-spaced >single-spacing > -- > 2.47.0 Regards, Branden signature.asc Description: PGP signature