On Thu, Feb 5, 2026 at 4:57 AM Thomas Wolff via Cygwin <[email protected]>
wrote:

> Am 05.02.2026 um 12:11 schrieb Backwoods BC via Cygwin:
> > On Wed, Feb 4, 2026 at 3:23 PM Dr Bean via Cygwin <[email protected]>
> wrote:
> >> On Wed, 04 Feb 2026, Brian Inglis via Cygwin wrote:
> >>
> >>> On 2026-02-04 12:03, Thomas Wolff via Cygwin wrote:
> >>>> Am 04.02.2026 um 18:10 schrieb Brian Inglis via Cygwin:
> >>>>> On 2026-02-04 02:56, Vincent via Cygwin wrote:
> >>>>>> My request is related to an issue I opened in the FLAC Github :
> >>>>>> https://github.com/xiph/flac/issues/861
> >>>>>> After some investigations, the issue is related to the build
> release of the
> >>>>>> FLAC package provided by Cygwin : the man pages of flac(1) and
> metaflac(1)
> >>>>>> use the HYPHEN (U+2010 )  character instead of the HYPHEN-MINUS
> (U+002D)
> >>>>>> character.
> >>>>>> These two commands expect HYPHEN-MINUS character, so if you
> copy-paste the
> >>>>>> man page options in your terminal, it will fail.
> >>>>>> Example : flac ‐‐version
> >>>>>> will return an error : « can't open input file ‐‐version: No such
> file or
> >>>>>> directory », because of  « ‐‐version » with HYPHEN copied-pasted
> from the
> >>>>>> man pages.
> >>>>>> The right string is « --version » with HYPHEN-MINUS (U+002D).
> >>>>>> Example : flac --version
> >>>>>> will return : « flac 1.5.0 »
> >>>>>> Please, feel free to read the issue in Github (
> >>>>>> https://github.com/xiph/flac/issues/861 ) for more details, as
> it's easier
> >>>>>> to read code and quotes with the markdown formatting.
> >>>>>> This is a very pretty nasty kind of bug, because it's very
> difficult to
> >>>>>> distinguish HYPHEN-MINUS and HYPHEN in a terminal. It's also very
> difficult
> >>>>>> to figure out why the command has failed, as the « No such file or
> >>>>>> directory » is not the root cause of the problem.
> >>>>>> I think a new build release to fix this, would be very welcome.
> >>>>>> Thank you very much for your time and your great work. :)
> >>>> It’s really a nuisance that man (presumably gnu man, but I don’t
> >>>> remember the details of a previous discussion) changed interpretation
> of
> >>>> some important characters into „glyphs“ that some witty people thought
> >>>> to be nice but are completely non-functional.
> >>>> It applies not only to „-“ but also to „~“. Look at `man bash` and
> >>>> search for bashrc and you'll see the tilde symbol replaced by an ugly
> >>>> superscript „small tilde“. Why??
> >>>> Package maintainers are forced to adapt their man pages and either
> >>>> replace all occurrences of these characters by corresponding escapes
> or
> >>>> apply these two global tricks per man page:
> >>>> .char ^ \(ha
> >>>> .char - \N'45'
> >>> It appears to be a consequence more of groff -man being upgraded to
> produce
> >>> better quality typographic output more consistently with other macro
> >>> packages, output devices, and more comprehensive font, character, and
> glyph
> >>> sets, while not penalizing the other existing macro packages originally
> >>> designed and intended to produce quality output: see groff(7),
> >>> groff_rfc1345(7), and groff_char(7), for example:
> >>> "The developers of AT&T /troff/ chose mappings for them that would be
> useful
> >>> for typesetting technical literature in a broad range of scientific
> >>> disciplines
> >>> ...
> >>> Keycap  Appearance and meaning   Special character and meaning
> >>>    "     " neutral double quote   \[dq] neutral double quote
> >>>    '     ’ closing single quote   \[aq] neutral apostrophe
> >>>    -     ‐ hyphen                 \- or \[-] minus sign/Unix dash
> >>>    \     (escape character)       \e or \[rs] reverse solidus
> >>>    ^     ˆ modifier circumflex    \(ha circumflex/caret/“hat”
> >>>    `     ‘ opening single quote   \(ga grave accent
> >>>    ~     ˜ modifier tilde         \(ti tilde"
> >>> Really this tension between compatibility with tty input and
> basic/draft and
> >>> typographic quality output has existed since the earliest days of
> >>> computerized text formatting and typesetting with various levels of
> higher
> >>> quality output devices from dot matrix, daisy wheel, phototypesetter,
> >>> electrostatic, laser, and higher quality rendering devices.
> >>> [Note: \N'#' refers to the current output font glyph index *NOT* an
> input code.]
> >>>>> Upstream sources seems to provide only .md man sources and no b-r
> >>>>> package for conversion (pandoc unavailable from Cygwin) so man pages
> >>>>> are generated for the upstream sources, and this conversion
> >>>>> generates man page options with plain text hyphen-minus, which are
> >>>>> treated by man as normal text *hyphen* `‐­­` not plain text *minus*
> >>>>> `-`.
> >>>>> In man pages you use escaped hyphen-minus `\fB\-v\fR` to treat them
> >>>>> as minus text `-` as used in options `-v`.
> >>>>> We see this use of unescaped hyphens in the upstream tar files,
> >>>>> below, so please complain upstream about their man page generation,
> >>>>> and reopen their issue:
> >>>>> ```
> >>>>> $ wget https://mirror.../x86_64/release/flac/flac-1.5.0-1-src.tar.xz
> >>>>> $ tar -xvf flac-1.5.0-1-src.tar.xz
> >>>>> flac-1.5.0-1.src/
> >>>>> flac-1.5.0-1.src/flac-1.5.0.tar.xz            # upstream sources
> >>>>> flac-1.5.0-1.src/FLAC.cygport
> >>>>> $ tar -xvf flac-1.5.0-1.src/flac-1.5.0.tar.xz
> flac-1.5.0/man/{,meta}flac.1
> >>>>> flac-1.5.0/man/flac.1
> >>>>> flac-1.5.0/man/metaflac.1
> >>>>> $ grep -m5 '\\f[[{]\?B[]}]\\\?-' flac-1.5.0/man/{,meta}flac.1
> >>>>> flac-1.5.0/man/flac.1:\f[B]-\f[R] \f[I]\&...\f[R] ]
> >>>>> flac-1.5.0/man/flac.1:\f[B]flac\f[R] [ \f[B]-d\f[R] |
> >>>>> \f[B]--decode\f[R] | \f[B]-t\f[R] |
> >>>>> flac-1.5.0/man/flac.1:\f[B]--test\f[R] | \f[B]-a\f[R] |
> \f[B]--analyze\f[R] ] [
> >>>>> flac-1.5.0/man/flac.1:\f[I]infile.ogg\f[R] | \f[B]-\f[R]
> \f[I]\&...\f[R] ]
> >>>>> flac-1.5.0/man/flac.1:\f[B]-d\f[R], analysis with \f[B]-a\f[R] or
> >>>>> testing with \f[B]-t\f[R].
> >>>>> flac-1.5.0/man/metaflac.1:\f[B]-o\f[R] \f[I]filename\f[R]\f[B],
> >>>>> --output- name=\f[R]\f[I]filename\f[R]
> >>>>> flac-1.5.0/man/metaflac.1:\f[B]--preserve-modtime\f[R]
> >>>>> flac-1.5.0/man/metaflac.1:\f[B]--with-filename\f[R]
> >>>>> flac-1.5.0/man/metaflac.1:\f[B]--no-filename\f[R]
> >>>>> flac-1.5.0/man/metaflac.1:\f[B]--no-utf8-convert\f[R]
> >>>>> ```
> >>> --
> >> My experience with the man page of `which`, which
> >> mirrors that of Vincent with FLAC
> >>
> >> http://drbean.sdf.org/LooksLikeHyphen.html
> > My experience is that this is a problem pretty much everywhere on the
> > 'Net. Long ago I wrote a simple filter script to remove all
> > non-printing characters and CR and LF from the clipboard contents and
> > put the result back into the clipboard. I then display the contents
> > for a few seconds before closing the window. This won't properly deal
> > with Unicode in the copied data, but at least you can see that the
> > data is bogus.
> >
> > As a Cygwin newbie long ago, I was constantly getting errors because
> > of spurious CR characters in copied text. Sometimes it was completely
> > non-obvious that this was the problem and it wasn't until I started
> > using my filter script regularly that I stopped getting mysterious
> > errors.
> >
> > If someone wanted to write a clipboard "purifier" that would
> > de-Unicode and de-HTML the data, I'd be forever grateful. I wouldn't
> > have a clue how to go about this myself.
> >
> This isn't the issue. I don't think you'd want your clipboard contents
> mangled from what you copied.
>

You're right, I don't. I want a filter that I can pipe the clipboard
through when I think that there are characters I want "cleaned up."


> The problem discussed here is what `man` provides for copying in the
> first place.
>
> Revised my workaround for man page maintainers:
> .char - \-
> .char ^ \(ha
> .char ~ \(ti
>
> And you may add these or keep their special layout:
> .char ` \(ga
> .char ' \(aq
>
> Thomas


This is a 2-phase problem. While you might manage to get man suppliers to
clean up what they release (I have serious doubts about that happening, but
I hope I'm wrong), that doesn't do anything about pasting from web pages,
documents, etc. where text is often mangled with HTML and/or Unicode
characters.

A filter such as I proposed would go a long way toward making copying and
pasting less cumbersome. I currently have another script (I admit to being
script-crazed) that dumps the clipboard into a temp file that I then open
in Notepad so that I can clean it up before pasting it. If my proposed
de-gunking filter existed, I could dispense with doing this a lot of the
time.

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to