Christoffer Stjernlöf <a...@xkqr.org> writes:
> One convenient trick to apply emphasis to parts of a word is to separate
> that part with zero-width non-breaking spaces. This allows us to draw
> attention to puns like "the ex_cite_ment one feels when one's work is
> getting cited" by underlining the "cite" part of the word
> "excitement".[1]
>
> Org doesn't support this out of the box but by adding the zero-width
> non-breaking space to org-emphasis-regexp-components we can get this to
> work.

Regardless of the wider issue you raise: Note that `zero width space'​
(0x200B) already works without modifying org-emphasis-regexp-components,
as it is already covered by the [:space:] character class, unlike `zero
width no-break space' (0xFEFF).[fn:1]

Since it's not a no-break character, 0x200B can lead to unwanted line
breaks, but a working solution is to filter zero-width spaces out of the
final output with an export filter, which has the added advantage of not
outputting weird invisible characters. It takes a bit of setup, but not
much more than customizing org-emphasis-regexp-components does.

I have this in my .emacs  (adapted from TEC's
https://blog.tecosaur.com/tmio/2021-05-31.html):

  (defun my/org-export-remove-zero-width-space (text _backend _info)
    "Remove zero width spaces from TEXT."
    (unless (org-export-derived-backend-p 'org)
      (replace-regexp-in-string "\u200b" "" text)))

  (add-to-list 'org-export-filter-final-output-functions
               #'my/org-export-remove-zero-width-space t)


[fn:1] I don't know why `zero width no-break space' does not count as a
[:space:]; but the philosophical contradictions alone would be
staggering. :-) Also, in Unicode the character at `zero width
non-breaking space' is, apparently, meant to indicate byte order and is
deprecated for the purpose for which it's actually named, for which
we're meant to use `word joiner' (0x2060) instead.

Yours,
Christian

Reply via email to