On Mon, Feb 19, 2018 at 09:33:28AM +0200, Oxygen XML Editor Support (Radu 
Coravu)  wrote:
> Hi Bernhard,
> It seems that for "nbsp" which has the decimal equivalent "160" you would
> need to type "ALT" and then "0160", that leading "0" seems to be important.
> The same probably for all other characters, type their decimal equivalent
> but it needs to be four typed figures.

Oh, how quickly we forget certain things.  :)

oXygen has had the ability to enter UTF-8 characters in the first
plane by their four character hexadecimal code point value since
version 17.1.  I can't recall what the default hotkey is for invoking
it because I changed mine (back) to F8 as soon as I installed that
version.  I believe I've still got the plugin you guys provided me
during my trial period for 17.0.

Anyway, if Bernhard is happy with using hex instead of int, that's the
solution instead of the Windows alt sequences (or the Mac alt/option
sequences either, for that matter).

Accessing characters in multiplanes beyond the first is difficult in
most programs, including oXygenXML.  Obviously XML can handle it, but
the accessing problems are twofold:

 1. Entering a hexadecimal character comprised of five or six hex
    characters on the remaining 16 planes (i.e. 0x10000 to 0x1fffff).

 2. Rendering characters which can only be displayed using multiple
    fonts and guaranteeing font fallback capablities.

I have only one program which can handle both of these natively for
editing and that's GNU Emacs, but in those cases where I need to delve
into the upper multiplanes I can open a file from oXygen in Emacs and
that'll do for now.

It might be worth having a look at extending the hex entry feature to
enable a way to enter a hex value of grater than 3 bytes (4
characters), but oXygfen takes that input differently to other
programs and so it might be tricker.  Emacs, LibreOffice and other
programs work by activating the hex input function (it's "M-x
insert-char" in Emacs) and then entering the code point hex value.  In
oXygen you enter the hex value as four characters in the document and
then press the hotkey which reads the preceding four characters and
transforms them.

As for font fallback, there's pretty much no options for handling that
in oXygen, but there are effective workarounds by doing sneaky things
with CSS in the source files as well as the output formats.

I've got my own little Unicode cheat sheet which has been gradually
growing over the last decade or so and covers most of this in more
detail.  Bear in mind two things: first, it's a personal cheat sheet
that I only share because it often answers frequent questions I hear
elsewhere; and second, it's a "living document" that gets updated

That said, it's here:


Or to download it:


It's only ever released as a PDF because of all the font/glyph
embedding.  It claims or attempts to export as PDF/A-1, but only to
ensure that font embedding and it probably won't pass preflight
checks (nor does it need to).

For those few readers of this list who also use Emacs, the last three
pages of that file include those portions of my Emacs init file which
specify the fallback fonts using fontset default.  I've got coverage
from 0x0000 to 0x2ffff and where things occasionally misbehave,
they're easy to identify with the aid of the binding on F16 (i.e. M-x

Finally, my current favourite code point checking tool, for any system
with Perl installed, is unum.pl, available here:


The current version of the cheat sheet discusses it on page 23, but
here's a nice example of what it does:

bash-4.4$ unum.pl 0x1f926
   Octal  Decimal      Hex        HTML    Character   Unicode
 0374446   129318  0x1F926   🤦    "🤦"         FACE PALM

Obviously some of us can see that character properly and some can't,
but you all know which it is.


Attachment: signature.asc
Description: PGP signature

oXygen-user mailing list

Reply via email to