[slurm-dev] Re: Incorrect handling of non-ASCII characters

Tim Wickberg Mon, 06 Jun 2016 07:27:18 -0700

We've created enhancement bug 2791 [0] to address this, and includedsome of the discussion from the list. I agree that ASCII-only is anuntenable solution for modern software, and would like to work towardsproper UTF8 support[1].

Slurm is (mostly) UTF8 compatible. This is why the umlaut in Schrödingeris printed correctly.


There are two issues that I plan to address in a future release:

1) Preventing the output routines from splitting in the middle of amulti-byte UTF8 character. Right now, 'squeue' and other commands maytruncate a string in the middle of a multi-byte UTF8 character whichresults in garbage (usually little rectangles) instead of appropriatecharacters printed to the screen.

2) Terminal output width calculation for arbitrary UTF8 strings. This isthe original reported issue - multi-byte Unicode characters have theirwidths miscalculated which throws the justification off making theterminal output 'ugly'.

A further issue that I do not currently plan to address is normalizationof arbitrary strings[2]. Right now, 'Schrödinger' (umlaut) and'Schrodinger' (no umlaut) are treated as separate accounts, which couldcause some confusion.


- Tim

[0] https://bugs.schedmd.com/show_bug.cgi?id=2791
[1] http://utf8everywhere.org/
[2] http://unicode.org/faq/normalization.html

--
Tim Wickberg
Director of Support, SchedMD LLC
Commercial Slurm Development and Support

[slurm-dev] Re: Incorrect handling of non-ASCII characters

Reply via email to