To me the distinction stems from which characters are considered information 
separators (isblank) and which are format effectors (isspace), as control 
character subtypes, as these pertain only to parsing and printing C source file 
tokens before and after the preprocessing phase has completed. Because both 
standards consider use of any characters outside the basic/portable sets 
implementation-defined, and all applications other than the compiler as 
undefined, they don't consider it their place to comment further; is how it 
appears to me.


In a message dated 5/15/2018 4:57:12 PM Eastern Standard Time, 
[email protected] writes:

 
There's a related discussion

(https://www.zsh.org/mla/workers/2018/msg00567.html) on the zsh
mailing list about the [[:blank:]] character class.

That's not one of the original POSIX character classes. I
understand it was added when C99 added isblank() which was
inspired by GNU libc's isblank() (there since at least as far
back as 1991).

I could not find any standard documentation that clarifies
what's that meant to be for. All we know is that it's a subset
of [:space:] and should include at least SPC and TAB.

I always thought that it was meant to be horizontal whitespace.
That is characters commonly found in text files that introduce
spacing *within* a line.

But then some locales on NetBSD include \v (vertical tabulation)
and most on OpenBSD include \v and \f (form feed/page break).

ISO/IEC TR 30112 (see draft at
http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf)
wants to exclude (from [:space:] and as a result from [:blank:]
as well) characters that should not be considered as
"boundaries" (like the non-breaking space).

Without clear direction, in practice what [[:blank:]] matches
outside the POSIX locale is completely random and inconsistent
from one system to the next.

Does/should POSIX offer an opinion on that?

-- 
Stephane

Reply via email to