On 3/28/25 12:46 PM, Paul Gilmartin wrote:
On Fri, 28 Mar 2025 11:05:05 -0400, Rick Troth wrote:
EXCELLENT observation
Postel's Law should be followed
https://en.wikipedia.org/wiki/Robustness_principle
No. This leads to chaos. Developers are inclined to obey the
"be liberal in what you accept" which leaves no platform on which
to validate "be conservative in what you send."
I strongly (though incompletely) disagree with your observation.
I have always taken robustness to mean, for example, always send
ASCII+CR+LF as a line of internet plain text, but *require* only
ASCII+LF. If the CR is present, fine. It's going to get stripped out
anyhow. (By whatever interpreter is crunching it. I'm not talking about
files, necessarily.)
Stuff like that is NOT chaos.
z/OS is conservative in what it accepts and I have had problems
building FOSS which was probably "validated" with the liberal gcc.
both senders and accepters whould obey the same rules, which
should be the Standard.
Here is where I am "incomplete" in my objection to "conservative in what
it accepts".
I found that XLC (for normal C, not for C++) rejected a lot of my code
where I had gotten lazy and used // for comments instead of proper /(*
... */.
In hat case, I appreciate the compile time errors. They only made my
code cleaner.
But I have been bitten by IBM implementations where Big Blue were such
sticklers for the rules that WIDELY accepted exceptions just did not
fly. What a pain!
It's like "right turn on read after stop" in all fifty states except for
these three.
Better to avoid the astonishment factor.
Chaos indeed.
Consumer systems REGULARLY have blanks in filenames, which makes
scripted automation difficult. Other punctuation can also be troublesome.
Yes. There should be extensions to scripting languages to support
utilities generating tokenized lists of filenames.
We could all go with Tcl.
-- R; <><
On 3/27/25 10:52 PM, Joel Ewing wrote:
Saying that any 8-bit combination is "supported" in file names and
paths by Unix is misleading. It would be very bad form to use a
character code that is not associated with a standard displayable
glyph, even if the creation of such a file were allowed, because
reliably working with a file with "invisible" characters in the name
after creation would be impractical. Special characters like
single-quote, double-quote, asterisk, question-mark, etc. may be
allowed, but are inadvisable because of their special semantic
significance when referring to files in commands and standard
utilities causes problems, Any broadning of file name rules in z/OS
would most likely need restrictions based on usage context as well.
What is allowed and sometimes useful in current versions of Linux and
Windows is almost any other arbitrary standard Unicode glyph in file
and path names, represented in UTF-8. I would assume that multi-byte
characters count as that many bytes against the 255-byte limit. You
can use characters in non-Latin alphabets and even use special
symbols, such as in the file path
*~/gov_SNAFU/👊🇺🇸🔥.txt , *provided all the utilities you plan to
use with the file support fonts that provide the appropriate glyphs
and the system allows you to represent Unicode characters -- in this
case u+270A (✊), u+1F1FA u+1F1F8 (🇺🇸), and u+1F525 (🔥) -- by some
means that isn't too tedious. Linux provides Unicode text entry
support globally at the Operating System level. Windows
unfortunately requires that support to be provided by individual
applications, and not all provide it.
JC Ewing
--
-- R; <><
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN