Michael Jennings <[EMAIL PROTECTED]> writes: > The issue centers on the documentation. Philosophically, in my > opinion, a program should be written so the documentation is easy to > read. In this case a hidden stripping of useless characters means > that there is one less thing to explain in the manual.
No, it's one *more* thing to explain in the manual. The only characters universally agreed to be "useless" in the context of parsing are the whitespace characters. *Everything* else is subject to serious considerations. For example, "control characters" for you might be UTF8-encoded characters for someone else. Not stripping them away without a very good reason to do so is for me a simple matter of correctness. The GNU coding standards seem to suggest the same. (...) Or go for generality. For example, Unix programs often have static tables or fixed-size strings, which make for arbitrary limits; use dynamic allocation instead. Make sure your program handles NULs and other funny characters in the input files. Add a programming language for extensibility and write part of the program in that language. and: Utilities reading files should not drop NUL characters, or any other nonprinting characters _including those with codes above 0177_. The only sensible exceptions would be utilities specifically intended for interface to certain types of terminals or printers that can't handle those characters. Whenever possible, try to make programs work properly with sequences of bytes that represent multibyte characters, using encodings such as UTF-8 and others. > There is precedent for this. Microsoft Windows is in some places > written to get around shortcomings in the processors on which it > runs. Such accommodation puts quirkiness in the code, but it gets > the job done. In many cases Wget tries to accommodate to its environment to ensure smoother operation. But with each such accomodation we are forced to weigh the added "quirkiness" (entropy) of the code against the benefit. In this case, implementing correct support for ^Z is not exactly trivial, and the benefit is minimal -- the ^Z characters don't even appear in files normally created on platforms supported by Wget, which are Unix and Windows. You are trying to convince us otherwise by offering an easier implementation of ^Z, thereby reducing the costs. But unfortunately this easier implementation reduces correctness of the code, and is therefore not an option. Sorry.