On Thursday, June 5, 2014 9:42:28 PM UTC+5:30, Chris Angelico wrote:
> On Fri, Jun 6, 2014 at 1:33 AM, Steven D'Aprano wrote:
> > In the Unix world, text formats and text
> > processing is much more common in user-space apps than binary processing.
> > Perhaps the definitive explanation and celebration of the Unix way is
> > Eric Raymond's "The Art Of Unix Programming":
> > http://www.catb.org/esr/writings/taoup/html/ch05s01.html
> Specifically, this from the opening paragraph:
> Text streams are a valuable universal format because they're easy for
> human beings to read, write, and edit without specialized tools. These
> formats are (or can be designed to be) transparent.
A fact that stops being true when you tie up text with encodings.
For two reasons:
1. The function/pair encode/decode mapping between byte-string and text
cannot be a bijection because the byte-string set is larger than the text
set. This is the error that Armin was hit by
2. Since there is not one but a zillion encodings possible we are not
talking of one (possibly universal) data structure but a zillion
ones: "Text streams are a universal format" - which encoding-ed
form of text??