On Sat, 28 Jul 2012 19:34:39 +0300 Eli Zaretskii <[email protected]> wrote:
> > Almost nobody in the MS world uses the ^Z convention on purpose any > > more; many don't even know about it. > They might not use or even know this, but the C library does. And > since the default open mode for ANSI C functions like fopen and > Posix-like functions like _open is text, failure to open a binary file > with O_BINARY resp. "rb" will cause the read operation to stop on the > first byte whose value is 26. > > IOW, you don't need to know about this to be bitten by it. And I'd thought the problem was due to using a compiler targeted at DOS. But no, I'd still get the problem using MinGW on Windows 7 if I performed the UCA & UCD 4.1.0 collation conformance test naïvely assuming that the test input file CollationTest_SHIFTED.txt is to be treated as a text file. At some stage in the past seven years, this feature has been fixed, and the file itself no longer contains U+0026 even though U+0026 remains in the test strings it defines. Of course, I might have misinterpreted the test. Perhaps on Windows one only needed to compare the first 345 strings, not the full 123,088 strings :-) Richard.

