I'm adding cut -C to do column-based selection, what should it do about the middle of double width characters? middle of double width characters? Right now I'm having it round down, so since japanese text is double width in monospaced fonts:
$ cat tests/files/utf8/japan.txt && echo 私はガラスを食べられます。それは私を傷つけません。 $ ./cut -C 5-11 tests/files/utf8/japan.txt ガラス I.E. 5 skips the first 2 (which starts at column 4, the next display point _below_ 5), and then it continues to stop before the ending column. (So 5-11 is the same as 5-10, and 5-12 shows 4 characters because the 4th character includes column 12). This is consistent, but I'm not sure if it's right...? Should the first one round up instead? (Since it's an exclusion range, should the start fail forward and the end fail backwards?) Dunno... Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
