On Wednesday, September 11, 2013 10:33:20 AM UTC-5, Edward K. Ream wrote: > It's possible that file.readline (which at.readline uses) really doesn't handle utf-16 encoded newlines properly. It might be possible to work around that problem by converting the entire input file to unicode first, rather than converting the file on a line-by-line basis. We shall see.
Success! Converting a utf-16 file to unicode all at once does indeed work. However, the prototype code (a new version of at.readline) has the side effect of breaking the @shadow logic. The solution will be unify the read code still further. This will take at least several more hours of work. Indeed, the read code needs a unified way of reading external files containing sentinels. This is a surprisingly tricky process. Indeed, most files must be scanned twice, at least partially. The first scan determines the encoding contained in the @+leo sentinel. The second scan converts the entire file to unicode, based on the encoding. But there are several special cases. In particular, if the file starts with a BOM, g.stripBOM -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/leo-editor. For more options, visit https://groups.google.com/groups/opt_out.
