On Wednesday, September 11, 2013 10:33:20 AM UTC-5, Edward K. Ream wrote:

> It's possible that file.readline (which at.readline uses) really doesn't 
handle utf-16 encoded newlines properly.  It might be possible to work 
around that problem by converting the entire input file to unicode first, 
rather than converting the file on a line-by-line basis.  We shall see.

Success!  Converting a utf-16 file to unicode all at once does indeed 
work.  However, the prototype code (a new version of at.readline) has the 
side effect of breaking the @shadow logic.  The solution will be unify the 
read code still further.  This will take at least several more hours of 
work.

Indeed, the read code needs a unified way of reading external files 
containing sentinels.  This is a surprisingly tricky process.  Indeed, most 
files must be scanned twice, at least partially.  The first scan determines 
the encoding contained in the @+leo sentinel. The second scan converts the 
entire file to unicode, based on the encoding.  But there are several 
special cases.  In particular, if the file starts with a BOM, g.stripBOM 

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to