On Wed, May 23, 2012 at 09:09:27PM +0200, Paul wrote: > On Wednesday, 23 May 2012 at 19:01:53 UTC, Graham Fawcett wrote: [...] > >So I think what you're trying to do is > > > >1. read a Latin-1 file, into unicode (internally in D) > >2. do splitLines(), etc., generating some result > >3. Convert the result back to latin-1, and output it. > > > >Is that right? > >Graham > > Exactly.
The safest way is probably to read it as binary data (i.e. byte[]), then do the conversion into UTF8, then process it, and finally convert it back to latin-1 (in binary form) and output it. D assumes Unicode internally; if you try to read a Latin-1 file as char[], you may be running into some implicit UTF conversions that are corrupting the data. Best use byte[] for reading/writing, and do conversions to/from UTF-8 internally for processing. T -- Doubt is a self-fulfilling prophecy.
