On Feb 23, 2007, at 7:36 AM, [EMAIL PROTECTED] wrote:

> When you
> read a file in, you tell it what the encoding is, either  
> explicitly, or
> by default (the default for a TextInputStrearm is UTF-8).
>

Ah.  This was the bit I was missing, although it is in the LR so  
apparently I need to read more closely.


>> Would the concatenation be changing the encoding?
>
> Yes, if you concatenate two strings of different encodings, then RB
> will convert one or both into some encoding that can represent the
> combined text.  (Of course it doesn't change the strings you're
> concatenating; I'm just talking about the combined result.)
>

OK.

>
> That's good.  From your description, I think there's nothing very
> complex going on here; the data you're reading in simply isn't UTF-8,
> as you're (probably by default) claiming it is.  Change your
> TextInputStream.Encoding to reflect whatever the text actually is
> (perhaps UTF-16?), and all will work fine.
>

So it would appear I'm now (always have been?) in the same boat as 90+ 
% of the people who are having trouble with encodings:

There's no reliable way to determine the encoding of a file when it  
is read in (right?).


Thanks Joe and Joe.  I have a much better handle on what's going on now.

-- 
David Glass - Gray Matter Computing
graymattercomputing.com - corepos.com
559-303-4915

Apple Certified Help Desk Specialist

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to