At the risk of hijacking my own thread, would the fact that the  
debugger reports this string as UTF8 when it apparently isn't (2x as  
long as it should be) be a bug?

When I read the file in it is reported as UTF-8 (in the debugger),  
can I not trust this information?

As I watch this in the debugger, as the lines come in from the text  
file they are reported as UTF-8, but when I drag them out of the  
listbox they are reported as UTF-16.

I know you can't tell me what *specifically* in my code would cause  
that, but what *generically* would cause that?  For instance, I build  
my DropObject from the following bits of information:

1 - the name selected in a popup.  This text comes from a constant  
defined in Rb
2 - the name of the file (from a popup).
3 - the text from the listbox

This concatenated with asterisks so I can parse it on the drop.   
Would the concatenation be changing the encoding?

I don't use DefineEncoding anywhere, as I learned my lesson the last  
time I tried to figure out encodings.

On Feb 22, 2007, at 6:16 PM, Joe Huber wrote:

> At 5:57 PM -0800 2/22/07, David Glass wrote:
>> 2Kng??  (should be 2 Kings)
>>
>> Where the ?? could be anything, but most often a random(?)  
>> oriental character, an empty space, or other random odd character  
>> (like a delta).
>> To make things more weird, the incorrect characters change each  
>> time I 'step' through lines of code.
>>
>> If I drill to the specific variable, I'm told the encoding is  
>> UTF-8 which is what I want it to be.  In this particular example  
>> the length is reported as 7 and the lengthB as 14.
>>
>> This information does not change (the encoding is always UTF-8,  
>> and the lengths don't change), and the debugger view shows the  
>> text properly, but when I send the text to Drawstring  it comes  
>> out whacked.
>
> 2 Kings would only be 7 bytes in UTF8, so you do have some sort of  
> encoding problem. Since your string has twice as many bytes as  
> characters it may be in UTF-16 encoding, not UTF-8 as you're  
> expecting.
>
> If you set the proper encoding on data you read in from an external  
> source, RB will generally handle text encodings for you automatically.
>
> So the two suggestions are to:
>
> 1. Double check that the encoding of the source file is what you  
> really think it is and make sure you're setting the strings to that  
> encoding as you read them in.
>
> 2. Be very suspicious of using DefineEncoding anywhere within your  
> program. Generally you shouldn't need it at all except when you're  
> initially reading in the strings.
>
> BTW The encoding of a string that you get out of an editfield might  
> NOT necessarily be the same encoding as the string you initially  
> wrote in. Don't assume the encoding. Just make sure it's right  
> initially and then let RB do its thing when handling text strings.
>
> Regards,
> Joe Huber

-- 
David Glass - Gray Matter Computing
graymattercomputing.com - corepos.com
559-303-4915

Apple Certified Help Desk Specialist

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to