At the risk of hijacking my own thread, would the fact that the debugger reports this string as UTF8 when it apparently isn't (2x as long as it should be) be a bug?
When I read the file in it is reported as UTF-8 (in the debugger), can I not trust this information? As I watch this in the debugger, as the lines come in from the text file they are reported as UTF-8, but when I drag them out of the listbox they are reported as UTF-16. I know you can't tell me what *specifically* in my code would cause that, but what *generically* would cause that? For instance, I build my DropObject from the following bits of information: 1 - the name selected in a popup. This text comes from a constant defined in Rb 2 - the name of the file (from a popup). 3 - the text from the listbox This concatenated with asterisks so I can parse it on the drop. Would the concatenation be changing the encoding? I don't use DefineEncoding anywhere, as I learned my lesson the last time I tried to figure out encodings. On Feb 22, 2007, at 6:16 PM, Joe Huber wrote: > At 5:57 PM -0800 2/22/07, David Glass wrote: >> 2Kng?? (should be 2 Kings) >> >> Where the ?? could be anything, but most often a random(?) >> oriental character, an empty space, or other random odd character >> (like a delta). >> To make things more weird, the incorrect characters change each >> time I 'step' through lines of code. >> >> If I drill to the specific variable, I'm told the encoding is >> UTF-8 which is what I want it to be. In this particular example >> the length is reported as 7 and the lengthB as 14. >> >> This information does not change (the encoding is always UTF-8, >> and the lengths don't change), and the debugger view shows the >> text properly, but when I send the text to Drawstring it comes >> out whacked. > > 2 Kings would only be 7 bytes in UTF8, so you do have some sort of > encoding problem. Since your string has twice as many bytes as > characters it may be in UTF-16 encoding, not UTF-8 as you're > expecting. > > If you set the proper encoding on data you read in from an external > source, RB will generally handle text encodings for you automatically. > > So the two suggestions are to: > > 1. Double check that the encoding of the source file is what you > really think it is and make sure you're setting the strings to that > encoding as you read them in. > > 2. Be very suspicious of using DefineEncoding anywhere within your > program. Generally you shouldn't need it at all except when you're > initially reading in the strings. > > BTW The encoding of a string that you get out of an editfield might > NOT necessarily be the same encoding as the string you initially > wrote in. Don't assume the encoding. Just make sure it's right > initially and then let RB do its thing when handling text strings. > > Regards, > Joe Huber -- David Glass - Gray Matter Computing graymattercomputing.com - corepos.com 559-303-4915 Apple Certified Help Desk Specialist _______________________________________________ Unsubscribe or switch delivery mode: <http://www.realsoftware.com/support/listmanager/> Search the archives: <http://support.realsoftware.com/listarchives/lists.html>
