Re: Unicode in variables

Devin Asay Mon, 19 Aug 2013 12:44:21 -0700

On Aug 19, 2013, at 1:29 PM, J. Landman Gay wrote:

> On 8/19/13 2:15 PM, Devin Asay wrote:
>> 
>> On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
>> 
>>> I need to read and process a tab-delimited text file that is in
>>> UTF8 format containing unicode. The final goal is to get it into an
>>> array with the first tabbed item as the keys, preserving all
>>> unicode. There are some HTML format tags in it as well.
>>> 
>>> If I read the file as binfile, carriage returns are all lost.
>> 
>> Jacque,
>> 
>> Where are the files coming from? Maybe they're using ASCII 13 as a
>> line terminator, or ASCII 10 + 13. Can't you replace whatever the
>> native line delimiter is with numToChar(10)?
> 
> I forgot about that. They're ascii 13, and replacing them does keep the line 
> breaks. Thanks.
> 
> When I run uniEncode(tData,"UTF8") on it, the high-ascii characters are in 
> the variable watcher as "+" and an unprintable box. Can I assume the real 
> character is in there? Will it work for text chunking, etc? When I split it 
> into an array, will the keys be intact?


I would do all of the chunking and splitting before you do uniEncode. Think of 
UTF8 as a reliable storage format, and only convert them when you are ready to 
display them.

Devin

Devin Asay
Learn to code with LiveCode University
http://university.livecode.com




_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Unicode in variables

Reply via email to