Hi Mark,
thanks a lot for the sample. Seems to work, as well as the reverse
conversion from ansi to UTF-8 with a transit via UTF-16.
It would certainly be helpful to have an illustrative example on how to
convert to and from UTF-8 in the oorexx manual. If you think your example
qualifies for the doc, I could submit a patch for the sysToUnicode.
In addition, when converting from ANSI to UTF-8, I had different results
depending on whether I *saved* the rexx script in ANSI or OEM format.
Which leads me to the question on whether it is possible to make the
interpreter UTF-8 aware?
Madou
2013/1/19 Mark Miesfeld <miesf...@gmail.com>
> On Thu, Jan 17, 2013 at 10:03 PM, Mukenx <muk...@gmail.com> wrote:
>
>> I have an oorexx script that receives text strings (json strings) encoded
>> in utf8 (peppered with german diacritics) and would like to convert the
>> strings into ansi format.
>>
>> I discovered the sysFromUnicode and sysToUnicode functions in the oorexx
>> 4.1.1 manual but could to get any meaningful results.
>>
>> here is what I tried:
>> 1) store the text "Tür" in a utf8.txt file in utf8 format
>> 2) and read it back in with rexx in a variable str.
>> fs = .Stream~new('utf8.txt')
>> str = fs~linein
>> fs~close
>> say str
>> say 'rc = 'sysFromUnicode(str, , , , 'outStem.')
>> loop ix over outStem.
>> say 'outstem.'ix' = <'outstem.ix'>'
>> end
>>
>> outputs:
>> Tür
>> rc = 0
>> outstem.!TEXT = <??>
>> outstem.!USEDDEFAULTCHAR = <1>
>>
>> Can someone help me out here?
>>
>
>
> Madou, I can't help much here because I'm not real knowledgeable in this
> area. But, I have a few comments.
>
> The interpreter is ANSI based, so you need the input to SysFromUnicode to
> be a series of bytes where the bytes are in UTF8 format. I would start
> off by not using linein(), but charin() where you give the complete file
> size as an argument and read in the complete file at one time. However,
> I'm not positive that will work because there may be come code page
> translation done.
>
> Second, you need to specify the codepage argument as UFT8, somewhere. I
> looked at the code for SysFromUnicode and SysToUnicode, and the
> documentation for the Windows API it uses. I think that the Windows API
> converts to and from UTF16 *only*. You specify the codepage to use in the
> translation.
>
> To convert UTF8 to ANSI, it looks to me like you would have to first
> convert UTF8 to UTF16 using SysToUnicode() and then take the output of that
> conversion and use SysFromUnicode to convert the UTF16 string to the ANSI
> codepage your are running in on your computer.
>
> The following simple example works for me:
>
> /* Simple UTF8 to ANSI test */
>
> -- Cent Pound Currency signs
> inString = 'c2a2c2a3c2a4'x
> say 'Using string:' inString
> say
>
> ret = SysToUnicode(inString, 'UTF8', , out.)
>
> if ret == 0 then say 'Convert UTF8 to UTF16 succeeded'
> else say 'Convert UTF8 to UTF16 failed. rc:' ret
>
> ret = SysFromUnicode(out.!TEXT, '437', , , ansi.)
> if ret == 0 then say 'Convert UTF16 to ANSI succeeded'
> else say 'Convert UTF16 to ANSI failed. rc:' ret
>
> say 'ANSI text:' ansi.!TEXT
> say "Used conversion character:" boolean2str(ansi.!USEDDEFAULTCHAR)
>
> say
> say 'Code page in console:'
> 'chcp'
>
> ::routine boolean2str
> use strict arg val
> if val then return 'true'
> else return 'false'
>
> Note in the above, for the SysFromUnicode() call, I used the active code
> page number in the console I am working in. Here is the display I get in
> my console, how this will look in this e-mail on your system, I have no
> idea:
>
> Using string: ¢£¤
> Convert UTF8 to UTF16 succeeded
> Convert UTF16 to ANSI succeeded
> ANSI text: ¢£☼
> Used conversion character: false
>
> Code page in console:
> Active code page: 437
>
> But, in my console I see the 'Using string' as gibberish and the cent and
> pound sign correctly. So, it looks to me like this works fine. The
> currency sign does not make sense to me, but I don't know what it should
> be.
>
> --
> Mark Miesfeld
>
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122912
> _______________________________________________
> Oorexx-users mailing list
> Oorexx-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oorexx-users
>
>
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Oorexx-users mailing list
Oorexx-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-users