AW: AW: AW: Why do I still need MacToISO, when working with UTF-8?

2017-01-17 Thread Tiemo Hollmann TB via use-livecode
Yep.
And no, I didn't tested textEncode(myFile,"CP1252")
Tiemo



-Ursprüngliche Nachricht-
Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag
von Kay C Lan via use-livecode
Gesendet: Mittwoch, 18. Januar 2017 05:36
An: How to use LiveCode 
Cc: Kay C Lan 
Betreff: Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?

On Tue, Jan 17, 2017 at 1:24 AM, Mark Waddingham via use-livecode
 wrote:
>
> However, the 'endpoints' (i.e. where the developer can 'see' encoded 
> text output - e.g. when writing to a file, or encoding for a URL) had 
> to remain as before otherwise all existing applications using anything 
> other than ASCII text would have broken when moving from 6.7 -> 7.0.
>
But isn't that the point of Tiemo's confusion - his scripts broke when
moving to 7.0! Prior to 7.0 he didn't have to do anything, it all 'just
worked'. When he moved to 7.0 where 'unicode' was suppose to 'just work' on
all platforms, he's used textEncode/textDecode to/from
UTF8 and it's not working for him (on Mac), instead he's found macToISO
(MacRoman to Latin 1) is working for him, which seems to be a step
backwards.

There must be something more hidden in his scripts or PHP.

I wonder if he replaced macToISO(myFile) with
textEncode(myFile,"CP1252") he'd get the same result. If so, it may suggest
that everything is expecting Latin 1, not unicode.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?

2017-01-17 Thread Kay C Lan via use-livecode
On Tue, Jan 17, 2017 at 1:24 AM, Mark Waddingham via use-livecode
 wrote:
>
> However, the 'endpoints' (i.e. where the developer can 'see' encoded text
> output - e.g. when writing to a file, or encoding for a URL) had to remain
> as before otherwise all existing applications using anything other than
> ASCII text would have broken when moving from 6.7 -> 7.0.
>
But isn't that the point of Tiemo's confusion - his scripts broke when
moving to 7.0! Prior to 7.0 he didn't have to do anything, it all
'just worked'. When he moved to 7.0 where 'unicode' was suppose to
'just work' on all platforms, he's used textEncode/textDecode to/from
UTF8 and it's not working for him (on Mac), instead he's found
macToISO (MacRoman to Latin 1) is working for him, which seems to be a
step backwards.

There must be something more hidden in his scripts or PHP.

I wonder if he replaced macToISO(myFile) with
textEncode(myFile,"CP1252") he'd get the same result. If so, it may
suggest that everything is expecting Latin 1, not unicode.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?

2017-01-16 Thread Sébastien Nouat via use-livecode

Hi Tiemo,

As an additional note (if you don't absolutely need to write binary) it 
is also possible to use the syntax


/open file  for "utf8" text read/

or

/open file  for "utf8" text write/


in which cases the engine takes care of encoding/decoding the string 
using UTF-8 encoding.


Then, when calling

/write  to file /

/string/ can be a LiveCode string with Unicode characters, and will be 
written in UTF-8, and


/read from file  for 1 lines

/will set /it/ to a string with the decoded UTF-8 string.


Regards,

Sebastien

On 16/01/2017 17:14, Tiemo Hollmann TB via use-livecode wrote:

Hi Mark,
thank you for taking your time and clarifying. I wasn't aware that the
internal format on a Mac client is MacRoman. I thought it would be a
"neutral" UTF-8 format.
Thanks
Tiemo


-Ursprüngliche Nachricht-
Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag
von Mark Waddingham via use-livecode
Gesendet: Montag, 16. Januar 2017 17:42
An: How to use LiveCode 
Cc: Mark Waddingham 
Betreff: Re: AW: Why do I still need MacToISO, when working with UTF-8?

Hi Tiemo,

Okay so, I'm assuming that all this code is running on the Mac client...


*put fld "name" into myName*

At this point myName contains a (text) string - thus encoding issues don't
exist (you should think of text strings in memory as being stored in an
'encoding neutral' format).


*open file myFile for binary write*
*write myName to file myFile*
*close file myFile*

This piece of code will open a file on disk in the native encoding of the
platform - so MacRoman. It will convert this from the internal encoding to
MacRoman on writing. Thus your text file will be a MacRoman encoded text
file.


*open file myFile for binary read*
*read from file myFile until EOF*
*close file myFile*
*put it into myName*

This piece of code will read from a file on disk and assume that it is in
the native encoding of the platform - so, in this case, MacRoman. It will
convert the content of the file from that to the internal encoding.

Up to this point - because you saved and loaded the file on the same
platform the content of myName should be as you expect -- unchanged.


*if the platform is "MacOS" then put macToISO(theName) into theName*

When run on Mac this line will execute and do the following:

 1) Convert theName to a binary string - this uses the native platform
encoding (MacRoman)
 2) Map each byte from the MacRoman code index to the ISO Latin-1 code
index

This essentially converts theName from a text string to a binary string
encoded in Latin-1.


*put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" &
URLEncode(theName))
into rslt*

This line constructs the URL - it is making the assumption that PHP (at the
other end) will interpret the bytes after the '?' as representing
Latin-1 encoded text.


Without macToISO on a Mac client theName enters corrupted in the mySQL
db

This is most likely because PHP is defaulting to 8859-1 or Latin-1 as the
encoding used in URLEncoded fields in a URL. If you don't do MacToIso, then
you will be passing up MacRoman encoded text (URLencoded) to PHP, which can
happily be decoded as Latin-1 or 8859-1 (Latin-1 is a superset of 8859-1),
but with some chars (such as accented letters) in different places.

What you need to do here is explicitly UTF8 encode theName before passing it
to URLEncode, then explicitly decode it as UTF8 on the PHP side (or set a
property in PHP which changes the default assumption about URLs - I
apologise for not being more accurate here, my knowledge of PHP is a little
stale these days!).

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?

2017-01-16 Thread Mark Waddingham via use-livecode

Hi Tiemo,


thank you for taking your time and clarifying. I wasn't aware that the
internal format on a Mac client is MacRoman. I thought it would be a
"neutral" UTF-8 format.


Internally, the engine uses either MacRoman/ISO-Latin1 *or* UTF-16 
depending on platform and what the string contains.


However, the 'endpoints' (i.e. where the developer can 'see' encoded 
text output - e.g. when writing to a file, or encoding for a URL) had to 
remain as before otherwise all existing applications using anything 
other than ASCII text would have broken when moving from 6.7 -> 7.0.


You can use the 'utf8' keyword to open utf-8 encoded files; however, you 
have to deal with urlEncode manually (which isn't necessarily a bad 
thing, since your server scripts determines what the URL Encoded bytes 
mean after the '?' - NOT LiveCode).


Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


AW: AW: Why do I still need MacToISO, when working with UTF-8?

2017-01-16 Thread Tiemo Hollmann TB via use-livecode
Hi Mark,
thank you for taking your time and clarifying. I wasn't aware that the
internal format on a Mac client is MacRoman. I thought it would be a
"neutral" UTF-8 format.
Thanks
Tiemo


-Ursprüngliche Nachricht-
Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag
von Mark Waddingham via use-livecode
Gesendet: Montag, 16. Januar 2017 17:42
An: How to use LiveCode 
Cc: Mark Waddingham 
Betreff: Re: AW: Why do I still need MacToISO, when working with UTF-8?

Hi Tiemo,

Okay so, I'm assuming that all this code is running on the Mac client...

> *put fld "name" into myName*

At this point myName contains a (text) string - thus encoding issues don't
exist (you should think of text strings in memory as being stored in an
'encoding neutral' format).

> *open file myFile for binary write*
> *write myName to file myFile*
> *close file myFile*

This piece of code will open a file on disk in the native encoding of the
platform - so MacRoman. It will convert this from the internal encoding to
MacRoman on writing. Thus your text file will be a MacRoman encoded text
file.

> *open file myFile for binary read*
> *read from file myFile until EOF*
> *close file myFile*
> *put it into myName*

This piece of code will read from a file on disk and assume that it is in
the native encoding of the platform - so, in this case, MacRoman. It will
convert the content of the file from that to the internal encoding.

Up to this point - because you saved and loaded the file on the same
platform the content of myName should be as you expect -- unchanged.

> *if the platform is "MacOS" then put macToISO(theName) into theName*

When run on Mac this line will execute and do the following:

1) Convert theName to a binary string - this uses the native platform
encoding (MacRoman)
2) Map each byte from the MacRoman code index to the ISO Latin-1 code
index

This essentially converts theName from a text string to a binary string
encoded in Latin-1.

> *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" &
> URLEncode(theName))
> into rslt*

This line constructs the URL - it is making the assumption that PHP (at the
other end) will interpret the bytes after the '?' as representing
Latin-1 encoded text.

> Without macToISO on a Mac client theName enters corrupted in the mySQL 
> db

This is most likely because PHP is defaulting to 8859-1 or Latin-1 as the
encoding used in URLEncoded fields in a URL. If you don't do MacToIso, then
you will be passing up MacRoman encoded text (URLencoded) to PHP, which can
happily be decoded as Latin-1 or 8859-1 (Latin-1 is a superset of 8859-1),
but with some chars (such as accented letters) in different places.

What you need to do here is explicitly UTF8 encode theName before passing it
to URLEncode, then explicitly decode it as UTF8 on the PHP side (or set a
property in PHP which changes the default assumption about URLs - I
apologise for not being more accurate here, my knowledge of PHP is a little
stale these days!).

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode