Richmond

It's at http://www.rebol.org/view-script.r?script=str-enc-utils.r which I 
forgot to paste into my previous message.

Regards

Peter 

On 25 Jan 2014, at 17:14, Richmond wrote:

> On 25/01/14 01:22, Peter W A Wood wrote:
>> Richmond
>> 
>> It is almost impossible to determine the encoding of text from the contents 
>> of the text. You can take educated guesses but when even just considering 
>> four different encodings that is tricky.
>> 
>> You can get an idea of the complexity by taking a quick look at the 
>> encoding? function in this REBOL script. (You should be able to find the 
>> function as there is a big banner with encoding? at the top of it.) The 
>> script counts characters that are likely to be in one encoding but not in 
>> another. For instance, presence of characters 129, 141, 144 and 157 give a 
>> hint that the text is MacRoman encoded.
> 
> Um? Where is the REBOL script?
> 
> Richmond.
> 
>> 
>> Regards
>> 
>> Peter
>> 
>> 
>> On 25 Jan 2014, at 02:54, Richmond wrote:
>> 
>>> On 22/01/14 20:41, Graham Samuel wrote:
>>>> Richmond, thanks for inching my problem towards a solution. I downloaded 
>>>> your test.
>>>> Clever, in fact too clever for me.
>>> Possibly, but NOT clever enough . . .
>>> 
>>> I would like an easy way to know what character encoding is being used in a 
>>> textField:
>>> 
>>> NOT just whether it is Unicode or Not:
>>> 
>>> There are all sorts of variable such as
>>> 
>>> fontLanguage  [I have never quite worked out how that jives with Unicode],
>>> 
>>> MacCyrillic,
>>> 
>>> and so on, ad nauseam.
>>> 
>>> ------
>>> 
>>> For the sake of argument, and at the risk of repeating myself:
>>> 
>>> I managed to resurrect a 120 page 'thing' of my wife's, written in mixed 
>>> English and Bulgarian on
>>> Mac OS 9 when Mac OS 9 was all the rage.
>>> 
>>> In the end . . . after a lot of blood, sweat, tears and incredibly coarse 
>>> remarks, I manged to turn it into
>>> a PDF with an embedded text layer .  . . allowing, at least, the English to 
>>> be directly transferred into an ODT
>>> document.
>>> 
>>> However my wife will still have the "joy" of having to retype all the 
>>> Bulgarian and all the other bits of text
>>> in various other languages, because they were initially typed on Mac OS 9 
>>> in the "funny ways" Mac did
>>> things then which are not the same as the "funny ways" (a.k.a. Unicode) we 
>>> do things now.
>>> 
>>> Had I had a stack that allowed me to import the document, or copy-paste the 
>>> text, and then been able to tell
>>> me the encodings of the various bits (chunks) so I could have run them 
>>> through some merry little algorhythms,
>>> life would have been considerably more refreshing.
>>> 
>>> ------------
>>> 
>>> Now, I know the argument about Livecode not being a jollified 
>>> word-processor that was trotted out when I made a few
>>> comments about Supercard having ways of doing paragraphing and so on.
>>> 
>>> And, Livecode may NOT be a jollified word-processor; but if it is meant to 
>>> be a computer programming language
>>> rather than a simplified subset of one, it should have the wherewithall for 
>>> programmers to build a word-processor
>>> without recourse to outside resources. That means (quite apart from 
>>> paragraph breaks, which can be easily arranged in Livecode)
>>> the ability to recognise and tell the programmer all sorts of tex-encoding 
>>> standards.
>>> 
>>> -----------
>>> 
>>> Now Graham's "Clever" is jolly gratifying, but, frankly, comparing 2 
>>> textFields in not very clever,
>>> and, while that can differentiate between ASCII text and Unicode text that 
>>> is as far as it goes.
>>> 
>>> ---------
>>> 
>>> My latest riff is to have a command of the sort:
>>> 
>>> put textEncoding
>>> 
>>> and something of the sort 'plainText', 'RTFtext', 'htmlText', 'unicodeText' 
>>> will be output as a result.
>>> 
>>> And then, for those who really go a bundle on this kind of thing, we might 
>>> extend that to 'UTF8', 'UTF16', 'UTF32' and so forth.
>>> 
>>> Richmond.
>>> 
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode@lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your 
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription 
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to