Search, Replace and Unicode

2012-12-29 Thread Richmond

'Tis very easy to do this:

on mouseUp
   replace ZaX with XaZ in fld TEKST
end mouseUp

and ZaXbbdsfZvfghXaasn

will magically become:

XaZbbdsfZvfghXaasn.

So, clutching at straws, I tried this:

on mouseUp
   set the useUnicode to true
   replace (numToChar(2367)) with (numToChar(105))
end mouseUp

and, kaboom-diddy-boom-diddy-boom . . .

it replaced all the instances of Unicode char 2367 with an 'i' (whacko),

BUT . . .

it also did something awful with the rest of the text in the fld; as 
far as I can see

it 'deUnicoded' it.

tried the same sort of thing like this:

on mouseUp
   set the useUnicode to true
   replace (numToChar(2367)) with (numToChar(2311))
end mouseUp

and got a right whoreson's.

So . . .

the next 'trick' is how to preserve the unicodeText as unicodeText,.

Cripes!

Richmond.



___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Richmond

On 12/29/2012 04:35 PM, Richmond wrote:

'Tis very easy to do this:

on mouseUp
   replace ZaX with XaZ in fld TEKST
end mouseUp

and ZaXbbdsfZvfghXaasn

will magically become:

XaZbbdsfZvfghXaasn.

So, clutching at straws, I tried this:

on mouseUp
   set the useUnicode to true
   replace (numToChar(2367)) with (numToChar(105))
end mouseUp

and, kaboom-diddy-boom-diddy-boom . . .

it replaced all the instances of Unicode char 2367 with an 'i' 
(whacko),


BUT . . .


snip

Richmond has a short memory; searching on the internet he turned up a 
message HE posted about 2 years ago,

which contained this:

on mouseUp
set the useUnicode to true
if the unicodeText of fld FIRST contains (numToChar(57888) 
numToChar(57999)) then
   get the unicodeText of fld FIRST
   replace (numToChar(57888)  numToChar(57999)) with
(numToChar(57999)  numToChar(57888)) in it
   set the unicodeText of fld FIRST to it
end if
end mouseUp

which works completely.

So Richmond is a bit of a 'twat', but a happy one at least.

Richmond.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Richmond
Now, of course, the real fun starts when one wants to play around with 
wild-cards so that one can, say,


swap Z and X around in a unicodeText field that contains stuff like this:

ZaXddZfXabcdeZoX

as I am unclear how to do that with a non-unicodeText field, the next 
step seems a bit problematic,
and, quite frankly, churning through 'ZaX',ZbX', 'ZcX', etc. (and I'm 
working with an abugida that features

about 4000 glyphs . . . joy) seems tedious in the extreme.

I thought about churning through a list of unicode addresses like this;

on mouseUp
put 2200 into CLICKER
repeat until CLICKER = 5
set the useUnicode to true
if the unicodeText of fld FIRST contains (numToChar(57888)  
numToChar(CLICKER)  numToChar(57999)) then

   get the unicodeText of fld FIRST
   replace (numToChar(57888)  numToChar(CLICKER)  
numToChar(57999))) with

  (numToChar(105)  numToChar(CLICKER)  numToChar(105)) in it
   set the unicodeText of fld FIRST to it
end if
add 1 to CLICKER
end mouseUp

and, theoretically, it works.

the only thing that slightly fusses me about that is what happens if a 
unicode address is empty and/or I land

up against a control character?

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Richmond

On 12/29/2012 05:48 PM, Richmond wrote:
Now, of course, the real fun starts when one wants to play around with 
wild-cards so that one can, say,


swap Z and X around in a unicodeText field that contains stuff like this:

ZaXddZfXabcdeZoX

as I am unclear how to do that with a non-unicodeText field, the next 
step seems a bit problematic,
and, quite frankly, churning through 'ZaX',ZbX', 'ZcX', etc. (and I'm 
working with an abugida that features

about 4000 glyphs . . . joy) seems tedious in the extreme.

I thought about churning through a list of unicode addresses like this;

on mouseUp
put 2200 into CLICKER
repeat until CLICKER = 5
set the useUnicode to true
if the unicodeText of fld FIRST contains (numToChar(57888)  
numToChar(CLICKER)  numToChar(57999)) then

   get the unicodeText of fld FIRST
   replace (numToChar(57888)  numToChar(CLICKER)  
numToChar(57999))) with

  (numToChar(105)  numToChar(CLICKER)  numToChar(105)) in it
   set the unicodeText of fld FIRST to it
end if
add 1 to CLICKER
end mouseUp

and, theoretically, it works.

the only thing that slightly fusses me about that is what happens if a 
unicode address is empty and/or I land

up against a control character?


This works:

doesn't seem to fuss the think whether there is a glyph present or not, 
and no problems
with any reserved chars that may be lying around somewhere within that 
range.


ran it through unicode addresses from 2325 to 62738, which took about 90 
seconds, which is rather too long, so
will cut that down to run through several ranges of unicode addresses 
rather than the whole shebang.


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Phil Davis

Hi Richmond,

Just curious - does setting the lockMessages  lockScreen to true speed 
it up any? Or hiding the field? (Sorry if these have already been 
answered; I'm not following the thread too closely)



On 12/29/12 8:18 AM, Richmond wrote:
ran it through unicode addresses from 2325 to 62738, which took about 
90 seconds, which is rather too long, so
will cut that down to run through several ranges of unicode addresses 
rather than the whole shebang.


--
Phil Davis


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Richmond

On 12/29/2012 09:38 PM, Phil Davis wrote:

Hi Richmond,

Just curious - does setting the lockMessages  lockScreen to true 
speed it up any? Or hiding the field? (Sorry if these have already 
been answered; I'm not following the thread too closely)


I really don't know as I am so thick those possibilities had not 
occurred to me.


However, I cracked open my monster Sanskrit font and looked at it 
FontForge and realised that I was being fairly bl**dy silly crunching 
through
thousands of irrelevant unicode addresses, so chopped things up into 3 
REPEAT UNTIL loops for the unicode ranges that were relevant to my work:


2325 - 2431
57354 - 58498
61952 - 62738

giving a saving of some 58371 interations and boiling the whole 
thing down to about 5 seconds . . . that is on the same sample text I

previously used.

HOWEVER . . . now I will play around with setting lockMessages and 
lockScreen as per your suggestions to see if there is any appreciable gain.


Phil; you state that you are not following the thread too closely; is 
that because it is not something that interests you, or do you have a 
stake in Unicode text manipulation but have been busy elsewhere.


I sometimes get the feeling (err . . . paranoid) that I am the only 
person using RR Livecode who is doing 'serious sh*t' with unicodeText;


but, hey, why should I worry?; in 1985, at the University of Durham, I 
was the only student trying to process text with PASCAL 5 while all the

Physics students were crunching numbers.




On 12/29/12 8:18 AM, Richmond wrote:
ran it through unicode addresses from 2325 to 62738, which took about 
90 seconds, which is rather too long, so
will cut that down to run through several ranges of unicode addresses 
rather than the whole shebang.





___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Richmond

On 12/29/2012 09:38 PM, Phil Davis wrote:

Hi Richmond,

Just curious - does setting the lockMessages  lockScreen to true 
speed it up any? Or hiding the field? (Sorry if these have already 
been answered; I'm not following the thread too closely)


I ran unicode replace script that took 8 seconds,

with

set the lockMessages to true

in line 1 of the script

and  set the lockMessages to false

in the last line

the script took 7 seconds



using lockScreen

the script took 2 seconds (Wow!)

---

using lockScreen  lockMessages

the script still took 2 seconds

---

obviously lockScreen is a good thing

Thank you very much for your suggestion!

Richmond.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Search, Replace and Unicode

2012-12-29 Thread Phil Davis


On 12/29/12 11:52 AM, Richmond wrote:

On 12/29/2012 09:38 PM, Phil Davis wrote:

Hi Richmond,

Just curious - does setting the lockMessages  lockScreen to true 
speed it up any? Or hiding the field? (Sorry if these have already 
been answered; I'm not following the thread too closely)


I really don't know as I am so thick those possibilities had not 
occurred to me.


However, I cracked open my monster Sanskrit font and looked at it 
FontForge and realised that I was being fairly bl**dy silly crunching 
through
thousands of irrelevant unicode addresses, so chopped things up into 3 
REPEAT UNTIL loops for the unicode ranges that were relevant to my work:


2325 - 2431
57354 - 58498
61952 - 62738

giving a saving of some 58371 interations and boiling the whole 
thing down to about 5 seconds . . . that is on the same sample text I

previously used.


Wow! Nice gain.



HOWEVER . . . now I will play around with setting lockMessages and 
lockScreen as per your suggestions to see if there is any appreciable 
gain.


Phil; you state that you are not following the thread too closely; 
is that because it is not something that interests you, or do you have 
a stake in Unicode text manipulation but have been busy elsewhere.


I am being dragged toward Unicode, kicking and screaming. It looms large 
in my future, as one client has asked me to add support for Arabic in 
his training system soon. But I don't have much experience with it. I'm 
hoping the improvements in LC 5.5 and beyond will make for a little 
smoother landing in UnicodeLand.




I sometimes get the feeling (err . . . paranoid) that I am the only 
person using RR Livecode who is doing 'serious sh*t' with unicodeText;


but, hey, why should I worry?; in 1985, at the University of Durham, I 
was the only student trying to process text with PASCAL 5 while all the

Physics students were crunching numbers.




On 12/29/12 8:18 AM, Richmond wrote:
ran it through unicode addresses from 2325 to 62738, which took 
about 90 seconds, which is rather too long, so
will cut that down to run through several ranges of unicode 
addresses rather than the whole shebang.





___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your 
subscription preferences:

http://lists.runrev.com/mailman/listinfo/use-livecode



--
Phil Davis


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode