Re: What is LC's internal text format?

2018-11-12 Thread Geoff Canyon via use-livecode
So then why does put textEncode("a","UTF-32") into X;put chartonum(byte 1 of X) put 97? That implies that "byte" 1 is "a", not 111. Likewise, put textEncode("㍁","UTF-32") into X;put chartonum(byte 1 of X) puts 65. I've looked in the dictionary and I don't see anything that comes close to

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Mark Waddingham via use-livecode
On 2018-11-13 06:35, Geoff Canyon via use-livecode wrote: I didn't realize until now that offset() simply fails with some unicode strings: put offset("a","↘܎qeiuruioqeaaa↘܎qeiuar",13) -- puts 0 On Mon, Nov 12, 2018 at 9:17 PM Geoff Canyon wrote: A few things: 1. It seems codepointOffset

Re: What is LC's internal text format?

2018-11-12 Thread Mark Waddingham via use-livecode
On 2018-11-13 07:15, Geoff Canyon via use-livecode wrote: On Mon, Nov 12, 2018 at 3:50 PM Monte Goulding via use-livecode < use-livecode@lists.runrev.com> wrote: Unless I'm misunderstanding, this hasn't been my observation. Using offset on a string that has been textEncodet()ed to UTF-32

Re: What is LC's internal text format?

2018-11-12 Thread Geoff Canyon via use-livecode
On Mon, Nov 12, 2018 at 3:50 PM Monte Goulding via use-livecode < use-livecode@lists.runrev.com> wrote: > Text strings in LiveCode are native encoded (MacRoman or ISO 8859) where > possible and where you don’t explicitly tell the engine > For what it’s worth using `offset` is the wrong thing to

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Geoff Canyon via use-livecode
I didn't realize until now that offset() simply fails with some unicode strings: put offset("a","↘܎qeiuruioqeaaa↘܎qeiuar",13) -- puts 0 On Mon, Nov 12, 2018 at 9:17 PM Geoff Canyon wrote: > A few things: > > 1. It seems codepointOffset can only find a single character? So it > won't work for

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Geoff Canyon via use-livecode
A few things: 1. It seems codepointOffset can only find a single character? So it won't work for any search for a multi-character string? 2: codepointOffset seems to work differently for multi-byte characters and regular characters: put codepointoffset("e","↘ndatestest",6) -- puts 3 put

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Monte Goulding via use-livecode
Hi Folks I was a bit perplexed by this so I had a quick look about the engine and I see the issue. The problem is you are using `offset` which works on characters. Characters in LiveCode are neither unicode codepoints or bytes. They are graphemes. This means that when you have chars to skip

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Geoff Canyon via use-livecode
On Mon, Nov 12, 2018 at 11:36 AM Ben Rubinstein via use-livecode < use-livecode@lists.runrev.com> wrote: > I'm really confused that case-insensitive should work at all for UTF-16 or > UTF-32; This is so puzzling. I tried this code in a button: on mouseUp put "Ѡ" into x put "ѡ" into y

Re: What is LC's internal text format?

2018-11-12 Thread Monte Goulding via use-livecode
Text strings in LiveCode are native encoded (MacRoman or ISO 8859) where possible and where you don’t explicitly tell the engine it’s unicode (via textDecode) so that they can follow faster single byte code paths. If you use textDecode then the engine will first check if the text can be native

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Geoff Canyon via use-livecode
On Mon, Nov 12, 2018 at 11:36 AM Ben Rubinstein via use-livecode < use-livecode@lists.runrev.com> wrote: > > I'm really confused that case-insensitive should work at all for UTF-16 or > UTF-32; at this point as far as I understand it, LC has no idea that how > to > correctly interpret the value

What is LC's internal text format?

2018-11-12 Thread Ben Rubinstein via use-livecode
This is something that I've been wondering about for a while. My unexamined assumption had been that in the 'new' fully unicode LC, text was held in UTF-8. However when I saved some text strings in binary I got something like UTF-8 - but not quite. And the recent experiments with offset

Re: .PID file in C:\Users\*\AppData\Local\._LiveCode_\

2018-11-12 Thread Monte Goulding via use-livecode
This is not something I’ve looked at before but it appears the files should be being cleaned up at the end of the session. I guess if LiveCode writing this file is a particular issue for you then you could put in a feature request to toggle relaunch support in the standalone builder. > On 12

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Niggemann, Bernd via use-livecode
6 >-0800<https://www.mail-archive.com/search?l=use-livecode@lists.runrev.com=date:20181112> >Coming late to this discussion. Very excited by this approach of converting >everything to UTF-32 in order to do fast offsets. >In the meantime I'd be suspicious about doing a case-insensi

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Ben Rubinstein via use-livecode
Coming late to this discussion. Very excited by this approach of converting everything to UTF-32 in order to do fast offsets. I'm really confused that case-insensitive should work at all for UTF-16 or UTF-32; at this point as far as I understand it, LC has no idea that how to correctly

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Brian Milby via use-livecode
I noticed something similar, but did not have a chance to dig into it. If I copied the complex character that Geoff inserted (is that Kanji?) into the string to search, I also got no results for UTF32. But, if I also copied it into the string to find field, then the results worked partially.

Re: How to find offsets in Unicode Text fast

2018-11-12 Thread Niggemann, Bernd via use-livecode
Thank you Brian for putting the test stack up. It makes it easier to test various non-ASCII texts. As your testing shows the UTF16 variant can be misleading. Unfortunately I also found a case of UTF32 not working. I copied from Icelandic Wikipedia from the entry about the capital Reykjavik

[ANN] This Week in LiveCode 154

2018-11-12 Thread panagiotis merakos via use-livecode
Hi all, Read about new developments in LiveCode open source and the open source community in today's edition of the "This Week in LiveCode" newsletter! Read issue #154 here: https://goo.gl/6Cgsur This is a weekly newsletter about LiveCode, focussing on what's been going on in and around

Re: .PID file in C:\Users\*\AppData\Local\._LiveCode_\

2018-11-12 Thread Andre Alves Garzia via use-livecode
Malte, Found it in the source: https://github.com/livecode/livecode/blob/d780d79e800afd65897631f840296075ff6573e9/engine/src/w32relaunch.cpp#L310 As I suspected, it is related to the running process. We still need to hear from the mothership about it but files in AppData/Local should be

Re: Re: .PID file in C:\Users\*\AppData\Local\._LiveCode_\

2018-11-12 Thread Malte Pfaff-Brill via use-livecode
Thanks Andre! I guess I will have to wait for someone from the mothership to join in then. :-/ I am currently trying to get the OSS project I am stewarding approved by a governmental agency and those files might be a showstopper, as I am not supposed to leave traces in the system, besides

Re: .PID file in C:\Users\*\AppData\Local\._LiveCode_\

2018-11-12 Thread Andre Alves Garzia via use-livecode
Malte, I have no idea, but I am running the IDE here and I have two of those files in that folder. Since they are named PID, I suspect that they somehow hold information about the running process ID or something similar. om om andre On 11/12/2018 9:23 AM, Malte Pfaff-Brill via use-livecode

.PID file in C:\Users\*\AppData\Local\._LiveCode_\

2018-11-12 Thread Malte Pfaff-Brill via use-livecode
Hi, Does anybody know what causes files to be created in C:\Users\*\AppData\Local\._LiveCode_\ for a standalone under Windows 7? Does this have something to do with the creation of UUIDs? Can the creation of those files be avoided? Cheers! Malte

Re: Regex replacements (\1, \2) not matching

2018-11-12 Thread panagiotis merakos via use-livecode
Hello all, There is an enhancement request about it: https://quality.livecode.com/show_bug.cgi?id=21534 Best regards, Panos -- On Mon, Nov 12, 2018 at 10:21 AM Kaveh Bazargan via use-livecode < use-livecode@lists.runrev.com> wrote: > Thanks James > > I am using the community edition of

Re: Regex replacements (\1, \2) not matching

2018-11-12 Thread Kaveh Bazargan via use-livecode
Thanks James I am using the community edition of LiveCode, but I remember that Thierry has been v helpful in the past. I just found a similar question I asked 4 years ago and he put lots of time in explaining: https://forums.livecode.com/viewtopic.php?t=21157 I will review that because I have a