Re: What is LC's internal text format?

2018-11-28 Thread Lagi Pittas via use-livecode
Hi Bob, So my rant didn't go to the bit bucket. To answer your question NO that wouldn't happen - adding "Side" languages to use with the IDE introduces people to a saner way or doing things. We could use Python and or Javascript for the great libraries and make them callable with a LCB

Re: What is LC's internal text format?

2018-11-26 Thread Bob Sneidar via use-livecode
I would be concerned that if a large number of Java coders (far more than the LC coders) were to come on board, we would end up with a java development environment as the java people would dominate the demand and direction of LC. Bob S > On Nov 21, 2018, at 09:00 , Lagi Pittas via

Re: What is LC's internal text format?

2018-11-21 Thread Lagi Pittas via use-livecode
Hi Mark, I can't see any reason why not - except for time (and money). The fact that the language has been "forked - livecode builder" means there is a precedent for changes into the way the language works. I cannot see why LCB could not be one of the "open language" variants that uses the

Re: What is LC's internal text format?

2018-11-20 Thread Bob Sneidar via use-livecode
> On Nov 20, 2018, at 10:24 , Ben Rubinstein via use-livecode > wrote: > > This isn't about strongly typed variables though, but about when (correct) > conversion is possible. > > LC throws an error if you implicitly ask it to convert the wrong kind of > string to a number - for example,

Re: What is LC's internal text format?

2018-11-20 Thread Geoff Canyon via use-livecode
I'll chip in and point out that the implicit conversion caused significant hiccups in figuring out the offsets issues -- several people (including me) were fooled by the fact that conversion to UTF-32 results in binary data, but can be transparently treated as text. Or maybe I'm

Re: What is LC's internal text format?

2018-11-20 Thread Ben Rubinstein via use-livecode
This isn't about strongly typed variables though, but about when (correct) conversion is possible. LC throws an error if you implicitly ask it to convert the wrong kind of string to a number - for example, add 45 to "horse". (Obviously multiplication is fine: the answer would be "45 horses".)

Re: What is LC's internal text format?

2018-11-20 Thread Mark Wieder via use-livecode
On 11/20/18 8:33 AM, Ben Rubinstein via use-livecode wrote: Would it not be better to refuse to make an assumption, i.e. require an explicit conversion? While I'd love to have the option of strongly typed variables at the scripting level, I know better than to expect that this will ever

Re: What is LC's internal text format?

2018-11-20 Thread Bob Sneidar via use-livecode
I'm not grasping the import of the question here, but it seems to me that the question is about what happens "under the hood", in relation to the format of the data as it is exposed to any I/O. It seems to me that in this context it's academic. If there is a problem with what's going on "under

Re: What is LC's internal text format?

2018-11-20 Thread Ben Rubinstein via use-livecode
Hi Monte, Thanks for this, sorry for delayed reply - I've been away. >> Does textEncode _always_ return a binary string? Or, if invoked with "CP1252", "ISO-8859-1", "MacRoman" or "Native", does it return a string? > > Internally we have different types of values. So we have MCStringRef which

Re: What is LC's internal text format?

2018-11-13 Thread Monte Goulding via use-livecode
> On 14 Nov 2018, at 11:39 am, Monte Goulding via use-livecode > wrote: > >> You generally want to use codepoint in 7+ generally where previously you >> used char unless you know you are dealing with a binary string and then you >> use byte. > > Sorry! I have written codepoints here when I

Re: What is LC's internal text format?

2018-11-13 Thread Monte Goulding via use-livecode
> On 14 Nov 2018, at 10:44 am, Monte Goulding via use-livecode > wrote: > > You generally want to use codepoint in 7+ generally where previously you used > char unless you know you are dealing with a binary string and then you use > byte. Sorry! I have written codepoints here when I was

Re: What is LC's internal text format?

2018-11-13 Thread Monte Goulding via use-livecode
> On 14 Nov 2018, at 6:33 am, Ben Rubinstein via use-livecode > wrote: > > That's really helpful - and in parts eye-opening - thanks Mark. > > I have a few follow-up questions. > > Does textEncode _always_ return a binary string? Or, if invoked with > "CP1252", "ISO-8859-1", "MacRoman" or

Re: What is LC's internal text format?

2018-11-13 Thread Geoff Canyon via use-livecode
I never left, I just went silent. But since I'm "back", I'm curious to know what the engine-types think of Bernd's solution for fixing the UTF-32 offsets code. It seems that when converting both the stringToFind and stringToSearch to UTF-32 and then searching the binary with byteOffset, you won't

Re: What is LC's internal text format?

2018-11-13 Thread Ben Rubinstein via use-livecode
For the avoidance of doubt, all my outrage is faux outrage. Public life on both sides of the Atlantic (and around the world) has completely exhausted capacity for real outrage. Come back Geoff! Ben On 13/11/2018 17:29, Mark Waddingham via use-livecode wrote: On 2018-11-13 18:21, Geoff

Re: What is LC's internal text format?

2018-11-13 Thread Jerry Jensen via use-livecode
> On Nov 13, 2018, at 2:52 AM, Mark Waddingham via use-livecode > wrote: > > Yes - a byte is not a number, a char is not a number a bit sequence is not a > number. Reminds of a clever sig line from somebody on this list. I can’t remember who, so author please step up and take credit.

Re: What is LC's internal text format?

2018-11-13 Thread Ben Rubinstein via use-livecode
That's really helpful - and in parts eye-opening - thanks Mark. I have a few follow-up questions. Does textEncode _always_ return a binary string? Or, if invoked with "CP1252", "ISO-8859-1", "MacRoman" or "Native", does it return a string? > CodepointOffset has signature 'integer

Re: What is LC's internal text format?

2018-11-13 Thread Mark Waddingham via use-livecode
On 2018-11-13 18:21, Geoff Canyon via use-livecode wrote: Nothing I said in this thread has anything to do with optimizing the allOffsets routines; I only used examples from that discussion because they illustrate my puzzlement on the exact topic you (in general) raised: how data types are

Re: What is LC's internal text format?

2018-11-13 Thread Geoff Canyon via use-livecode
On Tue, Nov 13, 2018 at 3:43 AM Ben Rubinstein via use-livecode < use-livecode@lists.runrev.com> wrote: > I'm grateful for all the information, but _outraged_ that the thread that > I > carefully created separate from the offset thread was so quickly hijacked > for > the continuing (useful!)

Re: What is LC's internal text format?

2018-11-13 Thread Bob Sneidar via use-livecode
There is a quest in World of Warcraft where the objective is actually to herd cats. It can be done, but only one cat at a time. :-) Bob S > On Nov 13, 2018, at 05:31 , Mark Waddingham via use-livecode > wrote: > > On 2018-11-13 12:43, Ben Rubinstein via use-livecode wrote: >> I'm grateful

Re: What is LC's internal text format?

2018-11-13 Thread Mark Waddingham via use-livecode
On 2018-11-13 12:43, Ben Rubinstein via use-livecode wrote: I'm grateful for all the information, but _outraged_ that the thread that I carefully created separate from the offset thread was so quickly hijacked for the continuing (useful!) detailed discussion on that topic. The phrase

Re: What is LC's internal text format?

2018-11-13 Thread Ben Rubinstein via use-livecode
I'm grateful for all the information, but _outraged_ that the thread that I carefully created separate from the offset thread was so quickly hijacked for the continuing (useful!) detailed discussion on that topic. From recent contributions on both threads I'm getting some more insights, but

Re: What is LC's internal text format?

2018-11-13 Thread Mark Waddingham via use-livecode
On 2018-11-13 11:06, Geoff Canyon via use-livecode wrote: I don't *think* I'm confusing binary string/data with binary numbers -- I was just trying to illustrate that when a Latin Small Letter A (U+0061) gets encoded, somewhere there is stored (four bytes, one of which is) a byte 97, i.e. the

Re: What is LC's internal text format?

2018-11-13 Thread Geoff Canyon via use-livecode
I don't *think* I'm confusing binary string/data with binary numbers -- I was just trying to illustrate that when a Latin Small Letter A (U+0061) gets encoded, somewhere there is stored (four bytes, one of which is) a byte 97, i.e. the bit sequence 111, unless computers don't work that way

Re: What is LC's internal text format?

2018-11-13 Thread Mark Waddingham via use-livecode
On 2018-11-13 08:35, Geoff Canyon via use-livecode wrote: So then why does put textEncode("a","UTF-32") into X;put chartonum(byte 1 of X) put 97? Because: 1) textEncode("a", "UTF-32") produces the byte sequence <97,0,0,0> 2) byte 1 of <97,0,0,0> is <97> 3) charToNum(<97>) first

Re: What is LC's internal text format?

2018-11-12 Thread Geoff Canyon via use-livecode
So then why does put textEncode("a","UTF-32") into X;put chartonum(byte 1 of X) put 97? That implies that "byte" 1 is "a", not 111. Likewise, put textEncode("㍁","UTF-32") into X;put chartonum(byte 1 of X) puts 65. I've looked in the dictionary and I don't see anything that comes close to

Re: What is LC's internal text format?

2018-11-12 Thread Mark Waddingham via use-livecode
On 2018-11-13 07:15, Geoff Canyon via use-livecode wrote: On Mon, Nov 12, 2018 at 3:50 PM Monte Goulding via use-livecode < use-livecode@lists.runrev.com> wrote: Unless I'm misunderstanding, this hasn't been my observation. Using offset on a string that has been textEncodet()ed to UTF-32

Re: What is LC's internal text format?

2018-11-12 Thread Geoff Canyon via use-livecode
On Mon, Nov 12, 2018 at 3:50 PM Monte Goulding via use-livecode < use-livecode@lists.runrev.com> wrote: > Text strings in LiveCode are native encoded (MacRoman or ISO 8859) where > possible and where you don’t explicitly tell the engine > For what it’s worth using `offset` is the wrong thing to

Re: What is LC's internal text format?

2018-11-12 Thread Monte Goulding via use-livecode
Text strings in LiveCode are native encoded (MacRoman or ISO 8859) where possible and where you don’t explicitly tell the engine it’s unicode (via textDecode) so that they can follow faster single byte code paths. If you use textDecode then the engine will first check if the text can be native