Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-05 Thread Geoff Canyon via use-livecode
I've updated my GitHub to the following, which adopts Brian's "starts with"
(I can't count how many times I've had to re-remember that "starts with" is
faster than comparing to char 1 through ) and added minor
optimizations to the wrapping-up code.

gc

function allOffsets D,S,pCase,pNoOverlaps
   -- returns a comma-delimited list of the offsets of D in S
   set the caseSensitive to pCase is true
   put length(D) into dLength
   put pNoOverlaps and dLength > 1 into pNoOverlaps
   put numtochar(chartonum(char -1 of D) mod 2 + 1) after S
   if not pNoOverlaps then
  repeat with i = 1 to dLength - 1
 if not (char i + 1 to -1 of D is char 1 to dLength - i of D) then
next repeat
 put char -i to -1 of D into OV[i]
 put i & cr after kList
  end repeat
   end if
   set the itemDel to D
   put 1 - dLength into C
   if pNoOverlaps or kList is empty then
  repeat for each item i in S
 add length(i) + dLength to C
 put C,"" after R
  end repeat
   else
  repeat for each item i in S
 repeat for each line K in kList
if i & D begins with OV[K] then put (C + K),"" after R
 end repeat
 add length(i) + dLength to C
 put C,"" after R
  end repeat
   end if
   set the itemDel to comma
   repeat with i = 1 to 999
  if item i of R > 0 then exit repeat
   end repeat
   delete item 1 to i - 1 of R
   if R begins with C then return 0
   return char 1 to -3 - length(C) of R
end allOffsets
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-05 Thread Geoff Canyon via use-livecode
On Sun, Nov 4, 2018 at 7:11 PM Mark Wieder via use-livecode <
use-livecode@lists.runrev.com> wrote:

>
> If you're looking for 'romeo' in pText, would you set pOverlaps to true
> or to false?


I'd set it to false, there's no way for "romeo" to overlap. But even if I
were looking for "radar", which could overlap, I'd set it to false if I
were searching an english text document, because there's no word
"radaradar". But as I said, I've switched it to default to finding overlaps.
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-05 Thread Geoff Canyon via use-livecode
On Sun, Nov 4, 2018 at 7:42 PM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:

> Simply add 1 to the last offset pointer. If after the first iteration you
> return 1, then set the charsToSkip to 2 instead of offset +
> len(searchString) if you take my meaning.
>
> Bob S
>

The method we're using avoids charsToSkip because it suffers mightily with
multi-byte characters. But the latest updates handle overlapping results,
see other posts in this thread.
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Brian Milby via use-livecode
Here's an image of the stack in my fork of the repo:
https://github.com/bwmilby/alloffsets/blob/bwm/bwm/stack_allOffsets_card_id_1018.png


On Sun, Nov 4, 2018 at 10:07 PM Brian Milby  wrote:

> I’m working on an update to the stack now. Moving buttons to the left side
> to make it easier to add more.
>
> Thanks,
> Brian
> On Nov 4, 2018, 10:02 PM -0600, Mark Wieder via use-livecode <
> use-livecode@lists.runrev.com>, wrote:
>
> On 11/4/18 4:45 PM, Brian Milby via use-livecode wrote:
>
> My updated solution always looks for overlap but if none are found it uses
> optimized versions of the search (private functions instead of inside the
> main function). I special case for no overlap and a single overlap in the
> delimiter. It is about the same speed as Geoff’s.
>
>
> Nice. I tried to get tricky and replace that 'replace with' loop with a
> 'repeat for each' loop, but ended up about 20% slower. Not at all what I
> expected.
>
> --
> Mark Wieder
> ahsoftw...@gmail.com
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Brian Milby via use-livecode
I’m working on an update to the stack now. Moving buttons to the left side to 
make it easier to add more.

Thanks,
Brian
On Nov 4, 2018, 10:02 PM -0600, Mark Wieder via use-livecode 
, wrote:
> On 11/4/18 4:45 PM, Brian Milby via use-livecode wrote:
> > My updated solution always looks for overlap but if none are found it uses 
> > optimized versions of the search (private functions instead of inside the 
> > main function). I special case for no overlap and a single overlap in the 
> > delimiter. It is about the same speed as Geoff’s.
>
> Nice. I tried to get tricky and replace that 'replace with' loop with a
> 'repeat for each' loop, but ended up about 20% slower. Not at all what I
> expected.
>
> --
> Mark Wieder
> ahsoftw...@gmail.com
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Mark Wieder via use-livecode

On 11/4/18 4:45 PM, Brian Milby via use-livecode wrote:

My updated solution always looks for overlap but if none are found it uses 
optimized versions of the search (private functions instead of inside the main 
function). I special case for no overlap and a single overlap in the delimiter. 
It is about the same speed as Geoff’s.


Nice. I tried to get tricky and replace that 'replace with' loop with a 
'repeat for each' loop, but ended up about 20% slower. Not at all what I 
expected.


--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Bob Sneidar via use-livecode
Simply add 1 to the last offset pointer. If after the first iteration you 
return 1, then set the charsToSkip to 2 instead of offset + len(searchString) 
if you take my meaning. 

Bob S


> On Nov 2, 2018, at 17:43 , Geoff Canyon via use-livecode 
>  wrote:
> 
> I like that, changing it. Now available at
> https://github.com/gcanyon/alloffsets
> 
> One thing I don't see how to do without significantly impacting performance
> is to return all offsets if there are overlapping strings. For example:
> 
> allOffsets("aba","abababa")
> 
> would return 1,5, when it might be reasonable to expect it to return 1,3,5.
> Using the offset function with numToSkip would make that easy; adapting
> allOffsets to do so would be harder to do cleanly I think.
> 
> gc
> 
> On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
> use-livecode@lists.runrev.com> wrote:
> 
>> how about allOffsets?
>> 
>> Bob S
>> 
>> 
>>> On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <
>> use-livecode@lists.runrev.com> wrote:
>>> 
>>> All of those return a single value; I wanted to convey the concept of
>>> returning multiple values. To me listOffset implies it does the same
>> thing
>>> as itemOffset, since items come in a list. How about:
>>> 
>>> offsets -- not my favorite because it's almost indistinguishable from
>> offset
>>> offsetsOf -- seems a tad clumsy
>>> 
>>> On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
>>> use-livecode@lists.runrev.com> wrote:
>>> 
>>>> It probably should be named listOffset, like itemOffset or lineOffset.
>>>> 
>>>> Bob S
>>>> 
>>>> 
>>>>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
>>>> use-livecode@lists.runrev.com> wrote:
>>>>> 
>>>>> Nice! I *just* finished creating a github repository for it, and adding
>>>>> support for multi-char search strings, much as you did. I was coming to
>>>> the
>>>>> list to post the update when I saw your post.
>>>>> 
>>>>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
>>>>> 
>>>>> Here's my updated version:
>>>>> 
>>>>> function offsetList D,S,pCase
>>>>> -- returns a comma-delimited list of the offsets of D in S
>>>>> set the caseSensitive to pCase is true
>>>>> set the itemDel to D
>>>>> put length(D) into dLength
>>>>> put 1 - dLength into C
>>>>> repeat for each item i in S
>>>>>add length(i) + dLength to C
>>>>>put C,"" after R
>>>>> end repeat
>>>>> set the itemDel to comma
>>>>> if char -dLength to -1 of S is D then return char 1 to -2 of R
>>>>> put length(C) + 1 into lenC
>>>>> put length(R) into lenR
>>>>> if lenC = lenR then return 0
>>>>> return char 1 to lenR - lenC - 1 of R
>>>>> end offsetList
>>>>> 
>>>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
>>>>> use-livecode@lists.runrev.com> wrote:
>>>>> 
>>>>>> Hi Geoff,
>>>>>> 
>>>>>> thank you for this beautiful script.
>>>>>> 
>>>>>> I modified it a bit to accept multi-character search string and also
>> for
>>>>>> case sensitivity.
>>>>>> 
>>>>>> It definitely is a lot faster for unicode text than anything I have
>>>> seen.
>>>>>> 
>>>>>> -
>>>>>> function offsetList D,S, pCase
>>>>>> -- returns a comma-delimited list of the offsets of D in S
>>>>>> -- pCase is a boolean for caseSensitive
>>>>>> set the caseSensitive to pCase
>>>>>> set the itemDel to D
>>>>>> put the length of D into tDelimLength
>>>>>> repeat for each item i in S
>>>>>>add length(i) + tDelimLength to C
>>>>>>put C - (tDelimLength - 1),"" after R
>>>>>> end repeat
>>>>>> set the itemDel to comma
>>>>>> if char -1 of S is D then return char 1 to -2 of R
>>>>>> put length(C) + 1 into lenC
>>>>>> put length(R) into lenR
>>>>>

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Mark Wieder via use-livecode

On 11/4/18 6:49 PM, Geoff Canyon via use-livecode wrote:


I'm not sure I agree that it would be so unlikely to know that overlaps
won't occur (or that it's unreasonable to not want them). If I'm looking
for every instance of "romeo" in romeo and juliet, then obviously I'm not
expecting, nor do I want, overlaps.
Sure, but in that case you'd be better off using the faster 'offset' 
function. Or do you mean every instance of 'romeo' in the play itself? 
There I can see why you'd want to set it to false for speed.


My point isn't really whether pOverlaps should default to true or false, 
but that you need detailed knowledge of the corpus of data before 
calling the function.


If you're looking for 'romeo' in pText, would you set pOverlaps to true 
or to false?


--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Geoff Canyon via use-livecode
On Sun, Nov 4, 2018 at 4:34 PM Mark Wieder via use-livecode <
use-livecode@lists.runrev.com> wrote:

> On 11/4/18 10:40 AM, Geoff Canyon via use-livecode wrote:
> > I also added a "with overlaps" option.
>
> My problem with the pWithOverlaps parameter is that is requires a priori
> knowledge of the data being consumed. If you already know there are
> overlaps then you'd set the parameter to true. If you don't know whether
> or not there are overlaps, then you'd need to set it to true so you
> don't miss anything (aside, of course, for the trivial case where you
> don't care whether or not there are overlaps - is there a use case for
> this?).
>
> The only time you would set it to false is after you've already
> determined that there are no overlaps, and the time spent on that would
> probably more than offset the extra processing in the function.


I'm not sure I agree that it would be so unlikely to know that overlaps
won't occur (or that it's unreasonable to not want them). If I'm looking
for every instance of "romeo" in romeo and juliet, then obviously I'm not
expecting, nor do I want, overlaps. Likewise, overlaps can only occur if
the search string allows for them, so "romeo" makes it impossible from the
get go

That said, it seems reasonable to default overlaps to true rather than
false. I'll set it up that way when I add the modification below.

On Sun, Nov 4, 2018 at 4:02 PM Brian Milby via use-livecode <
use-livecode@lists.runrev.com> wrote:

>
> put kList is not empty into pWithOverlaps
>

Good point -- I suppose it also makes sense (albeit that the speed
improvement would be trivial) to not bother even building kList if the term
to be found is a single character.

gc
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Brian Milby via use-livecode
My updated solution always looks for overlap but if none are found it uses 
optimized versions of the search (private functions instead of inside the main 
function). I special case for no overlap and a single overlap in the delimiter. 
It is about the same speed as Geoff’s.

Thanks,
Brian
On Nov 4, 2018, 6:34 PM -0600, Mark Wieder via use-livecode 
, wrote:
> On 11/4/18 10:40 AM, Geoff Canyon via use-livecode wrote:
> > I also added a "with overlaps" option.
>
> My problem with the pWithOverlaps parameter is that is requires a priori
> knowledge of the data being consumed. If you already know there are
> overlaps then you'd set the parameter to true. If you don't know whether
> or not there are overlaps, then you'd need to set it to true so you
> don't miss anything (aside, of course, for the trivial case where you
> don't care whether or not there are overlaps - is there a use case for
> this?).
>
> The only time you would set it to false is after you've already
> determined that there are no overlaps, and the time spent on that would
> probably more than offset the extra processing in the function.
>
> --
> Mark Wieder
> ahsoftw...@gmail.com
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Mark Wieder via use-livecode

On 11/4/18 10:40 AM, Geoff Canyon via use-livecode wrote:

I also added a "with overlaps" option.


My problem with the pWithOverlaps parameter is that is requires a priori 
knowledge of the data being consumed. If you already know there are 
overlaps then you'd set the parameter to true. If you don't know whether 
or not there are overlaps, then you'd need to set it to true so you 
don't miss anything (aside, of course, for the trivial case where you 
don't care whether or not there are overlaps - is there a use case for 
this?).


The only time you would set it to false is after you've already 
determined that there are no overlaps, and the time spent on that would 
probably more than offset the extra processing in the function.


--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Brian Milby via use-livecode
Logic matches my solution.  I also validated my solution using just the
offset function.  Speed hit for with overlap is similar.  One possible
optimization:

put kList is not empty into pWithOverlaps

If with overlaps was requested but the source delimiter did not contain any
overlaps, then the extra loops are skipped.

Adding a character to the end is clever.  I'll need to incorporate that and
see what it does to my method.

My take on the code updates is here:
https://github.com/bwmilby/alloffsets/blob/bwm/bwm/allOffsets_Scripts/stack_allOffsets_button_id_1026.livecodescript

Stack and index of scripts here:
https://github.com/bwmilby/alloffsets/tree/bwm/bwm

On Sun, Nov 4, 2018 at 12:42 PM Geoff Canyon via use-livecode <
use-livecode@lists.runrev.com> wrote:

> Alex, good catch! The code below and at
> https://github.com/gcanyon/alloffsets now puts a stop character after the
> string to prevent the error you found. I also added a "with overlaps"
> option. I think this is correct, and about as efficient as possible, but
> thanks to anyone who finds a bug or a faster way.
>
> gc
>
>
> function allOffsets D,S,pCase,pWithOverlaps
>-- returns a comma-delimited list of the offsets of D in S
>set the caseSensitive to pCase is true
>put length(D) into dLength
>put numtochar(chartonum(char -1 of D) mod 2 + 1) after S
>if pWithOverlaps then
>   repeat with i = 1 to dLength - 1
>  if not (char i + 1 to -1 of D is char 1 to dLength - i of D) then
> next repeat
>  put char -i to -1 of D into OV[i]
>  put i & cr after kList
>   end repeat
>end if
>set the itemDel to D
>put 1 - dLength into C
>if pWithOverlaps then
>   repeat for each item i in S
>  repeat for each line K in kList
> if char 1 to K of (i & D) is OV[K] then put (C + K),"" after R
>  end repeat
>  add length(i) + dLength to C
>  put C,"" after R
>   end repeat
>else
>   repeat for each item i in S
>  add length(i) + dLength to C
>  put C,"" after R
>   end repeat
>end if
>set the itemDel to comma
>repeat until item 1 of R > 0
>   delete item 1 of R
>end repeat
>delete item -1 of R
>if R is empty then return 0 else return char 1 to -2 of R
> end allOffsets
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-04 Thread Geoff Canyon via use-livecode
Alex, good catch! The code below and at
https://github.com/gcanyon/alloffsets now puts a stop character after the
string to prevent the error you found. I also added a "with overlaps"
option. I think this is correct, and about as efficient as possible, but
thanks to anyone who finds a bug or a faster way.

gc


function allOffsets D,S,pCase,pWithOverlaps
   -- returns a comma-delimited list of the offsets of D in S
   set the caseSensitive to pCase is true
   put length(D) into dLength
   put numtochar(chartonum(char -1 of D) mod 2 + 1) after S
   if pWithOverlaps then
  repeat with i = 1 to dLength - 1
 if not (char i + 1 to -1 of D is char 1 to dLength - i of D) then
next repeat
 put char -i to -1 of D into OV[i]
 put i & cr after kList
  end repeat
   end if
   set the itemDel to D
   put 1 - dLength into C
   if pWithOverlaps then
  repeat for each item i in S
 repeat for each line K in kList
if char 1 to K of (i & D) is OV[K] then put (C + K),"" after R
 end repeat
 add length(i) + dLength to C
 put C,"" after R
  end repeat
   else
  repeat for each item i in S
 add length(i) + dLength to C
 put C,"" after R
  end repeat
   end if
   set the itemDel to comma
   repeat until item 1 of R > 0
  delete item 1 of R
   end repeat
   delete item -1 of R
   if R is empty then return 0 else return char 1 to -2 of R
end allOffsets
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-03 Thread Brian Milby via use-livecode
t;> >>> offsets -- not my favorite because it's almost indistinguishable from
>> >> offset
>> >>> offsetsOf -- seems a tad clumsy
>> >>>
>> >>> On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
>> >>> use-livecode@lists.runrev.com> wrote:
>> >>>
>> >>>> It probably should be named listOffset, like itemOffset or
>> lineOffset.
>> >>>>
>> >>>> Bob S
>> >>>>
>> >>>>
>> >>>>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
>> >>>> use-livecode@lists.runrev.com> wrote:
>> >>>>> Nice! I *just* finished creating a github repository for it, and
>> adding
>> >>>>> support for multi-char search strings, much as you did. I was
>> coming to
>> >>>> the
>> >>>>> list to post the update when I saw your post.
>> >>>>>
>> >>>>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
>> >>>>>
>> >>>>> Here's my updated version:
>> >>>>>
>> >>>>> function offsetList D,S,pCase
>> >>>>>   -- returns a comma-delimited list of the offsets of D in S
>> >>>>>   set the caseSensitive to pCase is true
>> >>>>>   set the itemDel to D
>> >>>>>   put length(D) into dLength
>> >>>>>   put 1 - dLength into C
>> >>>>>   repeat for each item i in S
>> >>>>>  add length(i) + dLength to C
>> >>>>>  put C,"" after R
>> >>>>>   end repeat
>> >>>>>   set the itemDel to comma
>> >>>>>   if char -dLength to -1 of S is D then return char 1 to -2 of R
>> >>>>>   put length(C) + 1 into lenC
>> >>>>>   put length(R) into lenR
>> >>>>>   if lenC = lenR then return 0
>> >>>>>   return char 1 to lenR - lenC - 1 of R
>> >>>>> end offsetList
>> >>>>>
>> >>>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
>> >>>>> use-livecode@lists.runrev.com> wrote:
>> >>>>>
>> >>>>>> Hi Geoff,
>> >>>>>>
>> >>>>>> thank you for this beautiful script.
>> >>>>>>
>> >>>>>> I modified it a bit to accept multi-character search string and
>> also
>> >> for
>> >>>>>> case sensitivity.
>> >>>>>>
>> >>>>>> It definitely is a lot faster for unicode text than anything I have
>> >>>> seen.
>> >>>>>> -
>> >>>>>> function offsetList D,S, pCase
>> >>>>>>   -- returns a comma-delimited list of the offsets of D in S
>> >>>>>>   -- pCase is a boolean for caseSensitive
>> >>>>>>   set the caseSensitive to pCase
>> >>>>>>   set the itemDel to D
>> >>>>>>   put the length of D into tDelimLength
>> >>>>>>   repeat for each item i in S
>> >>>>>>  add length(i) + tDelimLength to C
>> >>>>>>  put C - (tDelimLength - 1),"" after R
>> >>>>>>   end repeat
>> >>>>>>   set the itemDel to comma
>> >>>>>>   if char -1 of S is D then return char 1 to -2 of R
>> >>>>>>   put length(C) + 1 into lenC
>> >>>>>>   put length(R) into lenR
>> >>>>>>   if lenC = lenR then return 0
>> >>>>>>   return char 1 to lenR - lenC - 1 of R
>> >>>>>> end offsetList
>> >>>>>> --
>> >>>>>>
>> >>>>>> Kind regards
>> >>>>>> Bernd
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
>> >>>>>>> From: Geoff Canyon
>> >>>>>>> To: How to use LiveCode 
>> >>>>>>> Subject: Re: How to find the

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-03 Thread Brian Milby via use-livecode
>>>   set the itemDel to D
> >>>>>   put length(D) into dLength
> >>>>>   put 1 - dLength into C
> >>>>>   repeat for each item i in S
> >>>>>  add length(i) + dLength to C
> >>>>>  put C,"" after R
> >>>>>   end repeat
> >>>>>   set the itemDel to comma
> >>>>>   if char -dLength to -1 of S is D then return char 1 to -2 of R
> >>>>>   put length(C) + 1 into lenC
> >>>>>   put length(R) into lenR
> >>>>>   if lenC = lenR then return 0
> >>>>>   return char 1 to lenR - lenC - 1 of R
> >>>>> end offsetList
> >>>>>
> >>>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
> >>>>> use-livecode@lists.runrev.com> wrote:
> >>>>>
> >>>>>> Hi Geoff,
> >>>>>>
> >>>>>> thank you for this beautiful script.
> >>>>>>
> >>>>>> I modified it a bit to accept multi-character search string and also
> >> for
> >>>>>> case sensitivity.
> >>>>>>
> >>>>>> It definitely is a lot faster for unicode text than anything I have
> >>>> seen.
> >>>>>> -
> >>>>>> function offsetList D,S, pCase
> >>>>>>   -- returns a comma-delimited list of the offsets of D in S
> >>>>>>   -- pCase is a boolean for caseSensitive
> >>>>>>   set the caseSensitive to pCase
> >>>>>>   set the itemDel to D
> >>>>>>   put the length of D into tDelimLength
> >>>>>>   repeat for each item i in S
> >>>>>>  add length(i) + tDelimLength to C
> >>>>>>  put C - (tDelimLength - 1),"" after R
> >>>>>>   end repeat
> >>>>>>   set the itemDel to comma
> >>>>>>   if char -1 of S is D then return char 1 to -2 of R
> >>>>>>   put length(C) + 1 into lenC
> >>>>>>   put length(R) into lenR
> >>>>>>   if lenC = lenR then return 0
> >>>>>>   return char 1 to lenR - lenC - 1 of R
> >>>>>> end offsetList
> >>>>>> --
> >>>>>>
> >>>>>> Kind regards
> >>>>>> Bernd
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
> >>>>>>> From: Geoff Canyon
> >>>>>>> To: How to use LiveCode 
> >>>>>>> Subject: Re: How to find the offset of the last instance of a
> >>>>>>>  repeating   character in a string?
> >>>>>>>
> >>>>>>> I was curious if using the itemDelimiter might work for this, so I
> >>>> wrote
> >>>>>>> the below code out of curiosity; but in my quick testing with
> >>>> single-byte
> >>>>>>> characters it was only about 30% faster than the above methods, so
> I
> >>>>>> didn't
> >>>>>>> bother to post it.
> >>>>>>>
> >>>>>>> But Ben Rubinstein just posted about a terrible slow-down doing
> >> pretty
> >>>>>> much
> >>>>>>> this same thing for text with unicode characters. So I ran a simple
> >>>> test
> >>>>>>> with 8000 character long strings that start with a single unicode
> >>>>>>> character, this is about 15x faster than offset() with skip. For
> >>>>>>> 100,000-character lines it's about 300x faster, so it seems to be
> >>>> immune
> >>>>>> to
> >>>>>>> the line-painter issues skip is subject to. So for what it's worth:
> >>>>>>>
> >>>>>>> function offsetList D,S
> >>>>>>> -- returns a comma-delimited list of the offsets of D in S
> >>>>>>> set the itemDel to D
> >>>>>>> repeat for each item i in S
> >>>>>>> add length(i) + 1 to C
> >>>>>>> put C,"" after R
> >>>>>>> end repeat
> >>>>>>> set the itemDel to comma
> >>>>>>> if char -1 of S is D then return char 1 to -2 of R
> >>>>>>> put length(C) + 1 into lenC
> >>>>>>> put length(R) into lenR
> >>>>>>> if lenC = lenR then return 0
> >>>>>>> return char 1 to lenR - lenC - 1 of R
> >>>>>>> end offsetList
> >>>>>>>
> >>>>>>
> >>>>>> ___
> >>>>>> use-livecode mailing list
> >>>>>> use-livecode@lists.runrev.com
> >>>>>> Please visit this url to subscribe, unsubscribe and manage your
> >>>>>> subscription preferences:
> >>>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>>>>>
> >>>>> ___
> >>>>> use-livecode mailing list
> >>>>> use-livecode@lists.runrev.com
> >>>>> Please visit this url to subscribe, unsubscribe and manage your
> >>>> subscription preferences:
> >>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>>>
> >>>> ___
> >>>> use-livecode mailing list
> >>>> use-livecode@lists.runrev.com
> >>>> Please visit this url to subscribe, unsubscribe and manage your
> >>>> subscription preferences:
> >>>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>>>
> >>> ___
> >>> use-livecode mailing list
> >>> use-livecode@lists.runrev.com
> >>> Please visit this url to subscribe, unsubscribe and manage your
> >> subscription preferences:
> >>> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>
> >> ___
> >> use-livecode mailing list
> >> use-livecode@lists.runrev.com
> >> Please visit this url to subscribe, unsubscribe and manage your
> >> subscription preferences:
> >> http://lists.runrev.com/mailman/listinfo/use-livecode
> >>
> > ___
> > use-livecode mailing list
> > use-livecode@lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-03 Thread Alex Tweedly via use-livecode

Hi Geoff,

unfortunately the impact of overlapping delimiter strings is more severe 
than simply not finding them. The code on github gets the wrong answer 
if there is an overlapping string at the very end of the search string, e.g.


alloffsets("", "a")    wrongly gives  1,5,10

I suspect the test for

 if char -dLength to -1 of S is D then return char 1 to -2 of R
should be (something like)
  if item -1 of S is empty then return char 1 to -2 of R
but to be honest, I'm not 10% certain of that.

Alex.



On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote:

I like that, changing it. Now available at
https://github.com/gcanyon/alloffsets

One thing I don't see how to do without significantly impacting performance
is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 1,3,5.
Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.

gc

On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:


how about allOffsets?

Bob S



On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <

use-livecode@lists.runrev.com> wrote:

All of those return a single value; I wanted to convey the concept of
returning multiple values. To me listOffset implies it does the same

thing

as itemOffset, since items come in a list. How about:

offsets -- not my favorite because it's almost indistinguishable from

offset

offsetsOf -- seems a tad clumsy

On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:


It probably should be named listOffset, like itemOffset or lineOffset.

Bob S



On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <

use-livecode@lists.runrev.com> wrote:

Nice! I *just* finished creating a github repository for it, and adding
support for multi-char search strings, much as you did. I was coming to

the

list to post the update when I saw your post.

Here's the GitHub link: https://github.com/gcanyon/offsetlist

Here's my updated version:

function offsetList D,S,pCase
  -- returns a comma-delimited list of the offsets of D in S
  set the caseSensitive to pCase is true
  set the itemDel to D
  put length(D) into dLength
  put 1 - dLength into C
  repeat for each item i in S
 add length(i) + dLength to C
 put C,"" after R
  end repeat
  set the itemDel to comma
  if char -dLength to -1 of S is D then return char 1 to -2 of R
  put length(C) + 1 into lenC
  put length(R) into lenR
  if lenC = lenR then return 0
  return char 1 to lenR - lenC - 1 of R
end offsetList

On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
use-livecode@lists.runrev.com> wrote:


Hi Geoff,

thank you for this beautiful script.

I modified it a bit to accept multi-character search string and also

for

case sensitivity.

It definitely is a lot faster for unicode text than anything I have

seen.

-
function offsetList D,S, pCase
  -- returns a comma-delimited list of the offsets of D in S
  -- pCase is a boolean for caseSensitive
  set the caseSensitive to pCase
  set the itemDel to D
  put the length of D into tDelimLength
  repeat for each item i in S
 add length(i) + tDelimLength to C
 put C - (tDelimLength - 1),"" after R
  end repeat
  set the itemDel to comma
  if char -1 of S is D then return char 1 to -2 of R
  put length(C) + 1 into lenC
  put length(R) into lenR
  if lenC = lenR then return 0
  return char 1 to lenR - lenC - 1 of R
end offsetList
--

Kind regards
Bernd






Date: Thu, 1 Nov 2018 00:15:37 -0700
From: Geoff Canyon
To: How to use LiveCode 
Subject: Re: How to find the offset of the last instance of a
 repeating   character in a string?

I was curious if using the itemDelimiter might work for this, so I

wrote

the below code out of curiosity; but in my quick testing with

single-byte

characters it was only about 30% faster than the above methods, so I

didn't

bother to post it.

But Ben Rubinstein just posted about a terrible slow-down doing

pretty

much

this same thing for text with unicode characters. So I ran a simple

test

with 8000 character long strings that start with a single unicode
character, this is about 15x faster than offset() with skip. For
100,000-character lines it's about 300x faster, so it seems to be

immune

to

the line-painter issues skip is subject to. So for what it's worth:

function offsetList D,S
-- returns a comma-delimited list of the offsets of D in S
set the itemDel to D
repeat for each item i in S
add length(i) + 1 to C
put C,"" after R
end repeat
set the itemDel to comma
if char -1 of S is D then return char 1 to -

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Brian Milby via use-livecode
Here is something... probably needs some optimization

function allOffsets2 D,S,pCase
   local dLength, C, R
   -- returns a comma-delimited list of the offsets of D in S
   set the caseSensitive to pCase is true
   set the itemDel to D
   put length(D) into dLength
   put 1 - dLength into C

   if dLength > 1 then
  local n, i, j, D2, L2
  put 0 into n
  repeat with i = 2 to dLength
 if char i to -1 of D is char 1 to -i of D then
add 1 to n
put char (1-i) to -1 of D into D2[n]
put i-1 into L2[n]
 end if
  end repeat
   end if

   repeat for each item i in S
  if C > 0 and n > 0 then
 repeat with j = 1 to n
if i&D begins with D2[j] then
   put C+L2[j],"" after R
end if
 end repeat
  end if
  add length(i) + dLength to C
  put C,"" after R
   end repeat
   set the itemDel to comma
   delete char -1 of R

   if item -1 of R > len(S) then
  if the number of items of R is 1 then
 return 0
  else
 delete item -1 of R
  end if
   end if

   if char -dLength to -1 of S is D then
  return R
   end if

   repeat with j = n down to 1
  if char -len(D2[j]) to -1 of S is D2[j] then
 delete item -1 of R
  end if
   end repeat
   return R
end allOffsets2


I think a couple of private functions would be good.  One for 0 overlap,
one for a single overlap, then a final general one for any number of
overlaps (the core of the above).  After the loop that generates D2/L2 I
would branch based on n to avoid the additional comparisons inside the loop.

On Fri, Nov 2, 2018 at 9:45 PM Alex Tweedly via use-livecode <
use-livecode@lists.runrev.com> wrote:

> Oh dear - answering my own posts  rarely a good sign :-)
>
>
> On 03/11/2018 02:10, Alex Tweedly via use-livecode wrote:
> >
> > On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote:
> >> One thing I don't see how to do without significantly impacting
> >> performance
> >> is to return all offsets if there are overlapping strings. For example:
> >>
> >> allOffsets("aba","abababa")
> >>
> >> would return 1,5, when it might be reasonable to expect it to return
> >> 1,3,5.
> >> Using the offset function with numToSkip would make that easy; adapting
> >> allOffsets to do so would be harder to do cleanly I think.
> >>
> > Can I suggest changing it to "someOffsets()" :-) :-)
> >
> > But seriously, can you not iteratively run "allofsets" ?
> >
> Answer : NO. That doesn't work.
> However, there is a more efficient way that does work - but it needs to
> be tested before I post it.
>
> -- Alex.
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Alex Tweedly via use-livecode

Oh dear - answering my own posts  rarely a good sign :-)


On 03/11/2018 02:10, Alex Tweedly via use-livecode wrote:


On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote:
One thing I don't see how to do without significantly impacting 
performance

is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 
1,3,5.

Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.


Can I suggest changing it to "someOffsets()" :-) :-)

But seriously, can you not iteratively run "allofsets" ?


Answer : NO. That doesn't work.
However, there is a more efficient way that does work - but it needs to 
be tested before I post it.


-- Alex.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Alex Tweedly via use-livecode


On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote:

I like that, changing it. Now available at
https://github.com/gcanyon/alloffsets

One thing I don't see how to do without significantly impacting performance
is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 1,3,5.
Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.


Can I suggest changing it to "someOffsets()" :-) :-)

But seriously, can you not iteratively run "allofsets" ?
something like  (typed straight into email - totally untested)

function allOffsets pDel, pStr
 repeat with c = 1 to 255  -- or some other upper limit ?
    if NOT pDel contains numtochar(c) then
       put numtochar(c) into c
       exit repeat
    end if
  end repeat
  repeat forever
    put someOffsets(pDel, pStr) into newR
    if the number of items in newR = 0 then exit repeat
    repeat for each item I in newR
       put c into char I of newR
    end repeat
    put newR after R
  end repeat
  sort items of R numeric
  return R
end alloffsets

-- Alex.

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Geoff Canyon via use-livecode
I like that, changing it. Now available at
https://github.com/gcanyon/alloffsets

One thing I don't see how to do without significantly impacting performance
is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 1,3,5.
Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.

gc

On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:

> how about allOffsets?
>
> Bob S
>
>
> > On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <
> use-livecode@lists.runrev.com> wrote:
> >
> > All of those return a single value; I wanted to convey the concept of
> > returning multiple values. To me listOffset implies it does the same
> thing
> > as itemOffset, since items come in a list. How about:
> >
> > offsets -- not my favorite because it's almost indistinguishable from
> offset
> > offsetsOf -- seems a tad clumsy
> >
> > On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
> > use-livecode@lists.runrev.com> wrote:
> >
> >> It probably should be named listOffset, like itemOffset or lineOffset.
> >>
> >> Bob S
> >>
> >>
> >>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
> >> use-livecode@lists.runrev.com> wrote:
> >>>
> >>> Nice! I *just* finished creating a github repository for it, and adding
> >>> support for multi-char search strings, much as you did. I was coming to
> >> the
> >>> list to post the update when I saw your post.
> >>>
> >>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
> >>>
> >>> Here's my updated version:
> >>>
> >>> function offsetList D,S,pCase
> >>>  -- returns a comma-delimited list of the offsets of D in S
> >>>  set the caseSensitive to pCase is true
> >>>  set the itemDel to D
> >>>  put length(D) into dLength
> >>>  put 1 - dLength into C
> >>>  repeat for each item i in S
> >>> add length(i) + dLength to C
> >>> put C,"" after R
> >>>  end repeat
> >>>  set the itemDel to comma
> >>>  if char -dLength to -1 of S is D then return char 1 to -2 of R
> >>>  put length(C) + 1 into lenC
> >>>  put length(R) into lenR
> >>>  if lenC = lenR then return 0
> >>>  return char 1 to lenR - lenC - 1 of R
> >>> end offsetList
> >>>
> >>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
> >>> use-livecode@lists.runrev.com> wrote:
> >>>
> >>>> Hi Geoff,
> >>>>
> >>>> thank you for this beautiful script.
> >>>>
> >>>> I modified it a bit to accept multi-character search string and also
> for
> >>>> case sensitivity.
> >>>>
> >>>> It definitely is a lot faster for unicode text than anything I have
> >> seen.
> >>>>
> >>>> -
> >>>> function offsetList D,S, pCase
> >>>>  -- returns a comma-delimited list of the offsets of D in S
> >>>>  -- pCase is a boolean for caseSensitive
> >>>>  set the caseSensitive to pCase
> >>>>  set the itemDel to D
> >>>>  put the length of D into tDelimLength
> >>>>  repeat for each item i in S
> >>>> add length(i) + tDelimLength to C
> >>>> put C - (tDelimLength - 1),"" after R
> >>>>  end repeat
> >>>>  set the itemDel to comma
> >>>>  if char -1 of S is D then return char 1 to -2 of R
> >>>>  put length(C) + 1 into lenC
> >>>>  put length(R) into lenR
> >>>>  if lenC = lenR then return 0
> >>>>  return char 1 to lenR - lenC - 1 of R
> >>>> end offsetList
> >>>> --
> >>>>
> >>>> Kind regards
> >>>> Bernd
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
> >>>>> From: Geoff Canyon
> >>>>> To: How to use LiveCode 
> >>>>> Subject: Re: How to 

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Bob Sneidar via use-livecode
how about allOffsets?

Bob S


> On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode 
>  wrote:
> 
> All of those return a single value; I wanted to convey the concept of
> returning multiple values. To me listOffset implies it does the same thing
> as itemOffset, since items come in a list. How about:
> 
> offsets -- not my favorite because it's almost indistinguishable from offset
> offsetsOf -- seems a tad clumsy
> 
> On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
> use-livecode@lists.runrev.com> wrote:
> 
>> It probably should be named listOffset, like itemOffset or lineOffset.
>> 
>> Bob S
>> 
>> 
>>> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
>> use-livecode@lists.runrev.com> wrote:
>>> 
>>> Nice! I *just* finished creating a github repository for it, and adding
>>> support for multi-char search strings, much as you did. I was coming to
>> the
>>> list to post the update when I saw your post.
>>> 
>>> Here's the GitHub link: https://github.com/gcanyon/offsetlist
>>> 
>>> Here's my updated version:
>>> 
>>> function offsetList D,S,pCase
>>>  -- returns a comma-delimited list of the offsets of D in S
>>>  set the caseSensitive to pCase is true
>>>  set the itemDel to D
>>>  put length(D) into dLength
>>>  put 1 - dLength into C
>>>  repeat for each item i in S
>>> add length(i) + dLength to C
>>> put C,"" after R
>>>  end repeat
>>>  set the itemDel to comma
>>>  if char -dLength to -1 of S is D then return char 1 to -2 of R
>>>  put length(C) + 1 into lenC
>>>  put length(R) into lenR
>>>  if lenC = lenR then return 0
>>>  return char 1 to lenR - lenC - 1 of R
>>> end offsetList
>>> 
>>> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
>>> use-livecode@lists.runrev.com> wrote:
>>> 
>>>> Hi Geoff,
>>>> 
>>>> thank you for this beautiful script.
>>>> 
>>>> I modified it a bit to accept multi-character search string and also for
>>>> case sensitivity.
>>>> 
>>>> It definitely is a lot faster for unicode text than anything I have
>> seen.
>>>> 
>>>> -
>>>> function offsetList D,S, pCase
>>>>  -- returns a comma-delimited list of the offsets of D in S
>>>>  -- pCase is a boolean for caseSensitive
>>>>  set the caseSensitive to pCase
>>>>  set the itemDel to D
>>>>  put the length of D into tDelimLength
>>>>  repeat for each item i in S
>>>>     add length(i) + tDelimLength to C
>>>> put C - (tDelimLength - 1),"" after R
>>>>  end repeat
>>>>  set the itemDel to comma
>>>>  if char -1 of S is D then return char 1 to -2 of R
>>>>  put length(C) + 1 into lenC
>>>>  put length(R) into lenR
>>>>  if lenC = lenR then return 0
>>>>  return char 1 to lenR - lenC - 1 of R
>>>> end offsetList
>>>> --
>>>> 
>>>> Kind regards
>>>> Bernd
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
>>>>> From: Geoff Canyon
>>>>> To: How to use LiveCode 
>>>>> Subject: Re: How to find the offset of the last instance of a
>>>>> repeating   character in a string?
>>>>> 
>>>>> I was curious if using the itemDelimiter might work for this, so I
>> wrote
>>>>> the below code out of curiosity; but in my quick testing with
>> single-byte
>>>>> characters it was only about 30% faster than the above methods, so I
>>>> didn't
>>>>> bother to post it.
>>>>> 
>>>>> But Ben Rubinstein just posted about a terrible slow-down doing pretty
>>>> much
>>>>> this same thing for text with unicode characters. So I ran a simple
>> test
>>>>> with 8000 character long strings that start with a single unicode
>>>>> character, this is about 15x faster than offset() with skip. For
>>>>> 100,000-character lines it's about 300x faster, so it seems to be
>> immune
>>>> to
>>>>> the line-painter issues skip is subject to. 

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Geoff Canyon via use-livecode
All of those return a single value; I wanted to convey the concept of
returning multiple values. To me listOffset implies it does the same thing
as itemOffset, since items come in a list. How about:

offsets -- not my favorite because it's almost indistinguishable from offset
offsetsOf -- seems a tad clumsy

On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:

> It probably should be named listOffset, like itemOffset or lineOffset.
>
> Bob S
>
>
> > On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
> use-livecode@lists.runrev.com> wrote:
> >
> > Nice! I *just* finished creating a github repository for it, and adding
> > support for multi-char search strings, much as you did. I was coming to
> the
> > list to post the update when I saw your post.
> >
> > Here's the GitHub link: https://github.com/gcanyon/offsetlist
> >
> > Here's my updated version:
> >
> > function offsetList D,S,pCase
> >   -- returns a comma-delimited list of the offsets of D in S
> >   set the caseSensitive to pCase is true
> >   set the itemDel to D
> >   put length(D) into dLength
> >   put 1 - dLength into C
> >   repeat for each item i in S
> >  add length(i) + dLength to C
> >  put C,"" after R
> >   end repeat
> >   set the itemDel to comma
> >   if char -dLength to -1 of S is D then return char 1 to -2 of R
> >   put length(C) + 1 into lenC
> >   put length(R) into lenR
> >   if lenC = lenR then return 0
> >   return char 1 to lenR - lenC - 1 of R
> > end offsetList
> >
> > On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
> > use-livecode@lists.runrev.com> wrote:
> >
> >> Hi Geoff,
> >>
> >> thank you for this beautiful script.
> >>
> >> I modified it a bit to accept multi-character search string and also for
> >> case sensitivity.
> >>
> >> It definitely is a lot faster for unicode text than anything I have
> seen.
> >>
> >> -
> >> function offsetList D,S, pCase
> >>   -- returns a comma-delimited list of the offsets of D in S
> >>   -- pCase is a boolean for caseSensitive
> >>   set the caseSensitive to pCase
> >>   set the itemDel to D
> >>   put the length of D into tDelimLength
> >>   repeat for each item i in S
> >>  add length(i) + tDelimLength to C
> >>  put C - (tDelimLength - 1),"" after R
> >>   end repeat
> >>   set the itemDel to comma
> >>   if char -1 of S is D then return char 1 to -2 of R
> >>   put length(C) + 1 into lenC
> >>   put length(R) into lenR
> >>   if lenC = lenR then return 0
> >>   return char 1 to lenR - lenC - 1 of R
> >> end offsetList
> >> --
> >>
> >> Kind regards
> >> Bernd
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> Date: Thu, 1 Nov 2018 00:15:37 -0700
> >>> From: Geoff Canyon
> >>> To: How to use LiveCode 
> >>> Subject: Re: How to find the offset of the last instance of a
> >>>  repeating   character in a string?
> >>>
> >>> I was curious if using the itemDelimiter might work for this, so I
> wrote
> >>> the below code out of curiosity; but in my quick testing with
> single-byte
> >>> characters it was only about 30% faster than the above methods, so I
> >> didn't
> >>> bother to post it.
> >>>
> >>> But Ben Rubinstein just posted about a terrible slow-down doing pretty
> >> much
> >>> this same thing for text with unicode characters. So I ran a simple
> test
> >>> with 8000 character long strings that start with a single unicode
> >>> character, this is about 15x faster than offset() with skip. For
> >>> 100,000-character lines it's about 300x faster, so it seems to be
> immune
> >> to
> >>> the line-painter issues skip is subject to. So for what it's worth:
> >>>
> >>> function offsetList D,S
> >>>  -- returns a comma-delimited list of the offsets of D in S
> >>>  set the itemDel to D
> >>>  repeat for each item i in S
> >>> add length(i) + 1 to C
> >>> put C,"" after R
> >>>  end repeat
> >>>  set the itemDel to comma
> >>>  if char -1 of S is D then return c

Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-02 Thread Bob Sneidar via use-livecode
It probably should be named listOffset, like itemOffset or lineOffset. 

Bob S


> On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode 
>  wrote:
> 
> Nice! I *just* finished creating a github repository for it, and adding
> support for multi-char search strings, much as you did. I was coming to the
> list to post the update when I saw your post.
> 
> Here's the GitHub link: https://github.com/gcanyon/offsetlist
> 
> Here's my updated version:
> 
> function offsetList D,S,pCase
>   -- returns a comma-delimited list of the offsets of D in S
>   set the caseSensitive to pCase is true
>   set the itemDel to D
>   put length(D) into dLength
>   put 1 - dLength into C
>   repeat for each item i in S
>  add length(i) + dLength to C
>  put C,"" after R
>   end repeat
>   set the itemDel to comma
>   if char -dLength to -1 of S is D then return char 1 to -2 of R
>   put length(C) + 1 into lenC
>   put length(R) into lenR
>   if lenC = lenR then return 0
>   return char 1 to lenR - lenC - 1 of R
> end offsetList
> 
> On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
> use-livecode@lists.runrev.com> wrote:
> 
>> Hi Geoff,
>> 
>> thank you for this beautiful script.
>> 
>> I modified it a bit to accept multi-character search string and also for
>> case sensitivity.
>> 
>> It definitely is a lot faster for unicode text than anything I have seen.
>> 
>> -
>> function offsetList D,S, pCase
>>   -- returns a comma-delimited list of the offsets of D in S
>>   -- pCase is a boolean for caseSensitive
>>   set the caseSensitive to pCase
>>   set the itemDel to D
>>   put the length of D into tDelimLength
>>   repeat for each item i in S
>>  add length(i) + tDelimLength to C
>>  put C - (tDelimLength - 1),"" after R
>>   end repeat
>>   set the itemDel to comma
>>   if char -1 of S is D then return char 1 to -2 of R
>>   put length(C) + 1 into lenC
>>   put length(R) into lenR
>>   if lenC = lenR then return 0
>>   return char 1 to lenR - lenC - 1 of R
>> end offsetList
>> --
>> 
>> Kind regards
>> Bernd
>> 
>> 
>> 
>> 
>> 
>>> 
>>> Date: Thu, 1 Nov 2018 00:15:37 -0700
>>> From: Geoff Canyon
>>> To: How to use LiveCode 
>>> Subject: Re: How to find the offset of the last instance of a
>>>  repeating   character in a string?
>>> 
>>> I was curious if using the itemDelimiter might work for this, so I wrote
>>> the below code out of curiosity; but in my quick testing with single-byte
>>> characters it was only about 30% faster than the above methods, so I
>> didn't
>>> bother to post it.
>>> 
>>> But Ben Rubinstein just posted about a terrible slow-down doing pretty
>> much
>>> this same thing for text with unicode characters. So I ran a simple test
>>> with 8000 character long strings that start with a single unicode
>>> character, this is about 15x faster than offset() with skip. For
>>> 100,000-character lines it's about 300x faster, so it seems to be immune
>> to
>>> the line-painter issues skip is subject to. So for what it's worth:
>>> 
>>> function offsetList D,S
>>>  -- returns a comma-delimited list of the offsets of D in S
>>>  set the itemDel to D
>>>  repeat for each item i in S
>>> add length(i) + 1 to C
>>> put C,"" after R
>>>  end repeat
>>>  set the itemDel to comma
>>>  if char -1 of S is D then return char 1 to -2 of R
>>>  put length(C) + 1 into lenC
>>>  put length(R) into lenR
>>>  if lenC = lenR then return 0
>>>  return char 1 to lenR - lenC - 1 of R
>>> end offsetList
>>> 
>> 
>> 
>> ___
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> 
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-01 Thread Geoff Canyon via use-livecode
Nice! I *just* finished creating a github repository for it, and adding
support for multi-char search strings, much as you did. I was coming to the
list to post the update when I saw your post.

Here's the GitHub link: https://github.com/gcanyon/offsetlist

Here's my updated version:

function offsetList D,S,pCase
   -- returns a comma-delimited list of the offsets of D in S
   set the caseSensitive to pCase is true
   set the itemDel to D
   put length(D) into dLength
   put 1 - dLength into C
   repeat for each item i in S
  add length(i) + dLength to C
  put C,"" after R
   end repeat
   set the itemDel to comma
   if char -dLength to -1 of S is D then return char 1 to -2 of R
   put length(C) + 1 into lenC
   put length(R) into lenR
   if lenC = lenR then return 0
   return char 1 to lenR - lenC - 1 of R
end offsetList

On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
use-livecode@lists.runrev.com> wrote:

> Hi Geoff,
>
> thank you for this beautiful script.
>
> I modified it a bit to accept multi-character search string and also for
> case sensitivity.
>
> It definitely is a lot faster for unicode text than anything I have seen.
>
> -
> function offsetList D,S, pCase
>-- returns a comma-delimited list of the offsets of D in S
>-- pCase is a boolean for caseSensitive
>set the caseSensitive to pCase
>set the itemDel to D
>put the length of D into tDelimLength
>repeat for each item i in S
>   add length(i) + tDelimLength to C
>   put C - (tDelimLength - 1),"" after R
>end repeat
>set the itemDel to comma
>if char -1 of S is D then return char 1 to -2 of R
>put length(C) + 1 into lenC
>put length(R) into lenR
>if lenC = lenR then return 0
>return char 1 to lenR - lenC - 1 of R
> end offsetList
> --
>
> Kind regards
> Bernd
>
>
>
>
>
> >
> > Date: Thu, 1 Nov 2018 00:15:37 -0700
> > From: Geoff Canyon
> > To: How to use LiveCode 
> > Subject: Re: How to find the offset of the last instance of a
> >   repeating   character in a string?
> >
> > I was curious if using the itemDelimiter might work for this, so I wrote
> > the below code out of curiosity; but in my quick testing with single-byte
> > characters it was only about 30% faster than the above methods, so I
> didn't
> > bother to post it.
> >
> > But Ben Rubinstein just posted about a terrible slow-down doing pretty
> much
> > this same thing for text with unicode characters. So I ran a simple test
> > with 8000 character long strings that start with a single unicode
> > character, this is about 15x faster than offset() with skip. For
> > 100,000-character lines it's about 300x faster, so it seems to be immune
> to
> > the line-painter issues skip is subject to. So for what it's worth:
> >
> > function offsetList D,S
> >   -- returns a comma-delimited list of the offsets of D in S
> >   set the itemDel to D
> >   repeat for each item i in S
> >  add length(i) + 1 to C
> >  put C,"" after R
> >   end repeat
> >   set the itemDel to comma
> >   if char -1 of S is D then return char 1 to -2 of R
> >   put length(C) + 1 into lenC
> >   put length(R) into lenR
> >   if lenC = lenR then return 0
> >   return char 1 to lenR - lenC - 1 of R
> > end offsetList
> >
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string? (Geoff Canyon)

2018-11-01 Thread Niggemann, Bernd via use-livecode
Hi Geoff,

thank you for this beautiful script.

I modified it a bit to accept multi-character search string and also for case 
sensitivity.

It definitely is a lot faster for unicode text than anything I have seen.

-
function offsetList D,S, pCase
   -- returns a comma-delimited list of the offsets of D in S
   -- pCase is a boolean for caseSensitive
   set the caseSensitive to pCase
   set the itemDel to D
   put the length of D into tDelimLength
   repeat for each item i in S
  add length(i) + tDelimLength to C
  put C - (tDelimLength - 1),"" after R
   end repeat
   set the itemDel to comma
   if char -1 of S is D then return char 1 to -2 of R
   put length(C) + 1 into lenC
   put length(R) into lenR
   if lenC = lenR then return 0
   return char 1 to lenR - lenC - 1 of R
end offsetList
--

Kind regards
Bernd





> 
> Date: Thu, 1 Nov 2018 00:15:37 -0700
> From: Geoff Canyon
> To: How to use LiveCode 
> Subject: Re: How to find the offset of the last instance of a
>   repeating   character in a string?
> 
> I was curious if using the itemDelimiter might work for this, so I wrote
> the below code out of curiosity; but in my quick testing with single-byte
> characters it was only about 30% faster than the above methods, so I didn't
> bother to post it.
> 
> But Ben Rubinstein just posted about a terrible slow-down doing pretty much
> this same thing for text with unicode characters. So I ran a simple test
> with 8000 character long strings that start with a single unicode
> character, this is about 15x faster than offset() with skip. For
> 100,000-character lines it's about 300x faster, so it seems to be immune to
> the line-painter issues skip is subject to. So for what it's worth:
> 
> function offsetList D,S
>   -- returns a comma-delimited list of the offsets of D in S
>   set the itemDel to D
>   repeat for each item i in S
>  add length(i) + 1 to C
>  put C,"" after R
>   end repeat
>   set the itemDel to comma
>   if char -1 of S is D then return char 1 to -2 of R
>   put length(C) + 1 into lenC
>   put length(R) into lenR
>   if lenC = lenR then return 0
>   return char 1 to lenR - lenC - 1 of R
> end offsetList
> 


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string?

2018-11-01 Thread Geoff Canyon via use-livecode
I was curious if using the itemDelimiter might work for this, so I wrote
the below code out of curiosity; but in my quick testing with single-byte
characters it was only about 30% faster than the above methods, so I didn't
bother to post it.

But Ben Rubinstein just posted about a terrible slow-down doing pretty much
this same thing for text with unicode characters. So I ran a simple test
with 8000 character long strings that start with a single unicode
character, this is about 15x faster than offset() with skip. For
100,000-character lines it's about 300x faster, so it seems to be immune to
the line-painter issues skip is subject to. So for what it's worth:

function offsetList D,S
   -- returns a comma-delimited list of the offsets of D in S
   set the itemDel to D
   repeat for each item i in S
  add length(i) + 1 to C
  put C,"" after R
   end repeat
   set the itemDel to comma
   if char -1 of S is D then return char 1 to -2 of R
   put length(C) + 1 into lenC
   put length(R) into lenR
   if lenC = lenR then return 0
   return char 1 to lenR - lenC - 1 of R
end offsetList

>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Kay C Lan via use-livecode
On Tue, Oct 30, 2018 at 2:33 AM Keith Clarke via use-livecode
 wrote:
>
> I’m trying to separate paths & pages from a list of URLs and so looking to 
> identify the position of the last ‘/‘ character.
>
If that is all you are after then I think setting the itemDelimiter to
"/" and separating the 'item -1' (page) from 'items 1 to -2' (path)
would give you a very simple a readable solution.  The only problem is
if you have the unlikely but not impossible situation where you have
paths that contain no pages.  Because of the known gotcha with LC and
how it counts items when the last item is empty you may need to
include and 'if' statement.

Try this, create a new Stack with a field and a button.

Into the field load the following text:

https://www.my.org/assets/general/february/
https://www.my.org/assets/general/march/
https://www.my.org/assets/general/april/2018.zip
https://www.my.org/assets/general/may/2018.zip
https://www.my.org/assets/general/june/2018.zip
https://www.my.org/assets/general/july/2018.zip
https://www.my.org/assets/general/july/2017.html
https://www.my.org/assets/general/july/2016.text
https://www.my.org/assets/general/july/2015.jpg
https://www.my.org/assets/general/august/2018.zip
https://www.my.org/assets/general/september/2018.zip
https://www.my.org/assets/general/october/2018.zip
https://www.my.org/assets/general/november/
https://www.my.org/assets/general/december/

Into the button load the following script (be careful of line breaks
there are 16 lines of code):

on mouseUp
   put fld 1 into tText
   set the itemDelimiter to "/"
   repeat for each line tLine in tText
  if (char -1 of tLine = "/") then --usual problem with dealing
with empty last items
 put empty into tPath[tLine]
  else
 if (tPath[item 1 to -2 of tLine] = empty) then  --initial entry
put item -1 of tLine into tPath[item 1 to -2 of tLine]
 else  --multiple entries
put tPath[item 1 to -2 of tLine] & cr & item -1 of tLine
into tPath[item 1 to -2 of tLine]
 end if
  end if
   end repeat
   breakpoint
end mouseUp

There is breakpoint at the end so the script will pause and you can
inspect the variables.  You'll see that an array is created with each
unique path as a key and each page its element.  In the case of 'july'
you will see that four pages are all listed, one per line.

From there it should open a world of possibilities to arrange, sort
and sift through the paths.

HTH

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Alex Tweedly via use-livecode

On 29/10/2018 22:32, Mark Wieder via use-livecode wrote:

On 10/29/2018 08:32 AM, Keith Clarke via use-livecode wrote:

I’m trying to separate paths & pages from a list of URLs and so 
looking to identify the position of the last ‘/‘ character.



How about 

function rightmostSlashOf p
   set the itemdelimiter to "/"
   return  (thenumberofcharsinp) - (thenumberofcharsinitem-1 ofp)
end rightmostSlashOf



___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Alex Tweedly via use-livecode

"toplevel/somename/another/somename"


On 29/10/2018 22:32, Mark Wieder via use-livecode wrote:

On 10/29/2018 08:32 AM, Keith Clarke via use-livecode wrote:

I’m trying to separate paths & pages from a list of URLs and so 
looking to identify the position of the last ‘/‘ character.


function rightmostSlashOf pText
   set the itemdelimiter to "/"
   return offset(item -1 of pText, pText)
end rightmostSlashOf




___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Bob Sneidar via use-livecode
Oh right you are! 

Bob S


> On Oct 29, 2018, at 16:04 , Mark Wieder via use-livecode 
>  wrote:
> 
> On 10/29/2018 03:55 PM, Bob Sneidar via use-livecode wrote:
>> That will only give him the item, not the character position.
> 
> Nope. It returns the position.
> 
> -- 
> Mark Wieder
> ahsoftw...@gmail.com
> 
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Mark Wieder via use-livecode

On 10/29/2018 03:55 PM, Bob Sneidar via use-livecode wrote:

That will only give him the item, not the character position.


Nope. It returns the position.

--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Bob Sneidar via use-livecode
That will only give him the item, not the character position. But it's a start. 
You can now get the number of characters of item 1 to -2 of pText +1. I didn't 
know the text you were searching had regular delimiters, and you were searching 
for the last delimiter. That makes things *much* easier. 

Bob S


> On Oct 29, 2018, at 15:32 , Mark Wieder via use-livecode 
>  wrote:
> 
> On 10/29/2018 08:32 AM, Keith Clarke via use-livecode wrote:
> 
>> I’m trying to separate paths & pages from a list of URLs and so looking to 
>> identify the position of the last ‘/‘ character.
> 
> function rightmostSlashOf pText
>   set the itemdelimiter to "/"
>   return offset(item -1 of pText, pText)
> end rightmostSlashOf
> 
> -- 
> Mark Wieder
> ahsoftw...@gmail.com
> 
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Mark Wieder via use-livecode

On 10/29/2018 08:32 AM, Keith Clarke via use-livecode wrote:


I’m trying to separate paths & pages from a list of URLs and so looking to 
identify the position of the last ‘/‘ character.


function rightmostSlashOf pText
   set the itemdelimiter to "/"
   return offset(item -1 of pText, pText)
end rightmostSlashOf

--
 Mark Wieder
 ahsoftw...@gmail.com

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Bob Sneidar via use-livecode
In dBase/Foxpro they had an AT function synonymous (roughly) with our offset 
function. They also had a RAT (Reverse AT) function. I needed something like 
this many moons ago. 

What I did to get all occurrences is I have a "pointer" variable I maintain 
with the position of the first character after the last instance of the string 
found. But to get the actual position in the original text, you have to add the 
pointer to the offset like so:

put 0 into tPointer
repeat
put offset(tVar, tTextChunk, tPointer) into tNextPos
if tNextPos = 0 then exit repeat
add tPointer to tNextPos
put char tNextPos to tNextPos + length(tVar) of tTextChunk into 
aFoundChunks [tNextPos] [length(tVar)]
put tNextPos + length(tVar) +1 into tPointer
end repeat

Something along those lines. Not tested, but you get the idea. 

Bob S


> On Oct 29, 2018, at 08:32 , Keith Clarke via use-livecode 
>  wrote:
> 
> Folks,
> Is there a simple way to find the offset of a character from the ‘right’ end 
> of a string, rather than the beginning - or alternatively get a list of all 
> occurrences?
> 
> I’m trying to separate paths & pages from a list of URLs and so looking to 
> identify the position of the last ‘/‘ character.
> 
> Thanks & regards,
> Keith
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Bob Sneidar via use-livecode
Looks like Devin beat me to it. :-)

Bob S


> On Oct 29, 2018, at 08:49 , Devin Asay via use-livecode 
>  wrote:
> 
> On Oct 29, 2018, at 9:32 AM, Keith Clarke via use-livecode 
>  wrote:
>> 
>> Folks,
>> Is there a simple way to find the offset of a character from the ‘right’ end 
>> of a string, rather than the beginning - or alternatively get a list of all 
>> occurrences?
>> 
>> I’m trying to separate paths & pages from a list of URLs and so looking to 
>> identify the position of the last ‘/‘ character.
>> 
>> Thanks & regards,
>> Keith
> 
> 
> There was a discussion on this topic on the list a few years ago, and I saved 
> these functions in my script library:
> 
> From Peter Brigham:
> These are utility functions I use constantly for text processing. 
> Offsets(str,cntr) returns a comma-delimited list of all the offsets of str in 
> ctnr. Lineoffsets(str,cntr) does the same with lineoffsets. Then you can 
> interate over the list of offsets to do whatever you want to each instance of 
> str in cntr. I keep them in a utility stack that is in the stackinuse, so it 
> is available to all stacks. I don't use regex, as I have never gotten the 
> regex syntax to stick in my head firmly enough to find it natural, and in any 
> case doing it by script turns out to be as fast or faster.
> 
> Peter's lineOffsets function returns a line number for each found char 
> offset. I added a function that returns only unique line numbers.
> 
> function offsets str,cntr
>-- returns a comma-delimited list of
>-- all the offsets of str in cntr
>put "" into oList
>put 0 into startPoint
>repeat
>put offset(str,cntr,startPoint) into os
>if os = 0 then exit repeat
>add os to startPoint
>put startPoint & "," after oList
>end repeat
>if oList = "" then return "0"
>return item 1 to -1 of oList
> end offsets
> 
> function lineOffsetsAll str,cntr
>-- returns a comma-delimited list of
>-- all the lineoffsets of str in cntr
># (returns a line number for ALL instances)
>put offsets(str,cntr) into charList
>if charList = "0" then return "0"
>put the number of items of charList into nbr
>put "" into oList
>repeat for each item n in charList
>put the number of lines of (char 1 to n of cntr) \
>& "," after oList
>end repeat
>return item 1 to -1 of oList
> end lineOffsetsAll
> 
> # added by Devin Asay
> function lineOffsets pStr,pSearchTxt
># (returns only unique line numbers)
>put empty into tList
>put 0 into tStartLine
>repeat 
>put lineOffset(pStr,pSearchTxt,tStartLine) into tLineNum
>if tLineNum = 0 then exit repeat
>add tLineNum to tStartLine
>put tStartLine & "," after tList
>end repeat
>if tList is empty then return "0"
>return item 1 to -1 of tList
> end lineOffsets
> 
> Hope this helps.
> 
> Devin
> 
> Devin Asay
> Director
> Office of Digital Humanities
> Brigham Young University
> 
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Keith Clarke via use-livecode
Perfect, thanks Devin - I was hoping to see ‘offsets’ in the docs under 
‘offset’, so this will do nicely! :-)
Best,
Keith   

> On 29 Oct 2018, at 15:49, Devin Asay via use-livecode 
>  wrote:
> 
> On Oct 29, 2018, at 9:32 AM, Keith Clarke via use-livecode 
>  wrote:
>> 
>> Folks,
>> Is there a simple way to find the offset of a character from the ‘right’ end 
>> of a string, rather than the beginning - or alternatively get a list of all 
>> occurrences?
>> 
>> I’m trying to separate paths & pages from a list of URLs and so looking to 
>> identify the position of the last ‘/‘ character.
>> 
>> Thanks & regards,
>> Keith
> 
> 
> There was a discussion on this topic on the list a few years ago, and I saved 
> these functions in my script library:
> 
> From Peter Brigham:
> These are utility functions I use constantly for text processing. 
> Offsets(str,cntr) returns a comma-delimited list of all the offsets of str in 
> ctnr. Lineoffsets(str,cntr) does the same with lineoffsets. Then you can 
> interate over the list of offsets to do whatever you want to each instance of 
> str in cntr. I keep them in a utility stack that is in the stackinuse, so it 
> is available to all stacks. I don't use regex, as I have never gotten the 
> regex syntax to stick in my head firmly enough to find it natural, and in any 
> case doing it by script turns out to be as fast or faster.
> 
> Peter's lineOffsets function returns a line number for each found char 
> offset. I added a function that returns only unique line numbers.
> 
> function offsets str,cntr
>-- returns a comma-delimited list of
>-- all the offsets of str in cntr
>put "" into oList
>put 0 into startPoint
>repeat
>put offset(str,cntr,startPoint) into os
>if os = 0 then exit repeat
>add os to startPoint
>put startPoint & "," after oList
>end repeat
>if oList = "" then return "0"
>return item 1 to -1 of oList
> end offsets
> 
> function lineOffsetsAll str,cntr
>-- returns a comma-delimited list of
>-- all the lineoffsets of str in cntr
># (returns a line number for ALL instances)
>put offsets(str,cntr) into charList
>if charList = "0" then return "0"
>put the number of items of charList into nbr
>put "" into oList
>repeat for each item n in charList
>put the number of lines of (char 1 to n of cntr) \
>& "," after oList
>end repeat
>return item 1 to -1 of oList
> end lineOffsetsAll
> 
> # added by Devin Asay
> function lineOffsets pStr,pSearchTxt
># (returns only unique line numbers)
>put empty into tList
>put 0 into tStartLine
>repeat 
>put lineOffset(pStr,pSearchTxt,tStartLine) into tLineNum
>if tLineNum = 0 then exit repeat
>add tLineNum to tStartLine
>put tStartLine & "," after tList
>end repeat
>if tList is empty then return "0"
>return item 1 to -1 of tList
> end lineOffsets
> 
> Hope this helps.
> 
> Devin
> 
> Devin Asay
> Director
> Office of Digital Humanities
> Brigham Young University
> 
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Devin Asay via use-livecode
On Oct 29, 2018, at 9:32 AM, Keith Clarke via use-livecode 
 wrote:
> 
> Folks,
> Is there a simple way to find the offset of a character from the ‘right’ end 
> of a string, rather than the beginning - or alternatively get a list of all 
> occurrences?
> 
> I’m trying to separate paths & pages from a list of URLs and so looking to 
> identify the position of the last ‘/‘ character.
> 
> Thanks & regards,
> Keith


There was a discussion on this topic on the list a few years ago, and I saved 
these functions in my script library:

From Peter Brigham:
These are utility functions I use constantly for text processing. 
Offsets(str,cntr) returns a comma-delimited list of all the offsets of str in 
ctnr. Lineoffsets(str,cntr) does the same with lineoffsets. Then you can 
interate over the list of offsets to do whatever you want to each instance of 
str in cntr. I keep them in a utility stack that is in the stackinuse, so it is 
available to all stacks. I don't use regex, as I have never gotten the regex 
syntax to stick in my head firmly enough to find it natural, and in any case 
doing it by script turns out to be as fast or faster.

Peter's lineOffsets function returns a line number for each found char offset. 
I added a function that returns only unique line numbers.

function offsets str,cntr
-- returns a comma-delimited list of
-- all the offsets of str in cntr
put "" into oList
put 0 into startPoint
repeat
put offset(str,cntr,startPoint) into os
if os = 0 then exit repeat
add os to startPoint
put startPoint & "," after oList
end repeat
if oList = "" then return "0"
return item 1 to -1 of oList
end offsets

function lineOffsetsAll str,cntr
-- returns a comma-delimited list of
-- all the lineoffsets of str in cntr
# (returns a line number for ALL instances)
put offsets(str,cntr) into charList
if charList = "0" then return "0"
put the number of items of charList into nbr
put "" into oList
repeat for each item n in charList
put the number of lines of (char 1 to n of cntr) \
& "," after oList
end repeat
return item 1 to -1 of oList
end lineOffsetsAll

# added by Devin Asay
function lineOffsets pStr,pSearchTxt
# (returns only unique line numbers)
put empty into tList
put 0 into tStartLine
repeat 
put lineOffset(pStr,pSearchTxt,tStartLine) into tLineNum
if tLineNum = 0 then exit repeat
add tLineNum to tStartLine
put tStartLine & "," after tList
end repeat
if tList is empty then return "0"
return item 1 to -1 of tList
end lineOffsets

Hope this helps.

Devin

Devin Asay
Director
Office of Digital Humanities
Brigham Young University

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

How to find the offset of the last instance of a repeating character in a string?

2018-10-29 Thread Keith Clarke via use-livecode
Folks,
Is there a simple way to find the offset of a character from the ‘right’ end of 
a string, rather than the beginning - or alternatively get a list of all 
occurrences?

I’m trying to separate paths & pages from a list of URLs and so looking to 
identify the position of the last ‘/‘ character.

Thanks & regards,
Keith
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode