Hi Geoff,

unfortunately the impact of overlapping delimiter strings is more severe than simply not finding them. The code on github gets the wrong answer if there is an overlapping string at the very end of the search string, e.g.

alloffsets("aaaa", "aaaaaaaaa")    wrongly gives  1,5,10

I suspect the test for

 if char -dLength to -1 of S is D then return char 1 to -2 of R
should be (something like)
  if item -1 of S is empty then return char 1 to -2 of R
but to be honest, I'm not 10% certain of that.

Alex.



On 03/11/2018 00:43, Geoff Canyon via use-livecode wrote:
I like that, changing it. Now available at
https://github.com/gcanyon/alloffsets

One thing I don't see how to do without significantly impacting performance
is to return all offsets if there are overlapping strings. For example:

allOffsets("aba","abababa")

would return 1,5, when it might be reasonable to expect it to return 1,3,5.
Using the offset function with numToSkip would make that easy; adapting
allOffsets to do so would be harder to do cleanly I think.

gc

On Fri, Nov 2, 2018 at 12:17 PM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:

how about allOffsets?

Bob S


On Nov 2, 2018, at 09:16 , Geoff Canyon via use-livecode <
use-livecode@lists.runrev.com> wrote:
All of those return a single value; I wanted to convey the concept of
returning multiple values. To me listOffset implies it does the same
thing
as itemOffset, since items come in a list. How about:

offsets -- not my favorite because it's almost indistinguishable from
offset
offsetsOf -- seems a tad clumsy

On Fri, Nov 2, 2018 at 7:41 AM Bob Sneidar via use-livecode <
use-livecode@lists.runrev.com> wrote:

It probably should be named listOffset, like itemOffset or lineOffset.

Bob S


On Nov 1, 2018, at 17:04 , Geoff Canyon via use-livecode <
use-livecode@lists.runrev.com> wrote:
Nice! I *just* finished creating a github repository for it, and adding
support for multi-char search strings, much as you did. I was coming to
the
list to post the update when I saw your post.

Here's the GitHub link: https://github.com/gcanyon/offsetlist

Here's my updated version:

function offsetList D,S,pCase
  -- returns a comma-delimited list of the offsets of D in S
  set the caseSensitive to pCase is true
  set the itemDel to D
  put length(D) into dLength
  put 1 - dLength into C
  repeat for each item i in S
     add length(i) + dLength to C
     put C,"" after R
  end repeat
  set the itemDel to comma
  if char -dLength to -1 of S is D then return char 1 to -2 of R
  put length(C) + 1 into lenC
  put length(R) into lenR
  if lenC = lenR then return 0
  return char 1 to lenR - lenC - 1 of R
end offsetList

On Thu, Nov 1, 2018 at 8:28 AM Niggemann, Bernd via use-livecode <
use-livecode@lists.runrev.com> wrote:

Hi Geoff,

thank you for this beautiful script.

I modified it a bit to accept multi-character search string and also
for
case sensitivity.

It definitely is a lot faster for unicode text than anything I have
seen.
-----------------------------
function offsetList D,S, pCase
  -- returns a comma-delimited list of the offsets of D in S
  -- pCase is a boolean for caseSensitive
  set the caseSensitive to pCase
  set the itemDel to D
  put the length of D into tDelimLength
  repeat for each item i in S
     add length(i) + tDelimLength to C
     put C - (tDelimLength - 1),"" after R
  end repeat
  set the itemDel to comma
  if char -1 of S is D then return char 1 to -2 of R
  put length(C) + 1 into lenC
  put length(R) into lenR
  if lenC = lenR then return 0
  return char 1 to lenR - lenC - 1 of R
end offsetList
------------------------------

Kind regards
Bernd





Date: Thu, 1 Nov 2018 00:15:37 -0700
From: Geoff Canyon
To: How to use LiveCode <use-livecode@lists.runrev.com>
Subject: Re: How to find the offset of the last instance of a
     repeating       character in a string?

I was curious if using the itemDelimiter might work for this, so I
wrote
the below code out of curiosity; but in my quick testing with
single-byte
characters it was only about 30% faster than the above methods, so I
didn't
bother to post it.

But Ben Rubinstein just posted about a terrible slow-down doing
pretty
much
this same thing for text with unicode characters. So I ran a simple
test
with 8000 character long strings that start with a single unicode
character, this is about 15x faster than offset() with skip. For
100,000-character lines it's about 300x faster, so it seems to be
immune
to
the line-painter issues skip is subject to. So for what it's worth:

function offsetList D,S
-- returns a comma-delimited list of the offsets of D in S
set the itemDel to D
repeat for each item i in S
    add length(i) + 1 to C
    put C,"" after R
end repeat
set the itemDel to comma
if char -1 of S is D then return char 1 to -2 of R
put length(C) + 1 into lenC
put length(R) into lenR
if lenC = lenR then return 0
return char 1 to lenR - lenC - 1 of R
end offsetList


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to