Re: How to get word offset all instances of a string in a chunk of text?

2018-08-31 Thread Richard Gaskin via use-livecode
Mike Kerner wrote: > Since the topic of processes came up a few weeks ago I've been > thinking about what it would take to build a process/threading > framework. I wonder if a text processing subprocessor, written > and copiled... I haven't yet come across good use cases for the desktop, but

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-31 Thread Mike Kerner via use-livecode
Since the topic of processes came up a few weeks ago I've been thinking about what it would take to build a process/threading framework. I wonder if a text processing subprocessor, written and copiled in 6 would be worth everyone's time. The main app would hand off the data and the command to

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-31 Thread Keith Clarke via use-livecode
Thanks Alex, HH & Jim for all the help & ideas. Just to close out the thread with a solution for future reference, the code below now extracts from a text source a list of unique words, cleaned up against a noise-word list, with word frequency, word & and a comma-delimited string of the word

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Curry Kenworthy via use-livecode
Jim: > This just doesn’t work in all cases That's the key though, don't repeat when it's not necessary! A day with no repeats is an efficient day. ;) Best wishes, Curry Kenworthy Custom Software Development LiveCode Training and Consulting http://livecodeconsulting.com/

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Jim Lambert via use-livecode
> I wrote: > > Then there is also this repeat-less approach using arrays and filter: > function findWordOffsets pText, pSearchTerm > put replaceText(pText,"\W+"," ") into pText > split pText by space > combine pText with cr and tab > filter pText with "*" & tab &

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Jim Lambert via use-livecode
> On 30/08/2018 10:24, Keith Clarke via use-livecode wrote: >> Folks, >> Is there a single-pass mechanism or more efficient way of returning the >> wordOffset of each instance of ?the? in ?the quick brown fox jumped over the >> lazy dog? than to use two passes through the text? Then there is

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Curry Kenworthy via use-livecode
hh: > Sadly LC 9 is at about 10 times slower > than LC 6 with such fast scripts. Yes, I've been doing some benchmarks and LC 9 usually takes anywhere from 2x to 8x as long to perform a job. With or without text being involved. It is a serious problem that should not be neglected across

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread hh via use-livecode
> Alex T. wrote: > > put 0 into tOffset > repeat for each trueWord W in tSource >add 1 to tOffset >if W = myWord then > put tOffset & comma after tOffsetList >end if > end repeat This is (whether trueWord or word chunks used) probably the fastest method for an offset counting

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread hh via use-livecode
For a more general context see http://www.runrev.com/pipermail/use-livecode//2004-February/032280.html Sadly LC 9 is at about 10 times slower than LC 6 with such fast scripts. For example LC 6.7.11 needs at about 500 ms to evaluate a 1 MByte string, LC 9.0.0 needs at about 5 seconds.

Re: How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Alex Tweedly via use-livecode
OK, this time I'm just typing into email - havent tested these suggestions :-) On 30/08/2018 10:24, Keith Clarke via use-livecode wrote: Folks, Is there a single-pass mechanism or more efficient way of returning the wordOffset of each instance of ‘the’ in ‘the quick brown fox jumped over the

How to get word offset all instances of a string in a chunk of text?

2018-08-30 Thread Keith Clarke via use-livecode
Folks, Is there a single-pass mechanism or more efficient way of returning the wordOffset of each instance of ‘the’ in ‘the quick brown fox jumped over the lazy dog’ than to use two passes through the text? Pass-1. Count the instances of ‘the’ into an array and then Pass-2. Repeat for the count