Re: Trying to code the fastest algorithm...

Alex Tweedly Sun, 05 Jun 2005 13:14:36 -0700

jbv wrote:


yes, you're right. That was a typo.
Actually my script is slightly more complex, and I had to strip
off a couple of things to make the problem easier to explain...
hence the typo...

So here's a clean version of my code :

put "" into newList
repeat for each line j in mySentences
   put j into a
   replace " " with itemdelimiter in a
   put "" into b

   repeat for each item i in a
       get itemoffset(i,myReference)
       if it>0 then
           put it & itemdelimiter after b
       end if
   end repeat
   delete last char in b

   if b contains itemdelimiter then
       put b & cr after newList
   end if
end repeat

I don't see anything here which "keeps only those sentences containing
more than 1 word" ?


it's the test :
   if b contains itemdelimiter then
which runs about 25% to 30% faster than
   if number of items of b > 1 then

cute !! very nice - I'll remember that one ....

Since you're asking for "fastest" algorithm, I assume at least one of
these numbers is large - and it's unlikely the same algorithm would
excel at both ends of the spectrum .....

I have 2 or 3 approaches in mind ... any clues you have on the
characteristics of the data would help decide which ones to pursue .....
e.g.   build an assoc array of the words which occur in myReferences,
where the content of the array element is the word number within the
references, and lookup each word in the sentence with that ... should be
faster than "itemOffset" but probably not enough to justify it if the
number of words in myReferences is very small.


The "array" approach crossed my mind, but I'm afraid it's not feasable :
the set of reference words is unpredictable. It is actually derived from a
sentence entered by the end user, and after some processing / selection
a set words is extracted from that sentence and then compared with other
sentences in a data base...

Still feasible ....( typing straight into email - typos possible )

put empty into myArray
put 0 into count
repeat for each item W in myReference
   add 1 to count
   put W && count & cr after myArray
end repeat
split myArray by CR and space

I'll try some simple experiments based on the figures you gave.... morelater.


--
Alex Tweedly       http://www.tweedly.net

No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.6.2 - Release Date: 04/06/2005

_______________________________________________
use-revolution mailing list
[email protected]
http://lists.runrev.com/mailman/listinfo/use-revolution

Re: Trying to code the fastest algorithm...

Reply via email to