Say you had a custom property set - "theProps" And the keys of this set were 1 through 500,000
So, basically, you had an array with half a million elements stored as a custom property set. Then, you wanted to search that array, and do it in such a way that the search returned the name and content of the first custom property that contains the item for which you search.
Would the following method be fastest?
Set the custompropertyset to "theProps" Put 0 into Z Repeat for each element E in the customproperties of <myobject> Add 1 to Z If E contains <searchterm> then exit repeat End repeat Put Z && the customproperties[Z] of <myobject> into field "feedback"
My questions are this: 1) Is there a better or faster-access way to store the array than as a custom property set?
Using "repeat for each line" on a string benchmarked about 20% faster here than using "repeat for each element" on an array.
I only tested on 40,000 lines, though. In general "repeat for each..." scales well, so I would feel fairly confident extrapolating my results to larger data sets.
2) Is there a search method that is faster than doing all those comparisons in transcript? For example, lineoffset and itemoffset are supposed to be very fast. Is it possible to use itemoffset on an array, or is there anything that works like an elementoffset command would work, if it existed? 3) Would it be faster to combine the array, and use itemoffset?
Offset can rip through large blocks of text very fast, but it's not very precise.
For example, in my case I had to do comparisons on specific items within a line. LineOffset will get you to that line, but won't tell you where within that line it is.
With a low number of hits lineOffset can be faster to find the line, and then you could evaluate specific elements to find the item if needed.
But for larger numbers of hits it should be slower, since once you find the line you still need to "get line x", and that requires the engine to count lines. Requiring the engine to count lines is the bottleneck.
So for my purposes, using "repeat for each line" gave me a consistently scalable solution which allowed me to query any items within a line without ever having to count lines. So lazy person that I am, I stopped there and moved on to other things. :)
4) Could filter be made to work in this situation, or would it only give the value of the element, but not the name of the element?
Filter may benchmark the fastest if you're looking for an item anywhere in a line (never tried it myself, since I need to find matches for specific items within lines, but worth testing).
5) Anything faster that I am not thinking of?
Probably. Search algorithms are a deep subject, and there's always one more clever way to solve a given problem.
I stopped benchmarking these things once I found that "repeat for each line" was coming out okay. Because I have so many comparisons to perform on specific items within a line, it gave me a robust (though admittedly brute force) solutuion with acceptable performance on data sets larger than will be needed in real-world performance with my app's audience.
But with half a million records it begs the question: Why not consider a database, where searching is done by compiled code optimized by people who specialize in such things?
-- Richard Gaskin Fourth World Media Corporation __________________________________________________ Rev tools and more: http://www.fourthworld.com/rev _______________________________________________ use-revolution mailing list [email protected] http://lists.runrev.com/mailman/listinfo/use-revolution
