> "Find" commands accept wild cards and evaluate using collation algorithms (case-insensitive comparison plus some > other locale specific rules) is it really fair to compare the two against object keys?
I'm not sure what "fair" means here, but it's definitely not a apples-to-apples comparison. Find in array and Find in sorted array are variants on the same thing, so they're easy to compare. Object keys? No idea. As far as I can tell, 4D refused to offer any information about how the work that the company will stand behind publicly. Thats okay, I'm used to black box testing and I like it...but it's time-consuming. The point of these comparisons isn't to figure out if one approach is "better" than another so much as how they work *in the real world.* Put another way, the goal is to come up with some rules of thumb about what to use when. Binary search kicks ass, and I know why. Object key lookups kick ass, and I don't know why. My take-away is to * Use objects when when they're easier or more appropriate for the problem at hand. * Use sorted arrays when when they're easier or more appropriate for the problem at hand. * Don't shift from arrays to objects based on a notion that they're "faster." * Consider objects instead of arrays if you don't have or can't be sure of a sorted array order because object key lookups are way faster than sequential array traversal (Find in array.) * Don't worry about speed at all unless you've got a solid reason to. Thinking best on my tests, a few points for anyone that wants to tweak them: -- If you want sequential searches to look better, just search for the first items. Search time should be directly related to the position of the target in the array. I avoided this trap on purpose. -- I used very small text values for lookups and keys! Long strings might behave differently, I don't know. I would actually find that an interesting result, if anyone feels like checking. -- The object keys are inserted into the test object in sorted order. This should not make any difference if there's a hash underneath, but we don't actually know that. Although it does see likely. From the few results I've gotten, I'd wildly guess that there's: -- An excellent hash function where "excellent" means "low collision, high dispersal and fast." -- A secondary structure off the hash bins that is itself smart. So, not a linked list (The CS 101 approach), but a second good hash or a tree of some kind. Or something else. -- A pretty large range of hash slots to reduce secondary lookup times. -- Probably some smart scheme for changing the hash table size dynamically under stress. That's an expensive maneuver (or normally is, I can think of ways to make it not too expensive.) Just speculating, I'm probably wrong in every detail here. Doesn't matter. It's a black box. ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:[email protected] **********************************************************************

