Hi Michael, Doing an index for every possible substring would be slow and add a large overhead: it would generally be unsuitable for substring matching. In Visual Studio 2010 intellisense only matches based on starts with OR pascal cased words inside the string; so that does make it more viable for indexing the same string multiple times based on words inside the string . If the requirement is exactly the same as 2010, then a tree would be a better approach as you could index each Word in each string based purely on Case. If the requirement is a case insensitive substring search, then a tree is likely to be much slower and have a large memory overhead. For typical business applications the "Like" matching via SQL, or a case insensitive sub string is what is usually wanted, definitely not a case sensitive.
|-----Original Message----- |From: [email protected] [mailto:ozdotnet- |[email protected]] On Behalf Of Michael Minutillo |Sent: Monday, 17 May 2010 7:07 PM |To: ozDotNet |Subject: Re: Filtering algorithm strategies to mimic intellisense | |This is how it works and I'd bet that the IntelliSense index is based on some kind |of trie (or a http://en.wikipedia.org/wiki/Radix_tree) built from this data. You |can still do a "contains" search with a trie but you'd have to index every |possible substring (which could become a time/space concern): i.e. | |void AddString(string newString) { | AddCore(newString); // Actually adds the string to the trie | if(newString.Length > 1) | AddString(newString.SubString(1)); |} | |On Mon, May 17, 2010 at 10:04 AM, Matt Siebert <[email protected]> |wrote: | | | I think VS2010 uses more than just simple 'contains' logic for matching |substrings. It distinguishes between upper and lower case chars to delimit |substrings. | | For example, suppose I have a variable named "fooBar", then |intellisense filters nicely when I type "foo" or "bar". "FB" kind of works but |doesn't actually select "fooBar". "fBar" and "ooBar" don't seem to do anything. | | I suspect its done this way so intellisense can break up the identifiers |into reasonable substrings that are likely to be used for lookups. These |substrings can then be used to index the identifiers using an appropriate |structure. Matches on these indexed substrings are then done using |StartsWith() instead of Contains(), so if the substrings are sorted then a more |efficient search can be used. | | On Mon, May 17, 2010 at 11:13 AM, Winston Pang |<[email protected]> wrote: | | | yeah i did realise it must be a O(n^2) operation. | | Still trying to figure out how I can improve it. Since a contains |operation will naturally be an O(n). | | I'm wondering if there's other forms of optimizations i can do as |well. | | | On Mon, May 17, 2010 at 10:35 AM, Mitch Wheat |<[email protected]> wrote: | | | Sorry, I meant order O(N) for the linear search. | | | | If you search degrades rapidly as the list increases, that |doesn't sound like O(N) performance, but more like O(N^2). | | | | Mitch | | | | From: [email protected] [mailto:ozdotnet- |[email protected]] On Behalf Of Mitch Wheat | Sent: Monday, 17 May 2010 8:33 AM | To: 'ozDotNet' | Subject: RE: Filtering algorithm strategies to mimic |intellisense | | | | Performing a linear search is O(N^2), so it will perform |poorly as the list grows, as you have observed. | | | | Either keep the list sorted and use binary search O(log N) |or use some other standard data structure such as a tree. | | | | Mitch | | | | From: [email protected] [mailto:ozdotnet- |[email protected]] On Behalf Of Winston Pang | Sent: Monday, 17 May 2010 7:41 AM | To: ozDotNet | Subject: Filtering algorithm strategies to mimic |intellisense | | | | Hey everyone, | | So I'm building a intellisense like autocomplete. I've |stumbled on some perf issues because its literally just iterating over the list and |re-applying a filter function on every item, based on every key stroke. The perf |degrades obviously as more times is in the list. | | I was wondering, does anyone have any strategies or past |expeirence with smarter ways to filter a list. Now this list is just a list of a |custom objects, with a particular display field that is a string. Filtering is filtered |based on a contains for each item. So it's essentially mimicing VS2010's |contains for intellisense. | | The VS2010 intellisense seems to be extremely speedy. | | Cheers, | | | Winston | | | | | | |-- |Michael M. Minutillo |Indiscriminate Information Sponge |Blog: http://wolfbyte-net.blogspot.com
