RE: Filtering algorithm strategies to mimic intellisense

Bill McCarthy Mon, 17 May 2010 03:24:20 -0700

Hi Michael,

Doing an index for every possible substring would be slow and add a large
overhead: it would generally be unsuitable for substring matching. In Visual
Studio 2010 intellisense only matches based on starts with OR pascal cased
words inside the string; so that does make it more viable for indexing the
same string multiple times based on words inside the string . 
If the requirement is exactly the same as 2010, then a tree would be a
better approach as you could index each Word in each string based purely on
Case. If the requirement is a case insensitive substring search, then a tree
is likely to be much slower and have a large memory overhead.  For typical
business applications the "Like" matching via SQL, or a case insensitive sub
string is what is usually wanted, definitely not a case sensitive.



|-----Original Message-----
|From: [email protected] [mailto:ozdotnet-
|[email protected]] On Behalf Of Michael Minutillo
|Sent: Monday, 17 May 2010 7:07 PM
|To: ozDotNet
|Subject: Re: Filtering algorithm strategies to mimic intellisense
|
|This is how it works and I'd bet that the IntelliSense index is based on
some kind
|of trie (or a http://en.wikipedia.org/wiki/Radix_tree) built from this
data. You
|can still do a "contains" search with a trie but you'd have to index every
|possible substring (which could become a time/space concern): i.e.
|
|void AddString(string newString) {
|  AddCore(newString); // Actually adds the string to the trie
|  if(newString.Length > 1)
|    AddString(newString.SubString(1));
|}
|
|On Mon, May 17, 2010 at 10:04 AM, Matt Siebert <[email protected]>
|wrote:
|
|
|       I think VS2010 uses more than just simple 'contains' logic for
matching
|substrings.  It distinguishes between upper and lower case chars to delimit
|substrings.
|
|       For example, suppose I have a variable named "fooBar", then
|intellisense filters nicely when I type "foo" or "bar".  "FB" kind of works
but
|doesn't actually select "fooBar".  "fBar" and "ooBar" don't seem to do
anything.
|
|       I suspect its done this way so intellisense can break up the
identifiers
|into reasonable substrings that are likely to be used for lookups.  These
|substrings can then be used to index the identifiers using an appropriate
|structure.  Matches on these indexed substrings are then done using
|StartsWith() instead of Contains(), so if the substrings are sorted then a
more
|efficient search can be used.
|
|       On Mon, May 17, 2010 at 11:13 AM, Winston Pang
|<[email protected]> wrote:
|
|
|               yeah i did realise it must be a O(n^2) operation.
|
|               Still trying to figure out how I can improve it. Since a
contains
|operation will naturally be an O(n).
|
|               I'm wondering if there's other forms of optimizations i can
do as
|well.
|
|
|               On Mon, May 17, 2010 at 10:35 AM, Mitch Wheat
|<[email protected]> wrote:
|
|
|                       Sorry, I meant order O(N) for the linear search.
|
|
|
|                       If you search degrades rapidly as the list
increases, that
|doesn't sound like O(N) performance, but more like O(N^2).
|
|
|
|                       Mitch
|
|
|
|                       From: [email protected]
[mailto:ozdotnet-
|[email protected]] On Behalf Of Mitch Wheat
|                       Sent: Monday, 17 May 2010 8:33 AM
|                       To: 'ozDotNet'
|                       Subject: RE: Filtering algorithm strategies to mimic
|intellisense
|
|
|
|                       Performing a linear search is O(N^2), so it will
perform
|poorly as the list grows, as you have observed.
|
|
|
|                       Either keep the list sorted and use binary search
O(log N)
|or use some other standard data structure such as a tree.
|
|
|
|                       Mitch
|
|
|
|                       From: [email protected]
[mailto:ozdotnet-
|[email protected]] On Behalf Of Winston Pang
|                       Sent: Monday, 17 May 2010 7:41 AM
|                       To: ozDotNet
|                       Subject: Filtering algorithm strategies to mimic
|intellisense
|
|
|
|                       Hey everyone,
|
|                       So I'm building a intellisense like autocomplete.
I've
|stumbled on some perf issues because its literally just iterating over the
list and
|re-applying a filter function on every item, based on every key stroke. The
perf
|degrades obviously as more times is in the list.
|
|                       I was wondering, does anyone have any strategies or
past
|expeirence with smarter ways to filter a list. Now this list is just a list
of a
|custom objects, with a particular display field that is a string. Filtering
is filtered
|based on a contains for each item. So it's essentially mimicing VS2010's
|contains for intellisense.
|
|                       The VS2010 intellisense seems to be extremely
speedy.
|
|                       Cheers,
|
|
|                       Winston
|
|
|
|
|
|
|--
|Michael M. Minutillo
|Indiscriminate Information Sponge
|Blog: http://wolfbyte-net.blogspot.com

RE: Filtering algorithm strategies to mimic intellisense

Reply via email to