On Thu, Jul 9, 2009 at 8:20 AM, Palit, Nilanjan<[email protected]> wrote:
>
> -----Original Message-----
> From: Ronald J Kimball
> Sent: Thursday, July 09, 2009 10:18 AM
> Subject: Re: [Boston.pm] binary search on a list of sorted strings in memory
>
>> Another option would be a dictionary tree, in which each node is a single
>> letter, so each word is represented by a path through the tree.  I don't
>> know how suitable it would be for this purpose.  It is a cool structure
>> though.  :)
>
> Back from CS algorithm class, the Boyer-Moore algorithm seems ideally
> suited for a fast pattern matching scheme that you are looking for.
> Because it searches in reverse order and can quickly find & abandon
> target strings that do *not* match, it seems its worth a look.

Boyer-Moore is a great way to speed up looking for a substring
anywhere in another string.  But when the substring has to be at the
start of the other string, Boyer-Moore is *not* as efficient as the
naive "compare the start".

> Plenty of Google hits to describe it.
>
> Apparently, Perl's RE optimizer uses this for simple string searches
> (http://mkweb.bcgsc.ca/intranet/perlbook/prog/ch24_02.htm).

My understanding is that it is great with ASCII, but the transition
tables get unwieldy with Unicode.

Ben

_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to