Tom Lane writes:
> I wonder why you didn't propose Boyer-Moore instead, as that would have
> some advantage for natural language text as well.
> The difficulty with B-M is the need for a table indexed by character
> code, which at first glance looks impractical for wchars. But it seems
> to me that we could use "wchar % 256" as the table index, meaning that
> wchars with the same trailing byte share the same table entry. That
> would lose some efficiency compared to an exact implementation, but the
> limited table size would outweigh that except in the most pathological
> cases.
hash table?
The main difficulty with BM is verification and understanding "good
suffix shift" (the second part of BM) (I don't understand it entirely).
----
Ajtkulov Pavel
[EMAIL PROTECTED]
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly