"David Rowley" <[EMAIL PROTECTED]> writes: > I've made and attached the changes Heikki recommended.
I looked this over a bit and was immediately confused by one thing: the introductory comment says that the skip table size ought to be based on the length of the haystack, which makes sense to me, but the code is actually initializing it on the basis of len2, ie, the length of the needle. Isn't that a bug? Was the same bug present in the tests you made to determine the best table sizes? Stylistic issues: I really dislike having two copies of the heuristic size-setting code. I think it's worth introducing a second test of use_wchar in order to arrange text_position_setup like this: ... existing code ... choose skiptablesize initialize skip table (this loop doesn't depend on char width) if (!state->use_wchar) process char needle else process wide-char needle Also it might be worth having local-variable copies of skiptablesize and len2 for this initialization loop: for (ai = 0; ai <= state->skiptablesize; ai++) state->skiptable[ai] = state->len2; I'm not at all sure whether a C compiler will think that it's allowed to avoid fetching these values again on each iteration, seeing that the loop is storing into other integer members of *state. Given that this loop is exactly the overhead we're trying to control by adjusting skiptablesize, making it as tight as possible seems worthwhile. > Now that the skip table is a member of TextPositionState, I was not quite > sure if I should #define a size for it. It would certainly look neater, only > the code that defines the skip table size in text_position_start assumes 256 > elements. I tend to agree that a #define is pointless there, since there's no easy way to make the relevant code do anything with it. It would be worthwhile to point out adjacent to the code what the max length is, though. I had gotten as far as rewriting your introductory comment like this /* * Prepare the skip table for Boyer-Moore-Horspool searching. First we * must determine how big a skip table to use. The declaration of * TextPositionState allows up to 256 elements, but with haystack lengths * of only a few characters we don't really want to have to initialize so * many elements --- it would take too long in comparison to the actual * search time. So we choose a useful skip table size based on the * haystack length. before noticing that what I was writing wasn't what the code actually did. Another point that doesn't seem to be included in your comments is that the skip table size *must* be a power of 2 because we use bit masking. (In fact state->skiptablesize might better be named skiptablemask since it's really 1 less than the table size.) BTW, try to avoid introducing so much random vertical space. IMHO it does little for readability and much to make routines too long to fit in one screenful. Generally the PG code doesn't use double blank lines anywhere inside a function, nor blank lines before a right brace line. pg_indent will clean this up to some extent, but it would be better if the code looked more like the project style in the first place. regards, tom lane -- Sent via pgsql-patches mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches