I wrote:
> Teodor Sigaev <[EMAIL PROTECTED]> writes:
>> 2 Snowball's compiling infrastructure doesn't support Windows target.

> Yeah.  Another problem with using their original source code is that
> running the Snowball compiler during build would not work for
> cross-compiled builds of Postgres, at least not without solving the
> problem of building some code for the host platform instead of the
> target.

> So what I'm thinking now is we should import libstemmer instead of the
> snowball_code representation.  I haven't gotten as far as thinking about
> exactly how to lay out the files though.

I've done some more work on this point.  After looking at the Snowball
code in more detail, I'm thinking it'd be a good idea to keep it at
arm's length in a loadable shared library, instead of incorporating it
directly into the backend.  This is because they don't see anything
wrong with exporting random global function names like "eq_v" and
"skip_utf8"; so the probability of name collisions is a bit too high for
my taste.  The current tsearch_core patch envisions having a couple of
the snowball stemmers in the core backend and the rest in a loadable
library, but I suggest we just put them all in a loadable library, with
the only entry points being snowball_init() and snowball_lexize()
tsearch dictionary support functions.  (I am thinking of having just one
such function pair, with the init function taking an init option to
select which stemmer to use, instead of a separate Postgres function
pair per stemmer.)

Attached is a rough proof-of-concept patch for this.  It doesn't do
anything useful, but it does prove that we can compile and link the
Snowball stemmers into a Postgres loadable module with only trivial
changes to their source code.  The code compiles cleanly (zero warnings
in gcc).  The file layout is

src/backend/snowball/Makefile           our files
src/backend/snowball/README
src/backend/snowball/dict_snowball.c
src/backend/snowball/libstemmer/*.c     their .c files

src/include/snowball/header.h           intercepting .h file
src/include/snowball/libstemmer/*.h     their .h files

If there're no objections, I'll push forward with completing the
dictionary support functions to go with this infrastructure.

                        regards, tom lane

Attachment: binodtypuIVWP.bin
Description: snowball-add.tar.gz

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to