I wrote: > Teodor Sigaev <[EMAIL PROTECTED]> writes: >> 2 Snowball's compiling infrastructure doesn't support Windows target.
> Yeah. Another problem with using their original source code is that > running the Snowball compiler during build would not work for > cross-compiled builds of Postgres, at least not without solving the > problem of building some code for the host platform instead of the > target. > So what I'm thinking now is we should import libstemmer instead of the > snowball_code representation. I haven't gotten as far as thinking about > exactly how to lay out the files though. I've done some more work on this point. After looking at the Snowball code in more detail, I'm thinking it'd be a good idea to keep it at arm's length in a loadable shared library, instead of incorporating it directly into the backend. This is because they don't see anything wrong with exporting random global function names like "eq_v" and "skip_utf8"; so the probability of name collisions is a bit too high for my taste. The current tsearch_core patch envisions having a couple of the snowball stemmers in the core backend and the rest in a loadable library, but I suggest we just put them all in a loadable library, with the only entry points being snowball_init() and snowball_lexize() tsearch dictionary support functions. (I am thinking of having just one such function pair, with the init function taking an init option to select which stemmer to use, instead of a separate Postgres function pair per stemmer.) Attached is a rough proof-of-concept patch for this. It doesn't do anything useful, but it does prove that we can compile and link the Snowball stemmers into a Postgres loadable module with only trivial changes to their source code. The code compiles cleanly (zero warnings in gcc). The file layout is src/backend/snowball/Makefile our files src/backend/snowball/README src/backend/snowball/dict_snowball.c src/backend/snowball/libstemmer/*.c their .c files src/include/snowball/header.h intercepting .h file src/include/snowball/libstemmer/*.h their .h files If there're no objections, I'll push forward with completing the dictionary support functions to go with this infrastructure. regards, tom lane
binodtypuIVWP.bin
Description: snowball-add.tar.gz
---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster