Any way to install the dictionary without the make? As in is there binary versions of it available? I am running postgresql on windows servers...
On 9/13/07, Oleg Bartunov <[EMAIL PROTECTED]> wrote: > > On Thu, 13 Sep 2007, Laimonas Simutis wrote: > > > Hey guys, > > > > maybe anyone using tsearch2 could advise on this. With the default > > installation, url, host and some other tokens are processed with the > simple > > dictionary. Thus term like mywebsite.com gets stored as 'mywebsite.com'. > The > > parser correctly assigns token id of type host to the term, but then the > > dictionary the terms gets routed through is simple and what gets stored > is > > mywebsite.com > > > > The questions are: > > > > 1) is there a dictionary available that I could utilize that will remove > > .com, .net, .org, etc? I could write one myself, but after seeing some > > sample dictionary implementations and C code I try to avoid, I got > scared a > > bit. > > Yes, we have dict_regex, which was developed by Sergey Karpov, see details > http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html > It uses pcre library and you need to know perl regexps. > > > > > 2) has anyone else dealt with this maybe in a different way? > > sure, preprocess text using prefered language before passing to > ro_tsvector > > > > > > > Thanks for any suggestions and help, > > > > Laimis > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 >