> > I have tested with local-enabled environment and found a bug. Included
> > is the new version of patches. 
> Your patch causes crash on tsearch2's installcheck with 'initdb -E UTF8 
> --locale 
> C', simple way to reproduce:
> # select to_tsquery('default', '''New York''');
> server closed the connection unexpectedly
>          This probably means the server terminated abnormally
>          before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

It seems it's a bug with original tsearch2. Here is the patches.

------------------------------------------------------------------
*** wordparser/parser.c~        2007-01-07 09:54:39.000000000 +0900
--- wordparser/parser.c 2007-01-11 10:33:41.000000000 +0900
***************
*** 51,57 ****
        if (prs->charmaxlen > 1)
        {
                prs->usewide = true;
!               prs->wstr = (wchar_t *) palloc(sizeof(wchar_t) * prs->lenstr);
                prs->lenwstr = char2wchar(prs->wstr, prs->str, prs->lenstr);
        }
        else
--- 51,57 ----
        if (prs->charmaxlen > 1)
        {
                prs->usewide = true;
!               prs->wstr = (wchar_t *) palloc(sizeof(wchar_t) * 
(prs->lenstr+1));
                prs->lenwstr = char2wchar(prs->wstr, prs->str, prs->lenstr);
        }
        else
------------------------------------------------------------------

> >> ! static int p_isalnum(TParser *prs) {
> ...
> >> !          if (lc_ctype_is_c())
> >> !          {
> >> !                  if (c > 0x7f)
> >> !                          return 1;
> 
> I have some some doubts that any character greater than 0x7f is an alpha 
> symbol. 
> Is it simple assumption or workaround?

Yeah, it's a workaround. Since there's no concept other than
alpha/numeric/latin in tsearch2, Asian characters have to be fall in
one of them.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to