Re: [HACKERS] Extending range of to_tsvector et al

2012-10-01 Thread Tom Lane
john knightley writes: > On Mon, Oct 1, 2012 at 11:58 AM, Dan Scott wrote: >> So... perhaps LC_CTYPE=C is a possible workaround for you? > LC_CTYPE would not be a work around - this database needs to be in > utf8 , the full text search is to be used for a mediawiki. You're confusing locale and

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
On Mon, Oct 1, 2012 at 11:58 AM, Dan Scott wrote: > Hi John: > > On Sun, Sep 30, 2012 at 11:45 PM, john knightley > wrote: >> Dear Dan, >> >> thank you for your reply. >> >> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on >> a utf8 local >> >> A short 5 line dictionary file

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
On Mon, Oct 1, 2012 at 12:11 PM, Tom Lane wrote: > john knightley writes: >> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on >> a utf8 local > >> A short 5 line dictionary file is sufficient to test:- > >> raeuz >> 我们 >> 𦘭𥎵 >> 𪽖𫖂 >> 󶒘󴮬 > >> line 1 "raeuz" Zhuang word writte

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Tom Lane
john knightley writes: > The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on > a utf8 local > A short 5 line dictionary file is sufficient to test:- > raeuz > 我们 > 𦘭𥎵 > 𪽖𫖂 > 󶒘󴮬 > line 1 "raeuz" Zhuang word written using English letters and show up > unde

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Dan Scott
Hi John: On Sun, Sep 30, 2012 at 11:45 PM, john knightley wrote: > Dear Dan, > > thank you for your reply. > > The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on > a utf8 local > > A short 5 line dictionary file is sufficient to test:- > > raeuz > 我们 > 𦘭𥎵 > 𪽖𫖂 > 󶒘󴮬 > > line 1

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
Dear Dan, thank you for your reply. The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on a utf8 local A short 5 line dictionary file is sufficient to test:- raeuz 我们 𦘭𥎵 𪽖𫖂 󶒘󴮬 line 1 "raeuz" Zhuang word written using English letters and show up under ts_vector ok line 2 "我们" u

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Dan Scott
On Sun, Sep 30, 2012 at 1:56 PM, johnkn63 wrote: > When using to_tsvector a number of newer unicode characters and pua > characters are not included. How do I add the characters which I desire to > be found? I've just started digging into this code a bit, but from what I've found src/backend/tse