Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Trevor Talbot
Digging through the simple vs advanced user discussion, I don't think expression indexes are really the right idea. It seems a bit fragile, you need a certain amount of knowledge about the optimizer to figure out if your queries can even use the index, and it's just plain ugly. It also seems like

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?f

2007-08-18 Thread Bruce Momjian
Josh Berkus wrote: Bruce, Oh, so you want the config inside each tsvector value. ?Interesting idea. Yeah, hasn't anyone suggested this before? It seems like the obvious solution. A TSvector constructed with en_US is NOT the same as a vector constructed with fr_FR and it's silly to

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Bruce Momjian
Trevor Talbot wrote: Digging through the simple vs advanced user discussion, I don't think expression indexes are really the right idea. It seems a bit fragile, you need a certain amount of knowledge about the optimizer to figure out if your queries can even use the index, and it's just plain

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Trevor Talbot
On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Remember an expression index can be a user-created function so you can embed whatever you want in your function and just index it's output, just like you would with a trigger creating a separate column. Well, you could create a function that

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes: Tom and Bruce, what version of patch you're using ? Bruce complained about using OID in arguments of functions, but AFAIR, it was removed in 0.58 version of patch. I'm working from the 0.58 version --- that was the latest last I looked. There are still

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Bruce Momjian
Trevor Talbot wrote: On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Remember an expression index can be a user-created function so you can embed whatever you want in your function and just index it's output, just like you would with a trigger creating a separate column. Well, you

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: Trevor Talbot wrote: On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Remember an expression index can be a user-created function so you can embed whatever you want in your function and just index it's output, just like

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Trevor Talbot
On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Trevor Talbot wrote: Well, you could create a function that returns a tsvector, but how do you get that to work with queries? I've been under the impression the expressions need to match (in the normal case, be the same function with

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Bruce Momjian
Joshua D. Drake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: Trevor Talbot wrote: On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Remember an expression index can be a user-created function so you can embed whatever you want in your function and just

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-18 Thread Bruce Momjian
Trevor Talbot wrote: On 8/18/07, Bruce Momjian [EMAIL PROTECTED] wrote: Trevor Talbot wrote: Well, you could create a function that returns a tsvector, but how do you get that to work with queries? I've been under the impression the expressions need to match (in the normal case, be

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Robert Treat
On Thursday 16 August 2007 15:58, Bruce Momjian wrote: Josh Berkus wrote: All, First off, I'll assert that backup/restore is a serious issue and while the folks who want Tsearch in core now are dismissing it, we'll be fielding the complaints later. Any solution which involves setting a

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Bruce Momjian
Robert Treat wrote: The only remaining problem I see is that the rest of the documentation relies heavily on default_text_search_config when in fact the most common usage with tables and indexes can't use it. tsquery can use the default easily, but I am betting that tsvector usually

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes: On Thu, 16 Aug 2007, Josh Berkus wrote: First off, I'll assert that backup/restore is a serious issue and while the folks who want Tsearch in core now are dismissing it, we'll be fielding the complaints later. Any solution which involves setting a GUC

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Luke Lonergan
on ma treo -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, August 17, 2007 11:28 AM Eastern Standard Time To: Oleg Bartunov Cc: Josh Berkus; pgsql-hackers@postgresql.org Subject:Re: [HACKERS] tsearch2 in PostgreSQL 8.3? Oleg Bartunov [EMAIL

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes: What exactly does default_text_search_config buy us? I think it is supposed to simplify things, but it sounds like it adds a bunch of corner cases, Well, the main thing we'd lose if we remove it is all trace of upward compatibility from the contrib

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Bruce Momjian
on ma treo -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Friday, August 17, 2007 11:28 AM Eastern Standard Time To: Oleg Bartunov Cc: Josh Berkus; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] tsearch2 in PostgreSQL 8.3? Oleg Bartunov

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: Well, the main thing we'd lose if we remove it is all trace of upward compatibility from the contrib version of tsearch. I don't think this is all that big of a deal. In fact I would expect it going from contrib to core and never had

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Bruce Momjian
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: Here's something not to forget in this whole business: the present TSearch2 implementation permits you to have a different tsvector configuration for each *row*, not just each column. That is, applications can be built with per-cell

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Josh Berkus
Folks, Here's something not to forget in this whole business: the present TSearch2 implementation permits you to have a different tsvector configuration for each *row*, not just each column. That is, applications can be built with per-cell configs. I know of at least one out there: Ubuntu's

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: What exactly does default_text_search_config buy us? I think it is supposed to simplify things, but it sounds like it adds a bunch of corner cases, Well, the main thing we'd lose if we

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Here's something not to forget in this whole business: the present TSearch2 implementation permits you to have a different tsvector configuration for each *row*, not just each column. That is, applications can be built with per-cell configs. Certainly.

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Bruce Momjian
Josh Berkus wrote: Folks, Here's something not to forget in this whole business: the present TSearch2 implementation permits you to have a different tsvector configuration for each *row*, not just each column. That is, applications can be built with per-cell configs. I know of at

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Ron Mayer
Joshua D. Drake wrote: Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: What exactly does default_text_search_config buy us? I think it is supposed to simplify things, but it sounds like it adds a bunch of corner cases, Well, the main thing we'd lose if we remove it is all trace of

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Josh Berkus
Bruce, Oh, so you want the config inside each tsvector value.  Interesting idea. Yeah, hasn't anyone suggested this before? It seems like the obvious solution. A TSvector constructed with en_US is NOT the same as a vector constructed with fr_FR and it's silly to pretend that they are

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Josh Berkus
Tom, It might be an obvious solution, but to some other problem than the one we have. The problem we are trying to address is how to know which config to use to construct a *new* tsvector. Oh, right. Back to the circular arguments then ... -- --Josh Josh Berkus PostgreSQL @ Sun San

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Oh, so you want the config inside each tsvector value. Interesting idea. Yeah, hasn't anyone suggested this before? It seems like the obvious solution. It might be an obvious solution, but to some other problem than the one we have. The problem we are

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread tomas
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Aug 17, 2007 at 04:06:15PM -0700, Josh Berkus wrote: Bruce, Oh, so you want the config inside each tsvector value.  Interesting idea. Yeah, hasn't anyone suggested this before? It seems like the obvious solution. A TSvector

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Mike Rylander
On 8/18/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Aug 17, 2007 at 04:06:15PM -0700, Josh Berkus wrote: Bruce, Oh, so you want the config inside each tsvector value. Interesting idea. Yeah, hasn't anyone suggested this

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Oleg Bartunov
On Sat, 18 Aug 2007, [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Aug 17, 2007 at 04:06:15PM -0700, Josh Berkus wrote: Bruce, Oh, so you want the config inside each tsvector value. б═Interesting idea. Yeah, hasn't anyone suggested this before? It

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Oleg Bartunov
Tom and Bruce, what version of patch you're using ? Bruce complained about using OID in arguments of functions, but AFAIR, it was removed in 0.58 version of patch. I and Teodor are very busy and just can't follow all discussions, so we have to rely on people's wisdom. If we have so many

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-17 Thread Oleg Bartunov
On Sat, 18 Aug 2007, Mike Rylander wrote: On 8/18/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Aug 17, 2007 at 04:06:15PM -0700, Josh Berkus wrote: Bruce, Oh, so you want the config inside each tsvector value. Interesting idea.

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Michael Paesold
Bruce Momjian wrote: Uh, how are we going to prevent the auto-casting to tsvector from using the default GUC config, e.g. CREATE INDEX i ON x USING GIN(col)? This is where I started to see the need for education and error-prone nature of the default GUC just wasn't worth having it, though I

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes: Teodor Sigaev wrote: 2) added operator class for text and varchar CREATE INDEX idxname ON tblname USING GIN ( textcolumn ); So just remove the operator class or don't specify it as default operator class for GIN, and the thing is gone. Perhaps there

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Yeah, unfortunately we overlooked the implications of the conversion to tsvector being environment-dependent. Those opclasses will have to go away again. AFAICS the only safe way to build an index directly on a text column is CREATE INDEX idxname ON

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Is there a null configuration which could be the default for the casts? If we did that then there would be no point in having a GUC variable, because it wouldn't control anything. While that is certainly an alternative solution, I think it's riskier than

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Josh Berkus
All, First off, I'll assert that backup/restore is a serious issue and while the folks who want Tsearch in core now are dismissing it, we'll be fielding the complaints later. Any solution which involves setting a GUC at restore time *which could vary per table or even column* isn't

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Oleg Bartunov
On Thu, 16 Aug 2007, Josh Berkus wrote: All, First off, I'll assert that backup/restore is a serious issue and while the folks who want Tsearch in core now are dismissing it, we'll be fielding the complaints later. Any solution which involves setting a GUC at restore time *which could vary

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-16 Thread Bruce Momjian
Josh Berkus wrote: All, First off, I'll assert that backup/restore is a serious issue and while the folks who want Tsearch in core now are dismissing it, we'll be fielding the complaints later. Any solution which involves setting a GUC at restore time *which could vary per table or even

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since I don't think that a datatype solution is the way to go, I don't feel that we are as far away from an agreement as Bruce is worried about. Well, from where I sit, there is one person saying give me the foot gun,

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Oleg Bartunov
On Wed, 15 Aug 2007, Magnus Hagander wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since I don't think that a datatype solution is the way to go, I don't feel that we are as far away from an agreement as Bruce is worried about. Well, from where I sit,

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Simon Riggs
On Tue, 2007-08-14 at 17:41 -0400, Tom Lane wrote: I've just finished re-reading the prior thread, and here are what seem to me to be the salient points: * Oleg, Teodor, and all of the old-line users of tsearch2 are comfortable with setting up a trigger to maintain a materialized tsvector

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Simon Riggs
On Wed, 2007-08-15 at 08:10 +0200, Magnus Hagander wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since I don't think that a datatype solution is the way to go, I don't feel that we are as far away from an agreement as Bruce is worried about. Well,

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Bruce Momjian
Magnus Hagander wrote: But I would like a design that is bulletproof in dump/reload scenarios, and I think it's fair to question that aspect of the tsearch2 design because we've seen many reports of people having trouble updating databases that use tsearch2. dump/reload is *the* biggest

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Magnus Hagander
On Wed, Aug 15, 2007 at 10:23:00AM -0400, Bruce Momjian wrote: Magnus Hagander wrote: But I would like a design that is bulletproof in dump/reload scenarios, and I think it's fair to question that aspect of the tsearch2 design because we've seen many reports of people having trouble

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: ISTM that the functional index would be considerably smaller than the additional column approach, since tsvectors can be quite long. That seems like a very desirable thing with larger textbases. However, without an additional column certain queries would

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Ron Mayer
Magnus Hagander wrote: I don't use the functional index part, but for new users I can see how that's certainly a *lot* easier. Can someone with modern hardware test to see if it's still quite a bit slower than the extra column. I had tried it too years ago; and found the functional index to

[HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
A lot of work has been done to try to get /contrib/tsearch2 into the core backend for 8.3, but we have hit a roadblock in how to handle multiple text search configurations. (FYI, the documentation is at http://momjian.us/expire/textsearch/HTML/textsearch.html.) There are three options for

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Alvaro Herrera
Bruce Momjian wrote: A lot of work has been done to try to get /contrib/tsearch2 into the core backend for 8.3, but we have hit a roadblock in how to handle multiple text search configurations. (FYI, the documentation is at http://momjian.us/expire/textsearch/HTML/textsearch.html.) There

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Heikki Linnakangas
Bruce Momjian wrote: #3 requires more code and is probably not something we want to do at this stage in 8.3 development. It requires passing typmod values between functions and operators (not something we have done easily in the past). It does? I was thinking of implicitly creating a new

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Heikki Linnakangas
Alvaro Herrera wrote: What is the worst consequence of mismatching configuration? Does it cause a system crash? A backend hang? A corrupted index? Lost data? Or does it, as I assume, just fail to return the exact result set that would be returned if the correct configuration was supplied?

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Kenneth Marshall
On Tue, Aug 14, 2007 at 03:15:44PM -0400, Alvaro Herrera wrote: Bruce Momjian wrote: A lot of work has been done to try to get /contrib/tsearch2 into the core backend for 8.3, but we have hit a roadblock in how to handle multiple text search configurations. (FYI, the documentation is at

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Heikki Linnakangas wrote: Bruce Momjian wrote: #3 requires more code and is probably not something we want to do at this stage in 8.3 development. It requires passing typmod values between functions and operators (not something we have done easily in the past). It does? I was thinking

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Alvaro Herrera
Bruce Momjian wrote: Basically, the default GUC doesn't work because of: error prone if super-user only, non-super-user doesn't work on restore if non-super-user, can cause mismatch (perhaps this is the best option), and restore still a problem (no storage of

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Alvaro Herrera wrote: Bruce Momjian wrote: Basically, the default GUC doesn't work because of: error prone if super-user only, non-super-user doesn't work on restore if non-super-user, can cause mismatch (perhaps this is the best option), and restore still a

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Mike Rylander
On 8/14/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Bruce Momjian wrote: Basically, the default GUC doesn't work because of: error prone if super-user only, non-super-user doesn't work on restore if non-super-user, can cause mismatch (perhaps this is the best

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Alvaro Herrera wrote: I haven't really seen anyone else arguing about this. I wonder whether you are being overly zealous about it. Uh, OK, but no one has told me how a database restore without a configuration name would work, so I am all ears. It's

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Mike Rylander wrote: On 8/14/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Bruce Momjian wrote: Basically, the default GUC doesn't work because of: error prone if super-user only, non-super-user doesn't work on restore if non-super-user, can cause mismatch

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: The other point is that we should have a good idea of the API because if it gets into 8.3 it will be harder to change. Yeah, once it's in core we have a pretty strong backwards-compatibility restriction to deal with. Someone upthread claimed we can

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Tom Lane wrote: * For queries, there is not anything very wrong with having a default configuration, but the trick is how to get the planner to match that up with an index that's written with the two-parameter form of to_tsvector. One hackish possibility is to define the single-parameter form

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since I don't think that a datatype solution is the way to go, I don't feel that we are as far away from an agreement as Bruce is worried about. Well, from where I sit, there is one person saying give me the foot gun, and Heikki saying

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: The other point is that we should have a good idea of the API because if it gets into 8.3 it will be harder to change. Yeah, once it's in core we have a pretty strong backwards-compatibility restriction to deal with. Someone

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since I don't think that a datatype solution is the way to go, I don't feel that we are as far away from an agreement as Bruce is worried about. Well, from where I sit, there is one person saying give me the foot

Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-14 Thread Andrew Dunstan
Bruce Momjian wrote: The people who actually use tsearch2 seem to all have the same opinion ... so I think we can't go too far in the bullet-proofing direction. Yeah. But I would like a design that is bulletproof in dump/reload scenarios, and I think it's fair to question that aspect