Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-30 Thread Josh Berkus
Ishii-san, Ok, probably we need to copy the English stemming rule to the one for Japanese. Pardon my ignorance here, but is the concept of stemming even relevant to Japanese/Chinese/Korean? What little I know about ideographic languages suggests it wouldn't work well. And surely the specific

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-30 Thread Tatsuo Ishii
Ishii-san, Ok, probably we need to copy the English stemming rule to the one for Japanese. Pardon my ignorance here, but is the concept of stemming even relevant to Japanese/Chinese/Korean? What little I know about ideographic languages suggests it wouldn't work well. And surely the

Re: [HACKERS] tsearch in core patch

2007-06-27 Thread Teodor Sigaev
But why do you need them to be different at all? Just make it russian Russian_Russia russian ru_RU Does that not work for some reason? I'd like to have unique names of configuration. So, if user sets GUC variable or call function with configuration's name then postgres should not have

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-25 Thread Mike Rylander
On 6/25/07, Tom Lane [EMAIL PROTECTED] wrote: Well, it's not hard at all to find chunks of English text that have embedded bits of French, Spanish, or what-have-you, but that's not an argument for trying to intermix the stemmers. I doubt that such simple bits of program could tell the language

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-25 Thread Tom Lane
Mike Rylander [EMAIL PROTECTED] writes: I can certainly understand the benefit of making the default configuration a simple locale to language map, but there are definitely uses for searching using different stemmers/stop-lists even within the same corpus/index. So, as a datapoint for the

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-25 Thread Mike Rylander
On 6/25/07, Tom Lane [EMAIL PROTECTED] wrote: Mike Rylander [EMAIL PROTECTED] writes: I can certainly understand the benefit of making the default configuration a simple locale to language map, but there are definitely uses for searching using different stemmers/stop-lists even within the

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-24 Thread Tatsuo Ishii
Tatsuo Ishii wrote: japanese '{ja_JP, C}' How would we know C - japanese? You can't do that. You can't have different languages (not locales) mapping to the same 'tsearch language' because the stemmer doesn't know that a specific word is in english or japanese. So you have two

Re: [HACKERS] tsearch in core patch

2007-06-24 Thread Tatsuo Ishii
I would be surprised if C locale defaulted to anything except English. Don't be surprised. The mechanism of collation is too simple for Japanse Kanji, and locale is not usefull for Japanse anyway. That's why Japanese installations of PostgreSQL tend to use C locale. -- Tatsuo Ishii SRA OSS, Inc.

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-24 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: Ok, probably we need to copy the English stemming rule to the one for Japanese. Pardon my ignorance here, but is the concept of stemming even relevant to Japanese/Chinese/Korean? What little I know about ideographic languages suggests it wouldn't work

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-24 Thread Tatsuo Ishii
Tatsuo Ishii [EMAIL PROTECTED] writes: Ok, probably we need to copy the English stemming rule to the one for Japanese. Pardon my ignorance here, but is the concept of stemming even relevant to Japanese/Chinese/Korean? What little I know about ideographic languages suggests it wouldn't

Re: [HACKERS] tsearch in core patch

2007-06-23 Thread Euler Taveira de Oliveira
Alvaro Herrera wrote: What I was really suggesting was having a table mapping locale names into tsearch languages. Then the configuration could be made based on the language, not on the locale name. So the stopword list is for russian, regardless of whether the locale is Russian_Russia or

Re: [HACKERS] tsearch in core patch

2007-06-23 Thread Oleg Bartunov
On Sat, 23 Jun 2007, Euler Taveira de Oliveira wrote: Will it be possible to disable stemming or stopwords removal? I'm asking this 'cause sometimes stemming doesn't lead to good results and/or stopwords are relevant. Maybe it could be an GUC variables ('enable_stemming' and

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-23 Thread Euler Taveira de Oliveira
Tatsuo Ishii wrote: japanese '{ja_JP, C}' How would we know C - japanese? You can't do that. You can't have different languages (not locales) mapping to the same 'tsearch language' because the stemmer doesn't know that a specific word is in english or japanese. So you have two options: (a)

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev
3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING done Why not rename ALTER FULLTEXT CONFIGURATION -- ALTER TEXT SEARCH CONFIGURATION here too ? It's renamed too. most languages can be written using UNICODE charset and UTF-8 encoding, so neither charset not encoding can be used

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev
The recommendation I was making was to use the language name, not the encoding name, in the user-visible configuration. How does it determine language of db automatically? -- Teodor Sigaev E-mail: [EMAIL PROTECTED]

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Teodor Sigaev wrote: The recommendation I was making was to use the language name, not the encoding name, in the user-visible configuration. How does it determine language of db automatically? I don't think we are going to do language selection automatically --- the user is going to have to

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Teodor Sigaev
I don't think we are going to do language selection automatically --- the user is going to have to set tsearch_conf_name. Are you suggest to remove long-lived feature of tsearch? In that case we don't need cfglocale (or cfglanguage as Tom suggested) and cfgdefault columns in pg_ts_cfg at all.

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese, but I'm not sure

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes: I don't think we are going to do language selection automatically --- the user is going to have to set tsearch_conf_name. Are you suggest to remove long-lived feature of tsearch? In that case we don't need cfglocale (or cfglanguage as Tom suggested)

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
Teodor Sigaev wrote: --- how do many languages use ISO8859-1 locale?. ISO8859-1 is encoding, not locale. I meant, if we'll use encoding name (for example PG_LATIN1) we couldn't distinguish languages which use that encoding (for example italian and finnish and some more), but using

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Oleg Bartunov
On Fri, 22 Jun 2007, Bruce Momjian wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure.

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Magnus Hagander
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are other examples (like chinese,

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
Magnus Hagander wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I very much doubt that the different spanishes are any different in the stemming rules, so there's no need for es_ES, es_PE, es_AR, es_CL etc; but in the case of portuguese I'm not so sure. Maybe there are

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread teodor
That may have been true until we started supporting Windows... Swedish_Sweden.1252 is what I get on my machine, for example. Principle is the same, but values certainly aren't. Well, at least the name is not itself translated, so a mapping table is not right out of the question. If they

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Bruce Momjian
Michael Glaesemann wrote: On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote: So, final propose: rename cfglocale to cfglanguages and store in it array of laguage names which is produced from first part of locale names: russian '{ru_RU, Russian_Russia}' spanish '{es_ES, es_CL, Spanish_Spain, Spanish_Chile}' Comments? Why not do it the

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tatsuo Ishii
On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? I'm worrying about

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Michael Glaesemann
On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb with locale C? Michael Glaesemann grzm

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread teodor
Why not do it the other way around? es_ES spanish Spanish_Spain spanish ru_RU russian pt_BR portuguese_brazil That way you don't need any funny index. Or do you need the list of locales for each language? (but even if you do, you can easily obtain it by indexing

Re: [Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-22 Thread Tatsuo Ishii
How would this work for initdb with locale C? I'm worrying about that too. english '{en_GB, en_US, C}' I suppose, that locale name always has a dot separator exept C locale --- which is well known exception So we would have to?: japanese '{ja_JP, C}' How would we know C - japanese?

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes: On Jun 22, 2007, at 9:28 , Tom Lane wrote: Is the point here for initdb to be able to establish a sane default initially? Seems to me it can guess the language from the first component of the locale (ru_RU - russian). How would this work for initdb

[Fwd: Re: [HACKERS] tsearch in core patch]

2007-06-22 Thread teodor
How would this work for initdb with locale C? I'm worrying about that too. english '{en_GB, en_US, C}' I suppose, that locale name always has a dot separator exept C locale --- which is well known exception ---(end of broadcast)--- TIP 1:

Re: [HACKERS] tsearch in core patch

2007-06-22 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote: Why not do it the other way around? es_ES spanish Spanish_Spain spanish ru_RU russian pt_BR portuguese_brazil That way you don't need any funny index. Or do you need the list of locales for each language?

[HACKERS] tsearch in core patch

2007-06-21 Thread Teodor Sigaev
http://www.sigaev.ru/misc/tsearch_core-0.52.gz Plan was: 1) rename FULLTEXT to TEXT SEARCH in SQL command done 2) rework Snowball stemmer's as Tom suggested done 3) ALTER FULLTEXT CONFIGURATION cfgname ADD/ALTER/DROP MAPPING done 4) remove support of default configuration per scheme. Default

Re: [HACKERS] tsearch in core patch

2007-06-21 Thread Hannu Krosing
Ühel kenal päeval, N, 2007-06-21 kell 21:44, kirjutas Teodor Sigaev: http://www.sigaev.ru/misc/tsearch_core-0.52.gz Plan was: 1) rename FULLTEXT to TEXT SEARCH in SQL command done 2) rework Snowball stemmer's as Tom suggested done 3) ALTER FULLTEXT CONFIGURATION cfgname

Re: [HACKERS] tsearch in core patch

2007-06-21 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes: Ühel kenal päeval, N, 2007-06-21 kell 21:44, kirjutas Teodor Sigaev: 6) use encoding names instead of locale's names in configuration Ugh. I missed that knowledge of encoding doesn't allow to determine exact language most languages can be written

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-23 Thread Peter Eisentraut
Am Donnerstag, 22. Februar 2007 18:07 schrieb Markus Schiltknecht: I agree so enhancing parser oabout not standard construct isn't good. Generally? Wow! This would mean PostgreSQL would always lack behind other RDBSes, regarding ease of use. Please don't do that! You are confusing making a

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-23 Thread Peter Eisentraut
Am Donnerstag, 22. Februar 2007 14:33 schrieb Teodor Sigaev: \df says only types of arguments, not a meaning. Only if you don't provide argument names. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9:

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht
Hi, Peter Eisentraut wrote: Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. As we still to still discuss the syntax: is there a proposal for how a function based syntax would look like?

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev
In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that do similar things. I was looking at CREATE TYPE mostly. With removing =, ,, (, and ) in

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev
CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT; SELECT add_fulltext_config('myfts', 'template_cfg', True); That's simple, but what about CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; ? SELECT create_fulltext_mapping(cfgname,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Andrew Dunstan
Teodor Sigaev wrote: In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that do similar things. I was looking at CREATE TYPE mostly. With removing

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule
CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT; SELECT add_fulltext_config('myfts', 'template_cfg', True); That's simple, but what about CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; ? SELECT create_fulltext_mapping(cfgname,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht
Hi, Andrew Dunstan wrote: If we are worried about the size of the transition table and keeping it in cache (see remarks from Tom upthread) then adding more keywords seems a bad idea, as it will surely expand the table. OTOH, I'd hate to make that a design criterion. Yeah, me too.

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht
Hi, Pavel Stehule wrote: Functions maybe doesn't see efective, but user's cannot learn new syntax. Are you serious? That argument speaks exactly *for* extending the grammar. From other databases, users are used to: CREATE TABLE ... (SQL) CREATE INDEX ... (SQL) CREATE FULLTEXT INDEX ...

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake
And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule
And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake
Pavel Stehule wrote: And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable.

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule
I am not talking about stored procedures. I am talking about a very ugly, counter intuitive syntax above. Initializing full text should be as simple as: CREATE INDEX foo USING FULLTEXT(bar); (or something similar) Or: CREATE TABLE foo (id serial, names text FULLTEXT); Anything more

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake
CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake I agree. Question: what about multilanguage fulltext. CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ]; CREATE TABLE foo (id serial, names text FULLTEXT [

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule
CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake I agree. Question: what about multilanguage fulltext. CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ]; CREATE TABLE foo (id serial, names text FULLTEXT [

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Robert Treat
On Thursday 25 January 2007 12:51, Oleg Bartunov wrote: On Thu, 25 Jan 2007, Nikolay Samokhvalov wrote: On 1/25/07, Teodor Sigaev [EMAIL PROTECTED] wrote: It's should clear enough for now - dump data from old db and load into new one. But dump should be without any contrib/tsearch2

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Markus Schiltknecht
Hi, Tom Lane wrote: You mean four different object types. I'm not totally clear on bison's scaling behavior relative to the number of productions You really want to trade parser performance (which is *very* implementation specific) for ease of use? Bison generates a LALR [1] parser, which

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Florian G. Pflug
Markus Schiltknecht wrote: Hi, Tom Lane wrote: You mean four different object types. I'm not totally clear on bison's scaling behavior relative to the number of productions You really want to trade parser performance (which is *very* implementation specific) for ease of use? Bison

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. I looked at the diff file and the major change in gram.y is the creation of a new

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Markus Schiltknecht
Hi, Florian G. Pflug wrote: According to http://en.wikipedia.org/wiki/LR_parser processing one token in any LR(1) parser in the worst case needs to a) Do a lookup in the action table with the current (state, token) pair b) Do a lookup in the goto table with a (state, rule) pair. c) Push one

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes: Markus Schiltknecht wrote: I didn't find hard facts about runtime complexity of LALR, though (pointers are very welcome). a) and b) should be O(1). Processing one token pushes at most one state onto the stack, so overall no more than N stats can be

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Brian Hurt
Markus Schiltknecht wrote: Hi, I recall having read something about rewriting the parser. Together with Tom being worried about parser performance and knowing GCC has switched to a hand written parser some time ago, I suspected bison to be slow. That's why I've asked. This has little to

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Florian G. Pflug
Markus Schiltknecht wrote: Are there any ongoing efforts to rewrite the parser (i.e. using another algorithm, like a recursive descent parser)? Why would you want to do that? I recall having read something about rewriting the parser. Together with Tom being worried about parser performance

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes: Markus Schiltknecht wrote: Are there any ongoing efforts to rewrite the parser (i.e. using another algorithm, like a recursive descent parser)? Why would you want to do that? Last, but not least, the C and C++ syntax is basically set in stone - At

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Peter Eisentraut
Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Peter Eisentraut
Joshua D. Drake wrote: This is like the third time we have been around this problem. The syntax is clear and reasonable imo. But others have differing opinions. Can we stop arguing about it and just include? If there are specific issues beyond syntax that is one thing, but that this point

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-21 Thread Oleg Bartunov
On Thu, 22 Feb 2007, Peter Eisentraut wrote: Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. ---

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian
FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. --- Teodor Sigaev wrote: We (Oleg and me) are glad to present tsearch in core

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Alvaro Herrera
Bruce Momjian wrote: FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. One of the objections I remember to this particular implementation was that configuration should be done using functions rather

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Oleg Bartunov
On Tue, 20 Feb 2007, Alvaro Herrera wrote: Bruce Momjian wrote: FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. One of the objections I remember to this particular implementation was that

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Bruce Momjian
Oleg Bartunov wrote: On Tue, 20 Feb 2007, Alvaro Herrera wrote: Bruce Momjian wrote: FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. One of the objections I remember to this particular

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Joshua D. Drake
It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. SQL commands make FTS syntax clear and follow tradition to manage system objects. From the user's side, I'd be very unhappy to configure FTS, which can be very complex,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-20 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. I looked at the diff file and the major change in gram.y is the creation of a new object type FULLTEXT,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-26 Thread Naz Gassiep
Andrew Dunstan wrote: I am constantly running into this: Q. Does PostgreSQL have full text indexing? A. Yes it is in contrib. Q. But that isn't part of core. A. *sigh* Where on the website can I see what plugins are included with PostgreSQL? Where on the website can I see the Official

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Bernd Helmle
On Wed, 24 Jan 2007 22:27:10 +0100, Peter Eisentraut [EMAIL PROTECTED] wrote: I wrote: The closest I could find is Oracle Text, the full-text search for Oracle. Oh, and note that Oracle Text is an extension and not included in the Oracle database product proper. Same with DB2 NSE,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Dawid Kuroczko
On 1/24/07, Andrew Dunstan [EMAIL PROTECTED] wrote: Peter Eisentraut wrote: contrib is a horrible misnomer. Can we maybe bite the bullet and call it something else? plugins? How about 'modules' or 'extras' or 'extensions'? :) standard-plugins might be more informative. I think of them as

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Peter Eisentraut
Dawid Kuroczko wrote: This is the reason I like 'modules' best. It makes one think that it is something maybe part of core, maybe not, but it has been isolated into separate entity for maintenance reasons. On etymological grounds, modules would also be my favorite, but the term module is

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
This is a fairly large patch and I would like the chance to review it before it goes in --- we'll commit tomorrow is not exactly a decent review window. Not a problem. One possible argument for this over the contrib version is a saner approach to dumping and restoring configurations. However,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
the patch. I'm personally not sold on the need for modifications to the SQL grammar, for example, as opposed to just using a set of SQL-callable functions and some new system catalogs. SQL grammar isn't changed significantly - just add variants of CREATE/DROP/ALTER /COMMENTS commands. Next,

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Joshua D. Drake
Teodor Sigaev wrote: the patch. I'm personally not sold on the need for modifications to the SQL grammar, for example, as opposed to just using a set of SQL-callable functions and some new system catalogs. SQL grammar isn't changed significantly - just add variants of CREATE/DROP/ALTER

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Nikolay Samokhvalov
On 1/25/07, Teodor Sigaev [EMAIL PROTECTED] wrote: It's should clear enough for now - dump data from old db and load into new one. But dump should be without any contrib/tsearch2 related functions. Upgrading from 8.1.x to 8.2.x was not tivial because of very trivial change in API (actually not

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Teodor Sigaev
though that we still have the more odd grammar of actually using Tsearch to query. Although I don't really have a better suggestion without adding some ungodly obscure operator. IMHO, best possible solution is 'WHERE table.text_field @ text'. Operator @ internally makes equivalent of

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-25 Thread Oleg Bartunov
On Thu, 25 Jan 2007, Nikolay Samokhvalov wrote: On 1/25/07, Teodor Sigaev [EMAIL PROTECTED] wrote: It's should clear enough for now - dump data from old db and load into new one. But dump should be without any contrib/tsearch2 related functions. Upgrading from 8.1.x to 8.2.x was not tivial

[HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Teodor Sigaev
We (Oleg and me) are glad to present tsearch in core of pgsql patch. In basic, layout, functions, methods, types etc are the same as in current tsearch2 with a lot of improvements: - pg_ts_* tables now are in pg_catalog - parsers, dictionaries, configurations now have owner and namespace

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Peter Eisentraut
Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for marketing reasons, which I will counter by saying that it looks

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Joshua D. Drake
Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for marketing reasons, which I will

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Tom Lane
Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. This is a fairly large patch and I would like the chance to review it before it goes in --- we'll commit tomorrow is not exactly a decent review window. Peter Eisentraut [EMAIL PROTECTED]

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Andrew Dunstan
Joshua D. Drake wrote: Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Jeff Davis
On Wed, 2007-01-24 at 19:15 +0100, Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread David Fetter
On Wed, Jan 24, 2007 at 01:53:54PM -0500, Andrew Dunstan wrote: Joshua D. Drake wrote: Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Jeremy Drake
On Wed, 24 Jan 2007, Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for marketing

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Joshua D. Drake
Jeremy Drake wrote: On Wed, 24 Jan 2007, Peter Eisentraut wrote: Teodor Sigaev wrote: If there aren't objections then we plan commit patch tomorrow or after tomorrow. I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Neil Conway
On Wed, 2007-01-24 at 13:49 -0500, Tom Lane wrote: 2) once we put this in core we are going to be stuck with supporting its SQL API forever. Are we convinced that this API is the one we want? I don't recall even having seen any proposal or discussion. There has been some prior discussion:

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Andrew Dunstan
Jeremy Drake wrote: On Wed, 24 Jan 2007, Peter Eisentraut wrote: I still haven't heard any argument for why this would be necessary or desirable at all, other than that it looks better for marketing reasons, which I will counter by saying that it looks worse for marketing reasons because our

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Peter Eisentraut
Andrew Dunstan wrote: contrib is a horrible misnomer. Can we maybe bite the bullet and call it something else? plugins? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Peter Eisentraut
Jeff Davis wrote: On that point, why do we have /contrib? It's for plugins that are so version-dependent that they can't exist as a separate project, as I understand it. No. (I don't know a better and succinct answer, but that is not it.) -- Peter Eisentraut

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Peter Eisentraut
Jeremy Drake wrote: I for one am greatly looking forward to tsearch2 being in core. I was very fond of the plugin mechanism, until I signed up with a hosting provider. Yes, you have told us about your hosting provider before. Just make sure your next hosting provider does not refuse to

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Joshua D. Drake
Peter Eisentraut wrote: Jeremy Drake wrote: I for one am greatly looking forward to tsearch2 being in core. I was very fond of the plugin mechanism, until I signed up with a hosting provider. Yes, you have told us about your hosting provider before. Just make sure your next hosting

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Peter Eisentraut
Neil Conway wrote: But I agree that we need considerably more discussion before committing the patch. I'm personally not sold on the need for modifications to the SQL grammar, for example, as opposed to just using a set of SQL-callable functions and some new system catalogs. In particular, I

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Stefan Kaltenbrunner
Neil Conway wrote: On Wed, 2007-01-24 at 13:49 -0500, Tom Lane wrote: 2) once we put this in core we are going to be stuck with supporting its SQL API forever. Are we convinced that this API is the one we want? I don't recall even having seen any proposal or discussion. There has been some

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Andrew Dunstan
Peter Eisentraut wrote: Andrew Dunstan wrote: contrib is a horrible misnomer. Can we maybe bite the bullet and call it something else? plugins? standard-plugins might be more informative. I think of them as being like perl's standard modules, things that are part of the

Re: [HACKERS] tsearch in core patch, for inclusion

2007-01-24 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Neil Conway wrote: Another question that would be easier to resolve before the patch is committed is naming: the patch currently uses a mix of full text and tsearch[2] as the name of the full-text search feature. If we're going to bless this as

  1   2   >