On Mon, 30 Jul 2007, Bruce Momjian wrote:

Bruce Momjian wrote:
We have to decide if we want a GUC default_text_search_config, and if so
when can it be changed.

Right now there are three ways to create a tsvector (or tsquery)

        ::tsvector
        to_tsvector(value)
        to_tsvector(config, value)

(ignoring plainto_tsvector)

Only the last one specifies the configuration. The others use the
configuration specified by default_text_search_config.  (We had an
previous discussion on what the default value of
default_text_search_config should be, and it was decided it should be
set via initdb based on a flag or the locale.)

Now, because most people use a single configuration, they can just set
default_text_search_config and there is no need to specify the
configuration name.

However, expression indexes cause a problem here:

        
http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

We recommend that users create an expression index on the column they
want to do a full text search on, e.g.

        CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body));

However, the big problem is that the expressions used in expression
indexes should not change their output based on the value of a GUC
variable (because it would corrupt the index), but in the case above,
default_text_search_config controls what configuration is used, and
hence the output of to_tsvector is changed if default_text_search_config
changes.

We have a few possible options:

        1) Document the problem and do nothing else.
        2) Make default_text_search_config a postgresql.conf-only
           setting, thereby making it impossible to change by non-super
           users, or make it a super-user-only setting.
        3) Remove default_text_search_config and require the
           configuration to be specified in each function call.

If we remove default_text_search_config, it would also make ::tsvector
casting useless as well.

OK, I just found a case that I think is going to make #3 a requirement
(remove default_text_search_config).

How is a CREATE INDEX ... to_tsvector(col) going to restore from a
pg_dump?  I see no way of guaranteeing that the
default_text_search_config is correct on the restore, and in fact I
don't think we have any way of knowing the default_text_search_config
used for the index.

And if we have to require the configuration name in CREATE INDEX, it has
to be used in WHERE, so we might as well just remove the default
capability and always require the configuration name.

this is very rare use case for text searching 1. expression index without configuration name
2. default_text_search_config can be changed by somebody

If somebody really need it, then he should be adviced to use configuration name, else we don't guarantee that somebody could change default_text_search_config variable and this could lead to incorrect dump/restore.

I don't think we should remove default_text_search_config because of this rare case.

        Regards,
                Oleg

PS.

Bruce, I'm in the mountains the Northern Caucasia and internet is
a bit  unreliable :(

_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to