On Wed, 8 Aug 2007, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Sure, but you have make sure you use the right configuration in the
trigger, no? Does the tsquery have to use the same configuration?
I wish I knew this myself. :-) Whatever I had done happened to work
but that was largely through people on IRC walking me through it.
This illustrates the major issue --- that this has to be simple for
people to get started, while keeping the capabilities for experienced
I am now thinking that making users always specify the configuration
name and not allowing :: casting is going to be the best approach. We
can always add more in 8.4 after it is in wide use.
I just read the docs and I'm trying to get a grip of the problem here.
If I understood correctly, the basic issue is that a tsvector datum
created using configuration A is incompatible with a tsquery datum
created using configuration B, in the sense that you won't get
reasonable results if you use the tsquery to search the tsvector, or do
ranking or highlighting. If the configurations happen to be similar
enough, it can work, but not in general.
not fair. There are many cases when one can intentionally use different
configurations. But I agree, this is not for beginners.
That underlying issue manifests itself in many ways, including:
- if you create table with a field of type tsvector, typically kept
up-to-date by triggers, and do a search on it using a different
configuration, you get incorrect results.
again, you might want to use different configuration.
- using an expression index instead of a tsvector-field, and always
explicitly specifying the configuration, you can avoid that problem (a
query with a different configuration won't use the index). But an
expression index, without explicitly specifying the configuration, will
get corrupted if you change the default configuration.
the same problem if you drop constrain from table (accidently) and then
gets surprised by select results.
Removing the default configuration setting altogether removes the 2nd
problem, but that's not good from a usability point of view. And it
doesn't solve the general issue, you can still do things like:
SELECT * FROM foo WHERE to_tsvector('confA', textcol) @@
True, but in that case you are specifically naming different
configurations, so it is hopefully obvious you have a mismatch.
ISTM we should have a separate tsvector and tsquery data type for each
configuration, and throw an error if you try to mix and match them in a
query. to_tsquery and to_tsvector would be new kind of polymorphic
functions that work with the types. Or we could automatically create a
copy of them when you create a new configuration. We could have a
default configuration setting and rewrite queries that don't explicitly
specify a configuration to use the default.
That is going to make multiple configurations quite complex in the
backend, and I think for little value.
You could still get into trouble if you alter the configuration after
starting to use it. We could solve that by not allowing you to ALTER
CONFIGURATION, at least not if it's used in tables or indexes. Forcing
people to create a new configuration, and to recreate all indexes and
tsvector columns every time you add a word to a stop-list, for example,
seems too onerous, though. Not sure what to do about that.
Yea, seems more work than is necessary. If we require the configuration
to be always supplied, and document that mismatches are a problem, I
think we are in good shape.
We should agree that all you describe is only for DUMMY users.
From authors point of view I dislike your approach to treat text searching
as a very limited tool. But I understand that we should preserve people from
I want for beginners easy setup and error-prone functionality,
but leaving experienced users to develop complex search engines.
Can we have separate safe interface for text searching and explicitly
recommend it for beginners ?
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at