Re: [HACKERS] A couple of tsearch loose ends

2007-08-22 Thread Stefan Kaltenbrunner

Tom Lane wrote:

Dimitri Fontaine <[EMAIL PROTECTED]> writes:

I don't understand why this ALTER variation is so different from existing=20
ones, but maybe the following syntax can't work:
  ALTER TEXT SEARCH DICTIONARY swedish ALTER STOPWORDS SET swedish;


You'd have to repeat the whole command for each option to be changed,
which given the amount of typing involved seems a bit unpleasant.

There are also historical differences between what is allowed by
the SET var = value syntax and what is allowed in the
parenthesized-option-list syntax.  Introducing an inconsistency between
ALTER and CREATE doesn't seem appetizing.

(BTW, does anyone want to teach psql's tab-completion about the new
text search statements?)


I will take a stab at doing that ...


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] A couple of tsearch loose ends

2007-08-22 Thread Tom Lane
Dimitri Fontaine <[EMAIL PROTECTED]> writes:
> I don't understand why this ALTER variation is so different from existing=20
> ones, but maybe the following syntax can't work:
>   ALTER TEXT SEARCH DICTIONARY swedish ALTER STOPWORDS SET swedish;

You'd have to repeat the whole command for each option to be changed,
which given the amount of typing involved seems a bit unpleasant.

There are also historical differences between what is allowed by
the SET var = value syntax and what is allowed in the
parenthesized-option-list syntax.  Introducing an inconsistency between
ALTER and CREATE doesn't seem appetizing.

(BTW, does anyone want to teach psql's tab-completion about the new
text search statements?)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] A couple of tsearch loose ends

2007-08-22 Thread Dimitri Fontaine
Hi list,

Le mardi 21 août 2007, Tom Lane a écrit :
> CREATE TEXT SEARCH DICTIONARY swedish (
> TEMPLATE = snowball,
> LANGUAGE = swedish,
> STOPWORDS = swedish
> );
>
> ALTER TEXT SEARCH DICTIONARY swedish (
> STOPWORDS
> );
>
> this dictionary would have LANGUAGE = swedish and no stopwords option.
>
> Any objections to changing it like that?

I don't understand why this ALTER variation is so different from existing 
ones, but maybe the following syntax can't work:
  ALTER TEXT SEARCH DICTIONARY swedish ALTER STOPWORDS SET swedish;

For dropping an option, could one of those commands do?
  ALTER TEXT SEARCH DICTIONARY swedish DROP STOPWORDS;
  ALTER TEXT SEARCH DICTIONARY swedish ALTER STOPWORDS SET NULL;

Not sure if it's doable or if it really looks more like other ALTER commands, 
but I think I'd like it more this way :)

Hope this helps,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] A couple of tsearch loose ends

2007-08-21 Thread Oleg Bartunov

On Tue, 21 Aug 2007, Tom Lane wrote:


When you look at it, this is downright silly.  Why don't we flatten
the two levels together and write something like

CREATE TEXT SEARCH DICTIONARY swedish (
   TEMPLATE = snowball,
   LANGUAGE = swedish,
   STOPWORDS = swedish
);



Dictionary is a program with its own options, so we can't know in advance
what actual options it uses. We can reserve some options, though.
This is a very useful feature.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] A couple of tsearch loose ends

2007-08-21 Thread Bruce Momjian
Tom Lane wrote:
> There are a couple of naming issues that I left untouched while
> reviewing the tsearch patch, but wanted to bring up for discussion.
> 
> One thing that had me confused for awhile is that the patch uses
> the word "template" in two different ways.  The main use is that a
> "template" is an object encapsulating the superuser-only aspects of
> defining a dictionary.  When you do CREATE TEXT SEARCH DICTIONARY
> you have to specify a template to base it on.  So in this context
> a dictionary and its template are different kinds of objects, and
> there's a persistent connection between them.

What has me concerned is the idea of database templates being different
from text search dictionary templates?  Why can't they function the same
way?

> On the other hand, CREATE TEXT SEARCH CONFIGURATION also uses the
> word "template", but in this case it's an optional specification
> of an existing configuration that gets copied.  So here, the config
> and the template are the same kind of object, and there's no
> connection between them after the copy is made.
> 
> This seems a bit confusing, and I wonder whether we ought not
> change the terminology for one thing or the other.  I don't
> particularly want to rename text search templates ... that would
> be quite a bit of work at this point ... so what I'd suggest is
> that the option to CREATE TEXT SEARCH CONFIGURATION be renamed
> "COPY" instead of "TEMPLATE".  Another thought here is that I'm
> inclined to drop the "with map" option and just always copy the
> source configuration exactly.  If you don't want the map, the
> only other information the source can provide is a parser name,
> which you might as well just give directly.

Agreed on the use of COPY.  I already pointed out this confusion in a
previous email.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] A couple of tsearch loose ends

2007-08-21 Thread Tom Lane
Teodor Sigaev <[EMAIL PROTECTED]> writes:
> I havn't any objections. "with map" was introduced when another options 
>   was existed - locale and default flag.

OK, I'll make that happen.

>> The other thing that was bugging me was that a lot of the dictionary
>> types have init options that are named things like DictFile, AffFile,
>> etc.

> DictFile and AffFile are files of ispell ( or derived from it ) 
> dictionaries, we don't manage that files - they require a lot of 
> lingustic knowledge which we don't have and I don't hope that there is 
> such man in pgsql community. So, we just use they.

Hmm ... I suppose, but I'd still prefer that the option names didn't
include the word "file".

Also, while revising the reference pages for the syntax changes I made,
I realized that there's further simplification possible for the
dictionary commands.  I changed these commands to use the same
"definition list" construct that's used by CREATE OPERATOR and such.
It has the nice property that the option "keywords" aren't actually
keywords in the eyes of the grammar, they're just any identifiers.
So what we have got as of CVS HEAD is

CREATE TEXT SEARCH DICTIONARY name (
TEMPLATE = template
[, OPTION = init_options ]
)

ALTER TEXT SEARCH DICTIONARY name (
OPTION = init_options
)

where "init_options" is supposed to be a string literal containing stuff
like
'Language=swedish, StopWords=swedish'

When you look at it, this is downright silly.  Why don't we flatten
the two levels together and write something like

CREATE TEXT SEARCH DICTIONARY swedish (
TEMPLATE = snowball,
LANGUAGE = swedish,
STOPWORDS = swedish
);

The original implementation couldn't do that but it's easy in the
definition-list grammar.  This is even more useful for ALTER, because
it'd be possible to change the value of one option without having to
write out the values of all the others.  What I'd suggest is that
we adopt the convention that an option is dropped if its name appears
with no value, otherwise it's kept unless overridden with a new value.
So after

ALTER TEXT SEARCH DICTIONARY swedish (
STOPWORDS
);

this dictionary would have LANGUAGE = swedish and no stopwords option.

Any objections to changing it like that?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] A couple of tsearch loose ends

2007-08-21 Thread Teodor Sigaev

"COPY" instead of "TEMPLATE".  Another thought here is that I'm
inclined to drop the "with map" option and just always copy the
source configuration exactly.  If you don't want the map, the
only other information the source can provide is a parser name,
which you might as well just give directly.


I havn't any objections. "with map" was introduced when another options 
 was existed - locale and default flag.




The other thing that was bugging me was that a lot of the dictionary
types have init options that are named things like DictFile, AffFile,
etc.  As I mentioned before, I dislike the fact that these things are
out in the filesystem rather than inside the database, and hope that
that will change eventually.  So I think that these names are not


DictFile and AffFile are files of ispell ( or derived from it ) 
dictionaries, we don't manage that files - they require a lot of 
lingustic knowledge which we don't have and I don't hope that there is 
such man in pgsql community. So, we just use they.


Managing of stop words are much more simple, so list may be stored in 
database, not in file.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] A couple of tsearch loose ends

2007-08-21 Thread Pavel Stehule
Hello

>
> The other thing that was bugging me was that a lot of the dictionary
> types have init options that are named things like DictFile, AffFile,
> etc.  As I mentioned before, I dislike the fact that these things are
> out in the filesystem rather than inside the database, and hope that
> that will change eventually.  So I think that these names are not
> future-proof and should be altered to not use the word "file";
> especially so in view of the fact that as committed, the patch doesn't
> let you specify a path name for them.  I already did that to StopFile,
> which is now StopWords, but did not touch the other dictionary options.
> I'm not sure what to do with DictFile, because that doesn't seem to have
> any special meaning at all once you take out "file" ...
>

and what  dictionary based languages?

Regards
Pavel Stehule

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match