[PATCHES] Improvement to pg_trgm readme

2004-11-25 Thread Christopher Kings-Lynne
This adds mention of my latest tweak to the tsearch2/pg_trgm 
integration.  It is much better to create a word list of unstemmed words 
than stemmed ones.

Chris
Index: contrib/pg_trgm/README.pg_trgm
===
RCS file: /projects/cvsroot/pgsql/contrib/pg_trgm/README.pg_trgm,v
retrieving revision 1.1
diff -c -r1.1 README.pg_trgm
*** contrib/pg_trgm/README.pg_trgm  31 May 2004 17:18:11 -  1.1
--- contrib/pg_trgm/README.pg_trgm  26 Nov 2004 01:31:39 -
***
*** 100,110 
The first step is to generate an auxiliary table containing all
the unique words in the Tsearch2 index:
  
!   CREATE TABLE words AS 
!   SELECT word FROM stat('SELECT vector FROM documents');
  
!   Where 'documents' is the table that contains the Tsearch2 index
!   column 'vector', of type 'tsvector'.
  
Next, create a trigram index on the word column:
  
--- 100,114 
The first step is to generate an auxiliary table containing all
the unique words in the Tsearch2 index:
  
!   CREATE TABLE words AS SELECT word FROM
!   stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
  
!   Where 'documents' is a table that has a text field 'bodytext'
!   that TSearch2 is used to search.  The use of the 'simple' dictionary
!   with the to_tsvector function, instead of just using the already
!   existing vector is to avoid creating a list of already stemmed
!   words.  This way, only the original, unstemmed words are added
!   to the word list.
  
Next, create a trigram index on the word column:
  

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] SQL conformance related patch

2004-11-25 Thread Troels Arvin
On Thu, 25 Nov 2004 22:13:01 +0100, Peter Eisentraut wrote:

> I could buy "SQL:1999", because then "SQL" would be a macro expanding to 
> "ISO/IEC 9075".  But I don't see how SQL-92 fits in there.

There was a naming change between SQL-92 and SQL:1999, according to the
mentioned book.
  The ":" is because the standard's name has been brought in line with
ISO's naming standards. The expansion to four digits was because of the
general year-2000 trend.

>> Note 1:
>> http://books.elsevier.com/mk/default.asp?isbn=1558604561 has a
>> section on this subject.
> 
> Note that this book is titled "SQL 1999 - Understanding Relational 
> Language Components".  Apparently, they are not sure either.

Yes, that's ironic. But I have the book right here. There is clearly a ":"
in the name on the front page, as well as in the colophon.

-- 
Greetings from Troels Arvin, Copenhagen, Denmark



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] SQL conformance related patch

2004-11-25 Thread Peter Eisentraut
Troels Arvin wrote:
> We have tried to use the official[1] short names:
> SQL-92
> SQL:1999
> SQL:2003

I could buy "SQL:1999", because then "SQL" would be a macro expanding to 
"ISO/IEC 9075".  But I don't see how SQL-92 fits in there.

> Note 1:
> http://books.elsevier.com/mk/default.asp?isbn=1558604561 has a
> section on this subject.

Note that this book is titled "SQL 1999 - Understanding Relational 
Language Components".  Apparently, they are not sure either.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] SQL conformance related patch

2004-11-25 Thread Troels Arvin
On Thu, 25 Nov 2004 19:16:47 +0100, Peter Eisentraut wrote:

> Btw., does anyone mind if I change the names of the standards to
> 
> SQL 1992
> SQL 1999
> SQL 2003
> 
> ?  The other styles seem to be rather contrived and are not applied
> consistently.

We have tried to use the official[1] short names:
SQL-92
SQL:1999
SQL:2003

- But it's true that those short names are not used consistently
throughout the documentation.

I see no reason not to use the offical short names; to me, they look OK
text-wise. But it boils down to a matter of taste.


Note 1:
http://books.elsevier.com/mk/default.asp?isbn=1558604561 has a section on
this subject.

-- 
Greetings from Troels Arvin, Copenhagen, Denmark



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] SQL conformance related patch

2004-11-25 Thread Peter Eisentraut
Troels Arvin wrote:
> Simon Riggs, Elein Mustain, and I have worked on adjustments to the
> information schema and parts of the documentation to reflect upcoming
> changes in PostgreSQL 8 and changes from SQL:1999 to SQL:2003.

Thanks.  I will evaluate this patch.

Btw., does anyone mind if I change the names of the standards to

SQL 1992
SQL 1999
SQL 2003

?  The other styles seem to be rather contrived and are not applied 
consistently.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PATCHES] pg_dumplo schema support

2004-11-25 Thread Karel Zak

Hi,

it looks users still use pg_dumplo from contrib tree:

On Thu, 2004-11-18 at 17:45 +0100, Düster Horst wrote:
> Hallo Karel
>
> I tried to use your pg_dumplo but it seems, that the version 
> contributed with pg-7.4.5 does not support schemas.

here is a patch with:

* support absolute paths in dump index 
* support for schemas
* queries to catalog uses "pg_catalog." prefix
* default directory is $PWD or "."
* check index file format
* schema support is backward compatible (default schema is "public")

Maybe it's too late for 8.0, but I think nobody test it during beta
releases. I tested it now with this patch and it works for me. If you
think that 8.1 will better, please apply it to 8.1.

Thanks,
Karel

-- 
Karel Zak <[EMAIL PROTECTED]>


pg_dumplo-schema-11232004.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PATCHES] SQL conformance related patch

2004-11-25 Thread Troels Arvin
Hello,

Simon Riggs, Elein Mustain, and I have worked on adjustments to the
information schema and parts of the documentation to reflect upcoming
changes in PostgreSQL 8 and changes from SQL:1999 to SQL:2003.

Attached is the result of that: A "diff -c" patch for current CVS HEAD.
The patch is also available via HTTP[1].

The patch subsumes Simon's patch of November 21 with subject "SQL
Conformance introductory section"[2]. Peter's objection to Simon's
choice of words has been accommodated.

We've had some discussions regarding feature ID F411 (Time zone
specification). If we were to be strict, PostgreSQL's support for F411
should be changed to NO, due to a difference in TIMESTAMP literal
handling. As it is, the status remains YES, but a comment has been added
in the information schema, and the datetime documentation has been
expanded a bit.

Simon also has ideas for improvement of the conformance documentation
page (features.sgml), but that's probably a version 8.1 thing. I think
he'd better describe the ideas himself.

The list of probably-not-but-just-maybe-supported features in my message
as of October 20[3] remains unchanged, but as previously discussed on
this list, time is running out.

As previously mentioned, I've created a little list of "low hanging
fruits" for PostgreSQL's standards-conformance - i.e. features in
SQL:2003 which PostgreSQL currently doesn't support but which might
actually be relatively easy to implement:
http://troels.arvin.dk/db/pgsql/conformance/low-hanging-fruits.txt



References:

1:
http://troels.arvin.dk/db/pgsql/conformance/pgsql-sql-conformance.patch

2:
http://article.gmane.org/gmane.comp.db.postgresql.devel.patches/10318

3:
http://article.gmane.org/gmane.comp.db.postgresql.devel.documentation/1278

-- 
Greetings from Troels Arvin, Copenhagen, Denmark
Index: doc/src/sgml/datatype.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/datatype.sgml,v
retrieving revision 1.150
diff -c -r1.150 datatype.sgml
*** doc/src/sgml/datatype.sgml	20 Sep 2004 22:48:25 -	1.150
--- doc/src/sgml/datatype.sgml	25 Nov 2004 11:01:11 -
***
*** 1744,1752 
   
  
   
!   For timestamp [without time zone], any explicit time
!   zone specified in the input is silently ignored. That is, the
!   resulting date/time value is derived from the explicit date/time
fields in the input value, and is not adjusted for time zone.
   
  
--- 1744,1763 
   
  
   
!   The SQL standard differentiates timestamp without time zone 
!   and timestamp with time zone literals by the existence of a 
!   +; or -. Hence, according to the standard, 
!   TIMESTAMP '2004-10-19 10:23:54'
!   is a timestamp without time zone, while
!   TIMESTAMP '2004-10-19 10:23:54+02'
!   is a timestamp with time zone.
!   PostgreSQL 
!   differs from the standard by requiring that timestamp with time zone 
!   literals be explicitly typed:
!   TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'
!   If a literal is not explicitly indicated as being of timestamp with time zone,
!   PostgreSQL will silently ignore any time zone indication in the literal.
!   That is, the resulting date/time value is derived from the date/time
fields in the input value, and is not adjusted for time zone.
   
  
***
*** 1778,1783 
--- 1789,1795 
as timezone local time.  A different zone reference can
be specified for the conversion using AT TIME ZONE.
   
+ 
  
  
  
Index: doc/src/sgml/features.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/features.sgml,v
retrieving revision 2.22
diff -c -r2.22 features.sgml
*** doc/src/sgml/features.sgml	29 Nov 2003 19:51:37 -	2.22
--- doc/src/sgml/features.sgml	25 Nov 2004 11:01:11 -
***
*** 7,35 
  
   
This section attempts to outline to what extent
!   PostgreSQL conforms to the SQL standard.
!   Full compliance to the standard or a complete statement about the
!   compliance to the standard is complicated and not particularly
!   useful, so this section can only give an overview.
!   
  
   
The formal name of the SQL standard is ISO/IEC 9075 Database
!   Language SQL.  A revised version of the standard is released
!   from time to time; the most recent one appearing in 1999.  That
!   version is referred to as ISO/IEC 9075:1999, or informally as SQL99.
!   The version prior to that was SQL92.
!   PostgreSQL development tends to aim for
conformance with the latest official version of the standard where
such conformance does not contradict traditional features or common
!   sense.  At the time of this writing, balloting is under way for a
!   new revision of the standard, which, if approved, will eventually
!   become the conformance target for future
!   Postg