[DOCS] Documentation bug in 8.3?
Reading through the text search data type docs: http://www.postgresql.org/docs/8.3/static/datatype-textsearch.html#DATATYPE-TSVECTOR it says: Optionally, integer position(s) can be attached to any or all of the lexemes: SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector; tsvector --- 'a':1,6,10 'on':5 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'rat':12 'sat':4 A position normally indicates the source word's location in the document. Positional information can be used for proximity ranking. Position values can range from 1 to 16383; larger numbers are silently clamped to 16383. Duplicate position entries are discarded. However in my testing of 8.3 duplicate position entries are not discarded: test=> SELECT 'a:1 b:1'::tsvector; tsvector - 'a':1 'b':1 (1 row) -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [DOCS] Documentation bug in 8.3?
Bruce Momjian <[EMAIL PROTECTED]> writes: > clamped to 16383. Duplicate position entries are discarded. > > However in my testing of 8.3 duplicate position entries are not > discarded: > test=> SELECT 'a:1 b:1'::tsvector; > tsvector > - >'a':1 'b':1 > (1 row) Those aren't duplicates, because they're not attached to the same lexeme. The comment is talking about this behavior: regression=# SELECT 'a:1 a:1'::tsvector; tsvector -- 'a':1 (1 row) regression=# SELECT 'a:1,2,1'::tsvector; tsvector -- 'a':1,2 (1 row) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] Documentation bug in 8.3?
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > clamped to 16383. Duplicate position entries are discarded. > > > > > However in my testing of 8.3 duplicate position entries are not > > discarded: > > > test=> SELECT 'a:1 b:1'::tsvector; > > tsvector > > - > > 'a':1 'b':1 > > (1 row) > > Those aren't duplicates, because they're not attached to the same > lexeme. The comment is talking about this behavior: > > regression=# SELECT 'a:1 a:1'::tsvector; > tsvector > -- > 'a':1 > (1 row) > > regression=# SELECT 'a:1,2,1'::tsvector; > tsvector > -- > 'a':1,2 > (1 row) OK, thanks. I will clarify the documentation. Patch attached and applied. -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/datatype.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v retrieving revision 1.222 diff -c -c -r1.222 datatype.sgml *** doc/src/sgml/datatype.sgml 2 Jan 2008 19:53:13 - 1.222 --- doc/src/sgml/datatype.sgml 12 Jan 2008 21:50:51 - *** *** 3330,3336 document. Positional information can be used for proximity ranking. Position values can range from 1 to 16383; larger numbers are silently clamped to 16383. ! Duplicate position entries are discarded. --- 3330,3336 document. Positional information can be used for proximity ranking. Position values can range from 1 to 16383; larger numbers are silently clamped to 16383. ! Duplicate positions for the same lexeme are discarded. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
