[HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Dan Chak

Dear Postgres Folk,

In working with tsvectors (added in 8.3), I've come to a place where  
my syntax-fu has failed me.  I've resorted to turning a result set of  
records into strings so that I can regexp out the record fields I  
need, rather than access them directly, as I'm sure it's possible to  
do with the right syntactic formulation.  Although my solution works,  
I'm sure it's much less efficient than it could be, and hope someone  
on the list can help do this the right way.


Basically, I would like to transpose a series of tsvectors (one per  
row) into columns.  E.g., as tsvects, I have this:


test=# select * from tsvects;
 sentence_id |tsvect
-+--
   1 | 'fox':3 'brown':2 'quick':1
   2 | 'lazi':1 'eleph':3 'green':2

Instead I want this:

 sentence_id | word  | freq
-+---+--
   1 | fox   | 1
   1 | brown | 1
   1 | quick | 1
   2 | lazi  | 1
   2 | eleph | 1
   2 | green | 1

I am able to generate this with the following view, but the problem is  
that to create it, I must first cast the ts_stat results to a string,  
and then regexp out the pertinent pieces:


create or replace view words as
select sentence_id,
   substring(stat from '^\\(([^,]+),') as word,
   substring(stat from ',([^,]+)\\)$') as freq
  from (select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id  
= ' ||

  tsvects.sentence_id)::text as stat
  from tsvects
   ) as foo;

It seems like there should be a way to access fields in the records  
returned from ts_stat directly, but I can't figure out how.  Here's  
the result of the subquery:


test=# select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id  
= ' ||

  tsvects.sentence_id)::text as stat
  from tsvects;
 sentence_id |stat
-+-
   1 | (fox,1,1)
   1 | (brown,1,1)
   1 | (quick,1,1)
   2 | (lazi,1,1)
   2 | (eleph,1,1)
   2 | (green,1,1)
(6 rows)

If I try to get at the elements (which I believe are named 'word',  
'ndoc', 'nentry'), I get a variety of syntax errors:


test=# select sentence_id,
test-#stat['word'],
test-#stat['nentry']
test-#   from (select sentence_id,
test(#ts_stat('select tsvect from tsvects where  
sentence_id = ' ||

test(#   tsvects.sentence_id) as stat
test(#   from tsvects
test(#) as foo;
ERROR:  cannot subscript type record because it is not an array

If I say stat.word (instead of subscripting), I get 'missing FROM- 
clause entry for table stat'.  If I say foo.stat.word, I get  
'ERROR:  schema foo does not exist'.


Any ideas on how to get into these records with resorting to text  
parsing?


Thanks,
Dan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Oleg Bartunov
try 
select * from ts_stat()

btw, performance of ts_stat() was greatly improved in 8.4.

Oleg
On Tue, 9 Dec 2008, Dan Chak wrote:


Dear Postgres Folk,

In working with tsvectors (added in 8.3), I've come to a place where my 
syntax-fu has failed me.  I've resorted to turning a result set of records 
into strings so that I can regexp out the record fields I need, rather than 
access them directly, as I'm sure it's possible to do with the right 
syntactic formulation.  Although my solution works, I'm sure it's much less 
efficient than it could be, and hope someone on the list can help do this the 
right way.


Basically, I would like to transpose a series of tsvectors (one per row) into 
columns.  E.g., as tsvects, I have this:


test=# select * from tsvects;
sentence_id |tsvect
-+--
 1 | 'fox':3 'brown':2 'quick':1
 2 | 'lazi':1 'eleph':3 'green':2

Instead I want this:

sentence_id | word  | freq
-+---+--
 1 | fox   | 1
 1 | brown | 1
 1 | quick | 1
 2 | lazi  | 1
 2 | eleph | 1
 2 | green | 1

I am able to generate this with the following view, but the problem is that 
to create it, I must first cast the ts_stat results to a string, and then 
regexp out the pertinent pieces:


create or replace view words as
select sentence_id,
 substring(stat from '^\\(([^,]+),') as word,
 substring(stat from ',([^,]+)\\)$') as freq
from (select sentence_id,
 ts_stat('select tsvect from tsvects where sentence_id = ' ||
tsvects.sentence_id)::text as stat
from tsvects
 ) as foo;

It seems like there should be a way to access fields in the records returned 
from ts_stat directly, but I can't figure out how.  Here's the result of the 
subquery:


test=# select sentence_id,
 ts_stat('select tsvect from tsvects where sentence_id = ' ||
tsvects.sentence_id)::text as stat
from tsvects;
sentence_id |stat
-+-
 1 | (fox,1,1)
 1 | (brown,1,1)
 1 | (quick,1,1)
 2 | (lazi,1,1)
 2 | (eleph,1,1)
 2 | (green,1,1)
(6 rows)

If I try to get at the elements (which I believe are named 'word', 'ndoc', 
'nentry'), I get a variety of syntax errors:


test=# select sentence_id,
test-#stat['word'],
test-#stat['nentry']
test-#   from (select sentence_id,
test(#ts_stat('select tsvect from tsvects where sentence_id = 
' ||

test(#   tsvects.sentence_id) as stat
test(#   from tsvects
test(#) as foo;
ERROR:  cannot subscript type record because it is not an array

If I say stat.word (instead of subscripting), I get 'missing FROM-clause 
entry for table stat'.  If I say foo.stat.word, I get 'ERROR:  schema foo 
does not exist'.


Any ideas on how to get into these records with resorting to text parsing?

Thanks,
Dan




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Tom Lane
Dan Chak [EMAIL PROTECTED] writes:
 If I say stat.word (instead of subscripting), I get 'missing FROM- 
 clause entry for table stat'.  If I say foo.stat.word, I get  
 'ERROR:  schema foo does not exist'.

I think the syntax you need is (stat).word etc.  See Field Selection
here:
http://www.postgresql.org/docs/8.3/static/sql-expressions.html#AEN1679
The reason for the parens is exactly to distinguish whether the leading
word is a table or column name.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Dan Chak

Oleg,

This syntax works fine until I also want to get the sentence_id  
column in there as well, so that I can differentiate one set of  
ts_stat results from another.  With the syntax where ts_stat is  
treated like a table, it isn't possible to run ts_stat separately on  
multiple tsvectors as I'm doing below.


Is there some generic record access syntax that I can use?

Thanks,
Dan

On Dec 9, 2008, at 3:04 PM, Oleg Bartunov wrote:


try select * from ts_stat()
btw, performance of ts_stat() was greatly improved in 8.4.

Oleg
On Tue, 9 Dec 2008, Dan Chak wrote:


Dear Postgres Folk,

In working with tsvectors (added in 8.3), I've come to a place  
where my syntax-fu has failed me.  I've resorted to turning a  
result set of records into strings so that I can regexp out the  
record fields I need, rather than access them directly, as I'm sure  
it's possible to do with the right syntactic formulation.  Although  
my solution works, I'm sure it's much less efficient than it could  
be, and hope someone on the list can help do this the right way.


Basically, I would like to transpose a series of tsvectors (one per  
row) into columns.  E.g., as tsvects, I have this:


test=# select * from tsvects;
sentence_id |tsvect
-+--
1 | 'fox':3 'brown':2 'quick':1
2 | 'lazi':1 'eleph':3 'green':2

Instead I want this:

sentence_id | word  | freq
-+---+--
1 | fox   | 1
1 | brown | 1
1 | quick | 1
2 | lazi  | 1
2 | eleph | 1
2 | green | 1

I am able to generate this with the following view, but the problem  
is that to create it, I must first cast the ts_stat results to a  
string, and then regexp out the pertinent pieces:


create or replace view words as
select sentence_id,
substring(stat from '^\\(([^,]+),') as word,
substring(stat from ',([^,]+)\\)$') as freq
from (select sentence_id,
ts_stat('select tsvect from tsvects where sentence_id =  
' ||

   tsvects.sentence_id)::text as stat
   from tsvects
) as foo;

It seems like there should be a way to access fields in the records  
returned from ts_stat directly, but I can't figure out how.  Here's  
the result of the subquery:


test=# select sentence_id,
ts_stat('select tsvect from tsvects where sentence_id =  
' ||

   tsvects.sentence_id)::text as stat
   from tsvects;
sentence_id |stat
-+-
1 | (fox,1,1)
1 | (brown,1,1)
1 | (quick,1,1)
2 | (lazi,1,1)
2 | (eleph,1,1)
2 | (green,1,1)
(6 rows)

If I try to get at the elements (which I believe are named 'word',  
'ndoc', 'nentry'), I get a variety of syntax errors:


test=# select sentence_id,
test-#stat['word'],
test-#stat['nentry']
test-#   from (select sentence_id,
test(#ts_stat('select tsvect from tsvects where  
sentence_id = ' ||

test(#   tsvects.sentence_id) as stat
test(#   from tsvects
test(#) as foo;
ERROR:  cannot subscript type record because it is not an array

If I say stat.word (instead of subscripting), I get 'missing FROM- 
clause entry for table stat'.  If I say foo.stat.word, I get  
'ERROR:  schema foo does not exist'.


Any ideas on how to get into these records with resorting to text  
parsing?


Thanks,
Dan




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Dan Chak

That works perfectly!

Thanks,
Dan

On Dec 9, 2008, at 3:13 PM, Tom Lane wrote:


Dan Chak [EMAIL PROTECTED] writes:

If I say stat.word (instead of subscripting), I get 'missing FROM-
clause entry for table stat'.  If I say foo.stat.word, I get
'ERROR:  schema foo does not exist'.


I think the syntax you need is (stat).word etc.  See Field Selection
here:
http://www.postgresql.org/docs/8.3/static/sql-expressions.html#AEN1679
The reason for the parens is exactly to distinguish whether the  
leading

word is a table or column name.

regards, tom lane



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Oleg Bartunov

On Tue, 9 Dec 2008, Dan Chak wrote:


Oleg,

This syntax works fine until I also want to get the sentence_id column in 
there as well, so that I can differentiate one set of ts_stat results from 
another.  With the syntax where ts_stat is treated like a table, it isn't 
possible to run ts_stat separately on multiple tsvectors as I'm doing below.


Is there some generic record access syntax that I can use?


write function



Thanks,
Dan

On Dec 9, 2008, at 3:04 PM, Oleg Bartunov wrote:


try select * from ts_stat()
btw, performance of ts_stat() was greatly improved in 8.4.

Oleg
On Tue, 9 Dec 2008, Dan Chak wrote:


Dear Postgres Folk,

In working with tsvectors (added in 8.3), I've come to a place where my 
syntax-fu has failed me.  I've resorted to turning a result set of records 
into strings so that I can regexp out the record fields I need, rather 
than access them directly, as I'm sure it's possible to do with the right 
syntactic formulation.  Although my solution works, I'm sure it's much 
less efficient than it could be, and hope someone on the list can help do 
this the right way.


Basically, I would like to transpose a series of tsvectors (one per row) 
into columns.  E.g., as tsvects, I have this:


test=# select * from tsvects;
sentence_id |tsvect
-+--
   1 | 'fox':3 'brown':2 'quick':1
   2 | 'lazi':1 'eleph':3 'green':2

Instead I want this:

sentence_id | word  | freq
-+---+--
   1 | fox   | 1
   1 | brown | 1
   1 | quick | 1
   2 | lazi  | 1
   2 | eleph | 1
   2 | green | 1

I am able to generate this with the following view, but the problem is 
that to create it, I must first cast the ts_stat results to a string, and 
then regexp out the pertinent pieces:


create or replace view words as
select sentence_id,
   substring(stat from '^\\(([^,]+),') as word,
   substring(stat from ',([^,]+)\\)$') as freq
from (select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id = ' ||
  tsvects.sentence_id)::text as stat
  from tsvects
   ) as foo;

It seems like there should be a way to access fields in the records 
returned from ts_stat directly, but I can't figure out how.  Here's the 
result of the subquery:


test=# select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id = ' ||
  tsvects.sentence_id)::text as stat
  from tsvects;
sentence_id |stat
-+-
   1 | (fox,1,1)
   1 | (brown,1,1)
   1 | (quick,1,1)
   2 | (lazi,1,1)
   2 | (eleph,1,1)
   2 | (green,1,1)
(6 rows)

If I try to get at the elements (which I believe are named 'word', 'ndoc', 
'nentry'), I get a variety of syntax errors:


test=# select sentence_id,
test-#stat['word'],
test-#stat['nentry']
test-#   from (select sentence_id,
test(#ts_stat('select tsvect from tsvects where 
sentence_id = ' ||

test(#   tsvects.sentence_id) as stat
test(#   from tsvects
test(#) as foo;
ERROR:  cannot subscript type record because it is not an array

If I say stat.word (instead of subscripting), I get 'missing FROM-clause 
entry for table stat'.  If I say foo.stat.word, I get 'ERROR:  schema 
foo does not exist'.


Any ideas on how to get into these records with resorting to text parsing?

Thanks,
Dan




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] syntax for reaching into records, specifically ts_stat results

2008-12-09 Thread Oleg Bartunov

ok, here is a function ( credits to Teodor )

CREATE OR REPLACE FUNCTION ts_stat(tsvector, OUT word text, OUT ndoc
integer, OUT nentry integer)
RETURNS SETOF record AS
$$
SELECT ts_stat('SELECT ' || quote_literal( $1::text ) || '::tsvector');
$$ LANGUAGE SQL RETURNS NULL ON NULL INPUT IMMUTABLE;

use it like

 select id, (ts_stat(fts)).* from apod where id=1;

Oleg


On Tue, 9 Dec 2008, Oleg Bartunov wrote:


On Tue, 9 Dec 2008, Dan Chak wrote:


Oleg,

This syntax works fine until I also want to get the sentence_id column in 
there as well, so that I can differentiate one set of ts_stat results from 
another.  With the syntax where ts_stat is treated like a table, it isn't 
possible to run ts_stat separately on multiple tsvectors as I'm doing 
below.


Is there some generic record access syntax that I can use?


write function



Thanks,
Dan

On Dec 9, 2008, at 3:04 PM, Oleg Bartunov wrote:


try select * from ts_stat()
btw, performance of ts_stat() was greatly improved in 8.4.

Oleg
On Tue, 9 Dec 2008, Dan Chak wrote:


Dear Postgres Folk,

In working with tsvectors (added in 8.3), I've come to a place where my 
syntax-fu has failed me.  I've resorted to turning a result set of 
records into strings so that I can regexp out the record fields I need, 
rather than access them directly, as I'm sure it's possible to do with 
the right syntactic formulation.  Although my solution works, I'm sure 
it's much less efficient than it could be, and hope someone on the list 
can help do this the right way.


Basically, I would like to transpose a series of tsvectors (one per row) 
into columns.  E.g., as tsvects, I have this:


test=# select * from tsvects;
sentence_id |tsvect
-+--
   1 | 'fox':3 'brown':2 'quick':1
   2 | 'lazi':1 'eleph':3 'green':2

Instead I want this:

sentence_id | word  | freq
-+---+--
   1 | fox   | 1
   1 | brown | 1
   1 | quick | 1
   2 | lazi  | 1
   2 | eleph | 1
   2 | green | 1

I am able to generate this with the following view, but the problem is 
that to create it, I must first cast the ts_stat results to a string, and 
then regexp out the pertinent pieces:


create or replace view words as
select sentence_id,
   substring(stat from '^\\(([^,]+),') as word,
   substring(stat from ',([^,]+)\\)$') as freq
from (select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id = ' ||
  tsvects.sentence_id)::text as stat
  from tsvects
   ) as foo;

It seems like there should be a way to access fields in the records 
returned from ts_stat directly, but I can't figure out how.  Here's the 
result of the subquery:


test=# select sentence_id,
   ts_stat('select tsvect from tsvects where sentence_id = ' ||
  tsvects.sentence_id)::text as stat
  from tsvects;
sentence_id |stat
-+-
   1 | (fox,1,1)
   1 | (brown,1,1)
   1 | (quick,1,1)
   2 | (lazi,1,1)
   2 | (eleph,1,1)
   2 | (green,1,1)
(6 rows)

If I try to get at the elements (which I believe are named 'word', 
'ndoc', 'nentry'), I get a variety of syntax errors:


test=# select sentence_id,
test-#stat['word'],
test-#stat['nentry']
test-#   from (select sentence_id,
test(#ts_stat('select tsvect from tsvects where 
sentence_id = ' ||

test(#   tsvects.sentence_id) as stat
test(#   from tsvects
test(#) as foo;
ERROR:  cannot subscript type record because it is not an array

If I say stat.word (instead of subscripting), I get 'missing FROM-clause 
entry for table stat'.  If I say foo.stat.word, I get 'ERROR:  schema 
foo does not exist'.


Any ideas on how to get into these records with resorting to text 
parsing?


Thanks,
Dan




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription: