[GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Uwe Schroeder

Maybe someone here can make sense of this. 
I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump, 
restore, vac full, reindex.

Both - old and new - run on the same hardware and the postgresql.conf settings 
are identical.

You'll probably ask for the table definitions, which I'm happy to provide, but 
I'll omit them here for the sake of a shorter message. Basically everything is 
identical, data, tables, indexes, harware, 
config.

I should mention that the tables are really views - maybe something in the 
views changed in 9.0.2

I run this query on the 8.3 system:

explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM 
v_bprofile_comments_club16  

 
WHERE v_bprofile_comments_club16.profile_id = '5584' AND 
v_bprofile_comments_club16.approved = true;

  QUERY PLAN
  
--
 Aggregate  (cost=6294.37..6294.38 rows=1 width=4) (actual 
time=269.633..269.635 rows=1 loops=1)
   -  Nested Loop  (cost=1889.71..6273.06 rows=8523 width=4) (actual 
time=156.585..266.325 rows=1641 loops=1)
 -  Nested Loop  (cost=1889.71..5858.47 rows=779 width=16) (actual 
time=156.565..237.216 rows=1641 loops=1)
   -  Nested Loop  (cost=1889.71..4488.01 rows=763 width=12) 
(actual time=156.453..200.174 rows=1641 loops=1)
 -  Index Scan using bprofile_pkey on bprofile m  
(cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1 loops=1)
   Index Cond: (id = 5584)
 -  Hash Left Join  (cost=1889.71..4476.11 rows=763 
width=16) (actual time=156.302..194.762 rows=1641 loops=1)
   Hash Cond: (b.uid = ug.user_id)
   -  Hash Join  (cost=1821.55..4399.44 rows=763 
width=20) (actual time=151.372..183.103 rows=1641 loops=1)
 Hash Cond: (c.from_id = b.id)
 -  Index Scan using 
bprofile_comments_status_idx on bprofile_comments c  (cost=0.00..2558.77 
rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1)
   Index Cond: ((profile_id = 5584) AND 
(approved = true))
   Filter: approved
 -  Hash  (cost=1726.15..1726.15 rows=7632 
width=8) (actual time=151.131..151.131 rows=14782 loops=1)
   -  Hash Left Join  (cost=61.50..1726.15 
rows=7632 width=8) (actual time=2.622..119.268 rows=14782 loops=1)
 Hash Cond: (b.uid = ugi.user_id)
 Filter: (gi.group_name IS NULL)
 -  Seq Scan on bprofile b  
(cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033 rows=15265 
loops=1)
   Filter: (NOT deleted)
 -  Hash  (cost=55.12..55.12 
rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1)
   -  Nested Loop  
(cost=0.00..55.12 rows=510 width=13) (actual time=0.136..1.909 rows=231 loops=1)
 -  Seq Scan on 
tg_group gi  (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1 
loops=1)
   Filter: 
((group_name)::text = 'Club16'::text)
 -  Index Scan using 
user_group_group_idx on user_group ugi  (cost=0.00..45.80 rows=660 width=8) 
(actual time=0.084..1.071 rows=231 loops=1)
   Index Cond: 
(ugi.group_id = gi.group_id)
   -  Hash  (cost=55.35..55.35 rows=1025 width=4) 
(actual time=4.866..4.866 rows=1025 loops=1)
 -  Index Scan using user_group_group_idx on 
user_group ug  (cost=0.00..55.35 rows=1025 width=4) (actual time=0.058..2.766 
rows=1025 loops=1)
   Index Cond: (group_id = 2)
   -  Index Scan using bphotos_profile_primary_idx on bphotos p  
(cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015 rows=1 loops=1641)   
 
 Index Cond: ((p.profile_id = b.id) AND (p.is_primary = 
true))  
 

Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Stephen Frost
* Uwe Schroeder (u...@oss4u.com) wrote:
 Now I turn off the 8.3 instance and start the 9.0 instance. Remember, 
 everything is identical. Here the same query again:

Everything isn't identical if you just started PG 9.0 though- presumably
the 8.3 instance had everything cache'd already.  What happens if you
run this query again under 9.0..?

 The duration suddenly goes from 270 milliseconds to 173 seconds! The index 
 scan on bprofile_comments_status_idx suddenly shows 15288 loops, where it 
 should be 1 loop just like before. So 
 shomehow the 9.0 planner gets it all wrong.

You do have a different plan, but both of them have a Nested Loop, with
a Hash table built inside it.  The 9.0 does also do the index scan
inside the loop, but if you look at the actual time, that's not really
causing a huge difference.  One thing I'm wondering about is if 9.0 is
getting a more accurate view of the amount of data and is realizing that
it might go over work_mem with the big Hash Left Join, and so decides
against it.  What does your work_mem setting look like on each system?
Have you tried increasing it?

Thanks,

Stephen

 I also noticed that normally I get an iowait with a few percent during such 
 operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU. PG9 has a 
 much smaller memory footprint than 8.3 in 
 the same configuration - so this all makes very little sense to me. Maybe 
 someone here has an idea.
 
 Thanks
 
 Uwe
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


signature.asc
Description: Digital signature


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-29 Thread Craig Ringer

On 01/29/2011 05:12 AM, Herouth Maoz wrote:


The machine has no additional room for internal disks. It is a recent
purchase and not likely to be replaced any time soon.


Newly acquired or not, it sounds like it isn't sized correctly for the 
load and needs an upgrade if it can't be shifted into a more suitable 
role and replaced.



Now, my position
is that the best solution would be to add an external hard disk, via
USB/firewire


eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing 
eSATA.


FireWire is usable for a database. USB is too ... kind of. Performance 
will be poor because of the high latency, CPU-heavy non-DMA access done 
by the USB stack.


For something read-only, that might be OK.


and use it for the archive tables. My sysadmin, on the
other hand, wants to mount a storage machine remotely and use it for the
extra tablespace, as the storage machine is a more reliable hardware.


If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, that 
might be a good idea. I'd personally avoid NFS or SMB.


That said, again if it's read-only you might be fine.


think that remote mounted volumes are not a proper device for a
database, as the network is subject to load and I've ran into frozen
mounts in both NFS and SMB in the past. Never mind being slower.


Remote *file* *level* shares aren't great for databases, IMO.


a. Is it normal for vacuum processes to take two weeks?


Define really huge and describe the hardware; without numbers it's 
hard to know. What version of Pg are you using?


Was it a standalone VACUUM or was it an autovacuum worker?


b. What happens if the vacuum process is stopped? Are the tuples
partially recovered, or are they only recovered if the process completes
properly?


I *think* tuples become available progressively, but I'm not certain of 
that.



c. Is there anything I can do to make vacuums shorter?


Do it much more often.

Use Pg 8.4 or later, with visibility map.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Thom Brown
On 29 January 2011 09:11, Uwe Schroeder u...@oss4u.com wrote:

 Maybe someone here can make sense of this.
 I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump, 
 restore, vac full, reindex.

 Both - old and new - run on the same hardware and the postgresql.conf 
 settings are identical.

 You'll probably ask for the table definitions, which I'm happy to provide, 
 but I'll omit them here for the sake of a shorter message. Basically 
 everything is identical, data, tables, indexes, harware,
 config.

 I should mention that the tables are really views - maybe something in the 
 views changed in 9.0.2

 I run this query on the 8.3 system:

 explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM 
 v_bprofile_comments_club16
 WHERE v_bprofile_comments_club16.profile_id = '5584' AND 
 v_bprofile_comments_club16.approved = true;
                                                                               
                QUERY PLAN
 --
  Aggregate  (cost=6294.37..6294.38 rows=1 width=4) (actual 
 time=269.633..269.635 rows=1 loops=1)
   -  Nested Loop  (cost=1889.71..6273.06 rows=8523 width=4) (actual 
 time=156.585..266.325 rows=1641 loops=1)
         -  Nested Loop  (cost=1889.71..5858.47 rows=779 width=16) (actual 
 time=156.565..237.216 rows=1641 loops=1)
               -  Nested Loop  (cost=1889.71..4488.01 rows=763 width=12) 
 (actual time=156.453..200.174 rows=1641 loops=1)
                     -  Index Scan using bprofile_pkey on bprofile m  
 (cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1 loops=1)
                           Index Cond: (id = 5584)
                     -  Hash Left Join  (cost=1889.71..4476.11 rows=763 
 width=16) (actual time=156.302..194.762 rows=1641 loops=1)
                           Hash Cond: (b.uid = ug.user_id)
                           -  Hash Join  (cost=1821.55..4399.44 rows=763 
 width=20) (actual time=151.372..183.103 rows=1641 loops=1)
                                 Hash Cond: (c.from_id = b.id)
                                 -  Index Scan using 
 bprofile_comments_status_idx on bprofile_comments c  (cost=0.00..2558.77 
 rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1)
                                       Index Cond: ((profile_id = 5584) AND 
 (approved = true))
                                       Filter: approved
                                 -  Hash  (cost=1726.15..1726.15 rows=7632 
 width=8) (actual time=151.131..151.131 rows=14782 loops=1)
                                       -  Hash Left Join  
 (cost=61.50..1726.15 rows=7632 width=8) (actual time=2.622..119.268 
 rows=14782 loops=1)
                                             Hash Cond: (b.uid = ugi.user_id)
                                             Filter: (gi.group_name IS NULL)
                                             -  Seq Scan on bprofile b  
 (cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033 rows=15265 
 loops=1)
                                                   Filter: (NOT deleted)
                                             -  Hash  (cost=55.12..55.12 
 rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1)
                                                   -  Nested Loop  
 (cost=0.00..55.12 rows=510 width=13) (actual time=0.136..1.909 rows=231 
 loops=1)
                                                         -  Seq Scan on 
 tg_group gi  (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 
 rows=1 loops=1)
                                                               Filter: 
 ((group_name)::text = 'Club16'::text)
                                                         -  Index Scan using 
 user_group_group_idx on user_group ugi  (cost=0.00..45.80 rows=660 width=8) 
 (actual time=0.084..1.071 rows=231 loops=1)
                                                               Index Cond: 
 (ugi.group_id = gi.group_id)
                           -  Hash  (cost=55.35..55.35 rows=1025 width=4) 
 (actual time=4.866..4.866 rows=1025 loops=1)
                                 -  Index Scan using user_group_group_idx on 
 user_group ug  (cost=0.00..55.35 rows=1025 width=4) (actual time=0.058..2.766 
 rows=1025 loops=1)
                                       Index Cond: (group_id = 2)
               -  Index Scan using bphotos_profile_primary_idx on bphotos p  
 (cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015 rows=1 loops=1641)
                     Index Cond: ((p.profile_id = b.id) AND (p.is_primary = 
 true))
         -  Index Scan using bphotos_profile_primary_idx on bphotos p  
 (cost=0.00..0.52 rows=1 width=4) (actual time=0.008..0.011 rows=1 loops=1641)
               Index Cond: ((p.profile_id = b.id) AND (p.is_primary = true))
  Total runtime: 270.808 ms
 (33 rows)


 As you 

Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
Thanks Oleg. I'm going to have to experiment with this so that I understand
it better.

Matt

On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov o...@sai.msu.su wrote:

 Matt, I'd try to use prefix search on original string concatenated with
 reverse string:

 Just tried on some spare table

 knn=# \d spot_toulouse
Table public.spot_toulouse
   Column|   Type| Modifiers
 -+---+---
  clean_name  | character varying |


 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
 using gin(to_tsvector('french', clean_name || ' ' || reverse(clean_name)));
 2.
 select clean_name from spot_toulouse where to_tsvector('french',
 clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:* |
 et:*');

 Select looks cumbersome, but you can always write wrapper functions. The
 only drawback I see for now is that ranking function will a bit confused,
 since coordinates of original and reversed words will be not the same, but
 again, it's possible to obtain tsvector by custom function, which aware
 about reversing.

 Good luck and let me know if this help you.

 Oleg


 On Fri, 28 Jan 2011, Matt Warner wrote:

  I'm in the process of migrating a project from Oracle to Postgres and have
 run into a feature question. I know that Postgres has a full-text search
 feature, but it does not allow scanning the index (as opposed to the
 data).
 Specifically, in Oracle you can do select * from table where
 contains(colname,'%part_of_word%')1. While this isn't terribly
 efficient,
 it's much faster than full-scanning the raw data and is relatively quick.

 It doesn't seem that Postgres works this way. Attempting to do this
 returns
 no rows: select * from table where to_tsvector(colname) @@
 to_tsquery('%part_of_word%')

 The reason I want to do this is that the partial word search does not
 involve dictionary words (it's scanning names).

 Is this something Postgres can do? Or is there a different way to do scan
 the index?

 TIA,

 Matt


Regards,
Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83



Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-29 Thread Herouth Maoz

בתאריך 29/01/11 13:57, ציטוט Craig Ringer:

On 01/29/2011 05:12 AM, Herouth Maoz wrote:


The machine has no additional room for internal disks. It is a recent
purchase and not likely to be replaced any time soon.


Newly acquired or not, it sounds like it isn't sized correctly for the 
load and needs an upgrade if it can't be shifted into a more suitable 
role and replaced.

Sigh. Budget considerations, you know.



Now, my position
is that the best solution would be to add an external hard disk, via
USB/firewire


eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing 
eSATA.
Oh, yes, I forgot about eSATA. I meant basically a real local connection 
rather than network one.




FireWire is usable for a database. USB is too ... kind of. Performance 
will be poor because of the high latency, CPU-heavy non-DMA access 
done by the USB stack.


For something read-only, that might be OK.


and use it for the archive tables. My sysadmin, on the
other hand, wants to mount a storage machine remotely and use it for the
extra tablespace, as the storage machine is a more reliable hardware.


If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, 
that might be a good idea. I'd personally avoid NFS or SMB.


OK.


That said, again if it's read-only you might be fine.
Question is - if the read-only tablespace gets stuck/frozen, what 
happens to the read-write part of the database, which is absolutely 
essential to have in good responsive working order?

a. Is it normal for vacuum processes to take two weeks?


Define really huge and describe the hardware; without numbers it's 
hard to know. What version of Pg are you using?
Pg 8.3.11. The tables have more than 200,000,000 records. About the 
hardware, I'm not entirely in the loop, but it has two dual-core Intel 
Xeon 5130 CPUs, 4G of memory, and its system disk (111G) is separate 
from the database disk (825G). The disks are hardware RAID, but I'm not 
sure which level, and I think they are 10,000 RPM but I could be wrong.


Was it a standalone VACUUM or was it an autovacuum worker?

Autovacuum worker.

TIA,
Herouth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
Reverse isn't a built-in Postgres function, so I found one and installed it.
However, attempting to use it in creating an index gets me the message
ERROR:  functions in index expression must be marked IMMUTABLE, even
though the function declaration already has the immutable argument.

Is there a specific version of the reverse function you're using? Or am I
just missing something obvious? This is Postgres 9, BTW.

Thanks,

Matt

On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner m...@warnertechnology.comwrote:

 Thanks Oleg. I'm going to have to experiment with this so that I understand
 it better.

 Matt


 On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov o...@sai.msu.su wrote:

 Matt, I'd try to use prefix search on original string concatenated with
 reverse string:

 Just tried on some spare table

 knn=# \d spot_toulouse
Table public.spot_toulouse
   Column|   Type| Modifiers
 -+---+---
  clean_name  | character varying |


 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
 using gin(to_tsvector('french', clean_name || ' ' || reverse(clean_name)));
 2.
 select clean_name from spot_toulouse where to_tsvector('french',
 clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:* |
 et:*');

 Select looks cumbersome, but you can always write wrapper functions. The
 only drawback I see for now is that ranking function will a bit confused,
 since coordinates of original and reversed words will be not the same, but
 again, it's possible to obtain tsvector by custom function, which aware
 about reversing.

 Good luck and let me know if this help you.

 Oleg


 On Fri, 28 Jan 2011, Matt Warner wrote:

  I'm in the process of migrating a project from Oracle to Postgres and
 have
 run into a feature question. I know that Postgres has a full-text search
 feature, but it does not allow scanning the index (as opposed to the
 data).
 Specifically, in Oracle you can do select * from table where
 contains(colname,'%part_of_word%')1. While this isn't terribly
 efficient,
 it's much faster than full-scanning the raw data and is relatively quick.

 It doesn't seem that Postgres works this way. Attempting to do this
 returns
 no rows: select * from table where to_tsvector(colname) @@
 to_tsquery('%part_of_word%')

 The reason I want to do this is that the partial word search does not
 involve dictionary words (it's scanning names).

 Is this something Postgres can do? Or is there a different way to do scan
 the index?

 TIA,

 Matt


Regards,
Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83





[GENERAL] pymssql: Problem with Unicode string

2011-01-29 Thread orgilhp
Hello!
I am using pymssql-1.9.908. And I have MSSQL2008 database server.
I ran following code from python 2.6:
---
import pymssql
conn = pymssql.connect(host='testserver', user='sa',
password='sa', database='testdb', as_dict=True)
crms = conn.cursor()
crms.execute('SELECT cc_Name FROM tblUsers')
for line in crms.fetchall():
print 'cc_Name:', self.unicodify(line['cc_Name'])
break
crms.close()
---
the result is- cc_Name: ?, ???
But the actual data is Аймаг, хот and I expect that the output would
be cc_Name: Аймаг, хот.

Here is my question:
How to get Unicode data from MSSQL database using pymssql library?

Any suggestion would be highly appreciated!
regards,
Orgil.D

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pymssql Connection to the database failed for an unknown reason

2011-01-29 Thread orgilhp
Hello
I am using pymssql to connect to MSSQL2008 database. But an error
occurs:
--
 import pymssql
 conn = pymssql.connect(host='orgilhpnb\mssql2008', user='erp', 
 password='123', database='eoffice_clone')
Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib/pymodules/python2.6/pymssql.py, line 609, in connect
raise InterfaceError, e[0]
pymssql.InterfaceError: Connection to the database failed for an
unknown reason.
--

The host name, user, password and database name are all correct.
Why I get an error? Please help me!

Any suggestion would be highly appreciated!

Best regards,
Orgil

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pymssql: Problem with Unicode string

2011-01-29 Thread Thom Brown
On 29 January 2011 14:52, orgilhp orgi...@gmail.com wrote:
 Hello!
 I am using pymssql-1.9.908. And I have MSSQL2008 database server.
 I ran following code from python 2.6:
 ---
        import pymssql
        conn = pymssql.connect(host='testserver', user='sa',
 password='sa', database='testdb', as_dict=True)
        crms = conn.cursor()
        crms.execute('SELECT cc_Name FROM tblUsers')
        for line in crms.fetchall():
            print 'cc_Name:', self.unicodify(line['cc_Name'])
            break
        crms.close()
 ---
 the result is- cc_Name: ?, ???
 But the actual data is Аймаг, хот and I expect that the output would
 be cc_Name: Аймаг, хот.

 Here is my question:
 How to get Unicode data from MSSQL database using pymssql library?

 Any suggestion would be highly appreciated!
 regards,
 Orgil.D

This is a PostgreSQL mailing list.  Please direct this question to an
MSSQL or Python list instead.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pymssql: Problem with Unicode string

2011-01-29 Thread Raymond O'Donnell

On 29/01/2011 14:52, orgilhp wrote:

Here is my question:
How to get Unicode data from MSSQL database using pymssql library?


I think you'd have a better chance of getting an answer from a list on 
MSSQL; this one is devoted to PostgreSQL.


That said, they're a helpful bunch around here, so you never know. :-)

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pymssql Connection to the database failed for an unknown reason

2011-01-29 Thread Thom Brown
On 29 January 2011 13:08, orgilhp orgi...@gmail.com wrote:
 Hello
 I am using pymssql to connect to MSSQL2008 database. But an error
 occurs:
 --
 import pymssql
 conn = pymssql.connect(host='orgilhpnb\mssql2008', user='erp', 
 password='123', database='eoffice_clone')
 Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib/pymodules/python2.6/pymssql.py, line 609, in connect
    raise InterfaceError, e[0]
 pymssql.InterfaceError: Connection to the database failed for an
 unknown reason.
 --

 The host name, user, password and database name are all correct.
 Why I get an error? Please help me!

 Any suggestion would be highly appreciated!

 Best regards,
 Orgil

Again, this is a PostgreSQL list.  Please use this list for
PostgreSQL-related queries only.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Oleg Bartunov

What version of Pg you run ? Try latest version.

Oleg

On Sat, 29 Jan 2011, Matt Warner wrote:


Reverse isn't a built-in Postgres function, so I found one and installed it.
However, attempting to use it in creating an index gets me the message
ERROR:  functions in index expression must be marked IMMUTABLE, even
though the function declaration already has the immutable argument.

Is there a specific version of the reverse function you're using? Or am I
just missing something obvious? This is Postgres 9, BTW.

Thanks,

Matt

On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner m...@warnertechnology.comwrote:


Thanks Oleg. I'm going to have to experiment with this so that I understand
it better.

Matt


On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov o...@sai.msu.su wrote:


Matt, I'd try to use prefix search on original string concatenated with
reverse string:

Just tried on some spare table

knn=# \d spot_toulouse
   Table public.spot_toulouse
  Column|   Type| Modifiers
-+---+---
 clean_name  | character varying |


1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
using gin(to_tsvector('french', clean_name || ' ' || reverse(clean_name)));
2.
select clean_name from spot_toulouse where to_tsvector('french',
clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:* |
et:*');

Select looks cumbersome, but you can always write wrapper functions. The
only drawback I see for now is that ranking function will a bit confused,
since coordinates of original and reversed words will be not the same, but
again, it's possible to obtain tsvector by custom function, which aware
about reversing.

Good luck and let me know if this help you.

Oleg


On Fri, 28 Jan 2011, Matt Warner wrote:

 I'm in the process of migrating a project from Oracle to Postgres and

have
run into a feature question. I know that Postgres has a full-text search
feature, but it does not allow scanning the index (as opposed to the
data).
Specifically, in Oracle you can do select * from table where
contains(colname,'%part_of_word%')1. While this isn't terribly
efficient,
it's much faster than full-scanning the raw data and is relatively quick.

It doesn't seem that Postgres works this way. Attempting to do this
returns
no rows: select * from table where to_tsvector(colname) @@
to_tsquery('%part_of_word%')

The reason I want to do this is that the partial word search does not
involve dictionary words (it's scanning names).

Is this something Postgres can do? Or is there a different way to do scan
the index?

TIA,

Matt



   Regards,
   Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83








Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Uwe Schroeder


 On 29 January 2011 09:11, Uwe Schroeder u...@oss4u.com wrote:
  Maybe someone here can make sense of this.
  I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump,
  restore, vac full, reindex.
  
  Both - old and new - run on the same hardware and the postgresql.conf
  settings are identical.
  
  You'll probably ask for the table definitions, which I'm happy to
  provide, but I'll omit them here for the sake of a shorter message.
  Basically everything is identical, data, tables, indexes, harware,
  config.
  
  I should mention that the tables are really views - maybe something in
  the views changed in 9.0.2
  
  I run this query on the 8.3 system:
  
  explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM
  v_bprofile_comments_club16 WHERE v_bprofile_comments_club16.profile_id =
  '5584' AND v_bprofile_comments_club16.approved = true; QUERY PLAN
  
  -
  - Aggregate
   (cost=6294.37..6294.38 rows=1 width=4) (actual time=269.633..269.635
  rows=1 loops=1) -  Nested Loop  (cost=1889.71..6273.06 rows=8523
  width=4) (actual time=156.585..266.325 rows=1641 loops=1) -  Nested
  Loop  (cost=1889.71..5858.47 rows=779 width=16) (actual
  time=156.565..237.216 rows=1641 loops=1) -  Nested Loop
   (cost=1889.71..4488.01 rows=763 width=12) (actual time=156.453..200.174
  rows=1641 loops=1) -  Index Scan using bprofile_pkey on bprofile m
   (cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1
  loops=1) Index Cond: (id = 5584)
  -  Hash Left Join  (cost=1889.71..4476.11 rows=763
  width=16) (actual time=156.302..194.762 rows=1641 loops=1) Hash Cond:
  (b.uid = ug.user_id)
-  Hash Join  (cost=1821.55..4399.44 rows=763
  width=20) (actual time=151.372..183.103 rows=1641 loops=1) Hash Cond:
  (c.from_id = b.id)
  -  Index Scan using
  bprofile_comments_status_idx on bprofile_comments c  (cost=0.00..2558.77
  rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1) Index
  Cond: ((profile_id = 5584) AND (approved = true)) Filter: approved
  -  Hash  (cost=1726.15..1726.15
  rows=7632 width=8) (actual time=151.131..151.131 rows=14782 loops=1) -
   Hash Left Join  (cost=61.50..1726.15 rows=7632 width=8) (actual
  time=2.622..119.268 rows=14782 loops=1) Hash Cond: (b.uid = ugi.user_id)
  Filter: (gi.group_name IS NULL) -  Seq Scan on bprofile b
   (cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033
  rows=15265 loops=1) Filter: (NOT deleted) -  Hash  (cost=55.12..55.12
  rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1) -
   Nested Loop  (cost=0.00..55.12 rows=510 width=13) (actual
  time=0.136..1.909 rows=231 loops=1) -  Seq Scan on tg_group gi
   (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1
  loops=1) Filter: ((group_name)::text = 'Club16'::text) -  Index Scan
  using user_group_group_idx on user_group ugi  (cost=0.00..45.80 rows=660
  width=8) (actual time=0.084..1.071 rows=231 loops=1) Index Cond:
  (ugi.group_id = gi.group_id) -  Hash  (cost=55.35..55.35 rows=1025
  width=4) (actual time=4.866..4.866 rows=1025 loops=1) -  Index Scan
  using user_group_group_idx on user_group ug  (cost=0.00..55.35 rows=1025
  width=4) (actual time=0.058..2.766 rows=1025 loops=1) Index Cond:
  (group_id = 2) -  Index Scan using bphotos_profile_primary_idx on
  bphotos p  (cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015
  rows=1 loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary
  = true)) -  Index Scan using bphotos_profile_primary_idx on bphotos p
   (cost=0.00..0.52 rows=1 width=4) (actual time=0.008..0.011 rows=1
  loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary =
  true)) Total runtime: 270.808 ms
  (33 rows)
  
  
  As you can see, the query performs nicely (for the hardware used).
  
  Now I turn off the 8.3 instance and start the 9.0 instance. Remember,
  everything is identical. Here the same query again:
  
  explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM
  v_bprofile_comments_club16 WHERE v_bprofile_comments_club16.profile_id =
  '5584' AND v_bprofile_comments_club16.approved = true; QUERY PLAN
  
  -
  - Aggregate
   (cost=6278.48..6278.49 rows=1 width=4) (actual
  time=173253.190..173253.192 rows=1 loops=1) -  Nested Loop
   (cost=83.87..6278.45 rows=11 width=4) (actual time=5485.258..173248.693
  rows=1851 loops=1) -  Nested Loop  (cost=83.87..6275.95 rows=1
  width=16) (actual time=5485.216..173213.895 rows=1851 loops=1) -
   Nested Loop  (cost=83.87..6269.67 rows=1 

Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Uwe Schroeder


 * Uwe Schroeder (u...@oss4u.com) wrote:
  Now I turn off the 8.3 instance and start the 9.0 instance. Remember, 
everything is identical. Here the same query again:
 Everything isn't identical if you just started PG 9.0 though- presumably
 the 8.3 instance had everything cache'd already.  What happens if you
 run this query again under 9.0..?

The 8.3 instance is also just started. I run both on the same system (for 
testing) so I turn one off to have the memory available. 
But yes, I did run the queries multiple times in a row with no major 
improvement.


 
  The duration suddenly goes from 270 milliseconds to 173 seconds! The
  index scan on bprofile_comments_status_idx suddenly shows 15288 loops,
  where it should be 1 loop just like before. So shomehow the 9.0 planner
  gets it all wrong.
 
 You do have a different plan, but both of them have a Nested Loop, with
 a Hash table built inside it.  The 9.0 does also do the index scan
 inside the loop, but if you look at the actual time, that's not really
 causing a huge difference.  One thing I'm wondering about is if 9.0 is
 getting a more accurate view of the amount of data and is realizing that
 it might go over work_mem with the big Hash Left Join, and so decides
 against it.  What does your work_mem setting look like on each system?
 Have you tried increasing it?

What has me bummed is the index scan on 

Index Scan using bprofile_comments_status_idx on bprofile_comments c  
(cost=0.00..2558.77 rows=1531 width=12) (actual time=0.140..21.559 rows=1660 
loops=1)

vs

Index Scan using bprofile_comments_status_idx on bprofile_comments c  
(cost=0.00..4328.64 rows=1751 width=12) (actual time=0.033..8.097 rows=1872 
loops=15288)

Unless I read this wrong, the upper (8.3) index scan fetches 1660 rows in up 
to 21ms
the 9.0 plan comes up with an index scan on the same data which fetches 1872 
rows in 8 ms but loops 15288 times (that's actually the number of records in 
the referenced table), which in my book makes this scan take up to 8 x 15288 = 
122304 ms or 122 seconds

work_mem is set to 50MB and increasing it to 80MB makes no difference 



 
   Thanks,
 
   Stephen
 
  I also noticed that normally I get an iowait with a few percent during
  such operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU.
  PG9 has a much smaller memory footprint than 8.3 in the same
  configuration - so this all makes very little sense to me. Maybe someone
  here has an idea.
  
  Thanks
  
  Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Stephen Frost
* Uwe Schroeder (u...@oss4u.com) wrote:
 Yes, the database is vacuumed and analyzed. The bad plan from 9.0 improves by 
 2 seconds when I go for a really high statistics target of 5000.

What if you go back to 10..?

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
9.0.2

On Sat, Jan 29, 2011 at 9:35 AM, Oleg Bartunov o...@sai.msu.su wrote:

 What version of Pg you run ? Try latest version.

 Oleg


 On Sat, 29 Jan 2011, Matt Warner wrote:

  Reverse isn't a built-in Postgres function, so I found one and installed
 it.
 However, attempting to use it in creating an index gets me the message
 ERROR:  functions in index expression must be marked IMMUTABLE, even
 though the function declaration already has the immutable argument.

 Is there a specific version of the reverse function you're using? Or am I
 just missing something obvious? This is Postgres 9, BTW.

 Thanks,

 Matt

 On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner m...@warnertechnology.com
 wrote:

  Thanks Oleg. I'm going to have to experiment with this so that I
 understand
 it better.

 Matt


 On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov o...@sai.msu.su wrote:

  Matt, I'd try to use prefix search on original string concatenated with
 reverse string:

 Just tried on some spare table

 knn=# \d spot_toulouse
   Table public.spot_toulouse
  Column|   Type| Modifiers
 -+---+---
  clean_name  | character varying |


 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
 using gin(to_tsvector('french', clean_name || ' ' ||
 reverse(clean_name)));
 2.
 select clean_name from spot_toulouse where to_tsvector('french',
 clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:*
 |
 et:*');

 Select looks cumbersome, but you can always write wrapper functions. The
 only drawback I see for now is that ranking function will a bit
 confused,
 since coordinates of original and reversed words will be not the same,
 but
 again, it's possible to obtain tsvector by custom function, which aware
 about reversing.

 Good luck and let me know if this help you.

 Oleg


 On Fri, 28 Jan 2011, Matt Warner wrote:

  I'm in the process of migrating a project from Oracle to Postgres and

 have
 run into a feature question. I know that Postgres has a full-text
 search
 feature, but it does not allow scanning the index (as opposed to the
 data).
 Specifically, in Oracle you can do select * from table where
 contains(colname,'%part_of_word%')1. While this isn't terribly
 efficient,
 it's much faster than full-scanning the raw data and is relatively
 quick.

 It doesn't seem that Postgres works this way. Attempting to do this
 returns
 no rows: select * from table where to_tsvector(colname) @@
 to_tsquery('%part_of_word%')

 The reason I want to do this is that the partial word search does not
 involve dictionary words (it's scanning names).

 Is this something Postgres can do? Or is there a different way to do
 scan
 the index?

 TIA,

 Matt


Regards,
   Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83





Regards,
Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83



Re: [GENERAL] PG9.0 planner difference to 8.3 - majorly bad performance

2011-01-29 Thread Uwe Schroeder

del

 The duration suddenly goes from 270 milliseconds to 173 seconds! The index
 scan on bprofile_comments_status_idx suddenly shows 15288 loops, where it
 should be 1 loop just like before. So shomehow the 9.0 planner gets it all
 wrong.
 
 I also noticed that normally I get an iowait with a few percent during such
 operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU. PG9 has a
 much smaller memory footprint than 8.3 in the same configuration - so this
 all makes very little sense to me. Maybe someone here has an idea.

Usually someone here has a lot more knowledge than me and comes up with a 
viable hint rather quickly. Not so this time I'm afraid.
That tells me that something deep down changed.

Here are a few observations I made:

in PG9 NOT IN queries with a subselect (aka select x from y where x.id not in 
(select id from some_other_table) ) perform a heck of a lot better than in 
8.x. On the same note, when you re-wrote the query to use a left outer join 
with a IS NULL where clause, PG9 performs horribly slow. A query like 
select x from y left outer join z on z.id=y.id where z.id is null performs 
like a 1000 times slower than in 8.x

I'm not an expert looking at explain output, but every time I try the left 
outer solution to something that's basically a not in I get lousy 
performace. Looking at the explain, now a left outer join always implies as 
many loops over a nested block as there are rows in the referenced table. 8.x 
actually returns the rows in one loop. This seems to be an issue of what is 
done in what order. 8.x puts the left outer scan pretty early in the query, 
so the remaining joins already use a limited dataset. 9.0 puts the left outer 
towards the end of the nested blocks, which makes it work on the full set.

Maybe this makes sense to someone with more in-depth knowlege of the changes.

So far to me it looks like if you use 9.x, avoit left outer joins and use 
NOT IN. On pg versions prior to 9.x avoid NOT IN and use left outer joins

odd :-)

Uwe



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general