[PERFORM] How many views is ok?
I have database of company data, and some of them is table of information about employees. I need each employee to have access only to his own row. Postgre cannot do this by system of privileges, because that can give privileges only to whole tables. Possibility is to create a view for each employee that chooses only his data and give employee privileges to this view. But I am not sure if such number of views does not have some performance drawbacks or even if postgre can support it (I expect i can). I would need several tables protected like this and it can result in, say 1000 views in maximum. Because access to DB will go through PHP information system, other possibility to protect data is to let IS connect as more privileged than user really is, but let it retrieve only data for that user. View-approach seems far more clear than this, but im not sure if postgre can handle it without problems. Thanks for any reply :-) --- Petr Kavan Database Development ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Performance pb vs SQLServer.
Hi, I have a perfomance issue : I run PG (8.0.3) and SQLServer2000 on a Windows2000 Server (P4 1,5Ghz 512Mo) I have a table (320 rows) and I run this single query : select cod from mytable group by cod I have an index on cod (char(4) - 88 different values) PG = ~ 20 sec. SQLServer = < 8 sec the explain is : HashAggregate (cost=64410.09..64410.09 rows=55 width=8) -> Seq Scan on mytable (cost=0.00..56325.27 rows=3233927 width=8) if I switch to "enable_hashagg = false" (just for a try...) the planner will choose my index : Group (cost=0.00..76514.01 rows=55 width=8) -> Index Scan using myindex on mytable (cost=0.00..68429.20 rows=3233927 width=8) but performance will be comparable to previous test. So with or without using Index I have the same result. Thanks for help. Stéphane COEZ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [Fwd: [PERFORM] PG8 Tuning]
Hi Paul, I was passed your message... regarding DSS workload with Postgres on Solaris. (I am not in the alias). Performance is relative to your workload. Can you actually send us what you are doing in your queries, updates etc? I have been running few tests myself and here are my rules of thumbs, your mileage can vary.. http://blogs.sun.com/roller/page/jkshah?entry=tuning_postgresql_8_0_2 * Increasing checkpoint certainly helps. (I went as far as actually going to increase LOGFILE size from 16MB to 256MB and recompiling it and then using lower number of checkpoints (appropriately).. (file rotations also decreases performance) * Moving pg_xlog to a different file system and mounting that file system with "forcedirectio" also helps a lot (This increases the througput by another 2x to 5x or more.) (This can be done either by adding forcedirectio in your /etc/vfstab mount options or for existing mounts as follows: mount -o remount,forcedirectio /filesystem (Note: Database files should not be using forcedirectio otherwise file system cache will not be used for it) * I actually reduced the PG Bufferpool to 1G or less since it seemed to decrease performance as I increased its bufferpool size (depending on your workload) * If you are using SPARC then following etc commands will help.. set segmap_percent=60 set ufs:freebehind=0 This will allocate 60% of RAM for file system buffer (database files) and also cache all files (since PostgreSQL files are 1G by default) This will help your repeat queries significantly. Other things depends on what you queries you are running? If you send me few samples, I can send you appropriate DTrace scripts (Solaris 10 or higher) to run to figure out what's happening Regards, Jignesh Jignesh K. Shah MTS Software Engineer Sun Microsystems, Inc MDE-Horizontal Technologies Email: [EMAIL PROTECTED] Phone: (781) 442 3052 http://blogs.sun.com/jkshah - Original Message - >From Paul Johnson <[EMAIL PROTECTED]> DateThu, 11 Aug 2005 13:23:21 +0100 (BST) To pgsql-performance@postgresql.org Subject [PERFORM] PG8 Tuning Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC CPUs running Solaris 10. The DB cluster is on an external fibre-attached Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN. The system is for the sole use of a couple of data warehouse developers, hence we are keen to use 'aggressive' tuning options to maximise performance. So far we have made the following changes and measured the impact on our test suite: 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement in some cases. 2) Increase work_mem from 1,024 to 524,288. 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box. Question - can Postgres only use 2GB RAM, given that shared_buffers can only be set as high as 262,143 (8K pages)? So far so good... 4) Move /pg_xlog to an internal disk within the V250. This has had a severe *negative* impact on performance. Copy job has gone from 2 mins to 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL jobs. I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to a single spindle disk? In cases such as this, where an external storage array with a hardware RAID controller is used, the normal advice to separate the data from the pg_xlog seems to come unstuck, or are we missing something? Cheers, Paul Johnson. --- Begin Message --- Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC CPUs running Solaris 10. The DB cluster is on an external fibre-attached Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN. The system is for the sole use of a couple of data warehouse developers, hence we are keen to use 'aggressive' tuning options to maximise performance. So far we have made the following changes and measured the impact on our test suite: 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement in some cases. 2) Increase work_mem from 1,024 to 524,288. 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box. Question - can Postgres only use 2GB RAM, given that shared_buffers can only be set as high as 262,143 (8K pages)? So far so good... 4) Move /pg_xlog to an internal disk within the V250. This has had a severe *negative* impact on performance. Copy job has gone from 2 mins to 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL jobs. I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to a single spindle disk? In cases such as this, where an external storage array with a hardware RAID controller is used, the normal advice to se
Re: [PERFORM] How many views is ok?
Petr Kavan wrote: > I have database of company data, and some of them is table of > information about employees. I need each employee to have access only > to his own row. Postgre cannot do this by system of privileges, > because that can give privileges only to whole tables. > > Possibility is to create a view for each employee that chooses only > his data and give employee privileges to this view. But I am not sure > if such number of views does not have some performance drawbacks or > even if postgre can support it (I expect i can). I would need several > tables protected like this and it can result in, say 1000 views in > maximum. > > Because access to DB will go through PHP information system, other > possibility to protect data is to let IS connect as more privileged > than user really is, but let it retrieve only data for that user. > > View-approach seems far more clear than this, but im not sure if > postgre can handle it without problems. We do a similar thing tying user to per-row permissions. We have 1 view per table, and it works fine. I would recommend that you do something similar. Basically, just make the view: CREATE VIEW just_me SECURITY DEFINER AS SELECT * FROM user_table WHERE username=session_user; REVOKE ALL FROM user_table; GRANT SELECT TO just_me TO PUBLIC; security definer, means that the 'just_me' view will be executed as the user who created the function (superuser). The REVOKE ALL (my syntax might be wrong) prevents users from querying the user tables directly. The 'session_user' makes the view use the name of the actual connected user (because of security definer, plain 'user' is the superuser) This should allow a user to see only their own row in the database. (Whichever rows that have username matching the connected name). Now, this only works if the php front end connects specifically as the given user (our system is setup to do this). If you don't do it this way, you'll always be stuck with the IS layer doing the restriction. Even if you create a view per user, if your PHP layer has the right to look at other tables/views, it doesn't really help. Good luck, John =:-> > > Thanks for any reply :-) > > --- > Petr Kavan > Database Development > > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Performance pb vs SQLServer.
Stéphane COEZ wrote: >Hi, > >I have a perfomance issue : > >I run PG (8.0.3) and SQLServer2000 on a Windows2000 Server (P4 1,5Ghz 512Mo) >I have a table (320 rows) and I run this single query : > >select cod from mytable group by cod >I have an index on cod (char(4) - 88 different values) > >PG = ~ 20 sec. >SQLServer = < 8 sec > > >the explain is : > >HashAggregate (cost=64410.09..64410.09 rows=55 width=8) > -> Seq Scan on mytable (cost=0.00..56325.27 rows=3233927 width=8) > > >if I switch to "enable_hashagg = false" (just for a try...) >the planner will choose my index : > >Group (cost=0.00..76514.01 rows=55 width=8) > -> Index Scan using myindex on mytable (cost=0.00..68429.20 rows=3233927 >width=8) > >but performance will be comparable to previous test. > >So with or without using Index I have the same result. > > My guess is that this is part of a larger query. There isn't really much you can do. If you want all 3.2M rows, then you have to wait for them to be pulled in. What you generally can do for performance, is to restructure things, so that you *don't* have to touch all 3.2M rows. If you are just trying to determine what the unique entries are for cod, you probably are better off doing some normalization, and keeping a separate table of cod values. I'm guessing the reason your query is faster with SQLServer is because of how postgres handles MVCC. Basically, it still has to fetch the main page to determine if a row exists. While SQL server doesn't do MVCC, so it can just look things up in the index. You might also try a different query, something like: SELECT DISTINCT cod FROM mytable ORDER BY cod GROUP BY cod; (You may or may not want order by, or group by, try the different combinations.) It might be possible to have the planner realize that all you want is unique rows, just doing a group by doesn't give you that. John =:-> > >Thanks for help. > >Stéphane COEZ > > > > >---(end of broadcast)--- >TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > > > signature.asc Description: OpenPGP digital signature
Re: [PERFORM] How many views is ok?
"Petr Kavan" <[EMAIL PROTECTED]> writes: > Possibility is to create a view for each employee that chooses only his data > and give employee privileges to this view. But I am not sure if such number > of views does not have some performance drawbacks or even if postgre can > support it (I expect i can). Do you really need more than one view? I'd consider something like create view emp_view as select * from emp where name = current_user; This requires that your Postgres usernames match up with something in the underlying table, of course. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance pb vs SQLServer.
On Sun, Aug 14, 2005 at 07:27:38PM -0500, John Arbash Meinel wrote: > My guess is that this is part of a larger query. There isn't really much > you can do. If you want all 3.2M rows, then you have to wait for them to > be pulled in. To me, it looks like he'll get 88 rows, not 3.2M. Surely we must be able to do something better than a full sequential scan in this case? test=# create table foo ( bar char(4) ); CREATE TABLE test=# insert into foo values (''); INSERT 24773320 1 test=# insert into foo values (''); INSERT 24773321 1 test=# insert into foo values (''); INSERT 24773322 1 test=# select * from foo group by bar; bar -- (2 rows) I considered doing some odd magic with generate_series() and subqueries with LIMIT 1, but it was a bit too weird in the end :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance pb vs SQLServer.
On Sun, Aug 14, 2005 at 07:27:38PM -0500, John Arbash Meinel wrote: > If you are just trying to determine what the unique entries are for cod, > you probably are better off doing some normalization, and keeping a > separate table of cod values. Pah, I missed this part of the e-mail -- you can ignore most of my (other) reply, then :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance pb vs SQLServer.
Steinar H. Gunderson wrote: >On Sun, Aug 14, 2005 at 07:27:38PM -0500, John Arbash Meinel wrote: > > >>My guess is that this is part of a larger query. There isn't really much >>you can do. If you want all 3.2M rows, then you have to wait for them to >>be pulled in. >> >> > >To me, it looks like he'll get 88 rows, not 3.2M. Surely we must be able to >do something better than a full sequential scan in this case? > >test=# create table foo ( bar char(4) ); >CREATE TABLE >test=# insert into foo values (''); >INSERT 24773320 1 >test=# insert into foo values (''); >INSERT 24773321 1 >test=# insert into foo values (''); >INSERT 24773322 1 >test=# select * from foo group by bar; > bar >-- > > >(2 rows) > >I considered doing some odd magic with generate_series() and subqueries with >LIMIT 1, but it was a bit too weird in the end :-) > >/* Steinar */ > > I think a plain "GROUP BY" is not smart enough to detect it doesn't need all rows (since it is generally used because you want to get aggregate values of other columns). I think you would want something like SELECT DISTINCT, possibly with an ORDER BY rather than a GROUP BY (which was my final suggestion). John =:-> signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Performance pb vs SQLServer.
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes: > To me, it looks like he'll get 88 rows, not 3.2M. Surely we must be able to > do something better than a full sequential scan in this case? Not really. There's been some speculation about implementing index "skip search" --- once you've verified there's at least one visible row of a given index value, tell the index to skip to the next different value instead of handing back any of the remaining entries of the current value. But it'd be a lot of work and AFAICS not useful for very many kinds of queries besides this. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance pb vs SQLServer.
On Sun, Aug 14, 2005 at 09:18:45PM -0400, Tom Lane wrote: > Not really. There's been some speculation about implementing index > "skip search" --- once you've verified there's at least one visible > row of a given index value, tell the index to skip to the next different > value instead of handing back any of the remaining entries of the > current value. But it'd be a lot of work and AFAICS not useful for > very many kinds of queries besides this. This is probably a completely wrong way of handling it all, but could it be done in a PL/PgSQL query like this? (Pseudo-code, sort of; I'm not very well versed in the actual syntax, but I'd guess you get the idea.) x = ( SELECT foo FROM table ORDER BY foo LIMIT 1 ); WHILE x IS NOT NULL RETURN NEXT x; x = ( SELECT foo FROM table WHERE foo > x ORDER BY foo LIMIT 1 ); END; (Replace with max() and min() for 8.1, of course.) /* Steinar */ - fond of horrible hacks :-) -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance pb vs SQLServer.
One little thing. Did you shutdown sql2000 while testing postgresql? Remember that postgresql uses system cache. Sql2000 uses a large part of memory as buffer and it will not be available to operating system. I must say that, probably, results will be the same, but it will be a better test. > I'm guessing the reason your query is faster with SQLServer is because > of how postgres handles MVCC. Basically, it still has to fetch the main > page to determine if a row exists. While SQL server doesn't do MVCC, so > it can just look things up in the index. Another thing [almost offtopic]: I would like to add something to understand what does MVCC means and what are the consecuences. MVCC: multiversion concurrency control. (ehhh...) Just do this. Open two psql sessions. Do this: Session 1: begin; update any_table set any_column = 'value_a' where other_column = 'value_b' -- do not commit Session 2: select any_table where other_column = 'value_b' Watch the result. Session 1: commit; Session 2: select any_table where other_column = 'value_b' Watch the result. Now open two session in query analyzer. Do the same thing: Session 1: begin tran update any_table set any_column = 'value_a' where other_column = 'value_b' -- do not commit Session 2: select any_table where other_column = 'value_b' Wait for result. Wait... wait... (Oh, a lock! Ok, when you get tired, go back to session 1.) Session 1: commit Session 2: Then watch the result. Which one was faster? ["very, very offtopic"] Ok. This comparition is just as useless as the other one, because it's comparing oranges with apples (It's funny anyway). I was just choosing an example in which you can see the best of postgresql against 'not so nice' behavior of mssql2000 (no service pack, it's my desktop system, I'll do the same test later with SP4 and different isolation levels and I'll check results). Furthermore, MSSQL2000 is 5 years old now. Does anybody has the same cellular phone, or computer? (I don't want to know :-) ). The big question is 'What do you need?'. No system can give you all. That's marketing 'sarasa'. Sorry for my english and the noise. [End of offtopic] Long life, little spam and prosperity. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How many views is ok?
Hey, that trick with session_user is great! :-) Thank you all very much, this will certainly help. --- Petr Kavan Database Development - Original Message - From: "John Arbash Meinel" <[EMAIL PROTECTED]> To: "Petr Kavan" <[EMAIL PROTECTED]> Cc: Sent: Monday, August 15, 2005 2:22 AM Subject: Re: [PERFORM] How many views is ok? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org