Re: [PERFORM] hardware and For PostgreSQL
Magnus Hagander wrote: > Ron St-Pierre wrote: > >> Joe Uhl wrote: >> >>> I realize there are people who discourage looking at Dell, but i've been >>> very happy with a larger ball of equipment we ordered recently from >>> them. Our database servers consist of a PowerEdge 2950 connected to a >>> PowerVault MD1000 with a 1 meter SAS cable. >>> >>> >>> >> We have a similar piece of equipment from Dell (the PowerEdge), and when >> we had a problem with it we received excellent service from them. When >> our raid controller went down (machine < 1 year old), Dell helped to >> diagnose the problem and installed a new one at our hosting facility, >> all within 24 hours. >> > > 24 hours?! I have a new one for my HP boxes onsite in 4 hours, including > a tech if needed... > > But I assume Dell also has service-agreement deals you can get to get > the level of service you'd want. (But you won't get it for a > non-brand-name server, most likely) > > Bottom line - don't underestimate the service you get from the vendor > when something breaks. Because eventually, something *will* break. > > > //Magnus > Yeah the response time depends on the service level purchased. I generally go with 24 hour because everything is redundant so a day of downtime isn't going to bring services down (though it could make them slow depending on what fails) but you can purchase 4 hr and in some cases even 2 hr. I had a "gold" level support contract on a server that failed awhile back and within 3 net hours they diagnosed and fixed the problem by getting onsite and replacing the motherboard and a cpu. I haven't had any of our 24hr support level devices fail yet so don't have anything to compare there. If you do go with Dell and want the higher support contracts i'll restate that a small business account is the way to go. Typically the prices are better to the point that a support level upgrade appears free when compared to the best shopping cart combo I can come up with. Joe ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Hardware for PostgreSQL
> > You're likely better off (performance-wise) putting it on the same disk > > as the database itself if that one has better RAID, for example. > I'm thinking along the lines of since nothing much writes to the OS > Disk, I should(keyword) be safe. You are almost certainly wrong about this; think "syslog" -- Adam Tauno Williams, Network & Systems Administrator Consultant - http://www.whitemiceconsulting.com Developer - http://www.opengroupware.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] select max(field) from table much faster with a group by clause?
Hi, I have a table "login" with approx 600,000 tuples, a person table with approx 10 tuples. When running select max("when") from login where userid='userid' it takes a second or two, but when adding "group by userid" the planner decides on using another plan, and it gets *much* faster. See example below. Number of tuples per user varies from zero to a couple of thousands. It seems to slower when there are no tuples as all, but it is always slow. This is only for max() and min(). For count(), the plan is the same, it always uses "Aggregate". Any ideas about this? Do we need to add "group by userid" to our code base to optimize, or is there another way? Updating postgresql to 8.2 is a long term option, but I'd like a short term option as well... Regards, Palle pp=# select version(); version - PostgreSQL 8.1.8 on amd64-portbld-freebsd6.1, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518 (1 row) Time: 0,530 ms pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ; QUERY PLAN Result (cost=323.80..323.81 rows=1 width=0) (actual time=3478.781..3478.785 rows=1 loops=1) InitPlan -> Limit (cost=0.00..323.80 rows=1 width=8) (actual time=3478.768..3478.768 rows=0 loops=1) -> Index Scan Backward using login_when_idx on "login" (cost=0.00..131461.90 rows=406 width=8) (actual time=3478.759..3478.759 rows=0 loops=1) Filter: (("when" IS NOT NULL) AND (userid = 'sarah.gilliam1'::text)) Total runtime: 3478.868 ms (6 rows) Time: 3480,442 ms pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' group by userid; QUERY PLAN - GroupAggregate (cost=0.00..648.44 rows=1 width=25) (actual time=0.191..0.191 rows=0 loops=1) -> Index Scan using login_userid_idx on "login" (cost=0.00..646.40 rows=406 width=25) (actual time=0.183..0.183 rows=0 loops=1) Index Cond: (userid = 'sarah.gilliam1'::text) Total runtime: 0.243 ms (4 rows) Time: 0,938 ms pp=# \d login Table "public.login" Column | Type | Modifiers +--+ userid | text | kursid | integer | when | timestamp with time zone | mode | text | default 'pm'::text Indexes: "login_kurs_user_idx" btree (kursid, userid) "login_userid_idx" btree (userid) "login_when_idx" btree ("when") Foreign-key constraints: "pp_fk1" FOREIGN KEY (userid) REFERENCES person(userid) ON UPDATE CASCADE ON DELETE CASCADE "pp_fk2" FOREIGN KEY (kursid) REFERENCES course(id) ON UPDATE CASCADE ON DELETE CASCADE ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [Fwd: Re: [PERFORM] Outer joins and Seq scans]
Sami Dalouche <[EMAIL PROTECTED]> writes: > Compare that to the following query, that is exactly the same except > that the City table is inner'joined instead of outer joined > ... > the explain analyze is available at : > http://www.photosdesami.com/temp/exp6.txt AFAICS it's just absolutely blind luck that that query is fast. The planner chooses to do the contactinf7_/city8_ join first, and because that happens to return no rows at all, all the rest of the query falls out in no time, even managing to avoid the scan of adcreatedevent. If there were any rows out of that join it would be a great deal slower. There is a pretty significant semantic difference between the two queries, too, now that I look closer: when you make "... join City city8_ on contactinf7_.city_id=city8_.id" a plain join instead of left join, that means the join to contactinf7_ can be reduced to a plain join as well, because no rows with nulls for contactinf7_ could possibly contribute to the upper join's result. That optimization doesn't apply in the original form of the query, which restricts the planner's freedom to rearrange things. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] select max(field) from table much faster with a group by clause?
Palle Girgensohn <[EMAIL PROTECTED]> writes: > When running > select max("when") from login where userid='userid' > it takes a second or two, but when adding "group by userid" the planner > decides on using another plan, and it gets *much* faster. See example below. It's only faster for cases where there are few or no rows for the particular userid ... > Number of tuples per user varies from zero to a couple of thousands. The planner is using an intermediate estimate of 406 rows. You might be well advised to increase the statistics target for login.userid --- with luck that would help it to choose the right plan type for both common and uncommon userids. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Fwd: Re: [PERFORM] Outer joins and Seq scans]
Thanks for your answer. So, basically, what you are saying is that there is nothing particularly wrong with the query, nor with its optimization ? So if I need performance for this query, I should just revert to other techniques (giving more memory to postgres, caching outside postgres, etc..) ? Regards, Sami Dalouche Le jeudi 01 novembre 2007 à 09:29 -0400, Tom Lane a écrit : > Sami Dalouche <[EMAIL PROTECTED]> writes: > > Compare that to the following query, that is exactly the same except > > that the City table is inner'joined instead of outer joined > > ... > > the explain analyze is available at : > > http://www.photosdesami.com/temp/exp6.txt > > AFAICS it's just absolutely blind luck that that query is fast. The > planner chooses to do the contactinf7_/city8_ join first, and because > that happens to return no rows at all, all the rest of the query falls > out in no time, even managing to avoid the scan of adcreatedevent. > If there were any rows out of that join it would be a great deal slower. > > There is a pretty significant semantic difference between the two > queries, too, now that I look closer: when you make > "... join City city8_ on contactinf7_.city_id=city8_.id" > a plain join instead of left join, that means the join to contactinf7_ > can be reduced to a plain join as well, because no rows with nulls for > contactinf7_ could possibly contribute to the upper join's result. > That optimization doesn't apply in the original form of the query, > which restricts the planner's freedom to rearrange things. > > regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] select max(field) from table much faster with a group by clause?
--On torsdag, november 01, 2007 09.43.39 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: When running select max("when") from login where userid='userid' it takes a second or two, but when adding "group by userid" the planner decides on using another plan, and it gets *much* faster. See example below. It's only faster for cases where there are few or no rows for the particular userid ... Number of tuples per user varies from zero to a couple of thousands. The planner is using an intermediate estimate of 406 rows. You might be well advised to increase the statistics target for login.userid --- with luck that would help it to choose the right plan type for both common and uncommon userids. Unfortunately, altering statistics doesn't help. I see no difference when altering the value from 10 (default) to 100, 1000 or 10. :-( Are there any other things I can modify? OH, btw, maybe something in the postgresql.conf sucks? max_connections = 100 shared_buffers = 3 # min 16 or max_connections*2, 8KB each temp_buffers = 2500 # min 100, 8KB each max_prepared_transactions = 100 # can be 0 or more work_mem = 16384# min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 32768 # min 100, size in KB max_fsm_pages = 50 max_fsm_relations = 2 max_files_per_process = 2000 fsync = off checkpoint_segments = 50# in logfile segments, min 1, 16MB each effective_cache_size = 1# typically 8KB each random_page_cost = 1.8 geqo = on geqo_threshold = 10 from_collapse_limit = 8 join_collapse_limit = 8 # 1 disables collapsing of explicit ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] select max(field) from table much faster with a group by clause?
Palle Girgensohn <[EMAIL PROTECTED]> writes: > Unfortunately, altering statistics doesn't help. I see no difference when > altering the value from 10 (default) to 100, 1000 or 10. :-( Um, you did re-ANALYZE the table after changing the setting? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] select max(field) from table much faster with a group by clause?
--On torsdag, november 01, 2007 11.06.57 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: Unfortunately, altering statistics doesn't help. I see no difference when altering the value from 10 (default) to 100, 1000 or 10. :-( Um, you did re-ANALYZE the table after changing the setting? alter table login alter userid SET statistics 1000; vacuum analyze login; ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] select max(field) from table much faster with a group by clause?
On 11/1/07, Tom Lane <[EMAIL PROTECTED]> wrote: > Palle Girgensohn <[EMAIL PROTECTED]> writes: > > Unfortunately, altering statistics doesn't help. I see no difference when > > altering the value from 10 (default) to 100, 1000 or 10. :-( > > Um, you did re-ANALYZE the table after changing the setting? And he changed it with ALTER TABLE name ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } right? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] select max(field) from table much faster with a group by clause?
Palle Girgensohn <[EMAIL PROTECTED]> writes: > --On torsdag, november 01, 2007 11.06.57 -0400 Tom Lane <[EMAIL PROTECTED]> > wrote: >> Um, you did re-ANALYZE the table after changing the setting? > alter table login alter userid SET statistics 1000; > vacuum analyze login; Hm, that's the approved procedure all right. But the plans didn't change at all? Not even the estimated number of rows? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] select max(field) from table much faster with a group by clause?
"Palle Girgensohn" <[EMAIL PROTECTED]> writes: > Are there any other things I can modify? You might consider an index on . Keep in mind that every new index imposes an incremental cost on every update and insert and increases the time for vacuum. > max_prepared_transactions = 100 # can be 0 or more Are you actually using prepared transactions (are you synchronising multiple databases using a transaction manager)? If not then set this to 0 as it takes some resources. > maintenance_work_mem = 16384# min 1024, size in KB Raising this might decrease vacuum times if that's a problem. > fsync = off You realize that this means if the system loses power or the kernel crashes you could have data corruption? Do you take very frequent backups or can you reconstruct your data? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Hardware for PostgreSQL
Ketema wrote: > I am trying to build a very Robust DB server that will support 1000+ > concurrent users (all ready have seen max of 237 no pooling being > used). I have read so many articles now that I am just saturated. I > have a general idea but would like feedback from others. Describe a bit better. 1,000 users or 1,000 simultaneous connections? Ie, do you have a front-end where someone logs on, gets a connection, and keeps it for the duration or is it a web-type app where each request might connect-query-disconnect? If the latter, are your connections persistent? How many queries/second do you expect? How complex are the queries (retrieve single record or data-mining)? Read-only or lots of updates? Do the read-queries need to be done every time or are they candidates for caching? > RAM? The more the merrier right? Generally, true. But once you reach the point that everything can fit in RAM, more is just wasted $$$. And, according to my reading, there are cases where more RAM can hurt - basically if you manage to create a situation where your queries are large enough to just flush cache so you don't benefit from caching but are hurt by spending time checking cache for the data. > Who has built the biggest baddest Pg server out there and what do you > use? Not me. Someone just showed me live system monitoring data on one of his several PG machines. That one was clocking multi-thousand TPS on a server (Sun??) with 128GB RAM. That much RAM makes "top" look amusing. Several of the social-networking sites are using PG - generally spreading load over several (dozens) of servers. They also make heavy use of pooling and caching - think dedicated memcached servers offering a combined pool of several TB RAM. For pooling, pgbouncer seems to have a good reputation. Tests on my current production server show it shaving a few ms off every connect-query-disconnect cycle. Connects are fairly fast in PG but that delay becomes a significant issue under heavy load. Test pooling carefully, though. If you blindly run everything through your pooler instead of just selected apps, you can end up with unexpected problems when one client changes a backend setting like "set statement_timeout to 5". If the next client assigned to that backend connection runs a long-duration analysis query, it is likely to fail. Cheers, Steve ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] select max(field) from table much faster with a group by clause?
--On torsdag, november 01, 2007 09.43.39 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: Palle Girgensohn <[EMAIL PROTECTED]> writes: When running select max("when") from login where userid='userid' it takes a second or two, but when adding "group by userid" the planner decides on using another plan, and it gets *much* faster. See example below. It's only faster for cases where there are few or no rows for the particular userid ... Well, no, not really. See below. OTH, it sometimes a bit slower. Seems to depend on how far away from the estimated number of rows you get? Weird? Number of tuples per user varies from zero to a couple of thousands. The planner is using an intermediate estimate of 406 rows. You might be well advised to increase the statistics target for login.userid --- with luck that would help it to choose the right plan type for both common and uncommon userids. I'll try that, thanks! -- pp=# SELECT max("when") FROM login WHERE userid='kudo' group by userid; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 6,194 ms pp=# SELECT max("when") FROM login WHERE userid='kudo' ; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 992,391 ms pp=# SELECT max("when") FROM login WHERE userid='kudo' ; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 779,582 ms pp=# SELECT max("when") FROM login WHERE userid='kudo' ; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 818,667 ms pp=# SELECT max("when") FROM login WHERE userid='kudo' ; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 640,242 ms pp=# SELECT max("when") FROM login WHERE userid='kudo' group by userid; max --- 2007-01-04 15:31:46.863325+01 (1 row) Time: 18,384 ms pp=# SELECT count(*) FROM login WHERE userid='kudo' group by userid; count --- 1998 (1 row) Time: 12,762 ms pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo' group by userid; QUERY PLAN - GroupAggregate (cost=0.00..648.44 rows=1 width=25) (actual time=24.700..24.703 rows=1 loops=1) -> Index Scan using login_userid_idx on "login" (cost=0.00..646.40 rows=406 width=25) (actual time=0.140..16.931 rows=1998 loops=1) Index Cond: (userid = 'kudo'::text) Total runtime: 24.779 ms (4 rows) Time: 25,633 ms pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo' ; QUERY PLAN Result (cost=323.93..323.94 rows=1 width=0) (actual time=1400.994..1400.997 rows=1 loops=1) InitPlan -> Limit (cost=0.00..323.93 rows=1 width=8) (actual time=1400.975..1400.979 rows=1 loops=1) -> Index Scan Backward using login_when_idx on "login" (cost=0.00..131515.87 rows=406 width=8) (actual time=1400.968..1400.968 rows=1 loops=1) Filter: (("when" IS NOT NULL) AND (userid = 'kudo'::text)) Total runtime: 1401.057 ms (6 rows) Time: 1401,881 ms pp=# SELECT userid, count("when") FROM login WHERE userid in ('girgen' , 'kudo') group by userid; userid | count +--- kudo | 1998 girgen | 1120 (2 rows) pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' group by userid; QUERY PLAN - GroupAggregate (cost=0.00..648.44 rows=1 width=25) (actual time=25.137..25.141 rows=1 loops=1) -> Index Scan using login_userid_idx on "login" (cost=0.00..646.40 rows=406 width=25) (actual time=0.121..20.712 rows=1120 loops=1) Index Cond: (userid = 'girgen'::text) Total runtime: 25.209 ms (4 rows) Time: 25,986 ms pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ; QUERY PLAN -- Result (cost=323.93..323.94 rows=1 width=0) (actual time=6.695..6.698 rows=1 loops=1) InitPlan -> Limit (cost=0.00..323.93 rows=1 width=8) (actual time=6.669..6.675 rows=1 loops=1) -> Index Scan Backward using login_when_idx on "login" (cost=0.00..131515.87 rows=406 width=8) (actual time=6.660..6.660 rows=1 loops=1) Filter: (("when" IS NOT NULL) AND (userid = 'girgen'::text)) Total runtime: 6.785 ms (6 rows)
Re: [PERFORM] Hardware for PostgreSQL
Magnus Hagander wrote: > Ow Mun Heng wrote: >>> You're likely better off (performance-wise) putting it on the same disk >>> as the database itself if that one has better RAID, for example. >> I'm thinking along the lines of since nothing much writes to the OS >> Disk, I should(keyword) be safe. > > Unless it's *always* in the cache (not so likely), reads will also move > the heads... And if you aren't mounted noatime, reads will also cause a write. Cheers, Steve ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] How to avoid hashjoin and mergejoin
I am comparing the same query on two different PG 8.2 servers, one Linux (8GB RAM) and one Windows (32GB RAM). Both have similar drives and CPU's. The Windows posgrestsql.config is pretty well tuned but it looks like someone had wiped out the Linux config so the default one was re-installed. All performance-related memory allocation values seem to be set to the defaults, but mods have been made: max_connections = 100 and shared_buffers = 32MB. The performance for this query is terrible on the Linux server, and good on the Windows server - presumably because the original Linux PG config has been lost. This query requires: that "set enable_seqscan to 'off';" Still, the Linux server did not create the same, fast plan as the Windows server. In order to get the same plan we had to: set enable_hashjoin to 'off'; set enable_mergejoin to 'off'; The plans were now similar, using nested loops and bitmapped heap scans. Now the Linux query outperformed the Windows query. Question: Can anyone tell me which config values would have made PG select hash join and merge joins when the nested loop/bitmap heap scan combination was faster? Carlo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to avoid hashjoin and mergejoin
On 11/1/07, Carlo Stonebanks <[EMAIL PROTECTED]> wrote: > I am comparing the same query on two different PG 8.2 servers, one Linux > (8GB RAM) and one Windows (32GB RAM). Both have similar drives and CPU's. > > The Windows posgrestsql.config is pretty well tuned but it looks like > someone had wiped out the Linux config so the default one was re-installed. > All performance-related memory allocation values seem to be set to the > defaults, but mods have been made: max_connections = 100 and shared_buffers > = 32MB. > > The performance for this query is terrible on the Linux server, and good on > the Windows server - presumably because the original Linux PG config has > been lost. This query requires: that "set enable_seqscan to 'off';" Have you run analyze on the server yet? A few general points on performance tuning. With 8.2 you should set shared_buffers to a pretty big chunk of memory on linux, up to 25% or so. That means 32 Meg shared buffers is REAL low for a linux server. Try running anywhere from 512Meg up to 1Gig for starters and see if that helps too. Also turn up work_mem to something like 16 to 32 meg then restart the server after making these changes. Then give us the explain analyze output with all the enable_xxx set to ON. summary: analyze, increase shared_buffers and work_mem, give us explain analyze. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to avoid hashjoin and mergejoin
"Carlo Stonebanks" <[EMAIL PROTECTED]> writes: > Still, the Linux server did not create the same, fast plan as the Windows > server. In order to get the same plan we had to: > set enable_hashjoin to 'off'; > set enable_mergejoin to 'off'; This is just about never the appropriate way to solve a performance problem, as it will inevitably create performance problems in other queries. What I'm wondering is whether the tables have been ANALYZEd recently, and also whether there are any nondefault postgresql.conf settings in use on the other server. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Hardware for PostgreSQL
On Thu, 2007-11-01 at 11:16 -0700, Steve Crawford wrote: > Magnus Hagander wrote: > > Ow Mun Heng wrote: > >>> You're likely better off (performance-wise) putting it on the same disk > >>> as the database itself if that one has better RAID, for example. > >> I'm thinking along the lines of since nothing much writes to the OS > >> Disk, I should(keyword) be safe. > > > > Unless it's *always* in the cache (not so likely), reads will also move > > the heads... > > And if you aren't mounted noatime, reads will also cause a write. /dev/VolGroup00/LogVol01 / ext3defaults,noatime 1 1 /dev/md0/raid1_arrayext3noatime,data=writeback 1 1 Yep..yep.. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] hardware for PostgreSQL
Hello, I am new to setting up PostgreSQL machines for our operational environments and would appreciate if someone can take a look at this setup; throw tomatoes if it looks too bad. We're expecting an initial load of about 5 million text meta-data records to our database; and are expecting upwards of 50 million records by 2008. We are expecting 40 "connect-query-disconnect' clients every 5 minutes or so, and are looking at 15 connections/sec on our front facing components. We've designed a set of Dell systems which we are planning to stick into our Slony/PgPool-II hybrid cluster; taking over our current random hodgepodge of machines we used when first experimenting. Each of these systems will be identical. Speed is important but we are putting more weight on the storage aspects. Below is our model system: Dell PowerEdge Energy 2950 (2) Quad Core Intel Xeon L5320, 2x4MB Cache, 1.86Ghz, 1066Mhz FSB 4GB 667Mhz Dual Ranked DIMMs, Energy Smart PERC 5/i, x8 Backplane, Integrated Controller Card Hard Drive Configuration: Integrated SAS/SATA RAID1/Raid 5 Hard Drive 1 (For Operating System): 36GB 10K RPM SAS 3Gbps 2.5-in Hot Plug HD Hard Drive 2 (For logs): 36GB 10K RPM SAS 3Gbps 2.5-in Hot Plug HD Hard Drives 3,4,5,6 (In a RAID 5 Configuration): (4) 146GB 10K SAS 3Gbps Hard Drive, 2-5 inch, Hot Plug Network Adapter: Dual Embedded Broadcom NetXTreme II 5708 Gigabit Ethernet NIC It's overkill for our initial system but we are shooting for a system that allows for growth. If someone can let us know if we're on the right path or are shooting ourselves in the foot with this setup I'd appreciate it. Thanks, - Mark ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] hardware for PostgreSQL
On 11/1/07, Mark Floyd <[EMAIL PROTECTED]> wrote: > Hello, > Dell PowerEdge Energy 2950 > (2) Quad Core Intel Xeon L5320, 2x4MB Cache, 1.86Ghz, 1066Mhz FSB > 4GB 667Mhz Dual Ranked DIMMs, Energy Smart > > PERC 5/i, x8 Backplane, Integrated Controller Card > > Hard Drive Configuration: Integrated SAS/SATA RAID1/Raid 5 > > Hard Drive 1 (For Operating System): 36GB 10K RPM SAS 3Gbps 2.5-in > Hot Plug HD > Hard Drive 2 (For logs): 36GB 10K RPM SAS 3Gbps 2.5-in Hot Plug HD > > Hard Drives 3,4,5,6 (In a RAID 5 Configuration): (4) 146GB 10K SAS > 3Gbps Hard Drive, 2-5 inch, Hot Plug If you can fit 8 drives in it, for the love of god add two more and mirror your OS and xlog drives ( I assume that's what you mean by drive 2 for logs).Running a server on non-redundant drives is not the best way to do things. And if you can live on ~ 300 Gigs of storage instead of 450 Gigs, look into RAID-10 for your data array. RAID 10 is noticeably faster than RAID-5 for any database that sees a fair bit of writing activity. > It's overkill for our initial system but we are shooting for a system > that allows for growth. If someone can let us know if we're on the > right path or are shooting ourselves in the foot with this setup I'd > appreciate it. Other than the 8 cores, it's not really overkill. And depending on your usage patterns 8 cores may well not be overkill too. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How to avoid hashjoin and mergejoin
<> In this particular example, this was done to "force" the query on the Linux box to use the same plan as on the Windows box to prove that - once the correct plan was chosen - the Linux box could at least MATCH the Windows box. That being said, I should mention this: we take certain "core" queries that we know are essential and embed them in a plpgsql SRF's that save the various settings, modify them as required for the query, then restore them after the rows are returned. Does this address the problem you mentioned? << What I'm wondering is whether the tables have been ANALYZEd recently,>> This is SUPPOSED to be done after a restore - but I will verify, thanks for the reminder. << and also whether there are any nondefault postgresql.conf settings in use on the other server.>> Definitely - this is what alerted me to the fact that there was something suspicious. We try to optimize our memory settings (based on various tuning docs, advice from here, and good old trial-and-error). Since the new config had barely any changes, I knew something was wrong. Carlo -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: November 1, 2007 5:42 PM To: Carlo Stonebanks Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to avoid hashjoin and mergejoin "Carlo Stonebanks" <[EMAIL PROTECTED]> writes: > Still, the Linux server did not create the same, fast plan as the Windows > server. In order to get the same plan we had to: > set enable_hashjoin to 'off'; > set enable_mergejoin to 'off'; This is just about never the appropriate way to solve a performance problem, as it will inevitably create performance problems in other queries. What I'm wondering is whether the tables have been ANALYZEd recently, and also whether there are any nondefault postgresql.conf settings in use on the other server. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] select max(field) from table much faster with a group by clause?
On Thu, Nov 01, 2007 at 02:07:55PM +0100, Palle Girgensohn wrote: > I have a table "login" with approx 600,000 tuples, a person table with > approx 10 tuples. > When running > select max("when") from login where userid='userid' > it takes a second or two, but when adding "group by userid" the planner > decides on using another plan, and it gets *much* faster. See example below. > pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ; just do: create index q on login (userid, "when"); and you should be fine. if it will not help, rewrite the query as: select "when" from login where userid = 'girgen' order by userid desc, "when" desc limit 1; depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq