Re: [PERFORM] HFS+ pg_test_fsync performance
2014-04-15 0:32 GMT+02:00 Mel Llaguno mllag...@coverity.com: I was given anecdotal information regarding HFS+ performance under OSX as being unsuitable for production PG deployments and that pg_test_fsync could be used to measure the relative speed versus other operating systems (such as Linux). In my performance lab, I have a number of similarly equipped Linux hosts (Ubuntu 12.04 64-bit LTS Server w/128Gb RAM / 2 OWC 6g Mercury Extreme SSDs / 7200rpm SATA3 HDD / 16 E5-series cores) that I used to capture baseline Linux numbers. As we generally recommend our customers use SSD (the s3700 recommended by PG), I wanted to perform a comparison. On these beefy machines I ran the following tests: SSD: # pg_test_fsync -f ./fsync.out -s 30 30 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync2259.652 ops/sec 443 usecs/op fsync1949.664 ops/sec 513 usecs/op fsync_writethroughn/a open_sync2245.162 ops/sec 445 usecs/op Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync2161.941 ops/sec 463 usecs/op fsync1891.894 ops/sec 529 usecs/op fsync_writethroughn/a open_sync1118.826 ops/sec 894 usecs/op Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write2171.558 ops/sec 460 usecs/op 2 * 8kB open_sync writes 1126.490 ops/sec 888 usecs/op 4 * 4kB open_sync writes569.594 ops/sec1756 usecs/op 8 * 2kB open_sync writes285.149 ops/sec3507 usecs/op 16 * 1kB open_sync writes142.528 ops/sec7016 usecs/op Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close 1947.557 ops/sec 513 usecs/op write, close, fsync 1951.082 ops/sec 513 usecs/op Non-Sync'ed 8kB writes: write 481296.909 ops/sec 2 usecs/op HDD: pg_test_fsync -f /tmp/fsync.out -s 30 30 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync 105.783 ops/sec9453 usecs/op fsync 27.692 ops/sec 36111 usecs/op fsync_writethroughn/a open_sync 103.399 ops/sec9671 usecs/op Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync n/a fdatasync 104.647 ops/sec9556 usecs/op fsync 27.223 ops/sec 36734 usecs/op fsync_writethroughn/a open_sync 55.839 ops/sec 17909 usecs/op Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write 103.581 ops/sec9654 usecs/op 2 * 8kB open_sync writes 55.207 ops/sec 18113 usecs/op 4 * 4kB open_sync writes 28.320 ops/sec 35311 usecs/op 8 * 2kB open_sync writes 14.581 ops/sec 68582 usecs/op 16 * 1kB open_sync writes 7.407 ops/sec 135003 usecs/op Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close27.228 ops/sec 36727 usecs/op write, close, fsync27.108 ops/sec 36890 usecs/op Non-Sync'ed 8kB writes: write 466108.001 ops/sec 2 usecs/op --- So far, so good. Local HDD vs. SSD shows a significant difference in fsync performance. Here are the corresponding fstab entries : /dev/mapper/cim-base /opt/cimext4defaults,noatime,nodiratime,discard 0 2 (SSD)
Re: [PERFORM] HFS+ pg_test_fsync performance
My 2 cents : The results are not surprising, in the linux enviroment the i/o call of pg_test_fsync are using O_DIRECT (PG_O_DIRECT) with also the O_SYNC or O_DSYNC calls, so ,in practice, it is waiting the answer from the storage bypassing the cache in sync mode, while in the Mac OS X it is not doing so, it's only using the O_SYNC or O_DSYNC calls without O_DIRECT, in practice, it's using the cache of filesystem , even if it is asking the sync of io calls. Bye Mat Dba Thanks for the explanation. Given that OSX always seems to use filesystem cache, is there a way to measure fsync performance that is equivalent to Linux? Or will the use of pg_test_fsync always be inflated under OSX? The reason I ask is that we would like to make a case with a customer that PG performance on OSX/HFS+ would be sub-optimal compared to using Linux/EXT4 (or FreeBSD/UFS2 for that matter). Thanks, Mel
Re: [PERFORM] [GENERAL] Approach to Data Summary and Analysis
On Mon, Apr 14, 2014 at 12:27 PM, Robert DiFalco robert.difa...@gmail.comwrote: I have several related tables that represent a call state. Let's think of these as phone calls to simplify things. Sometimes I need to determine the last time a user was called, the last time a user answered a call, or the last time a user completed a call. The basic schema is something like this: CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); CREATE TABLE calls_answered ( idBIGINT NOT NULL, answered TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (id) REFERENCES calls(id) ON DELETE CASCADE ); And so on for calls_connected, calls_completed, call_errors, etc. Occasionally I will want to know things like When was the last time a user answered a call or How many times has a user been called. I can do these queries using a combination of MAX or COUNT. But I'm concerned about the performance. SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Or the number of answered calls: SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Sometimes I might want to get this data for a whole bunch of users. For example, give me all users whose have not answered a call in the last 5 days. Or even what percentage of users called actually answered a call. This approach could become a performance issue. So the other option is to create a call_summary table that is updated with triggers. The summary table would need fields like user_id, last_call_id, call_count, last_answered_id, answered_count, last_completed_id, last_completed_count, etc. My only issue with a summary table is that I don't want a bunch of null fields. For example, if the user was *called* but they have never *answered* at call then the last_call_id and call_count fields on the summary table would be non-NULL but the last_answer_id and answer_count fields WOULD be NULL. But over time all fields would eventually become non-NULL. So that leads me to a summary table for EACH call state. Each summary table would have a user id, a ref_id, and a count -- one summary table for each state e.g. call_summary, call_answered_summary, etc. This approach has the down side that it creates a lot of tables and triggers. It has the upside of being pretty efficient without having to deal with NULL values. It's also pretty easy to reason about. So for my question -- is the choice between these a personal preference sort of thing or is there a right or wrong approach? Am I missing another approach that would be better? I'm okay with SQL but I'm not expert so I'm not sure if there is an accepted DESIGN PATTERN for this that I am missing. Thanks! My initial thought is: that design is over-normalized. The thing you are trying to model is the call, and it has severl attributes, some of which may be unknown or not applicable (which is what NULL is for). So my thought would be to do something like this: CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, answered TIMESTAMPTZ PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); -- I asked the Internet how to train my cat, and the Internet told me to get a dog.
Re: [PERFORM] [GENERAL] Approach to Data Summary and Analysis
On Tue, Apr 15, 2014 at 10:56 AM, Chris Curvey ch...@chriscurvey.comwrote: On Mon, Apr 14, 2014 at 12:27 PM, Robert DiFalco robert.difa...@gmail.com wrote: I have several related tables that represent a call state. Let's think of these as phone calls to simplify things. Sometimes I need to determine the last time a user was called, the last time a user answered a call, or the last time a user completed a call. The basic schema is something like this: CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); CREATE TABLE calls_answered ( idBIGINT NOT NULL, answered TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (id) REFERENCES calls(id) ON DELETE CASCADE ); And so on for calls_connected, calls_completed, call_errors, etc. Occasionally I will want to know things like When was the last time a user answered a call or How many times has a user been called. I can do these queries using a combination of MAX or COUNT. But I'm concerned about the performance. SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Or the number of answered calls: SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Sometimes I might want to get this data for a whole bunch of users. For example, give me all users whose have not answered a call in the last 5 days. Or even what percentage of users called actually answered a call. This approach could become a performance issue. So the other option is to create a call_summary table that is updated with triggers. The summary table would need fields like user_id, last_call_id, call_count, last_answered_id, answered_count, last_completed_id, last_completed_count, etc. My only issue with a summary table is that I don't want a bunch of null fields. For example, if the user was *called* but they have never *answered* at call then the last_call_id and call_count fields on the summary table would be non-NULL but the last_answer_id and answer_count fields WOULD be NULL. But over time all fields would eventually become non-NULL. So that leads me to a summary table for EACH call state. Each summary table would have a user id, a ref_id, and a count -- one summary table for each state e.g. call_summary, call_answered_summary, etc. This approach has the down side that it creates a lot of tables and triggers. It has the upside of being pretty efficient without having to deal with NULL values. It's also pretty easy to reason about. So for my question -- is the choice between these a personal preference sort of thing or is there a right or wrong approach? Am I missing another approach that would be better? I'm okay with SQL but I'm not expert so I'm not sure if there is an accepted DESIGN PATTERN for this that I am missing. Thanks! (Sorry, fat-fingered and hit send too early...) CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, answered TIMESTAMPTZ NULL, connected TIMESTAMPTZ NULL, completed TIMESTAMPTZ NULL, PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); Then your queries end up looking like this: --last time john answered SELECT MAX(a.id) FROM calls where answered is not null and user_id = ? -- users that have not answered a call in the last five days (I can think of a few ways to interpret that phrase) select myusers.* from myusers where not exists ( select * from calls where calls.user_id = myusers.user_id and answered = five days ago) -- average ring time select avg(extract ('seconds' from called - answered)) where answered is not null -- I asked the Internet how to train my cat, and the Internet told me to get a dog.
Re: [PERFORM] [GENERAL] Approach to Data Summary and Analysis
Actually that was exactly the initial table design. There were more fields because for my use case there were a lot more states and certain states have additional data (for example when a call goes from answered to connected it also gets the user_id of the person being connected to). So that one table started getting a LOT of columns which starting making it hard to reason about. The more normalized version has a couple of things going for it. COUNT, MIN, MAX, etc are very fast because I don't have to conditionally add null checks. Everything is inserted so for the millions of calls that get made the normalized schema was much more efficient for writing. It was also easier to understand. The answer table only has calls that were answered, the error table only has calls the resulted in an error after being connected, etc. I know this kind of gets into a religious area when discussing NULLs and what level of normalization is appropriate so I don't want to spark any of that on this thread. But only doing inserts and never doing updates or deletes performed very well for large data sets. That said, I could explore a compromise between the monolithic table approach and the completely normalized set of tables approach. Thanks for your input! On Tue, Apr 15, 2014 at 8:12 AM, Chris Curvey ch...@chriscurvey.com wrote: On Tue, Apr 15, 2014 at 10:56 AM, Chris Curvey ch...@chriscurvey.comwrote: On Mon, Apr 14, 2014 at 12:27 PM, Robert DiFalco robert.difa...@gmail.com wrote: I have several related tables that represent a call state. Let's think of these as phone calls to simplify things. Sometimes I need to determine the last time a user was called, the last time a user answered a call, or the last time a user completed a call. The basic schema is something like this: CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); CREATE TABLE calls_answered ( idBIGINT NOT NULL, answered TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), FOREIGN KEY (id) REFERENCES calls(id) ON DELETE CASCADE ); And so on for calls_connected, calls_completed, call_errors, etc. Occasionally I will want to know things like When was the last time a user answered a call or How many times has a user been called. I can do these queries using a combination of MAX or COUNT. But I'm concerned about the performance. SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Or the number of answered calls: SELECT MAX(a.id) FROM calls_answered a JOIN calls c ON c.id = a.id WHERE c.user_id = ?; Sometimes I might want to get this data for a whole bunch of users. For example, give me all users whose have not answered a call in the last 5 days. Or even what percentage of users called actually answered a call. This approach could become a performance issue. So the other option is to create a call_summary table that is updated with triggers. The summary table would need fields like user_id, last_call_id, call_count, last_answered_id, answered_count, last_completed_id, last_completed_count, etc. My only issue with a summary table is that I don't want a bunch of null fields. For example, if the user was *called* but they have never *answered* at call then the last_call_id and call_count fields on the summary table would be non-NULL but the last_answer_id and answer_count fields WOULD be NULL. But over time all fields would eventually become non-NULL. So that leads me to a summary table for EACH call state. Each summary table would have a user id, a ref_id, and a count -- one summary table for each state e.g. call_summary, call_answered_summary, etc. This approach has the down side that it creates a lot of tables and triggers. It has the upside of being pretty efficient without having to deal with NULL values. It's also pretty easy to reason about. So for my question -- is the choice between these a personal preference sort of thing or is there a right or wrong approach? Am I missing another approach that would be better? I'm okay with SQL but I'm not expert so I'm not sure if there is an accepted DESIGN PATTERN for this that I am missing. Thanks! (Sorry, fat-fingered and hit send too early...) CREATE TABLE calls ( id BIGINT NOT NULL, // sequence generator user_id BIGINT NOT NULL, called TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, answered TIMESTAMPTZ NULL, connected TIMESTAMPTZ NULL, completed TIMESTAMPTZ NULL, PRIMARY KEY (id), FOREIGN KEY (user_id) REFERENCES my_users(id) ON DELETE CASCADE ); Then your queries end up looking like this: --last time john answered SELECT MAX(a.id) FROM calls where answered is not null and user_id = ? -- users that have not answered a
[PERFORM] Testing strategies
I have a client wanting to test PostgreSQL on ZFS running Linux. Other than pg_bench are there any other benchmarks that are easy to test? One of the possible concerns is fragmentation over time. Any ideas on how to fragment the database before running pg_bench ? Also there is some concern about fragmentation of the WAL logs. I am looking at testing with and without the WAL logs on ZFS. Any other specific concerns ? Dave Cramer credativ ltd (Canada) 78 Zina St Orangeville, ON Canada. L9W 1E8 Office: +1 (905) 766-4091 Mobile: +1 (519) 939-0336 === Canada: http://www.credativ.ca USA: http://www.credativ.us Germany: http://www.credativ.de Netherlands: http://www.credativ.nl UK: http://www.credativ.co.uk India: http://www.credativ.in ===
Re: [PERFORM] Tuning Postgres for Single connection use
On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank nickeub...@gmail.com wrote: Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer, etc. for a database that DOESN'T anticipate concurrent connections and that is doing lots of aggregate functions on large tables? All the advice I can find online on tuning (thishttp://wiki.postgresql.org/wiki/Performance_Optimization , thishttp://media.revsys.com/talks/djangocon/2011/secrets-of-postgresql-performance.pdf , this http://www.revsys.com/writings/postgresql-performance.html etc.) is written for people anticipating lots of concurrent connections. I'm a social scientist looking to use Postgres not as a database to be shared by multiple users, but rather as my own tool for manipulating a massive data set (I have 5 billion transaction records (600gb in csv) and want to pull out unique user pairs, estimate aggregates for individual users, etc.). This also means almost no writing, except to creation of new tables based on selections from the main table. I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's important. I'd go with a small shared_buffers, like 128MB, and let the OS cache as much as possible. This minimizes the amount of double buffering. And set work_mem to about 6GB, then bump it up if that doesn't seem to cause problems. In the scenario you describe, it is probably no big deal if you guess too high. Monitor the process, if it it starts to go nuts just kill it and start again with a lower work_mem. If it is a single user system, you can afford to be adventurous. If you need to build indexes, you should bump up maintenance_work_mem, but I just would do that in the local session not system wide. Cheers, Jeff Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't get to 2gb. When I update config, the values just don't change in SHOW ALL -- integer constraint?). Is there a work around, or should I tweak something else accordingly? Thanks! Nick (Properly bottom posted this time?)
Re: [PERFORM] Tuning Postgres for Single connection use
On Tue, Apr 15, 2014 at 9:12 AM, Nick Eubank nickeub...@gmail.com wrote: On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote: I'd go with a small shared_buffers, like 128MB, and let the OS cache as much as possible. This minimizes the amount of double buffering. And set work_mem to about 6GB, then bump it up if that doesn't seem to cause problems. In the scenario you describe, it is probably no big deal if you guess too high. Monitor the process, if it it starts to go nuts just kill it and start again with a lower work_mem. If it is a single user system, you can afford to be adventurous. If you need to build indexes, you should bump up maintenance_work_mem, but I just would do that in the local session not system wide. Cheers, Jeff Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't get to 2gb. When I update config, the values just don't change in SHOW ALL -- integer constraint?). Is there a work around, or should I tweak something else accordingly? What version are you using? What is the exact line you put in your config file? Did you get any errors when using that config file? Are you sure you actually reloaded the server, so that it reread the config file, rather than just changing the file and then not applying the change? I usually set work_mem within a psql connection, in which case you need to quote the setting if you use units: set work_mem=3GB; But if you set it system wide in the config file the quotes should not be needed. Thanks! Nick (Properly bottom posted this time?) Looked good to me. Cheers, Jeff
Re: [PERFORM] Tuning Postgres for Single connection use
From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Nick Eubank Sent: Tuesday, April 15, 2014 11:12 AM To: Jeff Janes Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Tuning Postgres for Single connection use On Mon, Apr 14, 2014 at 5:19 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, Apr 14, 2014 at 2:46 PM, Nick Eubank nickeub...@gmail.com wrote: Any rules of thumb for work_mem, maintenance_work_mem, shared_buffer, etc. for a database that DOESN'T anticipate concurrent connections and that is doing lots of aggregate functions on large tables? All the advice I can find online on tuning (this, this, this etc.) is written for people anticipating lots of concurrent connections. I'm a social scientist looking to use Postgres not as a database to be shared by multiple users, but rather as my own tool for manipulating a massive data set (I have 5 billion transaction records (600gb in csv) and want to pull out unique user pairs, estimate aggregates for individual users, etc.). This also means almost no writing, except to creation of new tables based on selections from the main table. I'm on a Windows 8 VM with 16gb ram, SCSI VMware HD, and 3 cores if that's important. I'd go with a small shared_buffers, like 128MB, and let the OS cache as much as possible. This minimizes the amount of double buffering. And set work_mem to about 6GB, then bump it up if that doesn't seem to cause problems. In the scenario you describe, it is probably no big deal if you guess too high. Monitor the process, if it it starts to go nuts just kill it and start again with a lower work_mem. If it is a single user system, you can afford to be adventurous. If you need to build indexes, you should bump up maintenance_work_mem, but I just would do that in the local session not system wide. Cheers, Jeff Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't get to 2gb. When I update config, the values just don't change in SHOW ALL -- integer constraint?). Is there a work around, or should I tweak something else accordingly? Thanks! Nick (Properly bottom posted this time?) [Schnabel, Robert D.] Nick, I asked the same question a while ago about work_mem on Windows. See this thread: http://www.postgresql.org/message-id/17895.1315869...@sss.pgh.pa.us Bob -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tuning Postgres for Single connection use
Jeff Janes jeff.ja...@gmail.com writes: On Tue, Apr 15, 2014 at 9:12 AM, Nick Eubank nickeub...@gmail.com wrote: Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't get to 2gb. When I update config, the values just don't change in SHOW ALL -- integer constraint?). Is there a work around, or should I tweak something else accordingly? What version are you using? What is the exact line you put in your config file? Did you get any errors when using that config file? Are you sure you actually reloaded the server, so that it reread the config file, rather than just changing the file and then not applying the change? I usually set work_mem within a psql connection, in which case you need to quote the setting if you use units: set work_mem=3GB; FWIW, it's generally considered a seriously *bad* idea to set work_mem as high as 1GB in postgresql.conf: you're promising that each query running on the server can use 1GB per sort or hash step. You probably don't have the machine resources to honor that promise. (If you do, I'd like to have your IT budget ;-)) Recommended practice is to keep the global setting conservatively small, and bump it up locally in your session (with SET) for individual queries that need the very large value. But having said that, Postgres doesn't try to enforce any such practice. My money is on what Jeff is evidently thinking: you forgot to do pg_ctl reload, or else the setting is too large for your platform, in which case there should have been a complaint in the postmaster log. As noted elsewhere, the limit for Windows is a hair under 2GB even if it's 64-bit Windows. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Tuning Postgres for Single connection use
On Tuesday, April 15, 2014, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: On Tue, Apr 15, 2014 at 9:12 AM, Nick Eubank nickeub...@gmail.com wrote: Quick followup Jeff: it seems that I can't set work_mem above about 1gb (can't get to 2gb. When I update config, the values just don't change in SHOW ALL -- integer constraint?). Is there a work around, or should I tweak something else accordingly? What version are you using? What is the exact line you put in your config file? Did you get any errors when using that config file? Are you sure you actually reloaded the server, so that it reread the config file, rather than just changing the file and then not applying the change? I usually set work_mem within a psql connection, in which case you need to quote the setting if you use units: set work_mem=3GB; FWIW, it's generally considered a seriously *bad* idea to set work_mem as high as 1GB in postgresql.conf: you're promising that each query running on the server can use 1GB per sort or hash step. You probably don't have the machine resources to honor that promise. (If you do, I'd like to have your IT budget ;-)) Recommended practice is to keep the global setting conservatively small, and bump it up locally in your session (with SET) for individual queries that need the very large value. But having said that, Postgres doesn't try to enforce any such practice. My money is on what Jeff is evidently thinking: you forgot to do pg_ctl reload, or else the setting is too large for your platform, in which case there should have been a complaint in the postmaster log. As noted elsewhere, the limit for Windows is a hair under 2GB even if it's 64-bit Windows. regards, tom lane Thanks Tom -- quick follow up: I know that 1gb work_mem is a terrible idea for normal postgres users with lots of concurrent users, but for my situations where there will only ever be one connection running one query, why is that a problem on a machine with 16gb of ram. Re:Robert -- thanks for that clarification!
Re: [PERFORM] Testing strategies
On Tue, Apr 15, 2014 at 12:57 PM, Dave Cramer davecra...@gmail.com wrote: I have a client wanting to test PostgreSQL on ZFS running Linux. Other than pg_bench are there any other benchmarks that are easy to test? Check Gregory Smith article about testing disks [1]. [1] http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
[PERFORM] Workaround for working_mem max value in windows?
Hi all, A few years ago someone said postgres windows can't set working_mem above about 2 GB (www.postgresql.org/message-id/17895.1315869...@sss.pgh.pa.us -- seems to be same for maintenance_working_mem ). Im finding limit still present. I'm doing single user, single connection data intensive queries and would like to set a higher value on windows to better use 16gb built in ram (don't control platform, so can't jump to Linux). Anyone found a work around? Thanks! Nick