Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Tom Lane wrote: PFC [EMAIL PROTECTED] writes: What version of PostgreSQL are you using? I think newbies should be pushed a bit to use the latest versions, How about pushed *hard* ? I'm constantly amazed at the number of people who show up in the lists saying they installed 7.3.2 or whatever random version they found in a dusty archive somewhere. Please upgrade is at least one order of magnitude more valuable configuration advice than anything else we could tell them. (picking up an old thread while at a boring wait at the airport.. anyway) I keep trying to think of more nad better ways to do this :-) Perhaps we should put some text on the bug reporting form (and in the documentation about bug reporting) that's basically don't bother reporting a bug unless you're on the latest in a branch, and at least make sure you're on one of the maojr releases listed on www.postgresql.org? Seems reasonable? //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Magnus, don't bother reporting a bug unless you're on the latest in a branch, and at least make sure you're on one of the maojr releases listed on www.postgresql.org? Seems reasonable? absolutely. Should be standard practice. Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 fx 01212-5-13695179 - EuroPython 2007 will take place in Vilnius, Lithuania from Monday 9th July to Wednesday 11th July. See you there!
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Greg, I'd like to see people have a really simple set of questions to get them past the completely undersized initial configuration phase, then ship them toward resources to help educate about the parts that could be problematic for them based on what they do or don't know. I don't see an inconsistancy that I'd expect people to have a reasonable guess for max_connections, while also telling them that setting sort_mem is important, a middle value has been assigned, but a really correct setting isn't something they can expect the simple config tool to figure out for them; here's a pointer to the appropriate documentation to learn more. I disagree that this is acceptable, especially when we could set a better value using an easy-to-understand question. It's also been my experience (in 3 years of professional performance tuning) that most users *don't* have an accurate guess for max_connections. I'm really not clear on why you think what flavor of application do you have? is a difficult question. It's certainly one that my clients were able to answer easily. Overall, it seems like you're shooting for a conf tool which only really works for web apps, which isn't my personal goal or I think a good use of our time. I'm still of the opinion that recommendations for settings like max_fsm_pages and maintenance_work_mem should come out of a different type of tool that connects to the database. Well, there's several steps to this: 1) Run conf tool when installing PG; 2) Run conf tool++ after application is first up and running; 3) Run conf tool++ after application has been in production The (1) tool should at least provide a configuration which isn't going to lead to long term issues. For example, dramatically underallocating fsm_pages can result in having to run VACUUM FULL and the associated downtime, so it's something we want to avoid at the outset. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Greg, We've hashed through this area before, but for Lance's benefit I'll reiterate my dissenting position on this subject. If you're building a tool for dummies, my opinion is that you shouldn't ask any of this information. I think there's an enormous benefit to providing something that takes basic sizing information and gives conservative guidelines based on that--as you say, safe, middle-of-the-road values--that are still way, way more useful than the default values. The risk in trying to make a complicated tool that satisfies all the users Josh is aiming his more sophisticated effort at is that you'll lose the newbies. The problem is that there are no safe, middle-of-the-road values for some things, particularly max_connections and work_mem. Particularly, there are very different conf profiles between reporting applications and OLTP/Web. We're talking about order-of-magnitude differences here, not just a few points. e.g.: Web app, typical machine: max_connections = 200 work_mem = 256kb default_statistics_target=100 autovacuum=on Reporting app, same machine: max_connections = 20 work_mem = 32mb default_statistics_target=500 autovacuum=off Possibly we could make the language of the application type selection less technical, but I don't see it as dispensible even for a basic tool. I wouldn't even bother asking how many CPUs somebody has for what Lance is building. The kind of optimizations you'd do based on that are just too complicated to expect a tool to get them right and still be accessible to a novice. CPUs affects the various cpu_cost parameters, but I can but the idea that this should only be part of the advanced tool. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 26 Jun 2007, Josh Berkus wrote: The problem is that there are no safe, middle-of-the-road values for some things, particularly max_connections and work_mem. Your max_connections concern is one fact that haunts the idea of just giving out some sample configs for people. Lance's tool asks outright the expectation for max_connections which I think is the right thing to do. Web app, typical machine: work_mem = 256kb default_statistics_target=100 autovacuum=on Reporting app, same machine: work_mem = 32mb default_statistics_target=500 autovacuum=off I think people are stuck with actually learning a bit about work_mem whether they like it or not, because it's important to make it larger but we know going too high will be a problem with lots of connections doing sorts. As for turning autovacuum on/off and the stats target, I'd expect useful defaults for those would come out of how the current sample is asking about read vs. write workloads and expected database size. Those simple to understand questions might capture enough of the difference between your two types here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Greg, Your max_connections concern is one fact that haunts the idea of just giving out some sample configs for people. Lance's tool asks outright the expectation for max_connections which I think is the right thing to do. ... I think people are stuck with actually learning a bit about work_mem whether they like it or not, because it's important to make it larger but we know going too high will be a problem with lots of connections doing sorts. I find it extremely inconsistent that you want to select middle-of-the-road defaults for some values and ask users detailed questions for other values. Which are we trying to do, here? Given an application type selection, which is a question which can be written in easy-to-understand terms, these values can be set at reasonable defaults. In fact, for most performance tuning clients I had, we never actually looped back and tested the defaults by monitoring pg_temp, memstat and the log; performance was acceptable with the approximate values. As for turning autovacuum on/off and the stats target, I'd expect useful defaults for those would come out of how the current sample is asking about read vs. write workloads and expected database size. Those simple to understand questions might capture enough of the difference between your two types here. Both of the questions you cite above are unlikely to result in accurate answers from users, and the read vs. write answer is actually quite useless except for the extreme cases (e.g. read-only or mostly-write). The deciding answer in turning autovacuum off is whether or not the user does large bulk loads / ETL operations, which autovac would interfere with. The fact that we can't expect an accurate answer on database size (except from the minority of users who already have a full production DB) will be a chronic stumbling block for any conf tool we build. Quite a number of settings want to know this figure: max_fsm_pages, maintenance_work_mem, max_freeze_age, etc. Question is, would order-of-magnitude answers be likely to have better results? i.e.: How large is your database expected to grow? [] Less than 100MB / thousands of rows [] 100mb to 1gb / tens to hundreds of thousands of rows [] 1 gb to 10 gb / millions of rows [] 10 gb to 100 gb / tens to hundreds of millions of rows [] 100 gb to 1 TB / billions of rows [] more than 1 TB / many billions of rows ... users might have better guesses within those rough ranges, and it would give us enough data to set rough values. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 26 Jun 2007, Josh Berkus wrote: I find it extremely inconsistent that you want to select middle-of-the-road defaults for some values and ask users detailed questions for other values. Which are we trying to do, here? I'd like to see people have a really simple set of questions to get them past the completely undersized initial configuration phase, then ship them toward resources to help educate about the parts that could be problematic for them based on what they do or don't know. I don't see an inconsistancy that I'd expect people to have a reasonable guess for max_connections, while also telling them that setting sort_mem is important, a middle value has been assigned, but a really correct setting isn't something they can expect the simple config tool to figure out for them; here's a pointer to the appropriate documentation to learn more. The fact that we can't expect an accurate answer on database size (except from the minority of users who already have a full production DB) will be a chronic stumbling block for any conf tool we build. I'm still of the opinion that recommendations for settings like max_fsm_pages and maintenance_work_mem should come out of a different type of tool that connects to the database. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control
[EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Fri, 22 Jun 2007, Sabin Coanda wrote: Instead of (or in addition to) configure dozens of settings, what do you say about a feedback adjustable control based on the existing system statistics and parsing logs (e.g http://pgfouine.projects.postgresql.org/index.html ) ? something like this would be useful for advanced tuneing, but the biggest problem is that it's so difficult to fingoure out a starting point. bad choices at the starting point can cause several orders of magnatude difference in the database performsnce. In addition we know that the current defaults are bad for just about everyone (we just can't decide what better defaults would be) You are right. But an automatic tool beeing able to take decisions by different inputs, would be able to set a startup configuration too, based on the hw/sw environment, and interactive user requirements. this horrible starting point gives people a bad first impression that a simple tool like what's being discussed can go a long way towards solving. Well, I think to an automatic tool, not an utopian application good for everything. For instance the existing automatic daemon have some abilities, bat not all of the VACUUM command. I'm realistic that good things may be done in steps, not once. I would be super happy if an available automatic configuration tool would be able to set for the beginning just the shared_buffers or max_fsm_pages based on the available memory. Adjustments can be done later. Regards, Sabin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Fri, 22 Jun 2007, Adam Tauno Williams wrote: Just as an aside; how come the installation/setup Tutorial section - http://www.postgresql.org/docs/8.2/interactive/tutorial-start.html - doesn't mention setting some rough reasonable defaults in postgresql.conf or even a reference to the parameter documentation section. I think that anyone who has been working with the software long to know what should go into such a section has kind of forgotten about this part of the documentation by the time they get there. It is an oversight and yours is an excellent suggestion. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control
On Fri, 22 Jun 2007, Sabin Coanda wrote: Instead of (or in addition to) configure dozens of settings, what do you say about a feedback adjustable control based on the existing system statistics and parsing logs Take a look at the archive of this list for the end of April/Early May. There's a thread there named Feature Request --- was: PostgreSQL Performance Tuning that addressed this subject in length I think you'll find interesting reading. I personally feel there's much more long-term potential for a tool that inspects the database, but the needs of something looking for getting good starting configuration file (before there necessarily is even a populated database) is different enough that it may justify building two different tools. I would suggest you or anything else building the starter configuration tool not stray from the path of getting the most important variables set to reasonable values. Trying to satisfy every possible user is the path that leads to a design so complicated that it's unlikely you'll ever get a finished build done at all. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Thu, 21 Jun 2007, Scott Marlowe wrote: Generally I agree, however, how about a first switch, for beginner / intermediate / advanced. You're describing a perfectly reasonable approach for a second generation tool in this area. I think it would be very helpful for the user community to get a first generation one that works fairly well before getting distracted at all by things like this. The people capable of filling out the intermediate/advanced settings can probably just do a bit of reading and figure out most of what they should be doing themselves. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Fri, 2007-06-22 at 02:32 -0400, Greg Smith wrote: On Thu, 21 Jun 2007, Scott Marlowe wrote: Generally I agree, however, how about a first switch, for beginner / intermediate / advanced. You're describing a perfectly reasonable approach for a second generation tool in this area. I think it would be very helpful for the user community to get a first generation one that works fairly well before getting distracted at all by things like this. The people capable of filling out the intermediate/advanced settings can probably just do a bit of reading and figure out most of what they should be doing themselves. Just as an aside; how come the installation/setup Tutorial section - http://www.postgresql.org/docs/8.2/interactive/tutorial-start.html - doesn't mention setting some rough reasonable defaults in postgresql.conf or even a reference to the parameter documentation section. It seems like such a reference should exist between - http://www.postgresql.org/docs/8.2/interactive/tutorial-arch.html - and - http://www.postgresql.org/docs/8.2/interactive/tutorial-accessdb.html At least something along those lines should be said at http://www.postgresql.org/docs/8.2/interactive/install-post.html Personally, as DBA for more than a decade, I've got 0 sympathy for people who setup a database but can't be bothered to read the documentation. But in the case of PostgreSQL the documentation could do a better job of driving users to even the existence [and importance of] postgresql.conf and routine maintenance techniques. http://www.postgresql.org/docs/8.2/interactive/runtime-config.html http://www.postgresql.org/docs/8.2/interactive/maintenance.html Seems to me that even a remake of something like - http://www.iiug.org/~waiug/old/forum2000/SQLTunning/sld001.htm - focused on PostgreSQL would be novel and very interesting. Just my two cents. PostgreSQL is awesome, BTW. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control
Campbell, Lance [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Below is a link to the HTML JavaScript configuration page I am creating: http://www.webservices.uiuc.edu/postgresql/ I had many suggestions. Based on the feedback I received, I put together the initial list of questions. This list of questions can be change. Instead of (or in addition to) configure dozens of settings, what do you say about a feedback adjustable control based on the existing system statistics and parsing logs (e.g http://pgfouine.projects.postgresql.org/index.html ) ? Such an application improved with notifications would be useful for experimented users, too. A database is not static and it may evolve to different requirements. The initial configuration may be deprecated after one year. Regards, Sabin
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control
On Fri, 22 Jun 2007, Sabin Coanda wrote: Instead of (or in addition to) configure dozens of settings, what do you say about a feedback adjustable control based on the existing system statistics and parsing logs (e.g http://pgfouine.projects.postgresql.org/index.html ) ? something like this would be useful for advanced tuneing, but the biggest problem is that it's so difficult to fingoure out a starting point. bad choices at the starting point can cause several orders of magnatude difference in the database performsnce. In addition we know that the current defaults are bad for just about everyone (we just can't decide what better defaults would be) this horrible starting point gives people a bad first impression that a simple tool like what's being discussed can go a long way towards solving. David Lang ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance wrote: Now I am at the difficult part, what parameters to calculate and how to calculate them. Everything below has to do with PostgreSQL version 8.2: The parameters I would think we should calculate are: max_connections shared_buffers work_mem maintenance_work_mem effective_cache_size random_page_cost Any other variables? I am open to suggestions. we also should scale max_fsm_pages according to the database size and workload answers - I also note that the configuration file it generates seems to look like on for PostgreSQL 7.x or something - I think we should just include the specific parameters to change. Stefan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Now I am at the difficult part, what parameters to calculate and how to calculate them. Everything below has to do with PostgreSQL version 8.2: The parameters I would think we should calculate are: max_connections shared_buffers work_mem maintenance_work_mem effective_cache_size random_page_cost Any other variables? I am open to suggestions. I know this is mainly about tuning for performance but I do think you ought to give the option to change at least 'listen_address'. Something like: Accept connections on: - Local connections (Unix sockets/localhost) - All TCP/IP interfaces - Specific IP addresses: ___ (comma-seperated list) and maybe a pointer to the pg_hba.conf docs for further info. Regards, Ben ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Greg Smith wrote: On Tue, 19 Jun 2007, Josh Berkus wrote: I don't think the mostly reads / mostly writes question covers anything, nor is it likely to produce accurate answers. Instead, we need to ask the users to characterize what type of application they are running: T1) Please characterize the general type of workload you will be running on this database. Choose one of the following four... We've hashed through this area before, but for Lance's benefit I'll reiterate my dissenting position on this subject. If you're building a tool for dummies, my opinion is that you shouldn't ask any of this information. I think there's an enormous benefit to providing something that takes basic sizing information and gives conservative guidelines based on that--as you say, safe, middle-of-the-road values--that are still way, way more useful than the default values. The risk in trying to make a complicated tool that satisfies all the users Josh is aiming his more sophisticated effort at is that you'll lose the newbies. Generally I agree, however, how about a first switch, for beginner / intermediate / advanced. The choice you make determines how much detail we ask you about your setup. Beginners get two or three simple questions, intermediate a handful, and advanced gets grilled on everything. Then, just write the beginner and maybe intermediate to begin with and ghost out the advanced until it's ready. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 19 Jun 2007, Josh Berkus wrote: I don't think the mostly reads / mostly writes question covers anything, nor is it likely to produce accurate answers. Instead, we need to ask the users to characterize what type of application they are running: T1) Please characterize the general type of workload you will be running on this database. Choose one of the following four... We've hashed through this area before, but for Lance's benefit I'll reiterate my dissenting position on this subject. If you're building a tool for dummies, my opinion is that you shouldn't ask any of this information. I think there's an enormous benefit to providing something that takes basic sizing information and gives conservative guidelines based on that--as you say, safe, middle-of-the-road values--that are still way, way more useful than the default values. The risk in trying to make a complicated tool that satisfies all the users Josh is aiming his more sophisticated effort at is that you'll lose the newbies. Scan the archives of this mailing list for a bit. If you look at what people discover they've being nailed by, it's rarely because they need to optimize something like random_page_cost. It's usually because they have a brutally wrong value for one of the memory or vacuum parameters that are very easy to provide reasonable suggestions for without needing a lot of information about the server. I wouldn't even bother asking how many CPUs somebody has for what Lance is building. The kind of optimizations you'd do based on that are just too complicated to expect a tool to get them right and still be accessible to a novice. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
At 4:35p -0400 on 19 Jun 2007, Lance Campbell wrote: The parameters I would think we should calculate are: max_connections shared_buffers work_mem maintenance_work_mem effective_cache_size random_page_cost From an educational/newb standpoint, I notice that the page currently spews out a configuration file completely in line with what's currently there, comments and all. May I suggest highlighting what has been altered, perhaps above or below the textbox? It would make it immediately obvious, and easier to add an explanation of the thought process involved. Something like What's changed from the default: li pstrongmax_connections = 5/strong/p pThis follows directly from you put above. It is the maximum number of concurrent connections Postgres will allow./p /li li pstrongshared_buffers = 1/strong/p pThis setting will take some time to get exactly right for your needs. Postgres uses this for .../p /li Not something that necessarily needs to be spelled out in the .conf file, but would, IMVHO, help minimally educate. Kevin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance wrote: Please think simple. I stress the word simple. The real challenge here is getting the formulas correct. Someone mentioned to not focus on the values but just get something out there for everyone to help tweak. I agree! What questions do you think should be asked in order to figure out what values should go into the formulas for the configuration suggestions? My thoughts: What version of PostgreSQL are you using? OK, obviously not needed if embedded in the manuals. How many connections will be made to PostgreSQL? OK (but changed order) How much memory will be available to PostgreSQL? Would structure it like: - What is total memory of your machine? - How much do you want to reserve for other apps (e.g. apache/java)? Also: - How many disks will PG be using? - How much data do you think you'll store? - Will your usage be: mostly reads|balance of read+write|mostly writes - Are your searches: all very simple|few complex|lots of complex queries Then, with the output provide a commentary stating reasons why for the chosen values. e.g. random_page_cost = 1.0 Because you have [effective_cache_size = 1GB] and [total db size = 0.5GB] the cost of fetching a page is the same no matter what order you fetch them in. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance writes: 3) I suggested JavaScript because most people that get started with PostgreSQL will go to the web in order to find out about issues relating Why not c? It could then go into contrib. Anyways.. language is likely the least important issue.. As someone mentioned.. once the formulas are worked out it can be done in a few languages.. as people desire.. How much memory will be available to PostgreSQL? How many connections will be made to PostgreSQL? Will this be a dedicated Postgresql server? Will there be mostly reads or will there also be significant amount of writes? Are you on a RAID system or do you have several disks over which you would like to run postgresql on? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? I did forget the obvious question: What OS are you using? Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: Francisco Reyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 19, 2007 11:58 AM To: Campbell, Lance Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies Campbell, Lance writes: 3) I suggested JavaScript because most people that get started with PostgreSQL will go to the web in order to find out about issues relating Why not c? It could then go into contrib. Anyways.. language is likely the least important issue.. As someone mentioned.. once the formulas are worked out it can be done in a few languages.. as people desire.. How much memory will be available to PostgreSQL? How many connections will be made to PostgreSQL? Will this be a dedicated Postgresql server? Will there be mostly reads or will there also be significant amount of writes? Are you on a RAID system or do you have several disks over which you would like to run postgresql on? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance wrote: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? Well, random_page_cost will depend on how fast your disk system can locate a non-sequential page. If you have a 16-disk RAID-10 array that's noticably less time than a single 5400rpm IDE in a laptop. I did forget the obvious question: What OS are you using? Tricky to keep simple, isn't it :-) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
What version of PostgreSQL are you using? I think newbies should be pushed a bit to use the latest versions, maybe with some advice on how to setup the apt sources (in debian/ubuntu) to get them. How much memory will be available to PostgreSQL? How many connections will be made to PostgreSQL? I also think Postgres newbies using PHP should be encouraged to use something like ligttpd/fastcgi instead of Apache. The fastcgi model permits use of very few database connections and working PHP processes since lighttpd handles all the slow transfers to the client asynchronously. You can do the same with two Apache instances, one serving static pages and acting as a proxy for the second Apache serving dynamic pages. With this setup, even low-end server setups (For our personal sites, a friend and I share a dedicated server with 256MB of RAM, which we rent for 20€ a month). This thing will never run 200 Apache processes, but we have no problem with lighttpd/php-fcgi and postgres. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance writes: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? If the user has 2 disks and says that he will do a lot of updates he could put pg_xlog in the second disk. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? If the user has 2 disks and says that he will do a lot of updates he could put pg_xlog in the second disk. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster Let's not ask about disk or raid at this level of sanity tuning. It is important for a newbie to take the right first step. When it comes to disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI, and controller cards. Then we throw in RAID and the different levels therein. Add to that, we can talk about drivers controlling these drives and which OS is faster, more stable, etc. As you can see, a newbie would get drowned. So, please keep it simple. I know many people on this list are Gurus. We know you are the best in this field, but we are not and are just trying to improve what we have. -- Yudhvir Singh Sidhu 408 375 3134 cell
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Yudhvir, I completely agree. I was just putting together a similar email. Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Y Sidhu Sent: Tuesday, June 19, 2007 12:49 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? If the user has 2 disks and says that he will do a lot of updates he could put pg_xlog in the second disk. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster Let's not ask about disk or raid at this level of sanity tuning. It is important for a newbie to take the right first step. When it comes to disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI, and controller cards. Then we throw in RAID and the different levels therein. Add to that, we can talk about drivers controlling these drives and which OS is faster, more stable, etc. As you can see, a newbie would get drowned. So, please keep it simple. I know many people on this list are Gurus. We know you are the best in this field, but we are not and are just trying to improve what we have. -- Yudhvir Singh Sidhu 408 375 3134 cell
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 19 Jun 2007 12:58:26 -0400 Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: 3) I suggested JavaScript because most people that get started with PostgreSQL will go to the web in order to find out about issues relating Why not c? Why not whatever and install it on www.PostgreSQL.org? Is there any reason that this tool would need to be run on every installation. Run it on the site and it can always be up to date and can be written in whatever language is easiest to maintain on the mother system. I would also like to make a pitch for a JavaScript-free tool. Just collect all the pertinent information, work it out and display the results in a second page. Some people just don't like JavaScript and turn it off even if we can run it in our browser. -- D'Arcy J.M. Cain [EMAIL PROTECTED] | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
D'Arcy, I wanted to put it on the www.postgresql.org site. That is what I said in my original email. I don't believe anyone from the actual project has contacted me. I am setting up a JavaScript version first. If someone wants to do a different one feel free. I will have all of the calculations in the JavaScript so it should be easy to do it in any language. Thanks, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of D'Arcy J.M. Cain Sent: Tuesday, June 19, 2007 12:32 PM To: Francisco Reyes Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies On Tue, 19 Jun 2007 12:58:26 -0400 Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: 3) I suggested JavaScript because most people that get started with PostgreSQL will go to the web in order to find out about issues relating Why not c? Why not whatever and install it on www.PostgreSQL.org? Is there any reason that this tool would need to be run on every installation. Run it on the site and it can always be up to date and can be written in whatever language is easiest to maintain on the mother system. I would also like to make a pitch for a JavaScript-free tool. Just collect all the pertinent information, work it out and display the results in a second page. Some people just don't like JavaScript and turn it off even if we can run it in our browser. -- D'Arcy J.M. Cain [EMAIL PROTECTED] | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On 6/19/07, Campbell, Lance [EMAIL PROTECTED] wrote: Below is a link to the HTML JavaScript configuration page I am creating: http://www.webservices.uiuc.edu/postgresql/ I had many suggestions. Based on the feedback I received, I put together the initial list of questions. This list of questions can be change. Memory There are many different ways to ask about memory. Rather than ask a series of questions I went with a single question, #2. If it is better to ask about the memory in a series of questions then please give me the questions you would ask and why you would ask each of them. From my understanding the primary memory issue as it relates to PostgreSQL is how much memory is available to PostgreSQL. Remember that this needs to be as simple as possible. My next step is to list the PostgreSQL parameters found in the postgresql.conf file and how I will generate their values based on the questions I have so far. I will primarily focus on PostgreSQL 8.2.x. Once I have a consensus from everyone then I will put functionality behind the Generate Suggested Settings button. Thanks for all of the feedback, Lance Campbell Project Manager/Software Architect Web Services at Public Affairs University of Illinois 217.333.0382 http://webservices.uiuc.edu Lance, Simply awesome! -- Yudhvir Singh Sidhu 408 375 3134 cell
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Campbell, Lance writes: For the 6) Are your searches: How about having many simple ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 19 Jun 2007, Y Sidhu wrote: On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? If the user has 2 disks and says that he will do a lot of updates he could put pg_xlog in the second disk. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster Let's not ask about disk or raid at this level of sanity tuning. It is important for a newbie to take the right first step. When it comes to disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI, and controller cards. Then we throw in RAID and the different levels therein. Add to that, we can talk about drivers controlling these drives and which OS is faster, more stable, etc. As you can see, a newbie would get drowned. So, please keep it simple. I know many people on this list are Gurus. We know you are the best in this field, but we are not and are just trying to improve what we have. I strongly agree. besides, the number and types of drives, raid configurations, etc is so variable that I strongly believe that the right answer is going to be something along the lines of 'run this tool and then enter the number(s) that the tool reports' and then let the tool measure the end result of all the variables rather then trying to calculate the results. David Lang ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, 19 Jun 2007, Campbell, Lance wrote: Memory There are many different ways to ask about memory. Rather than ask a series of questions I went with a single question, #2. If it is better to ask about the memory in a series of questions then please give me the questions you would ask and why you would ask each of them. From my understanding the primary memory issue as it relates to PostgreSQL is how much memory is available to PostgreSQL. Remember that this needs to be as simple as possible. there are three catagories of memory useage 1. needed by other software 2. available for postgres 3. needed by the OS it's not clear if what you are asking is #2 or a combination of #2 and #3 IMHO you should ask for #2 and #3, possibly along the lines of how much memory is in the machine that isn't already used by other applications David Lang ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
there are three catagories of memory useage 1. needed by other software 2. available for postgres 3. needed by the OS There's actually only two required memory questions: M1) How much RAM do you have on this machine? M2) Is this: () Dedicated PostgreSQL Server? () Server shared with a few other applications? () Desktop? I don't think the mostly reads / mostly writes question covers anything, nor is it likely to produce accurate answers. Instead, we need to ask the users to characterize what type of application they are running: T1) Please characterize the general type of workload you will be running on this database. Choose one of the following four: () WEB: any scripting-language application which mainly needs to support 90% or more data reads, and many rapid-fire small queries over a large number of connections. Examples: forums, content management systems, directories. () OLTP: this application involves a large number of INSERTs, UPDATEs and DELETEs because most users are modifying data instead of just reading it. Examples: accounting, ERP, logging tools, messaging engines. () Data Warehousing: also called decision support and BI, these database support a fairly small number of large, complicated reporting queries, very large tables, and large batch data loads. () Mixed/Other: if your application doesn't fit any of the above, our script will try to pick safe, middle-of-the-road values. Hmmm, drop question (6) too. (2) should read: What is the maximum number of database connections which you'll need to support? If you don't know, we'll pick a default. Other questions we need: How many/how fast processors do you have? Pick the option which seems closest to what you have: () A single laptop processor () Single or dual older processors (1ghz) () Dual or quad current consumer processors (2ghz+) () Large, recent multi-core server system What OS Are You Using, of course, needs to have Linux, Solaris, BSD, OSX and Windows. At some point, this tool will also need to generate for the user any shmem settings that they need to make on the OS. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
On Tue, Jun 19, 2007 at 10:49:01AM -0700, Y Sidhu wrote: On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote: Campbell, Lance writes: Francisco and Richard, Why ask about disk or raid? How would that impact any settings in postgresql.conf? If the user has 2 disks and says that he will do a lot of updates he could put pg_xlog in the second disk. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster Let's not ask about disk or raid at this level of sanity tuning. It is important for a newbie to take the right first step. When it comes to disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI, and controller cards. Then we throw in RAID and the different levels therein. Add to that, we can talk about drivers controlling these drives and which OS is faster, more stable, etc. As you can see, a newbie would get drowned. So, please keep it simple. I know many people on this list are Gurus. We know you are the best in this field, but we are not and are just trying to improve what we have. Ignoring the i/o subsystem in db configuration, there's an idea. You could request some bonnie++ output (easy to aquire) as a baseline, do your magic analysis based on this, and skip it if it is not provided with a warning. Course the magic may be harder to come by. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
PFC [EMAIL PROTECTED] writes: What version of PostgreSQL are you using? I think newbies should be pushed a bit to use the latest versions, How about pushed *hard* ? I'm constantly amazed at the number of people who show up in the lists saying they installed 7.3.2 or whatever random version they found in a dusty archive somewhere. Please upgrade is at least one order of magnitude more valuable configuration advice than anything else we could tell them. If the configurator is a live tool on the website, then it could be aware of the latest release numbers and prod people with an appropriate amount of urgency depending on how old they say their version is. This may be the one good reason not to provide it as a standalone program. (No, we shouldn't make it try to phone home for latest release numbers --- in the first place, that won't work if the machine is really isolated from the net, and in the second place people will be suspicious of the motives.) regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] PostgreSQL Configuration Tool for Dummies
Lance, The parameters I would think we should calculate are: max_connections shared_buffers work_mem maintenance_work_mem effective_cache_size random_page_cost Actually, I'm going to argue against messing with random_page_cost. It's a cannon being used when a slingshot is called for. Instead (and this was the reason for the What kind of CPU? question) you want to reduce the cpu_* costs. I generally find that if cpu_* are reduced as appropriate to modern faster cpus, and effective_cache_size is set appropriately, a random_page_cost of 3.5 seems to work for appropriate choice of index scans. If you check out my spreadsheet version of this: http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc ... you'll see that the approach I found most effective was to create profiles for each of the types of db applications, and then adjust the numbers based on those. Other things to adjust: wal_buffers checkpoint_segments commit_delay vacuum_delay autovacuum Anyway, do you have a pgfoundry ID? I should add you to the project. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend