[PERFORM] How clustering for scale out works in PostgreSQL
I am *expecting 1000+ hits to my PostgreSQL DB* and I doubt my standalone DB will be able to handle it. So I want to *scale out by adding more servers to share the load*. For this, I want to do clustering. I am *curious to know how clustering works in PostgreSQL.* (I don't want to know how to setup cluster - as of now. Just want to know how clustering works). When I look at some of the content available while googling, I am getting more and more confused, as I find that in most of the sites, clustering is used interchangeably with replication. *My purpose is scale out to handle more load, not high availability.* Can any one please help me with the details or guide me to use urls. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
On 08/29/2013 07:59 AM, Richard Huxton wrote: On 29/08/13 13:14, bsreejithin wrote: I am *expecting 1000+ hits to my PostgreSQL DB* and I doubt my standalone DB will be able to handle it. We are going to need a little more detail here. In a normal environment 1000+ hits isn't that much, even if the hit is generating a dozen queries per page. A more appropriate action would be to consider the amount of transaction per second and the type of queries the machine will be doing. You will want to look into replication, hot standby as well as read only scaling with pgpool-II. OMG! 1000 hits every year! And hits too - not just any type of query :-) Seriously, if you try describing your setup, what queries make up your hits and what you mean by 1000 then there are people on this list who can tell you what sort of setup you'll need. While you're away googling though, replication is indeed the term you want. In particular hot standby which lets you run read-only queries on the replicas. Sarcasm with new recruits to the community is not the way to go. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
On Thu, Aug 29, 2013 at 11:13 AM, bsreejithin bsreejit...@gmail.com wrote: Thanks a lot Joshua and others who have responded.. I am sorry about not putting in more details in my initial post. What I posted is about a new setup that's going to come up..Discussions are on whether to setup DB cluster to handle 1000 concurrent users. Ok. That's a start. Can you tell us more about what these users are doing? What kind of queries are being issued to the database? How often (per user or total per time)? __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [PERFORM] How clustering for scale out works in PostgreSQL
The performance test that was conducted was for 1 Hour. There are 6 transactions. 2 DB inserts and 4 SELECTs. Every 2 minutes there will be 4 SELECTs. And every 3 minutes there will be 2 DB inserts. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917p5768957.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
On 08/29/2013 09:42 AM, bsreejithin wrote: The performance test that was conducted was for 1 Hour. There are 6 transactions. 2 DB inserts and 4 SELECTs. Every 2 minutes there will be 4 SELECTs. And every 3 minutes there will be 2 DB inserts. This shouldn't be a problem with proper hardware and a connection pooler. The concern isn't the 1000 sessions, it is the creating and destroying in rapid succession of 1000 connections. A connection pooler will resolve that issue. Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
-Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql- performance-ow...@postgresql.org] On Behalf Of bsreejithin Sent: Thursday, August 29, 2013 12:42 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How clustering for scale out works in PostgreSQL The performance test that was conducted was for 1 Hour. There are 6 transactions. 2 DB inserts and 4 SELECTs. Every 2 minutes there will be 4 SELECTs. And every 3 minutes there will be 2 DB inserts. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out- works-in-PostgreSQL-tp5768917p5768957.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. With that kind of activity, you don't need clustering for your 1000 users. What you need is PgBouncer, it should solv your problem. Please read some docs on PgBouncer, it's light-weight and very easy to setup. Regards, Igor Neyman -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
Ok Igor..Will check out PgBouncer..Thanks a lot. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917p5768960.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
Thanks Joshua..Will look to use connection pooler which Igor mentioned.. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917p5768961.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
bsreejithin wrote on 29.08.2013 18:13: PostgreSQL version was* 8.2*. 8.2 has long been deprecated. For a new system you should use 9.2 (or at least 9.1) Thomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How clustering for scale out works in PostgreSQL
Ya..sure...Migration to 9.2 is one of the activities planned and in fact it's already on track.Thanks Thomas On Thu, Aug 29, 2013 at 11:16 PM, Thomas Kellerer [via PostgreSQL] ml-node+s1045698n5768973...@n5.nabble.com wrote: bsreejithin wrote on 29.08.2013 18:13: PostgreSQL version was* 8.2*. 8.2 has long been deprecated. For a new system you should use 9.2 (or at least 9.1) Thomas -- Sent via pgsql-performance mailing list ([hidden email]http://user/SendEmail.jtp?type=nodenode=5768973i=0) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- If you reply to this email, your message will be added to the discussion below: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917p5768973.html To unsubscribe from How clustering for scale out works in PostgreSQL, click herehttp://postgresql.1045698.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=5768917code=YnNyZWVqaXRoaW5AZ21haWwuY29tfDU3Njg5MTd8MTYxODQyODgxOA== . NAMLhttp://postgresql.1045698.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-clustering-for-scale-out-works-in-PostgreSQL-tp5768917p5768974.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
[PERFORM] Optimising views
Hi all, I've started using views in an attempt to try and simplify some of my more complex reporting and running into some snags and I'm trying to figure out how Postgres decides to join data with my view. Just to present a simplified representation of what I am trying to accomplish the following would match pretty well with what I'm doing. At the center I have a table called jobs that holds information about jobs clients have requested. Related to my jobs are a number of different tables that each hold different types of costs. There is a simple table called directcosts that simply holds single items that represent a cost on that job There is a more complex invoice/detail structure for invoices received by suppliers who did things in relation to the job And a few more structures like that. What I currently do when I need to report on the total costs per job, and which works very well, is the following: select jobid, jobdesc, sum(cost) from ( select jobid, jobdesc, dcamount as cost from jobs join directcosts on jobid = dcjobid where some filter for my jobs union all select jobid, jobdesc, detamount as cost from jobs join invoiceheader on jobid = invjobid join invoicedetail on detinvid = invid where some filter for my jobs ) totalcosts group by jobid, jobdesc Work well enough.. But as I'm using the same data in different reports and I though a view might be smart. So I created a view: create view v_costs as select dcjobid as costjobid, sum(dcamount) as costamount from directcosts group by dcjobid union all select invjobid as costjobid, sum(detamount) as costamount from invoiceheader join finvoicedetail on detinvid = invid group by invjobid And rewrote my report to: select jobid, jobdesc, sum(costamount) from jobs join v_costs on costjobid = jobid where some filter for my jobs group by jobid, jobdesc Now what I was hoping for was that postgres would start at my jobs table, find the records I'm trying to report on and then index scan on the related tables and start aggregating the amounts. What it seems to do is to first execute the view to get totals for all the jobs in the database and join that result set with my 2 or 3 jobs that match my filter. What is it about my view that prevents postgres to effectively use it? The group bys? the union? Cheers, Bastiaan Olij -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance