Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread Craig Ringer

On 07/14/2012 09:26 AM, B Sreejith wrote:


Dear Robert,

We need to scale up both size and load.
Could you please provide steps I need to follow.



For load, first you need to build a representative sample of your 
application's querying patterns by logging queries and analysing the 
logs. Produce a load generator based on that data, set up a test copy of 
your database, and start pushing the query rate up to see what happens.


For simpler loads you can write a transaction script for pgbench based 
on your queries.


For size: Copy your data set, then start duplicating it with munged 
copies. Repeat, then use the load generator you wrote for the first part 
to see how scaling the data up affects your queries. See if anything is 
unacceptably slow (the auto_explain module is useful here) and examine it.


The truth is that predicting how complex database driven apps will scale 
is insanely hard, because access patterns change as data sizes and user 
counts grow. You're likely to land up tuning for a scenario that's quite 
different to the one that you actually face when you start hitting 
scaling limitations. This doesn't mean you should not investigate, it 
just means your trials don't prove anything and the optimisations you 
make based on what you learn may not gain you much.


--
Craig Ringer


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread John Jones
Hammerora is a good start but does have some issues when trying to get it
started. You can also try PGBench. As someone said, there is a plethora of
choices. It all depends on what you want to measure or accomplish.

John Jones

On Sat, Jul 14, 2012 at 1:48 AM, Craig Ringer ring...@ringerc.id.au wrote:

  On 07/14/2012 09:26 AM, B Sreejith wrote:

 Dear Robert,

 We need to scale up both size and load.
 Could you please provide steps I need to follow.


 For load, first you need to build a representative sample of your
 application's querying patterns by logging queries and analysing the logs.
 Produce a load generator based on that data, set up a test copy of your
 database, and start pushing the query rate up to see what happens.

 For simpler loads you can write a transaction script for pgbench based on
 your queries.

 For size: Copy your data set, then start duplicating it with munged
 copies. Repeat, then use the load generator you wrote for the first part to
 see how scaling the data up affects your queries. See if anything is
 unacceptably slow (the auto_explain module is useful here) and examine it.

 The truth is that predicting how complex database driven apps will scale
 is insanely hard, because access patterns change as data sizes and user
 counts grow. You're likely to land up tuning for a scenario that's quite
 different to the one that you actually face when you start hitting scaling
 limitations. This doesn't mean you should not investigate, it just means
 your trials don't prove anything and the optimisations you make based on
 what you learn may not gain you much.

 --
 Craig Ringer



Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread B Sreejith
Dear All,
Thanks alot for all the invaluable comments.
Regards,
  Sreejith.
On Jul 14, 2012 2:19 PM, Craig Ringer ring...@ringerc.id.au wrote:

  On 07/14/2012 09:26 AM, B Sreejith wrote:

 Dear Robert,

 We need to scale up both size and load.
 Could you please provide steps I need to follow.


 For load, first you need to build a representative sample of your
 application's querying patterns by logging queries and analysing the logs.
 Produce a load generator based on that data, set up a test copy of your
 database, and start pushing the query rate up to see what happens.

 For simpler loads you can write a transaction script for pgbench based on
 your queries.

 For size: Copy your data set, then start duplicating it with munged
 copies. Repeat, then use the load generator you wrote for the first part to
 see how scaling the data up affects your queries. See if anything is
 unacceptably slow (the auto_explain module is useful here) and examine it.

 The truth is that predicting how complex database driven apps will scale
 is insanely hard, because access patterns change as data sizes and user
 counts grow. You're likely to land up tuning for a scenario that's quite
 different to the one that you actually face when you start hitting scaling
 limitations. This doesn't mean you should not investigate, it just means
 your trials don't prove anything and the optimisations you make based on
 what you learn may not gain you much.

 --
 Craig Ringer



Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread Robert Klemme
On Sat, Jul 14, 2012 at 11:50 AM, B Sreejith bsreejit...@gmail.com wrote:
 Dear All,
 Thanks alot for all the invaluable comments.

Additionally to Craig's excellent advice to measurements there's
something else you can do: with the knowledge of the queries your
application fires against the database you can evaluate your schema
and index definitions.  While there is no guarantee that your
application will scale well if all indexes are present you believe
need to be present based on that inspection, you can pretty easily
identify tables with can be improved.  These are tables which a) are
known to grow large and b) do not have indexes nor no indexes which
support the queries your application does against these tables which
will result in full table scans.  Any database which scales in size
will sooner or later hit a point where full table scans of these large
tables will be extremely slow.  If these queries are done during
regular operation (and not nightly maintenance windows for example)
then you pretty surely have identified a show stopper.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread Sergey Konoplev
 We have around 15 to 18 separate products.What we are told to do is to check
 the scalability of the underlying DB of each product (application).

 Sounds like your client / boss has a case of buzz-word-itis. Scalability
 means lots of different things:

Yes, it is still not clear what exactly they want from you, but that
is what I would do...

I would take the metrics Craig described. These ones:

 - How well it copes with growth of data sizes
 - How well it copes with growth of query rates / activity
 - How well it copes with larger user counts (may not be the same as prior)
- Also hard drives activity, CPU, etc

And started to collect this statistics using monitoring tools like
http://www.cacti.net/, for example.

After a week/month/quarter, as time passes and the database activity
and size changes, you will see how the measurements are changed
(usually degraded). So you would be able to make conclusions on
whether your environment meets current requirements or not and to
forecast critical points.

As Craig mentioned, you may also try to simulate your database
activity either with pgbench. I would just like to show you this
article http://www.westnet.com/~gsmith/content/postgresql/pgbench-scaling.htm
where you will find some hints for your case.

Also look at the playback tools
http://wiki.postgresql.org/wiki/Statement_Playback.

-- 
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray...@gmail.com Skype: gray-hemp Phone: +79160686204

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread Craig Ringer

On 07/14/2012 08:17 PM, Robert Klemme wrote:

On Sat, Jul 14, 2012 at 11:50 AM, B Sreejith bsreejit...@gmail.com wrote:

Dear All,
Thanks alot for all the invaluable comments.

Additionally to Craig's excellent advice to measurements there's
something else you can do: with the knowledge of the queries your
application fires against the database you can evaluate your schema
and index definitions.  While there is no guarantee that your
application will scale well if all indexes are present
Don't forget that sometimes it's better to DROP an index that isn't used 
much, or that only helps occasional queries that aren't time-sensitive. 
Every index has a cost to maintain - it slows down your inserts and 
updates and it competes for disk cache with things that might be more 
beneficial.

b) do not have indexes nor no indexes which
support the queries your application does against these tables which
will result in full table scans.
A full table scan is not inherently a bad thing, even for a huge table. 
Sometimes you just need to examine every row, and the fastest way to do 
that is without a doubt a full table scan.


Remember, a full table scan won't tend to push everything out of 
shared_buffers, so it can also avoid competition for cache.


(If anyone ever wants concurrent scans badly enough to implement them, 
full table scans with effective_io_concurrency  1 will become a *lot* 
faster for some types of query).


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread Sreejith Balakrishnan
Dear @,

Is there any tool or some sort of script available, for PostgreSQL, which 
can be used to measure scalability of an application's database. Or is 
there any guideline on how to do this.

I am a bit confused about the concept of measuring scalability of an 
application's database.

How is the scalability measured? 

Is it like loading the DB with a bulk data volume and then do performance 
testing by using tools like JMeter?

Could any one kindly help me on this..

Thanks,
Sreejith.
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread Sergey Konoplev
On Tue, Jul 10, 2012 at 12:21 PM, Sreejith Balakrishnan
sreejith.balakrish...@tcs.com wrote:
 Dear @,

 Is there any tool or some sort of script available, for PostgreSQL, which
 can be used to measure scalability of an application's database. Or is there
 any guideline on how to do this.

scalability of an application's database can be understood either
like a relation of transactions per second to database size or like an
ability of database to be sharded/partitioned or may be like something
else.

Could you please explain more specifically the original task?
What is the goal of it?

 I am a bit confused about the concept of measuring scalability of an
 application's database.

 How is the scalability measured?

 Is it like loading the DB with a bulk data volume and then do performance
 testing by using tools like JMeter?

 Could any one kindly help me on this..

 Thanks,
 Sreejith.

 =-=-=
 Notice: The information contained in this e-mail
 message and/or attachments to it may contain
 confidential or privileged information. If you are
 not the intended recipient, any dissemination, use,
 review, distribution, printing or copying of the
 information contained in this e-mail message
 and/or attachments to it are strictly prohibited. If
 you have received this communication in error,
 please notify us by reply e-mail or telephone and
 immediately and permanently delete the message
 and any attachments. Thank you



-- 
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray...@gmail.com Skype: gray-hemp Phone: +79160686204

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread Robert Klemme
On Tue, Jul 10, 2012 at 10:21 AM, Sreejith Balakrishnan
sreejith.balakrish...@tcs.com wrote:
 Is there any tool or some sort of script available, for PostgreSQL, which
 can be used to measure scalability of an application's database. Or is there
 any guideline on how to do this.

 I am a bit confused about the concept of measuring scalability of an
 application's database.

You cannot measure scalability of a database as such.  You need to
know the nature of the load (i.e. operations executed against the DB -
how many INSERT, UPDATE, DELETE and SELECT, against which tables and
with what frequency and criteria).  And then, as Sergey said, you need
to define whether you want to scale up the load or the size - or both.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread B Sreejith
Dear Sergev,

We have around 15 to 18 separate products.What we are told to do is to
check the scalability of the underlying DB of each product (application).
That's the requirement.Nothing more was explained to us.That's why I said
earlier that I am confused on how to approach this.

Regards,
Sreejith.
On Jul 14, 2012 12:08 AM, Sergey Konoplev gray...@gmail.com wrote:

 On Tue, Jul 10, 2012 at 12:21 PM, Sreejith Balakrishnan
 sreejith.balakrish...@tcs.com wrote:
  Dear @,
 
  Is there any tool or some sort of script available, for PostgreSQL, which
  can be used to measure scalability of an application's database. Or is
 there
  any guideline on how to do this.

 scalability of an application's database can be understood either
 like a relation of transactions per second to database size or like an
 ability of database to be sharded/partitioned or may be like something
 else.

 Could you please explain more specifically the original task?
 What is the goal of it?

  I am a bit confused about the concept of measuring scalability of an
  application's database.
 
  How is the scalability measured?
 
  Is it like loading the DB with a bulk data volume and then do performance
  testing by using tools like JMeter?
 
  Could any one kindly help me on this..
 
  Thanks,
  Sreejith.
 
  =-=-=
  Notice: The information contained in this e-mail
  message and/or attachments to it may contain
  confidential or privileged information. If you are
  not the intended recipient, any dissemination, use,
  review, distribution, printing or copying of the
  information contained in this e-mail message
  and/or attachments to it are strictly prohibited. If
  you have received this communication in error,
  please notify us by reply e-mail or telephone and
  immediately and permanently delete the message
  and any attachments. Thank you



 --
 Sergey Konoplev

 a database and software architect
 http://www.linkedin.com/in/grayhemp

 Jabber: gray...@gmail.com Skype: gray-hemp Phone: +79160686204

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread B Sreejith
Dear Robert,

We need to scale up both size and load.
Could you please provide steps I need to follow.

Warm regards,
Sreejith.
On Jul 14, 2012 1:37 AM, Robert Klemme shortcut...@googlemail.com wrote:

 On Tue, Jul 10, 2012 at 10:21 AM, Sreejith Balakrishnan
 sreejith.balakrish...@tcs.com wrote:
  Is there any tool or some sort of script available, for PostgreSQL, which
  can be used to measure scalability of an application's database. Or is
 there
  any guideline on how to do this.
 
  I am a bit confused about the concept of measuring scalability of an
  application's database.

 You cannot measure scalability of a database as such.  You need to
 know the nature of the load (i.e. operations executed against the DB -
 how many INSERT, UPDATE, DELETE and SELECT, against which tables and
 with what frequency and criteria).  And then, as Sergey said, you need
 to define whether you want to scale up the load or the size - or both.

 Kind regards

 robert

 --
 remember.guy do |as, often| as.you_can - without end
 http://blog.rubybestpractices.com/

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread Craig Ringer

On 07/14/2012 09:21 AM, B Sreejith wrote:


Dear Sergev,

We have around 15 to 18 separate products.What we are told to do is to 
check the scalability of the underlying DB of each product (application).
That's the requirement.Nothing more was explained to us.That's why I 
said earlier that I am confused on how to approach this.




Sounds like your client / boss has a case of buzz-word-itis. 
Scalability means lots of different things:


- How well it copes with growth of data sizes
- How well it copes with growth of query rates / activity
- How well it copes with larger user counts (may not be the same as prior)
- Whether it's easily sharded onto multiple systems
- Whether it has any locking choke-points that serialize common operations
- 

Perhaps most importantly, your database is only as scalable as your 
application's use of it. Two apps can use exactly the same database 
structure, but one of them can struggle massively under load another one 
barely notices. For example, if one app does this (pseudocode):


SELECT id FROM customer WHERE 
FOR attribute IN customer
   SELECT :attribute.name FROM customer WHERE id = :customer.id
   IF attribute.is_changed THEN
   UPDATE customer SET :attribute.name = :attribute.new_value WHERE 
id = :customer.id

   END IF

and another just does:

UPDATE customer
SET attribute1 = value1, attribute2 = value2, attribute3 = value3
WHERE 


The first will totally melt down under load that isn't significantly 
different from idle as far as the second one is concerned.


That's a ridiculously bad example for the first app, but real examples 
that aren't much better arise from badly tuned or badly written object 
relational management systems. The classic N+1 selects problem and 
massive inefficient multiple left outer joins are classics.


Thus, you can't really evaluate the scalability of the database under 
load separately from the application that's using it and the workload.


--
Craig Ringer