Re: [ADMIN] Data split -- Creating a copy of database without outage

2012-05-30 Thread Igor Shmain
Thank you, Liu, for your suggestion. 

 

I might be missing something (I am new to postgres), but it seems that your
suggestion will not help much in my case.  Since the number of db requests
will grow with time (more users will come into the system), the current
server will become incapable of serving all the requests quickly enough.

 

The idea is to increase overall calculation capacity of the system by adding
more servers to it. Throwing more hardware to a single main server is not an
option in my case. Creating multiple replicas (slaves) is also not an good
option - it would be way more efficient to have a group of db servers, each
serving only some subset of users and hosting data for those users only.
Buying new servers in advance is not an option too.

 

What I am looking for is switching some of the users to another db server
when the capacity of the existing server(s) is not enough. The point is to
do it without interrupting the users' work (so they do not see that horrible
Sorry! This site is under maintenance.).

 

If I missed something it would be very kind of you to point this out.

 

Thank you once again,

-igorS

 

 

From: Haifeng Liu [mailto:liuhaif...@live.com] 
Sent: May-29-12 9:13 PM
To: Igor Shmain
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Data split -- Creating a copy of database without
outage

 

Why not use a partitioned table? You can write a trigger to control which
partition the coming data should be inserted.

 

Regards,



Liu Haifeng

Home: http://liuhaifeng.com

 





 

On May 30, 2012, at 4:13 AM, Igor Shmain wrote:





Can you please help with advice?

I need to design a solution for a database which will grow and will require
a horizontal split at some moment.

Here is how I am planning to do it: Every record in every table has a shard
number. After the database becomes too large (gets too many requests), the
tables need to be horizontally split. It means that for every table all
records with some specific shard numbers need to be moved to a new database.

My plan is to (1) create a copy of the database on a new server
(backup/restore?), (2) synchronize the databases (how?), and then (3)
forward all requests for the moved shards to the new database. (I will get
rid of the unneeded shards later). An important requirement: there should be
no outage for the users.

What are the ways to achieve it?

 

Thank you in advance,
-igorS

 

 



Re: [ADMIN] Data split -- Creating a copy of database without outage

2012-05-30 Thread Craig James
On Wed, May 30, 2012 at 8:14 AM, Igor Shmain igor.shm...@gmail.com wrote:

 Thank you, Liu, for your suggestion. 

 ** **

 I might be missing something (I am new to postgres), but it seems that
 your suggestion will not help much in my case.  Since the number of db
 requests will grow with time (more users will come into the system), the
 current server will become incapable of serving all the requests quickly
 enough.

 ** **

 The idea is to increase overall calculation capacity of the system by
 adding more servers to it. Throwing more hardware to a single main server
 is not an option in my case. Creating multiple replicas (slaves) is also
 not an good option – it would be way more efficient to have a group of db
 servers, each serving only some subset of users and hosting data for those
 users only. Buying new servers in advance is not an option too.

 ** **

 What I am looking for is switching some of the users to another db server
 when the capacity of the existing server(s) is not enough. The point is to
 do it without interrupting the users’ work (so they do not see that
 horrible “Sorry! This site is under maintenance…”).


Londiste is very good at replicating a database.  It's normally used for
real-time backup or to load-balance read-only applications, but it can also
be used for the situation you describe.  At some point when you decide it's
necessary to split your database, you would create a new database on a
second server, create an empty duplicate schema (using pg_dump's
schema-only feature) then install Londiste.  Londiste would replicate your
database in real time up to the point where you were ready to make the
actual switch.

To switch, you'd simply stop Londiste and simultaneously reconfigure your
load-balancing system so that a subset of your users were directed to the
new database.  Then you could uninstall Londiste, and clean out each
database by removing the user data that is for users on the other server.

You might also consider partitioning.  If you know in advance that you're
going to be discarding large subsets of the data, it will be much more
efficient if you partition it at the outset.  When you split  your system
in two, the cleanup process will be nothing more than dropping partitions.
You won't be plagued by bloated indexes and files.

Craig



 

 ** **

 If I missed something it would be very kind of you to point this out.

 ** **

 Thank you once again,

 -igorS

 ** **

 ** **

 *From:* Haifeng Liu [mailto:liuhaif...@live.com]
 *Sent:* May-29-12 9:13 PM
 *To:* Igor Shmain
 *Cc:* pgsql-admin@postgresql.org
 *Subject:* Re: [ADMIN] Data split -- Creating a copy of database without
 outage

 ** **

 Why not use a partitioned table? You can write a trigger to control which
 partition the coming data should be inserted.

 ** **

 Regards,

 

 Liu Haifeng

 Home: http://liuhaifeng.com

 ** **



 

 ** **

 On May 30, 2012, at 4:13 AM, Igor Shmain wrote:



 

 Can you please help with advice?

 I need to design a solution for a database which will grow and will
 require a horizontal split at some moment.

 Here is how I am planning to do it: Every record in every table has a
 shard number. After the database becomes too large (gets too many
 requests), the tables need to be horizontally split. It means that for
 every table all records with some specific shard numbers need to be moved
 to a new database.

 My plan is to (1) create a copy of the database on a new server
 (backup/restore?), (2) synchronize the databases (how?), and then (3)
 forward all requests for the moved shards to the new database. (I will get
 rid of the unneeded shards later). An important requirement: there should
 be no outage for the users.

 What are the ways to achieve it?

  

 Thank you in advance,
 -igorS

  

 ** **



Re: [ADMIN] Data split -- Creating a copy of database without outage

2012-05-30 Thread Amador Alvarez

Hi ,

I would start with a single high performance tuned database focusing 
mainly on dealing efficiently with concurrent activity and identifying 
the real hot spots.
If you check out that you really need to go forward on database power, 
consider on adding  new databases and relocate some users whenever you 
need it.


Take a look on the different options related to replication, tunning and 
database balancers, set up some testings and get a deeper understanding 
on your real needs and performance.


A.A.



On 05/30/2012 08:14 AM, Igor Shmain wrote:


Thank you, Liu, for your suggestion.

I might be missing something (I am new to postgres), but it seems that 
your suggestion will not help much in my case.  Since the number of db 
requests will grow with time (more users will come into the system), 
the current server will become incapable of serving all the requests 
quickly enough.


The idea is to increase overall calculation capacity of the system by 
adding more servers to it. Throwing more hardware to a single main 
server is not an option in my case. Creating multiple replicas 
(slaves) is also not an good option -- it would be way more efficient 
to have a group of db servers, each serving only some subset of users 
and hosting data for those users only. Buying new servers in advance 
is not an option too.


What I am looking for is switching some of the users to another db 
server when the capacity of the existing server(s) is not enough. The 
point is to do it without interrupting the users' work (so they do not 
see that horrible Sorry! This site is under maintenance...).


If I missed something it would be very kind of you to point this out.

Thank you once again,

-igorS

*From:*Haifeng Liu [mailto:liuhaif...@live.com]
*Sent:* May-29-12 9:13 PM
*To:* Igor Shmain
*Cc:* pgsql-admin@postgresql.org
*Subject:* Re: [ADMIN] Data split -- Creating a copy of database 
without outage


Why not use a partitioned table? You can write a trigger to control 
which partition the coming data should be inserted.


Regards,



Liu Haifeng

Home: http://liuhaifeng.com



On May 30, 2012, at 4:13 AM, Igor Shmain wrote:



Can you please help with advice?

I need to design a solution for a database which will grow and will 
require a horizontal split at some moment.


Here is how I am planning to do it: Every record in every table has a 
shard number. After the database becomes too large (gets too many 
requests), the tables need to be horizontally split. It means that for 
every table all records with some specific shard numbers need to be 
moved to a new database.


My plan is to (1) create a copy of the database on a new server 
(backup/restore?), (2) synchronize the databases (how?), and then (3) 
forward all requests for the moved shards to the new database. (I will 
get rid of the unneeded shards later). An important requirement: there 
should be no outage for the users.


What are the ways to achieve it?

Thank you in advance,
-igorS



Re: [ADMIN] Data split -- Creating a copy of database without outage

2012-05-30 Thread Igor Shmain
Thank you Craig and thank you Amador for your great help!

 

Craig's ideas are very useful - I definitely want to look deeper into
Londiste and to use partitioning.

 

Amador's approach makes perfect sense to me.

 

 

Best wishes,

-igor