Re: [PERFORM] Database conversion woes...

2004-02-06 Thread Kevin Carpenter
First just wanted to say thank you all for the quick and helpful 
answers.  With all the input I know I am on the right track.  With that 
in mind I created a perl script to do my migrations and to do it based 
on moving from a db name to a schema name.  I had done alot of the 
reading on converting based on the miss match of data types that MySQL 
likes to use.  I must say it is VERY nice having a intelligent system 
that say won't let a date of '-00-00' be entered.  Luckily I didn't 
have to deal with any enumerations.

So the conversion goes on.  I will definitely be back and forth in here 
as I get the new queries written and start migrating all I can back into 
the pg backend using plpgsql or c for the stored procedures where 
required.  The mammoth replicator has been working well.  I had tried 
the pgsql-r and had limited success with it, and dbmirror was just 
taking to long having to do 4 db transactions just to mirror one 
command.  I have eserv but was never really a java kind of guy.

Alright then - back to my code.  Again thanks for the help and info.

Kevin

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Database conversion woes...

2004-02-03 Thread Jeff
On Tue, 03 Feb 2004 11:42:59 -0500
Kevin Carpenter [EMAIL PROTECTED] wrote:

 For starters, I am moving roughly 50 seperate databases which each one
 represents one of our clients and is roughly 500 megs to 3 gigs in
 size.
  Currently we are using the MySQL replication, and so I am looking at
 Mammoths replicator for this one.  However I have seen it only allows
 on DB to be replicated at a time.  With the size of each single db, I

Not knowing too much about mammoths, but how the others work, you should
be able to run a replicator for each db.  (Or hack a shell script up to
make it run the replicator for each db.. either way each db will be
replicated independant of the others)

 don't know how I could put them all together under one roof, and if I
 was going to, what are the maximums that Postgres can handle for
 tables in one db?  We track over 2 million new points of data
 (records) a day, and are moving to 5 million in the next year.
 

From the docs:

Maximum size for a database unlimited (4 TB databases exist)
Maximum size for a table16 TB on all operating systems
Maximum size for a row  1.6 TB
Maximum size for a field1 GB
Maximum number of rows in a table   unlimited
Maximum number of columns in a table250 - 1600 depending on column
types Maximum number of indexes on a table  unlimited
 
...

My largest PG db is 50GB. 

My busiest PG db runs about 50 update|delete|insert's / second
(sustained throughout the day. It bursts up to 150 now and then).  And
we're doing about 40 selects / second.  And the machine it is running on
is typically 95% idle.  (Quad 2ghz xeon)

-- 
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Database conversion woes...

2004-02-03 Thread Richard Huxton
On Tuesday 03 February 2004 16:42, Kevin Carpenter wrote:

 Thanks in advance, will give more detail - just looking for some open
 directions and maybe some kicks to fuel my thought in other areas.

I've taken to doing a lot of my data manipulation (version conversions etc) in 
PG even if the final resting place is MySQL.

It's generally not too difficult to transfer data but you will have problems 
with MySQL's more relaxed attitude to data types (things like all-zero 
timestamps). I tend to write a script to tidy the data before export, and 
repeatedly restore from backup until the script corrects all problems.Not 
sure how convenient that'll be with dozens of gigs of data. Might be 
practical to start with the smaller databases, let your script grow in 
capabilities before importing the larger ones.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Database conversion woes...

2004-02-03 Thread Christopher Browne
[EMAIL PROTECTED] (Kevin Carpenter) writes:
 I am doing a massive database conversion from MySQL to Postgresql for a
 company I am working for.  This has a few quirks to it that I haven't
 been able to nail down the answers I need from reading and searching
 through previous list info.

 For starters, I am moving roughly 50 seperate databases which each
 one represents one of our clients and is roughly 500 megs to 3 gigs
 in size.  Currently we are using the MySQL replication, and so I am
 looking at Mammoths replicator for this one.  However I have seen it
 only allows on DB to be replicated at a time.  With the size of each
 single db, I don't know how I could put them all together under one
 roof, and if I was going to, what are the maximums that Postgres can
 handle for tables in one db?  We track over 2 million new points of
 data (records) a day, and are moving to 5 million in the next year.

I'll be evasive about replication, because the answers are pretty
painful :-(, but as for the rest of it, nothing about this sounds
challenging.

There is room for debate here as to whether you should have:
 a) One postmaster and many database instances, 
 2) One postmaster, one (or a few) database instances, and do the
client 'split' via schemas, or
 iii) Several postmasters, many database instances.

Replication will tend to work best with scenario 2), which minimizes
the number of connections that are needed to manage replication;
that's definitely a factor worth considering.

It is also possible for it to be worthwhile to spread vastly differing
kinds of activity across different backends so that they can have
separate buffer caches.  If all the activity is shared across one
postmaster, that means it is all shared across one buffer cache, and
there are pathological situations that are occasionally observed in
practice where one process will be trashing the shared cache,
thereby injuring performance for all other processes using that back
end.  In such a case, it may be best to give the ill-behaved process
its own database instance with a small cache that it can thrash on
without inconveniencing others.

Jan Wieck is working on some improvements for buffer management in 7.5
that may improve the situation vis-a-vis buffering, but that is
certainly not something ready to deploy in production just yet.

 Second what about the physical database size, what are the limits
 there?  I have seen that it was 4 gig on Linux from a 2000 message,
 but what about now?  Have we found way's past that?

There's NO problem with having enormous databases now; each table is
represented as one or more files (if you break a size barrier, oft
configured as 1GB, it creates an extent and extends into another
file), and for there to be problems with this, the problems would be
_really crippling_ OS problems.
-- 
(format nil [EMAIL PROTECTED] cbbrowne cbbrowne.com)
http://www3.sympatico.ca/cbbrowne/linuxxian.html
We come to bury DOS, not to praise it.
-- Paul Vojta [EMAIL PROTECTED], paraphrasing a quote of
Shakespeare

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Database conversion woes...

2004-02-03 Thread Richard Huxton
On Tuesday 03 February 2004 22:29, Kevin wrote:
 The mammoth replicator has been working well.  I had tried
 the pgsql-r and had limited success with it, and dbmirror was just
 taking to long having to do 4 db transactions just to mirror one
 command.  I have eserv but was never really a java kind of guy.

When this is over and you've got the time, I don't suppose you could put 
together a few hundred words describing your experiences with the Mammoth 
replicator - there are a couple of places they could be posted.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings