Re: [PERFORM] Database conversion woes...
First just wanted to say thank you all for the quick and helpful answers. With all the input I know I am on the right track. With that in mind I created a perl script to do my migrations and to do it based on moving from a db name to a schema name. I had done alot of the reading on converting based on the miss match of data types that MySQL likes to use. I must say it is VERY nice having a intelligent system that say won't let a date of '-00-00' be entered. Luckily I didn't have to deal with any enumerations. So the conversion goes on. I will definitely be back and forth in here as I get the new queries written and start migrating all I can back into the pg backend using plpgsql or c for the stored procedures where required. The mammoth replicator has been working well. I had tried the pgsql-r and had limited success with it, and dbmirror was just taking to long having to do 4 db transactions just to mirror one command. I have eserv but was never really a java kind of guy. Alright then - back to my code. Again thanks for the help and info. Kevin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Database conversion woes...
On Tue, 03 Feb 2004 11:42:59 -0500 Kevin Carpenter [EMAIL PROTECTED] wrote: For starters, I am moving roughly 50 seperate databases which each one represents one of our clients and is roughly 500 megs to 3 gigs in size. Currently we are using the MySQL replication, and so I am looking at Mammoths replicator for this one. However I have seen it only allows on DB to be replicated at a time. With the size of each single db, I Not knowing too much about mammoths, but how the others work, you should be able to run a replicator for each db. (Or hack a shell script up to make it run the replicator for each db.. either way each db will be replicated independant of the others) don't know how I could put them all together under one roof, and if I was going to, what are the maximums that Postgres can handle for tables in one db? We track over 2 million new points of data (records) a day, and are moving to 5 million in the next year. From the docs: Maximum size for a database unlimited (4 TB databases exist) Maximum size for a table16 TB on all operating systems Maximum size for a row 1.6 TB Maximum size for a field1 GB Maximum number of rows in a table unlimited Maximum number of columns in a table250 - 1600 depending on column types Maximum number of indexes on a table unlimited ... My largest PG db is 50GB. My busiest PG db runs about 50 update|delete|insert's / second (sustained throughout the day. It bursts up to 150 now and then). And we're doing about 40 selects / second. And the machine it is running on is typically 95% idle. (Quad 2ghz xeon) -- Jeff Trout [EMAIL PROTECTED] http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Database conversion woes...
On Tuesday 03 February 2004 16:42, Kevin Carpenter wrote: Thanks in advance, will give more detail - just looking for some open directions and maybe some kicks to fuel my thought in other areas. I've taken to doing a lot of my data manipulation (version conversions etc) in PG even if the final resting place is MySQL. It's generally not too difficult to transfer data but you will have problems with MySQL's more relaxed attitude to data types (things like all-zero timestamps). I tend to write a script to tidy the data before export, and repeatedly restore from backup until the script corrects all problems.Not sure how convenient that'll be with dozens of gigs of data. Might be practical to start with the smaller databases, let your script grow in capabilities before importing the larger ones. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Database conversion woes...
[EMAIL PROTECTED] (Kevin Carpenter) writes: I am doing a massive database conversion from MySQL to Postgresql for a company I am working for. This has a few quirks to it that I haven't been able to nail down the answers I need from reading and searching through previous list info. For starters, I am moving roughly 50 seperate databases which each one represents one of our clients and is roughly 500 megs to 3 gigs in size. Currently we are using the MySQL replication, and so I am looking at Mammoths replicator for this one. However I have seen it only allows on DB to be replicated at a time. With the size of each single db, I don't know how I could put them all together under one roof, and if I was going to, what are the maximums that Postgres can handle for tables in one db? We track over 2 million new points of data (records) a day, and are moving to 5 million in the next year. I'll be evasive about replication, because the answers are pretty painful :-(, but as for the rest of it, nothing about this sounds challenging. There is room for debate here as to whether you should have: a) One postmaster and many database instances, 2) One postmaster, one (or a few) database instances, and do the client 'split' via schemas, or iii) Several postmasters, many database instances. Replication will tend to work best with scenario 2), which minimizes the number of connections that are needed to manage replication; that's definitely a factor worth considering. It is also possible for it to be worthwhile to spread vastly differing kinds of activity across different backends so that they can have separate buffer caches. If all the activity is shared across one postmaster, that means it is all shared across one buffer cache, and there are pathological situations that are occasionally observed in practice where one process will be trashing the shared cache, thereby injuring performance for all other processes using that back end. In such a case, it may be best to give the ill-behaved process its own database instance with a small cache that it can thrash on without inconveniencing others. Jan Wieck is working on some improvements for buffer management in 7.5 that may improve the situation vis-a-vis buffering, but that is certainly not something ready to deploy in production just yet. Second what about the physical database size, what are the limits there? I have seen that it was 4 gig on Linux from a 2000 message, but what about now? Have we found way's past that? There's NO problem with having enormous databases now; each table is represented as one or more files (if you break a size barrier, oft configured as 1GB, it creates an extent and extends into another file), and for there to be problems with this, the problems would be _really crippling_ OS problems. -- (format nil [EMAIL PROTECTED] cbbrowne cbbrowne.com) http://www3.sympatico.ca/cbbrowne/linuxxian.html We come to bury DOS, not to praise it. -- Paul Vojta [EMAIL PROTECTED], paraphrasing a quote of Shakespeare ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Database conversion woes...
On Tuesday 03 February 2004 22:29, Kevin wrote: The mammoth replicator has been working well. I had tried the pgsql-r and had limited success with it, and dbmirror was just taking to long having to do 4 db transactions just to mirror one command. I have eserv but was never really a java kind of guy. When this is over and you've got the time, I don't suppose you could put together a few hundred words describing your experiences with the Mammoth replicator - there are a couple of places they could be posted. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings