Barry, I have made a similar experience, moving a big Oracle data base to Postgres 8.03 on linux. The first impact was similar, huge performance problems. The main problem was bad planner choices. The cause in our case: bad parameter types in the jdbc set methods (I guess you use Java). For oracle we used the NUMERIC type to set primary keys, but the postgres id type used was BIGINT, and it just refused to use the index in this case. Imagine that kicking in on a 100 million rows table... a sequential scan started a few times a second, now that made the DB unusable. So we fixed the code that for oracle continues to use NUMERIC and for postgres it uses BIGINT, and that is very important on setNull calls too.
One very useful tool was the following query: prepare ps as SELECT procpid, substring(current_query for 97), to_char((now()-query_start), 'HH24:MI:SS') as t FROM pg_stat_activity where current_query not like '%<insufficient%' and current_query not like '%IDLE%' order by t desc; Then you just "execute ps;" in psql, and it will show you the queries which are already running for a while. Other problems were caused by complex queries, where more than 2 tables were joined. For oracle we were giving "hints" in the form of special comments, to point to the right index, right plan, but that's not an option for postgres (yet ?). So the fix in this case was to use explicit joins which do influence the postgres planner choices. This fixed another class of issues for us... Another problem: if you want to avoid worst-case plans, and do away with a generic plan for all cases, then you might force the usage of server side prepare statements in all cases. I had to do that, a lot of queries were performing very badly without this. Now maybe that could be solved by raising the statistics targets where needed, but in my case the generic plan was always good enough, by design. We rely on the DB picking a good generic plan in all cases. One typical example for us would be: a limit query which select 20 rows out of 100 million, with a where clause which actually selects 1 row out of it for the last chunk... it was going for an index scan, but on the wrong index. The right index would have selected that exactly 1 row, the wrong one had to cruise through a few million rows... the limit fooled the planner that it will get 20 rows quickly. Now when I forced the usage of a prepared statement, it went for the right index and all was good. I actually set this in our connection pool: ((PGConnection)connection).setPrepareThreshold(1); but it is possible to set/reset it on a statement level, I just didn't find any query I should to do it for yet... the DB is steady now. Another issue was that we've had some functional indexes on oracle returning null for uninteresting rows, to lower the index size. This is easier to implement on postgres using a partial index, which has a lot simpler syntax than the oracle hack, and it is easier to handle. The catch was that we needed to change the where clause compared to oracle so that postgres picks the partial index indeed. There are cases where the planner can't figure out that it can use the index, especially if you use prepared statements and one of the parameters is used in the index condition. In this case it is needed to add the proper restriction to the where clause to point postgres to use the partial index. Using partial indexes speeds up the inserts and updates on those tables, and could speed up some selects too. Hmmm... that's about what I recall now... beside the postgres admin stuff, have you analyzed your data after import ? I forgot to do that at first, and almost reverted again back to oracle... and then after a few days it was very clear that running the auto-vacuum daemon is also a must :-) And: for big data sets is important to tweak all performance settings in the config file, otherwise you get surprises. We've been running a smaller instance of the same code on postgres for quite a while before deciding to migrate a big one, and that was cruising along happily with the default settings, so the first time we needed to do optimizations was when using a data set with a lot of data in it... HTH, Csaba. On Wed, 2005-08-17 at 06:42, Barry Lind wrote: > We just moved a large production instance of ours from Oracle to > Postgres 8.0.3 on linux. When running on Oracle the machine hummed > along using about 5% of the CPU easily handling the fairly constant > load, after moving the data to Postgres the machine was pretty much > maxed out on CPU and could no longer keep up with the transaction > volume. On a hunch I switched the jdbc driver to using the V2 > protocol and the load on the machine dropped down to what it was when > using Oracle and everything was fine. > > > > Now obviously I have found a work around for the performance problem, > but I really don’t want to rely on using the V2 protocol forever, and > don’t want to have to recommend to our customers that they need to run > with the V2 protocol. So I would like to resolve the problem and be > able to move back to a default configuration with the V3 protocol and > the benefits thereof. > > > > The problem is that I don’t really know where to begin to debug a > problem like this. In development environments and testing > environments we have not seen performance problems with the V3 > protocol in the jdbc driver. But they don’t come close to approaching > the transaction volume of this production instance. > > > > What I see when running the V3 protocol under ‘top’ is that the > postgres processes are routinely using 15% or more of the CPU each, > when running the V2 protocol they use more like 0.3%. > > > > Does anyone have any suggestions on an approach to debug a problem > like this? > > > > Thanks, > > --Barry > > ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match