[PERFORM] View based upon function won't use index on joins
Hello, I've inherited some very...interestingly... designed tables, and am trying to figure out how to make them usable. I've got an ugly hack in place, but it will not use an index properly, and I'm hoping someone will be able to point me in the right direction. Production is running 8.1.3, but I'm testing in 8.3.3. I know that's not good, but I'm seeing the exact same problem in both, so hopefully fixing it in one will fix the other. All tables/functions/views are included at the bottom, somewhat truncated to reduce length/repetition. The table in question (though not the only one with this problem) has a series of 24 column pairs per row, one holding a code and the other a value. Any code/value combo could be populated in any of these fields (the codes identify the type of value). The row is keyed into based upon an id number/qualifier pair. So, for a single id number/qualifier, there can be from 0 to 24 populated pairs. We need to go in for a single key and pull a list of all codes/values. Hopefully that makes sense. I created a set-returning function that would pull in the row for a specific number/qualifier combination, check each code to see if it was null/empty, and if not it would return a record containing the code/value. For various reasons I needed to create a view based upon this. Due to postgres not liking having set-returning pl/pgsql functions in select statements, the only way that I could get the view to work was to create a pl/sql wrapper that simply pulls the results of the prior pl/pgsql function. I have the view working, and if I pull straight from the view it uses the index properly (on id_nbr, id_qfr). However, if I try to join to another table, based upon the indexed fields, I get a sequential scan. This is not ideal at all. I know a lot of this is bad practice and ugly, but I need to get something that will work. Any ideas? I'm willing to rework any and all as far as views/functions are concerned, redesigning the tables is sadly not an option at this time. Ugly table: CREATE TABLE value_codes ( id_nbr integer NOT NULL, id_qfr character(1) NOT NULL, val_1_cd_1 character varying(30), val_1_amt_1 numeric(10,2), val_1_cd_2 character varying(30), val_1_amt_2 numeric(10,2), ... val_2_cd_12 character varying(30), val_2_amt_12 numeric(10,2), CONSTRAINT value_codes_pkey PRIMARY KEY (id_nbr, id_qfr) ) WITH ( OIDS=TRUE ); Joined table: CREATE TABLE main_table ( id_nbr integer NOT NULL, id_qfr character(1) NOT NULL, create_dt character(8), create_tm character(8), CONSTRAINT main_table_pkey PRIMARY KEY (id_nbr, id_qfr) ) WITH ( OIDS=FALSE ); CREATE INDEX main_table_create_dt_index ON main_table USING btree (create_dt); Initial function: CREATE OR REPLACE FUNCTION get_value_codes(IN fun_id_nbr integer, IN fun_id_qfr character, OUT value_code character varying, OUT value_amount numeric) RETURNS SETOF record AS $BODY$ declare current_rowrecord; begin selectval_1_cd_1, val_1_amt_1, val_1_cd_2, val_1_amt_2, ... val_2_cd_12, val_2_amt_12 into current_row from value_codes where id_nbr = fun_id_nbr and id_qfr = fun_id_qfr; if current_row.val_1_cd_1 is not null and current_row.val_1_cd_1 != '' then value_code := current_row.val_1_cd_1; value_amount := current_row.val_1_amt_1; return next; end if; ... if current_row.val_2_cd_12 is not null and current_row.val_2_cd_12 != '' then value_code := current_row.val_2_cd_12; value_amount := current_row.val_2_amt_12; return next; end if; return; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100 ROWS 10; Wrapper function: CREATE OR REPLACE FUNCTION get_value_codes_wrapper(IN id_nbr integer, IN id_qfr character, OUT value_code character varying, OUT value_amount numeric) RETURNS SETOF record AS $BODY$ SELECT * FROM get_value_codes($1, $2); $BODY$ LANGUAGE 'sql' VOLATILE COST 100 ROWS 10; View: CREATE OR REPLACE VIEW value_codes_view AS SELECT value_codes.id_nbr, value_codes.id_qfr, (get_value_codes_wrapper(value_codes.id_nbr, value_codes.id_qfr)).value_code AS value_code, (get_value_codes_wrapper(value_codes.id_nbr, value_codes.id_qfr)).value_amount AS value_amount FROM value_codes; Simple query Explained: explain analyze select * from value_codes_view where id_nbr >= 9000; Index Scan using value_codes_pkey on value_codes (cost=0.00..128.72 rows=53 width=6) (actual time=17.593..172.031 rows=15 loops=1) Index Cond: (id_nbr >= 9000) Total runtime: 172.141 ms Join query explained: explain analyze select * from main_table, value_codes_view where create_dt >= '20091001' and main_table.id_nbr = value_codes_view.id_nbr and main_table.id_qfr = value_codes_view.id_qfr; Hash Join (cost=24.38..312425.40 ro
Re: [PERFORM] View based upon function won't use index on joins
This seems to result in the same problem; should I attempt to pull for a specific id_nbr/id_qfr, postgres uses the index without a problem. If I try to join the two tables/views however, it insists on doing a sequential scan (actually two in this case) and will not use the index. Any other ideas/explanations? That being said, I probably need to look into arrays more. I haven't used them at all in my relatively brief experience with postgres. More research! 2009/11/20 Віталій Тимчишин > How about > > > CREATE OR REPLACE VIEW value_codes_view AS > select * from ( > > SELECT value_codes.id_nbr, > value_codes.id_qfr, > (ARRAY[val_1_cd_1, ... , val_2_cd_12])[i] as value_code, > (ARRAY[val_1_amt_1, ... , val_2_amt_12])[i] as value_amount, >FROM value_codes, generate_series(1,24) i) a > where value_code is not null and value_code != ''; > ? >
Re: [PERFORM] View based upon function won't use index on joins
I don't think so. I actually dumped the tables involved into stripped down versions of themselves in a new database for testing, so the data involved should be completely fresh. I ran a vacuum analyze after the dump of course. Just for paranoia's sake though I did do the following: explain analyze select id_nbr, id_qfr, val_1_cd_1, val_1_cd_2, ... val_2_amt_12 from value_codes where main_table.create_dt >= '20091001' and main_table.id_nbr = value_codes.id_nbr and main_table.id_qfr = value_codes.id_qfr with the following results "Nested Loop (cost=0.00..1592.17 rows=132 width=150) (actual time=0.093..1.075 rows=4 loops=1)" " -> Index Scan using main_table_create_dt_index on main_table (cost=0.00..21.47 rows=194 width=6) (actual time=0.035..0.249 rows=53 loops=1)" "Index Cond: (create_dt >= '20091001'::bpchar)" " -> Index Scan using value_codes_pkey on value_codes (cost=0.00..8.08 rows=1 width=150) (actual time=0.007..0.007 rows=0 loops=53)" "Index Cond: ((value_codes.id_nbr = main_table.id_nbr) AND (value_codes.id_qfr = main_table.id_qfr))" "Total runtime: 1.279 ms" I'm stumped. I'm starting to think that I'm trying to get postgres to do something that it just doesn't do. Shy of just throwing a trigger in the table to actually populate a second table with the same data solely for reporting purposes, which I hate to do for obvious reasons, I don't know what else to do. And this is only one example of this situation in the databases that I'm dealing with, I was hoping to come up with a more generic solution that I could apply in any number of locations. I do very much appreciate the responses...I've been gradually getting deeper and deeper into postgres, and am still very much learning as I go. All advice is very helpful. Thanks.. 2009/11/20 Віталій Тимчишин > > > 20 листопада 2009 р. 17:01 Jonathan Foy написав: > > This seems to result in the same problem; should I attempt to pull for a >> specific id_nbr/id_qfr, postgres uses the index without a problem. If I try >> to join the two tables/views however, it insists on doing a sequential scan >> (actually two in this case) and will not use the index. Any other >> ideas/explanations? >> > > Have you tried to do same (join) when not using the viewes or converting > columns into records? May be the problem is not in conversion, but in > something simplier, like statistics or index bloat? > > Best regards, Vitalii Tymchyshyn >
[PERFORM] Vacuum running out of memory
Hello My vacuums have suddenly started to fail, seemingly at random. I am confused. I'm running 8.1.3, with close to a dozen servers, up to 150 databases each. I have 8GB of RAM. Vacuums have started to fail on all servers (though only the occasional vacuum) with the following error: VACUUM,ERROR: out of memory VACUUM,DETAIL: Failed on request of size 268435452 I have some terrible tables that I inherited for which I recently created tons of indexes in order to make them useful. I had a post a couple of weeks ago detailing my problem with trying to get a function working to simplify the data...I fell back on indexes where the column values were not null/empty. Since they are almost always null/empty, I was able to dramatically speed up access without eating up much disk space, but I did throw an extra 200 indexes into each database. Shortly after I started getting occasional vacuum failures with the above error. I'm not sure if it's a coincidence or not, but my maintenance_work_mem is set to 262144 KB, which matches the failed request size above. I initially assumed that with 200*150 additional relations, I was messing up my max_fsm_relations setting, which is 60,000. However, as a test a ran a verbose vacuum analyze on a small table to get the statistics at the end, from which I got the following: INFO: free space map contains 2239943 pages in 28445 relations DETAIL: A total of 2623552 page slots are in use (including overhead). 2623552 page slots are required to track all free space. Current limits are: 800 page slots, 6 relations, using 50650 KB. which seems to indicate I'm well within my limits. (for curiosity's sake, which relations count towards that limit? From what I can tell it's only tables and indexes...functions, views, triggers, etc shouldn't contribute, should they?) Am I interpreting this wrong? Anyone have any insight as to what is going wrong? I can provide more information if needed... Thanks,
Re: [PERFORM] Vacuum running out of memory
I was wondering if that was the problem. So I'm correct in thinking that the failure occurred when the vacuum tried to pull its 256 MB as defined in the maintenance_work_mem value, and the system just did not have enough available...any idea why that would suddenly start happening? The indexes I created shouldn't have affected that, should they? And point taken with the update. I'm pushing to get us to 8.4, unsuccessfully so far, but management might be more amenable to minor version upgrades, since as I understand it there shouldn't be any risk of application problems with minor version changes... On Tue, Dec 8, 2009 at 11:22 AM, Tom Lane wrote: > Jonathan Foy writes: > > My vacuums have suddenly started to fail, seemingly at random. I am > > confused. > > > I'm running 8.1.3, with close to a dozen servers, up to 150 databases > each. > > I have 8GB of RAM. Vacuums have started to fail on all servers (though > only > > the occasional vacuum) with the following error: > > > VACUUM,ERROR: out of memory > > VACUUM,DETAIL: Failed on request of size 268435452 > > I'd back off maintenance_work_mem if I were you. I think you don't have > enough RAM to be running a lot of concurrent VACUUMs all with the same > large memory consumption. > > Also, if it's really 8.1.3, consider an update to 8.1.something-recent. > Not only are you exposed to a number of very serious known bugs, but > this patch in particular would likely help you: > http://archives.postgresql.org/pgsql-committers/2007-09/msg00377.php > >regards, tom lane >