Re: [PERFORM] View based upon function won't use index on joins

2009-11-23 Thread Robert Haas
2009/11/20 Jonathan Foy the...@gmail.com:
 Shy of just throwing a trigger in the
 table to actually populate a second table with the same data solely for
 reporting purposes,

That's what I would do in your situation, FWIW.  Query optimization is
a hard problem even under the best of circumstances; getting the
planner to DTRT with a crazy schema is - well, really hard.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] View based upon function won't use index on joins

2009-11-20 Thread Віталій Тимчишин
How about

CREATE OR REPLACE VIEW value_codes_view AS
select * from (
 SELECT value_codes.id_nbr,
  value_codes.id_qfr,
  (ARRAY[val_1_cd_1, ... , val_2_cd_12])[i] as value_code,
  (ARRAY[val_1_amt_1, ... , val_2_amt_12])[i] as value_amount,
   FROM value_codes, generate_series(1,24) i) a
where value_code is not null and value_code != '';
?


Re: [PERFORM] View based upon function won't use index on joins

2009-11-20 Thread Jonathan Foy
This seems to result in the same problem; should I attempt to pull for a
specific id_nbr/id_qfr, postgres uses the index without a problem. If I try
to join the two tables/views however, it insists on doing a sequential scan
(actually two in this case) and will not use the index.  Any other
ideas/explanations?

That being said, I probably need to look into arrays more.  I haven't used
them at all in my relatively brief experience with postgres.  More research!

2009/11/20 Віталій Тимчишин tiv...@gmail.com

 How about


 CREATE OR REPLACE VIEW value_codes_view AS
 select * from (

  SELECT value_codes.id_nbr,
   value_codes.id_qfr,
   (ARRAY[val_1_cd_1, ... , val_2_cd_12])[i] as value_code,
   (ARRAY[val_1_amt_1, ... , val_2_amt_12])[i] as value_amount,
FROM value_codes, generate_series(1,24) i) a
 where value_code is not null and value_code != '';
 ?



Re: [PERFORM] View based upon function won't use index on joins

2009-11-20 Thread Віталій Тимчишин
20 листопада 2009 р. 17:01 Jonathan Foy the...@gmail.com написав:

 This seems to result in the same problem; should I attempt to pull for a
 specific id_nbr/id_qfr, postgres uses the index without a problem. If I try
 to join the two tables/views however, it insists on doing a sequential scan
 (actually two in this case) and will not use the index.  Any other
 ideas/explanations?


Have you tried to do same (join) when not using the viewes or converting
columns into records? May be the problem is not in conversion, but in
something simplier, like statistics or index bloat?

Best regards, Vitalii Tymchyshyn


Re: [PERFORM] View based upon function won't use index on joins

2009-11-20 Thread Jonathan Foy
I don't think so. I actually dumped the tables involved into stripped down
versions of themselves in a new database for testing, so the data involved
should be completely fresh.  I ran a vacuum analyze after the dump of
course.

Just for paranoia's sake though I did do the following:

explain analyze select id_nbr, id_qfr,
val_1_cd_1,
val_1_cd_2,
...
val_2_amt_12
from value_codes
where main_table.create_dt = '20091001'
and main_table.id_nbr = value_codes.id_nbr
and main_table.id_qfr = value_codes.id_qfr

with the following results

Nested Loop  (cost=0.00..1592.17 rows=132 width=150) (actual
time=0.093..1.075 rows=4 loops=1)
  -  Index Scan using main_table_create_dt_index on main_table
(cost=0.00..21.47 rows=194 width=6) (actual time=0.035..0.249 rows=53
loops=1)
Index Cond: (create_dt = '20091001'::bpchar)
  -  Index Scan using value_codes_pkey on value_codes  (cost=0.00..8.08
rows=1 width=150) (actual time=0.007..0.007 rows=0 loops=53)
Index Cond: ((value_codes.id_nbr = main_table.id_nbr) AND
(value_codes.id_qfr = main_table.id_qfr))
Total runtime: 1.279 ms


I'm stumped.  I'm starting to think that I'm trying to get postgres to do
something that it just doesn't do.  Shy of just throwing a trigger in the
table to actually populate a second table with the same data solely for
reporting purposes, which I hate to do for obvious reasons, I don't know
what else to do.  And this is only one example of this situation in the
databases that I'm dealing with, I was hoping to come up with a more generic
solution that I could apply in any number of locations.

I do very much appreciate the responses...I've been gradually getting deeper
and deeper into postgres, and am still very much learning as I go.  All
advice is very helpful.

Thanks..

2009/11/20 Віталій Тимчишин tiv...@gmail.com



 20 листопада 2009 р. 17:01 Jonathan Foy the...@gmail.com написав:

 This seems to result in the same problem; should I attempt to pull for a
 specific id_nbr/id_qfr, postgres uses the index without a problem. If I try
 to join the two tables/views however, it insists on doing a sequential scan
 (actually two in this case) and will not use the index.  Any other
 ideas/explanations?


 Have you tried to do same (join) when not using the viewes or converting
 columns into records? May be the problem is not in conversion, but in
 something simplier, like statistics or index bloat?

 Best regards, Vitalii Tymchyshyn



[PERFORM] View based upon function won't use index on joins

2009-11-19 Thread Jonathan Foy
Hello,

I've inherited some very...interestingly...
designed tables, and am trying to figure out how to make them usable.  I've
got an ugly hack in place, but it will not use an index properly, and I'm
hoping someone will be able to point me in the right direction.

Production is running 8.1.3, but I'm testing in 8.3.3.  I know that's not
good, but I'm seeing the exact same problem in both, so hopefully fixing it
in one will fix the other.

All tables/functions/views are included at the bottom, somewhat truncated to
reduce length/repetition.

The table in question (though not the only one with this problem) has a
series of 24 column pairs per row, one holding a code and the other a
value.  Any code/value combo could be populated in any of these fields (the
codes identify the type of value).  The row is keyed into based upon an id
number/qualifier pair.  So, for a single id number/qualifier, there can be
from 0 to 24 populated pairs.  We need to go in for a single key and pull a
list of all codes/values.  Hopefully that makes sense.

I created a set-returning function that would pull in the row for a specific
number/qualifier combination, check each code to see if it was null/empty,
and if not it would return a record containing the code/value.

For various reasons I needed to create a view based upon this.  Due to
postgres not liking having set-returning pl/pgsql functions in select
statements, the only way that I could get the view to work was to create a
pl/sql wrapper that simply pulls the results of the prior pl/pgsql function.

I have the view working, and if I pull straight from the view it uses the
index properly (on id_nbr, id_qfr).  However, if I try to join to another
table, based upon the indexed fields, I get a sequential scan.  This is not
ideal at all.  I know a lot of this is bad practice and ugly, but I need to
get something that will work.

Any ideas?  I'm willing to rework any and all as far as views/functions are
concerned, redesigning the tables is sadly not an option at this time.


Ugly table:

CREATE TABLE value_codes
(
  id_nbr integer NOT NULL,
  id_qfr character(1) NOT NULL,
  val_1_cd_1 character varying(30),
  val_1_amt_1 numeric(10,2),
  val_1_cd_2 character varying(30),
  val_1_amt_2 numeric(10,2),
  ...
  val_2_cd_12 character varying(30),
  val_2_amt_12 numeric(10,2),
  CONSTRAINT value_codes_pkey PRIMARY KEY (id_nbr, id_qfr)
)
WITH (
  OIDS=TRUE
);



Joined table:

CREATE TABLE main_table
(
  id_nbr integer NOT NULL,
  id_qfr character(1) NOT NULL,
  create_dt character(8),
  create_tm character(8),
  CONSTRAINT main_table_pkey PRIMARY KEY (id_nbr, id_qfr)
)
WITH (
  OIDS=FALSE
);

CREATE INDEX main_table_create_dt_index
  ON main_table
  USING btree
  (create_dt);



Initial function:

CREATE OR REPLACE FUNCTION get_value_codes(IN fun_id_nbr integer,
IN fun_id_qfr character,
OUT value_code character varying,
OUT value_amount numeric)
  RETURNS SETOF record AS
$BODY$
declare
current_rowrecord;
begin

selectval_1_cd_1,
val_1_amt_1,
val_1_cd_2,
val_1_amt_2,
...
val_2_cd_12,
val_2_amt_12
into current_row
from value_codes
where   id_nbr = fun_id_nbr
and id_qfr = fun_id_qfr;

if
current_row.val_1_cd_1 is not null
and current_row.val_1_cd_1 != ''
then
value_code := current_row.val_1_cd_1;
value_amount := current_row.val_1_amt_1;

return next;
end if;
...
if
current_row.val_2_cd_12 is not null
and current_row.val_2_cd_12 != ''
then
value_code := current_row.val_2_cd_12;
value_amount := current_row.val_2_amt_12;

return next;
end if;

return;
end;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100
  ROWS 10;



Wrapper function:

CREATE OR REPLACE FUNCTION get_value_codes_wrapper(IN id_nbr integer,
IN id_qfr character,
OUT value_code character varying,
OUT value_amount numeric)
  RETURNS SETOF record AS
$BODY$
SELECT * FROM get_value_codes($1, $2);
$BODY$
  LANGUAGE 'sql' VOLATILE
  COST 100
  ROWS 10;



View:

CREATE OR REPLACE VIEW value_codes_view AS
 SELECT value_codes.id_nbr,
  value_codes.id_qfr,
  (get_value_codes_wrapper(value_codes.id_nbr,
value_codes.id_qfr)).value_code AS value_code,
  (get_value_codes_wrapper(value_codes.id_nbr,
value_codes.id_qfr)).value_amount AS value_amount
   FROM value_codes;



Simple query Explained:

explain analyze select * from value_codes_view where id_nbr = 9000;

Index Scan using value_codes_pkey on value_codes  (cost=0.00..128.72 rows=53
width=6) (actual time=17.593..172.031 rows=15 loops=1)
  Index Cond: (id_nbr = 9000)
Total runtime: 172.141 ms


Join query explained:

explain analyze select * from main_table, value_codes_view
where create_dt = '20091001'
and main_table.id_nbr = value_codes_view.id_nbr
and main_table.id_qfr = value_codes_view.id_qfr;

Hash Join  (cost=24.38..312425.40