Greg Stark wrote:
However that's not enough to explain what you've shown. How about you show the
actual query and actual plan you're working with? The plan you've shown can't
result from the query you sent.
Mea culpa, sort of. But ... in fact, the plan I sent *was* from query I sent, with the
table/column names changed for clarity. This time I'll send the plan "raw".
(This is PG 8.0.1.)
chm=> explain select count(1) from (select normalize_add_salt(smiles) from
chm(> salt_smiles order by db_no) as foo;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Aggregate (cost=69.95..69.95 rows=1 width=0)
-> Subquery Scan foo (cost=0.00..67.93 rows=806 width=0)
-> Index Scan using salt_smiles_pkey on salt_smiles (cost=0.00..59.87
rows=806 width=30)
(3 rows)
As pointed out by Tom and others, this query DOES in fact call the
normalize_add_salt() function.
Now here's the weird part. (And where my original posting went wrong -- sorry
for the error! I got the two queries mixed up.)
I originally had a more complex query, the purpose being to guarantee that the
function was called on the strings in the order specified. (More on this
below.) Here is the original query I used:
chm=> explain select count(1) from (select normalize_add_salt(smiles)
chm(> from (select smiles from salt_smiles order by db_no) as foo) as bar;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Aggregate (cost=67.94..67.94 rows=1 width=0)
-> Subquery Scan foo (cost=0.00..65.92 rows=806 width=0)
-> Index Scan using salt_smiles_pkey on salt_smiles (cost=0.00..57.86
rows=806 width=30)
(3 rows)
Notice that the plans are essentially identical, yet in this one the function does NOT get called.
I proved this by brute force, inserting "char **p = NULL; *p = "foo";" into the
C code to guarantee a segmentation violation if the function gets called. In the first case it
does SIGSEGV, and in the second case it does not.
Now the reason for this more-complex query with an additional subselect is that
the SMILES (which, by the way, are a lexical way of representing chemical
structures - see www.daylight.com), must be passed to the function in a
particular order (hence the ORDER BY). In retrospect I realize the optimizer
apparently flattens this query anyway (hence the identical plans, above).
But the weird thing is that, in spite of flattening, which would appear to make
the queries equivalent, the function gets called in one case, and not in the
other.
Steinar H. Gunderson asked:
select count(1) from (select foo_init(value) from foo_init_table order by
value_id) as foo;
Why not just count(foo_init(value))?
Because the SMILES must be processed in a specific order, hence the more
complex queries.
The simple answer to this whole problem is what Steinar wrote:
This works well, but it requires me to actually retrieve the function's
value 800 times.
Is this actually a problem?
No, it's just a nuisance. It occurs to me that in spite of the ORDER BY
expression, Postgres is free to evaluate the function first, THEN sort the
results, which means the SMILES would be processed in random order anyway.
I.e. my ORDER BY clause is useless for the intended purpose.
So the only way I can see to get this right is to pull the SMILES into my application with the ORDER BY to ensure I have them in the correct order, then send them back one at a time via a "select normalize_add_salt(smiles)", meaning I'll retrieve 800 strings and then send them back.
I just thought there ought to be a way to do this all on the PG server instead of sending
all these strings back and forth. I'd like to say to Postgres, "Just do it this
way, OK?" But the optimizer can't be turned off, so I guess I have to do it the
slow way. The good news is that this is just an initialization step, after which I
typically process thousands of molecules, so the extra overhead won't kill me.
Thanks to all for your help.
Craig
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly