I decided to move the discussion to -hackers ... 

Hello All,

I've recently  proposed a patch 
http://archives.postgresql.org/pgsql-patches/2006-02/msg00165.php
to fix an old problem with dropped columns, but recently 
http://archives.postgresql.org/pgsql-patches/2006-02/msg00246.php
Neil Conway have discovered the problem even with the patch. After looking 
on the problem it seems that my patch just discovered an additional set of 
incosistencies of Postgres in the work with droppped columns. 

I looked in the source of those problems and it seems that I have found 
it, but it is situated rather deep in Postgres and I don't have enough 
expertise too judge it. So I would be interested if someone clarify the 
issue for me and how the problem should be fixed.

So the problem discovered by Neil (after applying the patch):

> create table usno (ra real, dec real, bmag real, rmag real, ipix int8);
> create function ret_next_check() returns setof usno as $$
> declare
>    r record;
> begin
>    for r in select * from usno loop
>        return next r;
>    end loop;
>    return;
> end;
> $$ language plpgsql;
>
> insert into usno values (1.0, 2.0, 3.0, 4.0, 5);
> select * from ret_next_check();
> alter table usno drop column ipix;
> select * from ret_next_check(); -- fails with 
 ERROR:  wrong record type  supplied in RETURN NEXT
CONTEXT:  PL/pgSQL function "ret_next_check" line 5 at return next

In terms of postgres source code, the problem come from the fact that for 
some reason, during the execution of last query ("select * from 
ret_next_check();", when already the column "ipix" is deleted) 
the function exec_stmt_return_next() and some parent functions still 
see in the tupdesc the deleted column and !!! without attisdropped flag!).
And this is certainly the error  (as I understand...)

So, I tried track down the problem. 

Roughly, the problem come from the execution of the 

pl_exec.c:1547 exec_run_select(estate, stmt->query, 0, &portal);

the query here is "select * from usno" 

and that exec_run_select() routine produce the portal with the 
"bad" tupDesc (The deleted column is there and without attisdropped 
flag set). 

Tracking that again I came to the function ExecInitSeqScan()
In that function the function InitScanRelation set the tupdesc 
scanstate.ss_ScanTupleSlot[0].tts_tupleDescriptor (if I 
understand correctly it is read from the heap) and that 
tupdesc is correct!  (so the last column is marked as attisdropped)
But then in the function ExecInitSeqScan() the function 
ExecAssignResultTypeFromTL(&scanstate->ps) set the 
scanstate.ps.ps_ResultTupleSlot.tts_tupleDescriptor tupdesc which is 
incorrect ( last column is not marked as attisdropped)(that tupdesc 
seem to be produced somehow from the plan, I don't understand...)
And finally when that ExecInitSeqScan() returns to ExecInitNode() the 
scanstate is casted to the to the (PlanState *) and the correct tupdesc is 
not used anymore. Only the incorrect is used. And therefore that incorrect 
tupdesc lead to the 
ERROR:  wrong record type  supplied in RETURN NEXT
CONTEXT:  PL/pgSQL function "ret_next_check" line 5 at return next
(because of new patched  version of compatible_tupdesc() which now take 
care of deleted columns).  

So, did I make the right conclusions ? What can be the right fix of that ?
Comments ? 

Sorry if I was not very clear, it's the first time I'm looking so deeply 
in Postgres.

Regards,
        Sergey

*****************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy
Web: http://lnfm1.sai.msu.ru/~math 
E-mail: [EMAIL PROTECTED]




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to