> > > That sounds fantastic. It'd help this copy variant retain festure parity > > with normal copy. And it'd bring us closer to being able to FETCH in non > > queries. > > On second thought, though, this couldn't exactly duplicate the existing > COPY syntax, because COPY relies heavily on the rowtype of the named > target table to tell it what it's copying. You'd need some new syntax > to provide the list of column names and types, which puts a bit of > a hole in the "syntax we already know" argument. A SRF-returning-record > would have a leg up on that, because we do have existing syntax for > defining the concrete rowtype that any particular call returns. > > regards, tom lane >
I would like to make COPY itself a SRF. That's a bit beyond my capabilities, so if that is the route we want to go, I will need help. The syntax would probably look like this (new bits in bold): WITH my_copy AS ( COPY FROM 'example.csv' TO *RESULT SET(c1 text, c2 integer, dummy1 text, dummy2 text, c5 date)* WITH (FORMAT CSV) *RETURNING c1, c2, c3* ) SELECT ... FROM my_copy LEFT OUTER JOIN ref_table ... The RESULT SET (colspecs) bit would be the rsinfo currently used by copy_srf(). It would be nice if the CTE declaration could take types, but it can't. The RETURNING clause here doesn't return all the columns made available from the COPY. That would be nice, but not required because the same filtration could be done when the CTE is referenced. So if we require RETURNING * be the only returning option I'd be fine with that. If we're ok with adding a function like copy_srf() to the core, will we still be happy with it when COPY does get a RETURNING clause? Somewhat off-topic: here's some other observations of a n00b who spent a fair amount of time looking at the copy.c code. 1. BeginCopyFrom and NextCopyFrom pull attlist/tupdesc info from ->rel, repeatedly. If we were going to try to leverage that code we'd need to store those things in a separate cstate member so that we add complexity only in the initialization of the copy state data struct, pulling the result structure from rsinfo rather than a relation. There's probably a minor performance gain to be had in keeping that info around. Refactoring those two procs to allow for a pre-set attlist/tupdesc would help. 2. NextCopyFrom() checks every single time to see if it's row 0 and if it should skip this header row. I know a single (row_num == 0 && has_header) isn't much extra processing, but shouldn't we digest and discard headers before going into the per-row loop? 3. All the code that handles indexes, constraints, buffering, etc, simply doesn't apply in the SRF context. 4. The code somewhat needlessly mixes code for the COPY FROM and COPY TO cases. There's probably a good reason for this, but it made for a lot of clutter in achieving my very narrow goal.