Re: [HACKERS] json api WIP patch

james Tue, 08 Jan 2013 12:07:51 -0800


I had been wondering how to do such an insertion efficiently in the context of 
SPI, but it seems that there is no SPI_copy equiv that would allow a query 
parse and plan to be avoided.


Your query above would need to be planned too, although the plan will be 
trivial.

Ah yes, I meant that I had not found a way to avoid it (for multi-rowinserts etc) from a stored proc context where I have SPI functionsavailable.

You should not try to use it as a general bulk load facility. And it will not 
be as fast as COPY for several reasons, including that the Json parsing 
routines are necessarily much heavier than the COPY parse routines, which have 
in any case been optimized over quite a long period. Also, a single json datum 
is limited to no more than 1Gb. If you have such a datum, parsing it involves 
having it in memory and then taking a copy (I wonder if we could avoid that 
step - will take a look). Then each object is decomposed into a hash table of 
key value pairs, which it then used to construct the record datum. Each field 
name  in the result record is used to look up the value in the hash table - 
this happens once in the case of populate_record() and once per object in the 
array in the case of populate_recordset(). In the latter case the resulting 
records are put into a tuplestore structure (which spills to disk if necessary) 
which is then returned to the caller when all the objects in

 the js
on array are processed. COPY doesn't have these sorts of issues. It knows 
without having to look things up where each datum is in each record, and it 
stashes the result straight into the target table. It can read and insert huge 
numbers of rows without significant memory implications.

Yes - but I don't think I can use COPY from a stored proc context can I?If I could use binary COPY from a stored proc that has received abinary param and unpacked to the data, it would be handy.

If SPI provided a way to perform a copy to a temp table and then somecallback on an iterator that yields rows to it, that would do the trickI guess.

Perhaps if you give us a higher level view of what you're trying to achieve we 
can help you better.

I had been trying to identify a way to work with record sets where therecords might be used for insert, or for updates or deletion statements,preferably without forming a large custom SQL statement that must thenbe parsed and planned (and which would be a PITA if I wanted to use theSQL-C preprocessor or some language bindings that like to prepare astatement and execute with params).

The data I work with has a master-detail structure and insertionperformance matters, so I'm trying to limit manipulations to onestatement per table per logical operation even where there are multipledetail rows.

Sometimes the network latency can be a pain too and that also suggestsan RPC with unpack and insert locally.


Cheers
James



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json api WIP patch

Reply via email to