Re: [PATCHES] COPY-able csv log outputs

Greg Smith Mon, 28 May 2007 11:28:14 -0700

On Sun, 20 May 2007, Andrew Dunstan wrote:

I've had a preference for INSERT from the beginning here that thisreinforces.
COPY is our standard bulk insert mechanism. I think arguing against it wouldbe a very hard sell.

Let me say my final peace on this subject...if I considered this data tobe strictly bulk insert, then I'd completely agree here. Most of thereally interesting applications I was planning to build on top of thismechanism are more interactive than that though. Here's a sample:

-Write a daemon that lives on the server, connects to a logging database,and pops into an idle loop based on LISTEN.-A client app wants to see the recent logs files. It uses NOTIFY to askthe daemon for them and LISTENs for a response.-The daemon wakes up, reads all the log files since it last did something,and appends those records to the log file table. It sends out a NOTIFY tosay the log file table is current.

That enables remote clients to grab the log files from the server wheneverthey please, so they can actually monitor themselves. Benchmarking is theinitial app I expect to call this, and with some types of tests I expectthe daemon to be importing every 10 minutes or so.

Assuming a unique index on the data to prevent duplication is a requiredfeature, I can build this using the COPY format logs as well, but thatrequires I either a) am 100% perfect in making sure I never pass over thesame data twice, which is particularly annoying when the daemon getsrestarted, or b) break the COPY into single lines and insert them one at atime, at which point I'm not bulk loading at all. If these were INSERTstatements instead, I'd have a lot more tolerance for error, because theworst problem I'd ever run into is spewing some unique key violationerrors into the logs if I accidentally imported too much. With COPY, anymistake or synchronization issue and I lose the whole import.

I don't mean to try and stir this back up again as an argument(particularly not on this list). There are plenty of other apps whereCOPY is clearly the best approach, you can easily make a case that my appis a fringe application rather than a mainstream one, and on the balancethis job is still far easier than my current approach of parsing the logs.I just wanted to give a sample of how using COPY impacts the dynamics ofhow downstream applications will have to work with this data, so you cansee that my contrary preference isn't completely random here.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [PATCHES] COPY-able csv log outputs

Reply via email to