Re: [PATCHES] CREATE TABLE LIKE INCLUDING INDEXES support
Hi, Since this patch is going to consider creating unique/primary indexes assuming them to be constraints, If it does that it will be rejected. There is a difference here and that difference has to be maintained. The correct way to think about this is that a pg_constraint entry of type unique or primary key has an associated index that is part of its implementation (and therefore has an internal dependency on the constraint). But they are far from being the same thing. Thanks Tom, I understand the difference now. I have a working patch and will send it to Neil for review tommorrow. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [PATCHES] UTF8MatchText
Tom Lane skrev: You could imagine trying to do % a byte at a time (and indeed that's what I'd been thinking it did) but that gets you out of sync which breaks the _ case. It is only when you have a pattern like '%_' when this is a problem and we could detect this and do byte by byte when it's not. Now we check (*p == '\\') || (*p == '_') in each iteration when we scan over characters for '%', and we could do it once and have different loops for the two cases. Other than this part that I think can be optimized I don't see anything wrong with the idea behind the patch. To make the '%' case fast might be an important optimization for a lot of use cases. It's not uncommon that '%' matches a bigger part of the string than the rest of the pattern. It's easy to make a misstake when one is used to think about the simple fixed size characters like ascii. Strange that this simple topic can be so difficult to think about... :-) /Dennis ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Preliminary GSSAPI Patches
I finally got to testing that updated patch. It's fine per-se, but was missing the updated README.GSSAPI file. Herewith fixed. gss.patch3.bz2 Description: Binary data On May 12, 2007, at 9:53 AM, Henry B. Hotz wrote: These patches are updated as discussed to remove the incomplete feature. Unfortunately I have a wedding to go to this weekend and won't get them tested until next week. Will post when I've done so. On Mar 31, 2007, at 3:41 PM, Henry B. Hotz wrote: These patches have been reasonably tested (and cross-tested) on Solaris 9 (SPARC) and MacOS 10.4 (both G4 and Intel) with the native GSSAPI libraries. They implement the gss-np and (incompletely) the gss authentication methods. Unlike the current krb5 method gssapi has native support in Java and (with the SSPI) on Windows. I still have bugs in the security layer for the gss method. Hopefully will finish getting them ironed out today or tomorrow. Documentation is in the README.GSSAPI file. Make sure you get it created when you apply the patches. gss.patches.gz gss.patch2.bz2 Just to cover the legal bases: I don't consider these changes to be significant enough to require the involvement of the JPL clearance process. JPL has already ruled that they do not fall afoul of any ITAR restrictions. I am not imposing any license restrictions myself either, but some credit in the release notes, or wherever, would be appreciated. The opinions expressed in this message are mine, not those of Caltech, JPL, NASA, or the US Government. [EMAIL PROTECTED], or [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] UTF8MatchText
Dennis Bjorklund wrote: Tom Lane skrev: You could imagine trying to do % a byte at a time (and indeed that's what I'd been thinking it did) but that gets you out of sync which breaks the _ case. It is only when you have a pattern like '%_' when this is a problem and we could detect this and do byte by byte when it's not. Now we check (*p == '\\') || (*p == '_') in each iteration when we scan over characters for '%', and we could do it once and have different loops for the two cases. Other than this part that I think can be optimized I don't see anything wrong with the idea behind the patch. To make the '%' case fast might be an important optimization for a lot of use cases. It's not uncommon that '%' matches a bigger part of the string than the rest of the pattern. Are you sure? The big remaining char-matching bottleneck will surely be in the code that scans for a place to start matching a %. But that's exactly where we can't use byte matching for cases where the charset might include AB and BA as characters - the pattern might contain %BA and the string AB. However, this isn't a danger for UTF8, which leads me to think that we do indeed need a special case for UTF8, but for a different improvement from that proposed in the original patch. I'll post an updated patch shortly. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Seq scans status update
Heikki Linnakangas wrote: In any case, I'd like to see more test results before we make a decision. I'm running tests with DBT-2 and a seq scan running in the background to see if the cache-spoiling effect shows up. I'm also trying to get hold of some bigger hardware to run on. Running these tests takes some calendar time, but the hard work has already been done. I'm going to start reviewing Jeff's synchronized scans patch now. Here are the results of the DBT-2 tests: http://community.enterprisedb.com/seqscan/imola/ In each of these tests, at the end of rampup a script is started that issues a SELECT COUNT(*) FROM stock in a loop, with 2 minute delay between end of previous query and start of next one. The patch makes the seq scans go significantly faster. In the 1 hour test period, the patched tests perform roughly 30-100% as many selects as unpatched tests. With 100 and 105 warehouses, it also significantly reduces the impact of the seq scan on other queries; response times are lower with the patch. With 120 warehouses the reduction of impact is not as clear, but when you plot the response times it's still there (the plots on the response times charts-page are useless because they're overwhelmed by the checkpoint spike). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] UTF8MatchText
oops. patch attached this time Andrew Dunstan wrote: I wrote: It is only when you have a pattern like '%_' when this is a problem and we could detect this and do byte by byte when it's not. Now we check (*p == '\\') || (*p == '_') in each iteration when we scan over characters for '%', and we could do it once and have different loops for the two cases. Other than this part that I think can be optimized I don't see anything wrong with the idea behind the patch. To make the '%' case fast might be an important optimization for a lot of use cases. It's not uncommon that '%' matches a bigger part of the string than the rest of the pattern. Are you sure? The big remaining char-matching bottleneck will surely be in the code that scans for a place to start matching a %. But that's exactly where we can't use byte matching for cases where the charset might include AB and BA as characters - the pattern might contain %BA and the string AB. However, this isn't a danger for UTF8, which leads me to think that we do indeed need a special case for UTF8, but for a different improvement from that proposed in the original patch. I'll post an updated patch shortly. Here is a patch that implements this. Please analyse for possible breakage. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match Index: src/backend/utils/adt/like.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/like.c,v retrieving revision 1.68 diff -c -r1.68 like.c *** src/backend/utils/adt/like.c 27 Feb 2007 23:48:08 - 1.68 --- src/backend/utils/adt/like.c 20 May 2007 14:16:22 - *** *** 28,48 #define LIKE_ABORT (-1) ! static int MatchText(char *t, int tlen, char *p, int plen); ! static int MatchTextIC(char *t, int tlen, char *p, int plen); ! static int MatchBytea(char *t, int tlen, char *p, int plen); ! static text *do_like_escape(text *, text *); ! static int MBMatchText(char *t, int tlen, char *p, int plen); ! static int MBMatchTextIC(char *t, int tlen, char *p, int plen); static text *MB_do_like_escape(text *, text *); /* * Support routine for MatchText. Compares given multibyte streams * as wide characters. If they match, returns 1 otherwise returns 0. * */ ! static int wchareq(char *p1, char *p2) { int p1_len; --- 28,51 #define LIKE_ABORT (-1) ! static int SB_MatchText(char *t, int tlen, char *p, int plen); ! static int SB_MatchTextIC(char *t, int tlen, char *p, int plen); ! static text *SB_do_like_escape(text *, text *); ! static int MB_MatchText(char *t, int tlen, char *p, int plen); static text *MB_do_like_escape(text *, text *); + static int UTF8_MatchText(char *t, int tlen, char *p, int plen); + + static int GenericMatchText(char *s, int slen, char* p, int plen); + static int mbtexticlike(text *str, text *pat); + /* * Support routine for MatchText. Compares given multibyte streams * as wide characters. If they match, returns 1 otherwise returns 0. * */ ! static inline int wchareq(char *p1, char *p2) { int p1_len; *** *** 72,86 * of getting a single character transformed to the system's wchar_t format. * So now, we just downcase the strings using lower() and apply regular LIKE * comparison. This should be revisited when we install better locale support. - * - * Note that MBMatchText and MBMatchTextIC do exactly the same thing now. - * Is it worth refactoring to avoid duplicated code? They might become - * different again in the future. */ /* Set up to compile like_match.c for multibyte characters */ #define CHAREQ(p1, p2) wchareq(p1, p2) - #define ICHAREQ(p1, p2) wchareq(p1, p2) #define NextChar(p, plen) \ do { int __l = pg_mblen(p); (p) +=__l; (plen) -=__l; } while (0) #define CopyAdvChar(dst, src, srclen) \ --- 75,87 * of getting a single character transformed to the system's wchar_t format. * So now, we just downcase the strings using lower() and apply regular LIKE * comparison. This should be revisited when we install better locale support. */ + #define NextByte(p, plen) ((p)++, (plen)--) + #define BYTEEQ(p1, p2) (*(p1) == *(p2)) + /* Set up to compile like_match.c for multibyte characters */ #define CHAREQ(p1, p2) wchareq(p1, p2) #define NextChar(p, plen) \ do { int __l = pg_mblen(p); (p) +=__l; (plen) -=__l; } while (0) #define CopyAdvChar(dst, src, srclen) \ *** *** 89,122 while (__l-- 0) \ *(dst)++ = *(src)++; \ } while (0) ! #define MatchText MBMatchText ! #define MatchTextIC MBMatchTextIC #define do_like_escape MB_do_like_escape
Re: [PATCHES] Concurrent psql patch
Gregory Stark wrote: Attached is an updated patch. This patch appears to add a nonexistent test to the regression schedules. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] UTF8MatchText
Andrew Dunstan [EMAIL PROTECTED] writes: Are you sure? The big remaining char-matching bottleneck will surely be in the code that scans for a place to start matching a %. But that's exactly where we can't use byte matching for cases where the charset might include AB and BA as characters - the pattern might contain %BA and the string AB. However, this isn't a danger for UTF8, which leads me to think that we do indeed need a special case for UTF8, but for a different improvement from that proposed in the original patch. I'll post an updated patch shortly. Here is a patch that implements this. Please analyse for possible breakage. On the strength of this analysis, shouldn't we drop the separate UTF8 match function and just use SB_MatchText for UTF8? It strikes me that we may be overcomplicating matters in another way too. If you believe that the %-scan code is now the bottleneck, that is, the key loop is where we have pattern '%foo' and we are trying to match 'f' to each successive data position, then you should be bothered that SB_MatchTextIC is applying tolower() to 'f' again for each data character. Worst-case we could have O(N^2) applications of tolower() during a match. I think there's a fair case to be made that we should get rid of SB_MatchTextIC and implement *all* the case-insensitive variants by means of an initial lower() call. This would leave us with just two match functions and allow considerable unification of the setup logic. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] build/install xml2 when configured with libxml
On 4/15/07, Peter Eisentraut [EMAIL PROTECTED] wrote: Well, if we're going to make libxslt an explicit thing, then it'd be trivial to add an xslt transformation function into the core, and then I think we can claim equivalent support. But we'll have to check the details, of course. I have been thinking, however, that I don't want to add more and more library dependencies into the server. libxml2 was necessary to some extent. But xslt functionality could easily be provided as a module. This would be easy to do and might be useful even for 8.3. But I don't really know how to label that. Having a contrib/xslt alongside contrib/xml2 would probably be confusing. Ideas? The current CVS' configure is really confusing: it has --with-xslt option, while there is no XSLT support in the core. At least let's change the option's comment to smth like build with XSLT support (now it is used for contrib/xml2 only)... -- Best regards, Nikolay ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] build/install xml2 when configured with libxml
Nikolay Samokhvalov wrote: The current CVS' configure is really confusing: it has --with-xslt option, while there is no XSLT support in the core. At least let's change the option's comment to smth like build with XSLT support (now it is used for contrib/xml2 only)... contrib is a misnomer at best. When 8.3 branches I intend to propose that we abandon it altogether, in line with some previous discussions. We can change the configure help text if people think it matters that much - which seems to me much more potentially useful than changing comments. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] UTF8MatchText
Tom Lane wrote: On the strength of this analysis, shouldn't we drop the separate UTF8 match function and just use SB_MatchText for UTF8? Possibly - IIRC I looked at that and there was some reason I didn't, but I'll look again. It strikes me that we may be overcomplicating matters in another way too. If you believe that the %-scan code is now the bottleneck, that is, the key loop is where we have pattern '%foo' and we are trying to match 'f' to each successive data position, then you should be bothered that SB_MatchTextIC is applying tolower() to 'f' again for each data character. Worst-case we could have O(N^2) applications of tolower() during a match. I think there's a fair case to be made that we should get rid of SB_MatchTextIC and implement *all* the case-insensitive variants by means of an initial lower() call. This would leave us with just two match functions and allow considerable unification of the setup logic. Yeah, quite possibly. I'm also wondering if we are wasting effort downcasing what will in most cases be the same pattern over and over again. Maybe we need to look at memoizing that somehow, or at least test to see if that would be a gain. We're getting quite a long way from the original patch :-) cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Koichi Suzuki [EMAIL PROTECTED] writes: As replied to Patch queue triage by Tom, here's simplified patch to mark WAL record as compressable, with no increase in WAL itself. Compression/decompression commands will be posted separately to PG Foundary for further review. Applied with some minor modifications. I didn't like the idea of suppressing the sanity-check on WAL record length; I think that's fairly important. Instead, I added a provision for an XLOG_NOOP WAL record type that can be used to fill in the extra space. The way I envision that working is that the compressor removes backup blocks and converts each compressible WAL record to have the same contents and length it would've had if written without backup blocks. Then, it inserts an XLOG_NOOP record with length set to indicate the amount of extra space that needs to be chewed up -- but in the compressed version of the WAL file, XLOG_NOOP's data area is not actually stored. The decompressor need only scan the file looking for XLOG_NOOP and insert the requisite number of zero bytes (and maybe recompute the XLOG_NOOP's CRC, depending on whether you want it to be valid for the short-format record in the compressed file). There will also be some games to be played for WAL page boundaries, but you had to do that anyway. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] build/install xml2 when configured with libxml
On 5/20/07, Andrew Dunstan [EMAIL PROTECTED] wrote: contrib is a misnomer at best. When 8.3 branches I intend to propose that we abandon it altogether, in line with some previous discussions. We can change the configure help text if people think it matters that much - which seems to me much more potentially useful than changing comments. Actually, I meant configure help text, not any comment in the code :-) -- Best regards, Nikolay ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] UTF8MatchText
Andrew Dunstan [EMAIL PROTECTED] writes: Yeah, quite possibly. I'm also wondering if we are wasting effort downcasing what will in most cases be the same pattern over and over again. Maybe we need to look at memoizing that somehow, or at least test to see if that would be a gain. Someone (Itagaki-san IIRC) suggested that we ought to convert x ILIKE y into lower(x) LIKE lower(y) at some fairly early stage, definitely before constant-folding in the planner. That would take care of that issue without any run-time mechanism, and would open opportunities for making use of an index on lower(x). I recall thinking at the time that there were some potential downsides, but right at the moment I'm darned if I can see any --- especially if we're going to make ILIKE do this uniformly at runtime anyway. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] UTF8MatchText
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Yeah, quite possibly. I'm also wondering if we are wasting effort downcasing what will in most cases be the same pattern over and over again. Maybe we need to look at memoizing that somehow, or at least test to see if that would be a gain. Someone (Itagaki-san IIRC) suggested that we ought to convert x ILIKE y into lower(x) LIKE lower(y) at some fairly early stage, definitely before constant-folding in the planner. That would take care of that issue without any run-time mechanism, and would open opportunities for making use of an index on lower(x). I recall thinking at the time that there were some potential downsides, but right at the moment I'm darned if I can see any --- especially if we're going to make ILIKE do this uniformly at runtime anyway. Sounds like a TODO item. I'm already concerned a bit about scope creep for this item. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] UTF8MatchText
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Yeah, quite possibly. I'm also wondering if we are wasting effort downcasing what will in most cases be the same pattern over and over again. Maybe we need to look at memoizing that somehow, or at least test to see if that would be a gain. Someone (Itagaki-san IIRC) suggested that we ought to convert x ILIKE y into lower(x) LIKE lower(y) at some fairly early stage, definitely before constant-folding in the planner. Sounds like a TODO item. I'm already concerned a bit about scope creep for this item. Agreed, I don't want to tackle this right now --- I'm just suggesting it's probably a better answer than memoizing at runtime. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] UTF8MatchText
Tom Lane wrote: On the strength of this analysis, shouldn't we drop the separate UTF8 match function and just use SB_MatchText for UTF8? We still call NextChar() after _, and I think we probably need to, don't we? If so we can't just marry the cases. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] UTF8MatchText
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: On the strength of this analysis, shouldn't we drop the separate UTF8 match function and just use SB_MatchText for UTF8? We still call NextChar() after _, and I think we probably need to, don't we? If so we can't just marry the cases. Doh, you're right ... but on third thought, what happens with a pattern containing %_? If % tries to advance bytewise then we'll be trying to apply NextChar in the middle of a data character, and bad things ensue. I think we need to go back to the scheme with SB_ and MB_ variants and no special case for UTF8. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] UTF8MatchText
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: On the strength of this analysis, shouldn't we drop the separate UTF8 match function and just use SB_MatchText for UTF8? We still call NextChar() after _, and I think we probably need to, don't we? If so we can't just marry the cases. Doh, you're right ... but on third thought, what happens with a pattern containing %_? If % tries to advance bytewise then we'll be trying to apply NextChar in the middle of a data character, and bad things ensue. I think we need to go back to the scheme with SB_ and MB_ variants and no special case for UTF8. My head is spinning with all these variants. I'll look at ti tomorrow. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] COPY-able csv log outputs
I got a chance to review this patch over the weekend. Basic API seems good, met all my requirements, no surprises with how the GUC variable controlled the feature. The most fundamental issue I have with the interface is that using COPY makes it difficult to put any unique index on the resulting table. I like to have a unique index on my imported log table because it rejects the dupe records if you accidentally import the same section of log file twice. COPY tosses the whole thing if there's an index violation, which is a problem during a regular import because you will occasionally come across lines with the same timestamp that are similar in every way except for their statment; putting an index on the timestamp+statement seems impractical. I've had a preference for INSERT from the beginning here that this reinforces. I'm planning to just work around this issue by doing the COPY into a temporary table and then INSERTing from there. I didn't want to just let the concern pass by without mentioning it though. It crosses my mind that inserting some sort of unique log file line ID number would prevent the dupe issue and make for better ordering (it's possible to have two lines with the same timestamp show up in the wrong order now), not sure that's a practical idea to consider. The basic coding of the patch seemed OK to me, but someone who is much more familiar than myself with the mechanics of pipes should take a look at that part of the patch before committing; it's complicated code and I can't comment on it. There are some small formatting issues that need to be fixed, particularly in the host+port mapping. I can fix those myself and submit a slightly updated patch. There's some documentation improvements I want to make before this goes in as well. The patch is actually broken fairly hard right now because of the switch from INSERT to COPY FROM CSV as the output format at the last minute. It outputs missing fields as NULL (fine for INSERT) that chokes the CSV import when the session_start timestamp is missing. All of those NULL values need to be just replaced with nothing for proper CSV syntax; there should just the comma for the next field. I worked around this with copy pglog from '/opt/pgsql/testlog.csv' with CSV null as 'NULL'; I can fix that too when I'm revising. I plan to have a version free of obvious bugs to re-submit ready by next weekend. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Updateable cursors patch
maybe just send a better error message I do the transformation of a where current of clause into where clause at the analyze stage itself(which is the right place to do. I think). Hence this error. If I can postpone this transformation until after re-write then the usual error will be thrown if there are no rules. It is easily doable. But I want to confirm if this will break any other part. I am looking into it now. Rgds, Arul Shaji Jaime Casanova wrote: On 5/17/07, Jaime Casanova [EMAIL PROTECTED] wrote: On 5/17/07, FAST PostgreSQL [EMAIL PROTECTED] wrote: No. It works with scrollable cursors. It will work for cursors/selects which does not put the results in some store, such as WITH hold/group by/order by etc But most of these restrictions apply for normal 'Select for update' anyway. (With the order by clause, the implementation is as per the sql standards.) your patch doesn't work with updatable views because they don't have ctid columns ERROR: column ctid does not exist STATEMENT: update vfoo set des_cta = des_cta || ' - prueba' where current of foo; ERROR: current transaction is aborted, commands ignored until end of transaction block is this sane behavior? to accept create cursors for update on views and then failing to update where current of and rollback the entire transaction? comments? maybe just send a better error message ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] COPY-able csv log outputs
Greg Smith wrote: I got a chance to review this patch over the weekend. Basic API seems good, met all my requirements, no surprises with how the GUC variable controlled the feature. The most fundamental issue I have with the interface is that using COPY makes it difficult to put any unique index on the resulting table. I like to have a unique index on my imported log table because it rejects the dupe records if you accidentally import the same section of log file twice. COPY tosses the whole thing if there's an index violation, which is a problem during a regular import because you will occasionally come across lines with the same timestamp that are similar in every way except for their statment; putting an index on the timestamp+statement seems impractical. Does the format not include the per-process line number? (I know i briefly looked at this patch previously, but I forget the details.) One reason I originally included line numbers in log_line-prefix was to handle this sort of problem. I've had a preference for INSERT from the beginning here that this reinforces. COPY is our standard bulk insert mechanism. I think arguing against it would be a very hard sell. I'm planning to just work around this issue by doing the COPY into a temporary table and then INSERTing from there. I didn't want to just let the concern pass by without mentioning it though. It crosses my mind that inserting some sort of unique log file line ID number would prevent the dupe issue and make for better ordering (it's possible to have two lines with the same timestamp show up in the wrong order now), not sure that's a practical idea to consider. I guess that answers my question. We should definitely provide a unique line key. The basic coding of the patch seemed OK to me, but someone who is much more familiar than myself with the mechanics of pipes should take a look at that part of the patch before committing; it's complicated code and I can't comment on it. There are some small formatting issues that need to be fixed, particularly in the host+port mapping. I can fix those myself and submit a slightly updated patch. There's some documentation improvements I want to make before this goes in as well. The patch is actually broken fairly hard right now because of the switch from INSERT to COPY FROM CSV as the output format at the last minute. It outputs missing fields as NULL (fine for INSERT) that chokes the CSV import when the session_start timestamp is missing. All of those NULL values need to be just replaced with nothing for proper CSV syntax; there should just the comma for the next field. I worked around this with copy pglog from '/opt/pgsql/testlog.csv' with CSV null as 'NULL'; I missed that before - yes it should be fixed. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] COPY-able csv log outputs
Greg Smith [EMAIL PROTECTED] writes: The most fundamental issue I have with the interface is that using COPY makes it difficult to put any unique index on the resulting table. I like to have a unique index on my imported log table because it rejects the dupe records if you accidentally import the same section of log file twice. COPY tosses the whole thing if there's an index violation, which is a problem during a regular import because you will occasionally come across lines with the same timestamp that are similar in every way except for their statment; putting an index on the timestamp+statement seems impractical. Essentially the above is arguing that you want a unique index but you can't be bothered to invent an actually-unique key. This doesn't seem a sound argument to me. If we need a unique key, let's find one. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend