Re: [PATCHES] CREATE TABLE LIKE INCLUDING INDEXES support

2007-05-20 Thread NikhilS

Hi,



 Since this patch is going to consider creating unique/primary indexes
 assuming them to be constraints,

If it does that it will be rejected.  There is a difference here and
that difference has to be maintained.

The correct way to think about this is that a pg_constraint entry of
type unique or primary key has an associated index that is part of
its implementation (and therefore has an internal dependency on the
constraint).  But they are far from being the same thing.



Thanks Tom, I understand the difference now. I have a working patch and will
send it to Neil for review tommorrow.

Regards,
Nikhils

--
EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Dennis Bjorklund

Tom Lane skrev:

You could imagine trying to do
% a byte at a time (and indeed that's what I'd been thinking it did)
but that gets you out of sync which breaks the _ case.


It is only when you have a pattern like '%_' when this is a problem and 
we could detect this and do byte by byte when it's not. Now we check (*p 
== '\\') || (*p == '_') in each iteration when we scan over characters 
for '%', and we could do it once and have different loops for the two cases.


Other than this part that I think can be optimized I don't see anything 
wrong with the idea behind the patch. To make the '%' case fast might be 
an important optimization for a lot of use cases. It's not uncommon that 
'%' matches a bigger part of the string than the rest of the pattern.


It's easy to make a misstake when one is used to think about the simple 
fixed size characters like ascii. Strange that this simple topic can be 
so difficult to think about... :-)


/Dennis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Preliminary GSSAPI Patches

2007-05-20 Thread Henry B. Hotz
I finally got to testing that updated patch.  It's fine per-se, but  
was missing the updated README.GSSAPI file.  Herewith fixed.




gss.patch3.bz2
Description: Binary data


On May 12, 2007, at 9:53 AM, Henry B. Hotz wrote:

These patches are updated as discussed to remove the incomplete  
feature.  Unfortunately I have a wedding to go to this weekend and  
won't get them tested until next week.  Will post when I've done so.


On Mar 31, 2007, at 3:41 PM, Henry B. Hotz wrote:

These patches have been reasonably tested (and cross-tested) on  
Solaris 9 (SPARC) and MacOS 10.4 (both G4 and Intel) with the  
native GSSAPI libraries.  They implement the gss-np and  
(incompletely) the gss authentication methods.  Unlike the current  
krb5 method gssapi has native support in Java and (with the SSPI)  
on Windows.


I still have bugs in the security layer for the gss method.   
Hopefully will finish getting them ironed out today or tomorrow.


Documentation is in the README.GSSAPI file.  Make sure you get it  
created when you apply the patches.


gss.patches.gz


gss.patch2.bz2


Just to cover the legal bases:  I don't consider these changes to  
be significant enough to require the involvement of the JPL  
clearance process.  JPL has already ruled that they do not fall  
afoul of any ITAR restrictions.


I am not imposing any license restrictions myself either, but some  
credit in the release notes, or wherever, would be appreciated.


The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
[EMAIL PROTECTED], or [EMAIL PROTECTED]



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan



Dennis Bjorklund wrote:

Tom Lane skrev:

You could imagine trying to do
% a byte at a time (and indeed that's what I'd been thinking it did)
but that gets you out of sync which breaks the _ case.


It is only when you have a pattern like '%_' when this is a problem 
and we could detect this and do byte by byte when it's not. Now we 
check (*p == '\\') || (*p == '_') in each iteration when we scan over 
characters for '%', and we could do it once and have different loops 
for the two cases.


Other than this part that I think can be optimized I don't see 
anything wrong with the idea behind the patch. To make the '%' case 
fast might be an important optimization for a lot of use cases. It's 
not uncommon that '%' matches a bigger part of the string than the 
rest of the pattern.





Are you sure? The big remaining char-matching bottleneck will surely be 
in the code that scans for a place to start matching a %. But that's 
exactly where we can't use byte matching for cases where the charset 
might include AB and BA as characters - the pattern might contain %BA 
and the string AB. However, this isn't a danger for UTF8, which leads me 
to think that we do indeed need a special case for UTF8, but for a 
different improvement from that proposed in the original patch. I'll 
post an updated patch shortly.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Seq scans status update

2007-05-20 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
In any case, I'd like to see more test results before we make a 
decision. I'm running tests with DBT-2 and a seq scan running in the 
background to see if the cache-spoiling effect shows up. I'm also trying 
to get hold of some bigger hardware to run on. Running these tests takes 
some calendar time, but the hard work has already been done. I'm going 
to start reviewing Jeff's synchronized scans patch now.


Here are the results of the DBT-2 tests:

http://community.enterprisedb.com/seqscan/imola/

In each of these tests, at the end of rampup a script is started that 
issues a SELECT COUNT(*) FROM stock in a loop, with 2 minute delay 
between end of previous query and start of next one.


The patch makes the seq scans go significantly faster. In the 1 hour 
test period, the patched tests perform roughly 30-100% as many selects 
as unpatched tests.


With 100 and 105 warehouses, it also significantly reduces the impact of 
the seq scan on other queries; response times are lower with the patch. 
With 120 warehouses the reduction of impact is not as clear, but when 
you plot the response times it's still there (the plots on the response 
times charts-page are useless because they're overwhelmed by the 
checkpoint spike).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan


oops. patch attached this time


Andrew Dunstan wrote:



I wrote:





It is only when you have a pattern like '%_' when this is a problem 
and we could detect this and do byte by byte when it's not. Now we 
check (*p == '\\') || (*p == '_') in each iteration when we scan 
over characters for '%', and we could do it once and have different 
loops for the two cases.


Other than this part that I think can be optimized I don't see 
anything wrong with the idea behind the patch. To make the '%' case 
fast might be an important optimization for a lot of use cases. It's 
not uncommon that '%' matches a bigger part of the string than the 
rest of the pattern.





Are you sure? The big remaining char-matching bottleneck will surely 
be in the code that scans for a place to start matching a %. But 
that's exactly where we can't use byte matching for cases where the 
charset might include AB and BA as characters - the pattern might 
contain %BA and the string AB. However, this isn't a danger for UTF8, 
which leads me to think that we do indeed need a special case for 
UTF8, but for a different improvement from that proposed in the 
original patch. I'll post an updated patch shortly.




Here is a patch that implements this. Please analyse for possible 
breakage.


cheers

andrew



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Index: src/backend/utils/adt/like.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/like.c,v
retrieving revision 1.68
diff -c -r1.68 like.c
*** src/backend/utils/adt/like.c	27 Feb 2007 23:48:08 -	1.68
--- src/backend/utils/adt/like.c	20 May 2007 14:16:22 -
***
*** 28,48 
  #define LIKE_ABORT		(-1)
  
  
! static int	MatchText(char *t, int tlen, char *p, int plen);
! static int	MatchTextIC(char *t, int tlen, char *p, int plen);
! static int	MatchBytea(char *t, int tlen, char *p, int plen);
! static text *do_like_escape(text *, text *);
  
! static int	MBMatchText(char *t, int tlen, char *p, int plen);
! static int	MBMatchTextIC(char *t, int tlen, char *p, int plen);
  static text *MB_do_like_escape(text *, text *);
  
  /*
   * Support routine for MatchText. Compares given multibyte streams
   * as wide characters. If they match, returns 1 otherwise returns 0.
   *
   */
! static int
  wchareq(char *p1, char *p2)
  {
  	int			p1_len;
--- 28,51 
  #define LIKE_ABORT		(-1)
  
  
! static int	SB_MatchText(char *t, int tlen, char *p, int plen);
! static int	SB_MatchTextIC(char *t, int tlen, char *p, int plen);
! static text *SB_do_like_escape(text *, text *);
  
! static int	MB_MatchText(char *t, int tlen, char *p, int plen);
  static text *MB_do_like_escape(text *, text *);
  
+ static int	UTF8_MatchText(char *t, int tlen, char *p, int plen);
+ 
+ static int	GenericMatchText(char *s, int slen, char* p, int plen);
+ static int	mbtexticlike(text *str, text *pat);
+ 
  /*
   * Support routine for MatchText. Compares given multibyte streams
   * as wide characters. If they match, returns 1 otherwise returns 0.
   *
   */
! static inline int
  wchareq(char *p1, char *p2)
  {
  	int			p1_len;
***
*** 72,86 
   * of getting a single character transformed to the system's wchar_t format.
   * So now, we just downcase the strings using lower() and apply regular LIKE
   * comparison.	This should be revisited when we install better locale support.
-  *
-  * Note that MBMatchText and MBMatchTextIC do exactly the same thing now.
-  * Is it worth refactoring to avoid duplicated code?  They might become
-  * different again in the future.
   */
  
  /* Set up to compile like_match.c for multibyte characters */
  #define CHAREQ(p1, p2) wchareq(p1, p2)
- #define ICHAREQ(p1, p2) wchareq(p1, p2)
  #define NextChar(p, plen) \
  	do { int __l = pg_mblen(p); (p) +=__l; (plen) -=__l; } while (0)
  #define CopyAdvChar(dst, src, srclen) \
--- 75,87 
   * of getting a single character transformed to the system's wchar_t format.
   * So now, we just downcase the strings using lower() and apply regular LIKE
   * comparison.	This should be revisited when we install better locale support.
   */
  
+ #define NextByte(p, plen)	((p)++, (plen)--)
+ #define BYTEEQ(p1, p2)		(*(p1) == *(p2))
+ 
  /* Set up to compile like_match.c for multibyte characters */
  #define CHAREQ(p1, p2) wchareq(p1, p2)
  #define NextChar(p, plen) \
  	do { int __l = pg_mblen(p); (p) +=__l; (plen) -=__l; } while (0)
  #define CopyAdvChar(dst, src, srclen) \
***
*** 89,122 
  		 while (__l--  0) \
  			 *(dst)++ = *(src)++; \
  	   } while (0)
  
! #define MatchText	MBMatchText
! #define MatchTextIC MBMatchTextIC
  #define do_like_escape	MB_do_like_escape
  
  

Re: [PATCHES] Concurrent psql patch

2007-05-20 Thread Andrew Dunstan



Gregory Stark wrote:

Attached is an updated patch.
  


This patch appears to add a nonexistent test to the regression schedules.

cheers

andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Are you sure? The big remaining char-matching bottleneck will surely 
 be in the code that scans for a place to start matching a %. But 
 that's exactly where we can't use byte matching for cases where the 
 charset might include AB and BA as characters - the pattern might 
 contain %BA and the string AB. However, this isn't a danger for UTF8, 
 which leads me to think that we do indeed need a special case for 
 UTF8, but for a different improvement from that proposed in the 
 original patch. I'll post an updated patch shortly.

 Here is a patch that implements this. Please analyse for possible 
 breakage.

On the strength of this analysis, shouldn't we drop the separate
UTF8 match function and just use SB_MatchText for UTF8?

It strikes me that we may be overcomplicating matters in another way
too.  If you believe that the %-scan code is now the bottleneck, that
is, the key loop is where we have pattern '%foo' and we are trying to
match 'f' to each successive data position, then you should be bothered
that SB_MatchTextIC is applying tolower() to 'f' again for each data
character.  Worst-case we could have O(N^2) applications of tolower()
during a match.  I think there's a fair case to be made that we should
get rid of SB_MatchTextIC and implement *all* the case-insensitive
variants by means of an initial lower() call.  This would leave us with
just two match functions and allow considerable unification of the setup
logic.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Nikolay Samokhvalov

On 4/15/07, Peter Eisentraut [EMAIL PROTECTED] wrote:

Well, if we're going to make libxslt an explicit thing, then it'd be
trivial to add an xslt transformation function into the core, and then
I think we can claim equivalent support.  But we'll have to check the
details, of course.

I have been thinking, however, that I don't want to add more and more
library dependencies into the server.  libxml2 was necessary to some
extent.  But xslt functionality could easily be provided as a module.
This would be easy to do and might be useful even for 8.3.  But I don't
really know how to label that.  Having a contrib/xslt alongside
contrib/xml2 would probably be confusing.  Ideas?


The current CVS' configure is really confusing: it has --with-xslt
option, while there is no XSLT support in the core. At least let's
change the option's comment to smth like build with XSLT support (now
it is used for contrib/xml2 only)...

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Andrew Dunstan



Nikolay Samokhvalov wrote:


The current CVS' configure is really confusing: it has --with-xslt
option, while there is no XSLT support in the core. At least let's
change the option's comment to smth like build with XSLT support (now
it is used for contrib/xml2 only)...



contrib is a misnomer at best. When 8.3 branches I intend to propose 
that we abandon it altogether, in line with some previous discussions.


We can change the configure help text if people think it matters that 
much - which seems to me much more potentially useful than changing 
comments.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan



Tom Lane wrote:

On the strength of this analysis, shouldn't we drop the separate
UTF8 match function and just use SB_MatchText for UTF8?
  


Possibly - IIRC I looked at that and there was some reason I didn't, but 
I'll look again.



It strikes me that we may be overcomplicating matters in another way
too.  If you believe that the %-scan code is now the bottleneck, that
is, the key loop is where we have pattern '%foo' and we are trying to
match 'f' to each successive data position, then you should be bothered
that SB_MatchTextIC is applying tolower() to 'f' again for each data
character.  Worst-case we could have O(N^2) applications of tolower()
during a match.  I think there's a fair case to be made that we should
get rid of SB_MatchTextIC and implement *all* the case-insensitive
variants by means of an initial lower() call.  This would leave us with
just two match functions and allow considerable unification of the setup
logic.


  



Yeah, quite possibly. I'm also wondering if we are wasting effort 
downcasing what will in most cases be the same pattern over and over 
again. Maybe we need to look at memoizing that somehow, or at least test 
to see if that would be a gain.


We're getting quite a long way from the original patch :-)

cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-20 Thread Tom Lane
Koichi Suzuki [EMAIL PROTECTED] writes:
 As replied to Patch queue triage by Tom, here's simplified patch to
 mark WAL record as compressable, with no increase in WAL itself.
 Compression/decompression commands will be posted separately to PG
 Foundary for further review.

Applied with some minor modifications.  I didn't like the idea of
suppressing the sanity-check on WAL record length; I think that's
fairly important.  Instead, I added a provision for an XLOG_NOOP
WAL record type that can be used to fill in the extra space.
The way I envision that working is that the compressor removes
backup blocks and converts each compressible WAL record to have the
same contents and length it would've had if written without backup
blocks.  Then, it inserts an XLOG_NOOP record with length set to
indicate the amount of extra space that needs to be chewed up --
but in the compressed version of the WAL file, XLOG_NOOP's data
area is not actually stored.  The decompressor need only scan
the file looking for XLOG_NOOP and insert the requisite number of
zero bytes (and maybe recompute the XLOG_NOOP's CRC, depending on
whether you want it to be valid for the short-format record in the
compressed file).  There will also be some games to be played for
WAL page boundaries, but you had to do that anyway.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Nikolay Samokhvalov

On 5/20/07, Andrew Dunstan [EMAIL PROTECTED] wrote:


contrib is a misnomer at best. When 8.3 branches I intend to propose
that we abandon it altogether, in line with some previous discussions.

We can change the configure help text if people think it matters that
much - which seems to me much more potentially useful than changing
comments.


Actually, I meant configure help text, not any comment in the code :-)

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Yeah, quite possibly. I'm also wondering if we are wasting effort 
 downcasing what will in most cases be the same pattern over and over 
 again. Maybe we need to look at memoizing that somehow, or at least test 
 to see if that would be a gain.

Someone (Itagaki-san IIRC) suggested that we ought to convert
x ILIKE y into lower(x) LIKE lower(y) at some fairly early
stage, definitely before constant-folding in the planner.  That
would take care of that issue without any run-time mechanism,
and would open opportunities for making use of an index on lower(x).

I recall thinking at the time that there were some potential downsides,
but right at the moment I'm darned if I can see any --- especially
if we're going to make ILIKE do this uniformly at runtime anyway.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  
Yeah, quite possibly. I'm also wondering if we are wasting effort 
downcasing what will in most cases be the same pattern over and over 
again. Maybe we need to look at memoizing that somehow, or at least test 
to see if that would be a gain.



Someone (Itagaki-san IIRC) suggested that we ought to convert
x ILIKE y into lower(x) LIKE lower(y) at some fairly early
stage, definitely before constant-folding in the planner.  That
would take care of that issue without any run-time mechanism,
and would open opportunities for making use of an index on lower(x).

I recall thinking at the time that there were some potential downsides,
but right at the moment I'm darned if I can see any --- especially
if we're going to make ILIKE do this uniformly at runtime anyway.


  
Sounds like a TODO item. I'm already concerned a bit about scope creep 
for this item.


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 Yeah, quite possibly. I'm also wondering if we are wasting effort 
 downcasing what will in most cases be the same pattern over and over 
 again. Maybe we need to look at memoizing that somehow, or at least test 
 to see if that would be a gain.
 
 Someone (Itagaki-san IIRC) suggested that we ought to convert
 x ILIKE y into lower(x) LIKE lower(y) at some fairly early
 stage, definitely before constant-folding in the planner.
 
 Sounds like a TODO item. I'm already concerned a bit about scope creep 
 for this item.

Agreed, I don't want to tackle this right now --- I'm just suggesting
it's probably a better answer than memoizing at runtime.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan



Tom Lane wrote:


On the strength of this analysis, shouldn't we drop the separate
UTF8 match function and just use SB_MatchText for UTF8?
  


We still call NextChar() after _, and I think we probably need to, 
don't we? If so we can't just marry the cases.



cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 On the strength of this analysis, shouldn't we drop the separate
 UTF8 match function and just use SB_MatchText for UTF8?

 We still call NextChar() after _, and I think we probably need to, 
 don't we? If so we can't just marry the cases.

Doh, you're right ... but on third thought, what happens with a pattern
containing %_?  If % tries to advance bytewise then we'll be trying to
apply NextChar in the middle of a data character, and bad things ensue.

I think we need to go back to the scheme with SB_ and MB_ variants and
no special case for UTF8.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] UTF8MatchText

2007-05-20 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


On the strength of this analysis, shouldn't we drop the separate
UTF8 match function and just use SB_MatchText for UTF8?
  


  
We still call NextChar() after _, and I think we probably need to, 
don't we? If so we can't just marry the cases.



Doh, you're right ... but on third thought, what happens with a pattern
containing %_?  If % tries to advance bytewise then we'll be trying to
apply NextChar in the middle of a data character, and bad things ensue.

I think we need to go back to the scheme with SB_ and MB_ variants and
no special case for UTF8.


  


My head is spinning with all these variants. I'll look at ti tomorrow.

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] COPY-able csv log outputs

2007-05-20 Thread Greg Smith
I got a chance to review this patch over the weekend.  Basic API seems 
good, met all my requirements, no surprises with how the GUC variable 
controlled the feature.


The most fundamental issue I have with the interface is that using COPY 
makes it difficult to put any unique index on the resulting table.  I like 
to have a unique index on my imported log table because it rejects the 
dupe records if you accidentally import the same section of log file 
twice.  COPY tosses the whole thing if there's an index violation, which 
is a problem during a regular import because you will occasionally come 
across lines with the same timestamp that are similar in every way except 
for their statment; putting an index on the timestamp+statement seems 
impractical.


I've had a preference for INSERT from the beginning here that this 
reinforces.  I'm planning to just work around this issue by doing the COPY 
into a temporary table and then INSERTing from there.  I didn't want to 
just let the concern pass by without mentioning it though.  It crosses my 
mind that inserting some sort of unique log file line ID number would 
prevent the dupe issue and make for better ordering (it's possible to have 
two lines with the same timestamp show up in the wrong order now), not 
sure that's a practical idea to consider.


The basic coding of the patch seemed OK to me, but someone who is much 
more familiar than myself with the mechanics of pipes should take a look 
at that part of the patch before committing; it's complicated code and I 
can't comment on it.  There are some small formatting issues that need to 
be fixed, particularly in the host+port mapping.  I can fix those myself 
and submit a slightly updated patch.  There's some documentation 
improvements I want to make before this goes in as well.


The patch is actually broken fairly hard right now because of the switch 
from INSERT to COPY FROM CSV as the output format at the last minute.  It 
outputs missing fields as NULL (fine for INSERT) that chokes the CSV 
import when the session_start timestamp is missing.  All of those NULL 
values need to be just replaced with nothing for proper CSV syntax; there 
should just the comma for the next field.  I worked around this with


copy pglog from '/opt/pgsql/testlog.csv' with CSV null as 'NULL';

I can fix that too when I'm revising.  I plan to have a version free of 
obvious bugs to re-submit ready by next weekend.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] Updateable cursors patch

2007-05-20 Thread FAST PostgreSQL


 maybe just send a better error message

I do the transformation of a where current of clause into where clause 
at the analyze stage itself(which is the right place to do. I think). 
Hence this error. If I can postpone this transformation until after 
re-write then the usual error will be thrown if there are no rules. It 
is easily doable. But I want to confirm if this will break any other 
part. I am looking into it now.


Rgds,
Arul Shaji


Jaime Casanova wrote:

On 5/17/07, Jaime Casanova [EMAIL PROTECTED] wrote:

On 5/17/07, FAST PostgreSQL [EMAIL PROTECTED] wrote:
 No. It works with scrollable cursors. It will work for cursors/selects
 which does not put the results in some store, such as WITH hold/group
 by/order by etc But most of these restrictions apply for normal
 'Select for update' anyway. (With the order by clause, the
 implementation is as per the sql standards.)


your patch doesn't work with updatable views because they don't have
ctid columns

ERROR:  column ctid does not exist
STATEMENT:  update vfoo set des_cta = des_cta || ' - prueba' where
current of foo;
ERROR:  current transaction is aborted, commands ignored until end of
transaction block

is this sane behavior? to accept create cursors for update on views
and then failing to update where current of and rollback the entire
transaction?

comments?



maybe just send a better error message




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] COPY-able csv log outputs

2007-05-20 Thread Andrew Dunstan



Greg Smith wrote:
I got a chance to review this patch over the weekend.  Basic API seems 
good, met all my requirements, no surprises with how the GUC variable 
controlled the feature.


The most fundamental issue I have with the interface is that using 
COPY makes it difficult to put any unique index on the resulting 
table.  I like to have a unique index on my imported log table because 
it rejects the dupe records if you accidentally import the same 
section of log file twice.  COPY tosses the whole thing if there's an 
index violation, which is a problem during a regular import because 
you will occasionally come across lines with the same timestamp that 
are similar in every way except for their statment; putting an index 
on the timestamp+statement seems impractical.


Does the format not include the per-process line number? (I know i 
briefly looked at this patch previously, but I forget the details.) One 
reason I originally included line numbers in log_line-prefix was to 
handle this sort of problem.




I've had a preference for INSERT from the beginning here that this 
reinforces.


COPY is our standard bulk insert mechanism. I think arguing against it 
would be a very hard sell.


I'm planning to just work around this issue by doing the COPY into a 
temporary table and then INSERTing from there.  I didn't want to just 
let the concern pass by without mentioning it though.  It crosses my 
mind that inserting some sort of unique log file line ID number would 
prevent the dupe issue and make for better ordering (it's possible to 
have two lines with the same timestamp show up in the wrong order 
now), not sure that's a practical idea to consider.


I guess that answers my question. We should definitely provide a unique 
line key.


The basic coding of the patch seemed OK to me, but someone who is much 
more familiar than myself with the mechanics of pipes should take a 
look at that part of the patch before committing; it's complicated 
code and I can't comment on it.  There are some small formatting 
issues that need to be fixed, particularly in the host+port mapping.  
I can fix those myself and submit a slightly updated patch.  There's 
some documentation improvements I want to make before this goes in as 
well.


The patch is actually broken fairly hard right now because of the 
switch from INSERT to COPY FROM CSV as the output format at the last 
minute.  It outputs missing fields as NULL (fine for INSERT) that 
chokes the CSV import when the session_start timestamp is missing.  
All of those NULL values need to be just replaced with nothing for 
proper CSV syntax; there should just the comma for the next field.  I 
worked around this with


copy pglog from '/opt/pgsql/testlog.csv' with CSV null as 'NULL';




I missed that before - yes it should be fixed.

cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY-able csv log outputs

2007-05-20 Thread Tom Lane
Greg Smith [EMAIL PROTECTED] writes:
 The most fundamental issue I have with the interface is that using COPY 
 makes it difficult to put any unique index on the resulting table. I like 
 to have a unique index on my imported log table because it rejects the 
 dupe records if you accidentally import the same section of log file 
 twice.  COPY tosses the whole thing if there's an index violation, which 
 is a problem during a regular import because you will occasionally come 
 across lines with the same timestamp that are similar in every way except 
 for their statment; putting an index on the timestamp+statement seems 
 impractical.

Essentially the above is arguing that you want a unique index but you
can't be bothered to invent an actually-unique key.  This doesn't seem
a sound argument to me.  If we need a unique key, let's find one.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend