date:20051008

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about

2005-10-08 Thread Joe Conway


Bruce Momjian wrote:

There was also a problem in that if two cursors were opened, the first
close would close the transaction.  I have fixed that code by changing
the xact variable in to a counter that keeps track of the number of
opened cursors and commits only when they are all closed.

Both the dblink.c patch and the regression patch are at:

ftp://candle.pha.pa.us/pub/postgresql/mypatches



OK, I'll take a look, but I won't have time for a couple of days (I'm 
not at home -- visiting my dad for his 80th birthday -- and have no 
broadband access).


Joe

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

2005-10-08 Thread Magnus Hagander

> The point is I'm having a hard time seeing what the actual 
> gain is in not changing it back. If the principal name 
> mismatches, we're going to get rejected anyway, so it's not 
> really a problem there. Even though the gain in changing it 
> back isn't all that big either, why should we introduce  
> abackwards-incompatibility if there is no real gain in a 
> different part of the code.


Here's a patch that fixes the big problem and reverts the behaviour of
appl_version to be compatible with 8.0. It's easy enough to isolate the
changes that are around the appl_version - one line in
backend/libpq/auth.c call to krb5_recvauth and one in
interfaces/libpq/fe-auth.c call to krb5_sendauth.

The call in backend/libpq/auth.c to krb5_sname_to_principal in 8.1beta2
was completely broken for a scenario where you *didn't* use virtual
hosts, by setting pg_krb5_server to NULL... The call is needed there as
well.

//Magnus


krb5fix.patch
Description: krb5fix.patch

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about open

2005-10-08 Thread Bruce Momjian

Joe Conway wrote:
> Bruce Momjian wrote:
> > There was also a problem in that if two cursors were opened, the first
> > close would close the transaction.  I have fixed that code by changing
> > the xact variable in to a counter that keeps track of the number of
> > opened cursors and commits only when they are all closed.
> > 
> > Both the dblink.c patch and the regression patch are at:
> > 
> > ftp://candle.pha.pa.us/pub/postgresql/mypatches
> > 
> 
> OK, I'll take a look, but I won't have time for a couple of days (I'm 
> not at home -- visiting my dad for his 80th birthday -- and have no 
> broadband access).

No problem -- thanks.  I have slimmed down the patch by applying the
cosmetic parts to CVS.  Use the URL above to get the newest versions of
the dblink.c and regression changes.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[PATCHES] [PATCH] Using pread instead of lseek (with analysis)

2005-10-08 Thread Martijn van Oosterhout

Hi,

Below is a patch that tries to determine (for linux) if the pread is
emulated or not (using an autoconf test), and if so use it instead of
lseek. It doesn't remove all uses of lseek. For example, PostgreSQL
uses lseek to find the end of the file when extending. So in a bulk
load, there is one seek per block written anyway. Also, the xlog seems
to be completely seperate code.

Short summary: Actual benefit, (on Linux anyway) not distinguishable
from the noise.

Longer summary: It reduces CPU usage slightly (few percent user,
overall almost nothing), but the benefit is usually swamped by the
actual cost of the read from disk (system cache miss). I would actually
make the change because it simplifies the code a bit.

Comparisons on other systems may be worthwhile.

What follows is the detailed analysis of the change. My conclusion is
that on Linux at least where the syscall overhead is so low, it's not
worth the change for performance. It is cleaner code-wise IMHO.


Initially, the table is 1.5 million rows of random integer data. Worst
case scenario: ordered output run with seqscan disabled. Table is 66MB,
Index is 56MB and includes all three data columns. The queries took 98
+/- 1 seconds. Over the entire query it consistantly loaded about 55MB
of data from disk (about 50% system cache miss rate).

Query: select count(*) from (select * from bigtable order by a,b,c)

Exhibit A below is the vmstat output with 10 second averages. The
interesting columns are cpu us (user) and cpu sy (system). Note, system
is defined as stuff not user and not idle. Waiting for the I/O to
complete is counted as system, even though it's not doing all that
much. It shows a very, very slight reduction in user time, though it is
measurable and repeatable (both run a number of times with same
results).

What this tells me is that lseek time is swamped by actual read time,
so not a major benefit in most cases.

Exhibit B is the same but with only 1 million rows (Table 48MB, Index
29MB), this fitting in cache memory. The original index had some slack
from deleting records (I did a vacuum full though). The queries now
take 60 +/- 1 seconds. As you can see even without accessing the disk,
it still isn't making a big difference.

Exhibit C is the output of strace -c on the second test. As you can see
there is a difference, but marginal.

Exhibit D is the output of a seq scan, in case anyone figured I might
be buggering the queries somehow. As you can see, it shows that the
index scan is extremely expensive. Only 5900 reads for the seq scan
(enough to read the whole table) instead of 89. The shared buffers
is the default 1000. Upping this would close the gap somewhat but that
is irrelevent for this test (lseek vs pread).

Exhibit E is the strace -c output, for completeness. lseek doesn't
appear in the top 10.

If you read this far, thanks for the attention! If you're running
another system, it would be nice to test. Remember to update the
autoconf code for your system!

Have a nice day,

Misc details:

command line:
  time echo "select count(*) from(select * from bigtable order by a,b,c) x" | \ 
./postgres -f s -D /var/lib/postgres/data test
System: 
  Linux kleptog 2.6.8-1-686 #1 Thu Nov 25 04:34:30 UTC 2004 i686 GNU/Linux
  128MB RAM
  Intel(R) Celeron(TM) CPU 1000MHz

Exhibit (A): Index scan, dataset > memory

Using ordinary llseek/read:

procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 1  0  4   1372940 10756400   53418 1070   145 18 56  0 27
 1  0  4   1648992 10726000   548 3 1071   147 18 63  0 19
 1  0  4   1552984 10734800   542 4 1069   144 20 69  0 11
 1  0  4   1432980 10749600   550 6 1071   146 19 69  0 12
 1  0  4   1648664 10760000   560 3 1071   148 19 67  0 15
 1  0  4   1576632 10772800   597 3 1076   157 19 68  0 13
 2  0  4   1768640 10752800   61814 1080   163 19 67  0 14

Using pread():

procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 1  0  4   1680600 10761600   53418 1070   145 17 56  0 28
 1  0  4   1656644 10764400   543 5 1070   147 16 64  0 19
 1  0  4   1344624 10797600   540 2 1068   142 19 70  0 11
 1  0  4   1656632 10765200   555 6 1071   147 17 70  0 13
 1  0  4   1608540 10779600   56413 1074   149 18 68  0 15
 1  0  4   1392504 10806800   598 4 1076   158 17 69  0 14
 0  1  4   1368460 10812000   626 3 1079   165 17 69  0 14

Exhibit (B): Index scan, dataset < memory

Using ordinary llseek/read:

procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free

[PATCHES] test, please ignore

2005-10-08 Thread Bruce Momjian

test email, please ignore.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about open

2005-10-08 Thread Bruce Momjian


[ reposted]

Tom Lane wrote:
> Bruce Momjian  writes:
> > Well, as I said in the patch email:
> 
> > The reported problem is that dblink_open/dblink_close() (for cursor
> > reads) do a BEGIN/COMMIT regardless of the transaction state of the
> > remote connection.  There was code in dblink.c to track the remote
> > transaction state (rconn), but it was not being maintained or used.
> 
> You should lose the remoteXactOpen flag entirely, in favor of just
> testing PQtransactionStatus() on-the-fly when necessary.  Simpler,
> more reliable, not notably slower.
> 
> With that change, the separate remoteConn struct could be dropped
> altogether in favor of just using the PGconn pointer.  This would
> make things notationally simpler, and in fact perhaps allow undoing
> the bulk of the edits in your patch.  As-is I think the patch is
> pretty risky to apply during beta.

I have developed a dblink regression patch that tests for the bug.  It
opens a transaction in the client code, opens/closes a cursor, then
tries a DECLARE.  Without my patch, this fails because the
dblink_close() closes the transaction, but with my patch it succeeds. 
Here is the test:

-- test opening cursor in a transaction
SELECT dblink_exec('myconn','BEGIN');

-- an open transaction will prevent dblink_open() from opening its own
SELECT dblink_open('myconn','rmt_foo_cursor','SELECT * FROM foo');

-- this should not close the cursor because the client opened it
SELECT dblink_close('myconn','rmt_foo_cursor');

-- this should succeed because we have an open transaction
SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM 
foo');

-- commit remote transaction
SELECT dblink_exec('myconn','COMMIT');

Here is the failure reported by the current code:

-- this should succeed because we have an open transaction
SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR SELECT * FROM 
foo');
ERROR:  sql error
DETAIL:  ERROR:  DECLARE CURSOR may only be used in transaction blocks

There was also a problem in that if two cursors were opened, the first
close would close the transaction.  I have fixed that code by changing
the xact variable in to a counter that keeps track of the number of
opened cursors and commits only when they are all closed.

Both the dblink.c patch and the regression patch are at:

ftp://candle.pha.pa.us/pub/postgresql/mypatches

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PATCHES] [BUGS] BUG #1927: incorrect timestamp returned

2005-10-08 Thread Tom Lane

Bruce Momjian  writes:
> Tom Lane wrote:
>> I think the solution is that timestamp_out needs to decide how many
>> fractional digits it wants to display, and then round off the input
>> accordingly, *before* it breaks the input down into y/m/d/h/m/s fields.
>> This "60.00" business is happening because the rounding is done only on
>> the seconds-and-fractional-seconds field.

> Well, the testing showed that the one with the most 9's was actually
> rounded up to a whole number by timestamp_in, meaning we never have a
> chance to adjust it in timestamp_out.  I am assuming you will be able to
> round the middle test value up to a whole number in timestamp_out so the
> 60 number will disappear.

After further thought I've realized that the current methodology for
applying JROUND (ie, to apply it to a whole timestamp or time value)
is fundamentally bogus.  The macro is trying to round off to six fraction
digits, but if you've got many digits on the left side of the decimal point
then you cannot guarantee that six fraction digits of precision are
available to work with.  It is *only* sensible to JROUND a value
representing a fractional-seconds field (or possibly seconds +
fractional seconds).  Accordingly, the majority of the JROUND calls in
the existing code are indeed bogus and should go away.  The only one
that's not bogus is the one in dt2time(), which is properly applied to a
fractional second ... but the problem with that one is that there's no
provision for coping with the possibility that the fractional second
rounds up to 1.0.  In that situation, we need to be able to propagate
the roundoff to higher fields to avoid the "23:59:60" syndrome.

Since dt2time doesn't have a way to propagate the roundoff into the date
part, we can't fix this right there.  It needs to be handled at the
caller level (timestamp2tm and related routines).

Attached is a proposed patch that cleans this up.  It's not quite
complete because I haven't propagated the changes into ecpglib,
but I thought I could put it out for comment in this form.  

The patch does the following:

1. Replace JROUND with TSROUND (to round to MAX_TIMESTAMP_PRECISION
places) and TIMEROUND (to round to MAX_TIME_PRECISION places).

2. Remove JROUND calls in calculational routines.

3. In timestamp2tm, time2tm, timetz2tm, and interval2tm, round off
the fractional-seconds field to the right number of places, check
for rounding up to 1, and cope as needed.

4. Make the sprintf field width in EncodeTimeOnly match
MAX_TIME_PRECISION.  Because it was only printing 9 digits and not 10,
there were cases where a properly rounded fractional time would still
print as "60.0".  (I think at some point we reduced it from 10 to 9
as a hacky way of dealing with a different bug report ... but now I
feel we've gotten the right solution at last.)

The patch passes the regression tests and appears to fix the problems.

Comments?

regards, tom lane

*** contrib/btree_gist/btree_ts.c.orig  Fri Jul  1 09:44:55 2005
--- contrib/btree_gist/btree_ts.c   Sat Oct  8 13:12:37 2005
***
*** 122,130 
*gmt -= (tz * INT64CONST(100));
  #else
*gmt -= tz;
-   *gmt = JROUND(*gmt);
  #endif
- 
}
return gmt;
  }
--- 122,128 
*** src/backend/utils/adt/date.c.orig   Thu Sep  8 22:31:49 2005
--- src/backend/utils/adt/date.cSat Oct  8 13:12:45 2005
***
*** 944,953 
--- 944,961 
  #else
double  trem;

+ recalc:
trem = time;
TMODULO(trem, tm->tm_hour, (double)SECS_PER_HOUR);
TMODULO(trem, tm->tm_min, (double)SECS_PER_MINUTE);
TMODULO(trem, tm->tm_sec, 1.0);
+   trem = TIMEROUND(trem);
+   /* roundoff may need to propagate to higher-order fields */
+   if (trem >= 1.0)
+   {
+   time = ceil(time);
+   goto recalc;
+   }
*fsec = trem;
  #endif

***
*** 1837,1845 
--- 1845,1861 
  #else
double  trem = time->time;

+ recalc:
TMODULO(trem, tm->tm_hour, (double)SECS_PER_HOUR);
TMODULO(trem, tm->tm_min, (double)SECS_PER_MINUTE);
TMODULO(trem, tm->tm_sec, 1.0);
+   trem = TIMEROUND(trem);
+   /* roundoff may need to propagate to higher-order fields */
+   if (trem >= 1.0)
+   {
+   trem = ceil(time->time);
+   goto recalc;
+   }
*fsec = trem;
  #endif

*** src/backend/utils/adt/datetime.c.orig   Sat Jul 23 10:25:33 2005
--- src/backend/utils/adt/datetime.cSat Oct  8 13:12:46 2005
***
*** 3488,3495 
sprintf(str, "%02d:%02d", tm->tm_hour, tm->tm_min);

/*
!* Print fractional seconds if any.  The field widths here should be
!* at least equal to the larger of MAX_TIME_PRECISION and
 * MAX_TIMESTAMP_PRECISION.
 */
if (fsec != 0)
--- 3488,3495 
sprintf(str, "%02d:%02d",

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

2005-10-08 Thread Tom Lane

"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> Here's a patch that fixes the big problem and reverts the behaviour of
> appl_version to be compatible with 8.0.

Applied with trivial stylistic cleanups.

BTW, the documentation seems a bit broken:

krb_server_hostname (string)

 Sets the hostname part of the service principal. This, combined
 with krb_srvname, is used to generate the complete service
 principal, i.e. krb_server_hostname/[EMAIL PROTECTED]

I suppose one of those last two should be "krb_srvname", but which?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

2005-10-08 Thread Magnus Hagander

> > Here's a patch that fixes the big problem and reverts the 
> behaviour of 
> > appl_version to be compatible with 8.0.
> 
> Applied with trivial stylistic cleanups.
> 
> BTW, the documentation seems a bit broken:
> 
> krb_server_hostname (string)
> 
>  Sets the hostname part of the service principal. This, combined
>  with krb_srvname, is used to generate the complete service
>  principal, i.e. krb_server_hostname/[EMAIL PROTECTED]
> 
> I suppose one of those last two should be "krb_srvname", but which?

Yes.
It should be krb_srvname/[EMAIL PROTECTED]

Now that you mention it, krb_server_hostname should probably mentioned
in chapter 19.2.3 somewhere. 

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

2005-10-08 Thread Tom Lane

"Magnus Hagander" <[EMAIL PROTECTED]> writes:
>> I suppose one of those last two should be "krb_srvname", but which?

> Yes.
> It should be krb_srvname/[EMAIL PROTECTED]

OK, will fix.

> Now that you mention it, krb_server_hostname should probably mentioned
> in chapter 19.2.3 somewhere. 

Yeah, it just says

hostname is the fully qualified host name of the server machine.

which is clearly insufficient detail.  Please send some proposed text.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

2005-10-08 Thread Qingqing Zhou


"Martijn van Oosterhout"  wrote
>
> Below is a patch that tries to determine (for linux) if the pread is
> emulated or not (using an autoconf test), and if so use it instead of
> lseek.

Just FYI, this was proposed 2 years ago:

http://archives.postgresql.org/pgsql-patches/2003-02/msg00197.php

The basic problem I read from the above thread is that it does not give 
enough performance proofs on *various* platforms.

Regards,
Qingqing 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

2005-10-08 Thread Tom Lane

BTW, it appears to me that this patch has also broken the claim in the
manual that

If [krb_server_hostname is] not set, the default is to allow any
service principal matching an entry in the keytab.

The reason that was true was that we passed a NULL "server" value to
krb5_recvauth(), which with this patch we never do anymore.

I'm not sure if this represents a serious loss of flexibility or not,
but in any case the documentation needs an update.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

2005-10-08 Thread Tom Lane

Martijn van Oosterhout  writes:
> What follows is the detailed analysis of the change. My conclusion is
> that on Linux at least where the syscall overhead is so low, it's not
> worth the change for performance. It is cleaner code-wise IMHO.

How is it cleaner code-wise to debug and maintain two #ifdef'd code
paths instead of one?  (And when I say "debug", I mean "I don't believe
FileSeek still works".  One reason that the patch seems unmaintainable
to me as-is is that it creates a subtle, critical, and undocumented
difference in the semantic meaning of the seekPos variable: in one case
it's tracking the kernel's own seek position, and in the other it isn't.)

> What this tells me is that lseek time is swamped by actual read time,
> so not a major benefit in most cases.

It would be reasonable to check results in fully-cached cases, which
would be the best real-world scenario for this to show any improvement
in.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

2005-10-08 Thread Martijn van Oosterhout

On Sat, Oct 08, 2005 at 05:27:11PM -0400, Tom Lane wrote:
> How is it cleaner code-wise to debug and maintain two #ifdef'd code
> paths instead of one?  (And when I say "debug", I mean "I don't believe
> FileSeek still works".  One reason that the patch seems unmaintainable
> to me as-is is that it creates a subtle, critical, and undocumented
> difference in the semantic meaning of the seekPos variable: in one case
> it's tracking the kernel's own seek position, and in the other it isn't.)

I mean in the situation where we would only use pread. As for the
difference, no code outside of fd.c can assume anying about the kernel
seek position because it can't see it (the file might be closed!).
Unless something bypasses the abstraction?

> It would be reasonable to check results in fully-cached cases, which
> would be the best real-world scenario for this to show any improvement
> in.

If you look, I did that and even then it simply didn't make a
difference. lseek is 10 microseconds, you'd need to do hundreds of
thousands of them to make a difference. And any gain would disappear in
just the rotational latency of a hard disk read.

I satisfied my curiosity and showed it just isn't worth it at this
point.
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgp4zc73DtXI1.pgp
Description: PGP signature

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about open

[PATCHES] [PATCH] Using pread instead of lseek (with analysis)

[PATCHES] test, please ignore

Re: [PATCHES] [HACKERS] Patching dblink.c to avoid warning about open

Re: [PATCHES] [BUGS] BUG #1927: incorrect timestamp returned

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

Re: [PATCHES] [HACKERS] Kerberos brokenness and oops question in 8.1beta2

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

Re: [PATCHES] [PATCH] Using pread instead of lseek (with analysis)

14 matches

Site Navigation

Mail list logo

Footer information