Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Jeffrey Baker
Tom Lane wrote:
Jeffrey W. Baker [EMAIL PROTECTED] writes:
I see that Tom has already done the infrastructure work by adding
getmulti, but getmulti isn't used by nodeIndexscan.c, only
nodeBitmapIndexscan.c.  Will btree index scans be executed by creating
in-memory bitmaps in 8.1, or will some scans still be executed the usual
way?

We aren't going to remove the existing indexscan behavior, because
bitmap scans lose the ordering of the underlying index.  There are many
situations where that ordering is important.  (See for instance the
recent changes to make MAX/MIN use that behavior.)
Would you take a patch that retained the optimized executions of plans 
returning 1 tuple and also fixed the random heap problem?

-jwb
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Neil Conway
Jeffrey Baker wrote:
Would you take a patch that retained the optimized executions of plans 
returning 1 tuple and also fixed the random heap problem?
Can you elaborate on what you're proposing? Obviously sorted b+-tree 
output is important for a lot more than just min()/max(). I don't see an 
obvious way to produce sorted output from a bitmap tree index scan 
without requiring an additional sort step (which would be rather 
pointless -- the whole point of the optimization is to avoid an 
additional sort).

-Neil
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Jeffrey Baker
Neil Conway wrote:
Jeffrey Baker wrote:
Would you take a patch that retained the optimized executions of plans 
returning 1 tuple and also fixed the random heap problem?

Can you elaborate on what you're proposing? Obviously sorted b+-tree 
output is important for a lot more than just min()/max(). I don't see an 
obvious way to produce sorted output from a bitmap tree index scan 
without requiring an additional sort step (which would be rather 
pointless -- the whole point of the optimization is to avoid an 
additional sort).
I understand the importance of returning tuples in index order for many 
plans (although I probably haven't thought of all the cases.  min/max is 
the most obvious; order by + limit is another).  The only problem I'm 
trying to solve is when an indexscan returns a large result, causing the 
heap to be visited in index order, which is to say random order, from 
the disk's perspective.  When I investigated this last year, sorting the 
intermediate result of the index scan in disk order was good for a 
reduction by two-thirds in actual execution time, and sorting the scan 
result in chunks of 1000 tuples was enough to reduce the time by half.

I'm considering one of the following courses of action:
Change nodeIndexscan.c to call index_getmulti, and to handle multiple 
tuples returned.  That code would sort the tuple array and store the 
tuples in the result in disk order.

-or-
Change the planner/executor to use the bitmap scan in all cases where 
index order is unimportant.  From my reading of the current code, the 
bitmap scan is only used in case of a join.

-jwb
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Mark Cave-Ayland
 
 -Original Message-
 From: Simon Riggs [mailto:[EMAIL PROTECTED]
 Sent: 12 May 2005 16:52
 To: Mark Cave-Ayland (External)
 Cc: 'Christopher Kings-Lynne'; 'Tom Lane'; 'Bruce Momjian'; 
 pgsql-hackers@postgresql.org
 Subject: RE: [HACKERS] Cost of XLogInsert CRC calculations

(cut)
 
 It might be possible to speed things up by a factor of two using a
 two- byte lookup table rather than a one byte lookup. This would take 
 up 65536 table entries, which is only 256KB. If we held that in shared 
 memory we could calculate the entries at startup, or read them in from 
 a file. It would only use the same space as if we had 255 connections, 
 so not a great increase in memory usage overall. Need to look at the
 effect of retrieving everything from memory rather than from 
 cache, so it probably wouldn't be twice as fast.
 
 Best Regards, Simon Riggs


Hi everyone,

As per this thread, I decided to spend a little time over the weekend
looking at the performance of the existing CRC64 code in order to determine
whether we could code the algorithm in a more efficient manner. In order to
do this, I devised a couple of test programs using code cut/pasted from
pg_crc.h (and pg_resetxlog) in order perform some timings. 

These tests were done on two systems: a P4 1.8GHz laptop with MingW (Win32)
and gcc 3.2.3, and a Xeon 2.8GHz server running FC1 and gcc 3.3.2. 

The program used to time the CRC64 calculation simply did the following:

1) Create an 8k buffer of data

2) Fill the buffer with some repeatable values; in this case the
algorithm used was the following:

for (i = 0 ; i  8192 ; i++ )
{
*data = (char )(i % 256);
}

3) Calculate the CRC64 of the buffer 10,000 times and display the
result,
making a note of the start and end times. Using this information an
estimate 
of the cost of a single CRC calculation can esaily be found.

I noticed looking at the code that there were two versions of the CRC code,
one using 32-bit arguments wrapped within a check for INT64_IS_BUSTED and
another using 64-bit unsigned integers. Since I wasn't sure initially which
one was being used, I decided to test both of them :) Both programs were
compiled using gcc's -O2 flag for optimisation.

The first thing to notice about the results was that I obtained different
CRC values using both algorithms on Win32, which were both different to the
results obtained under Linux:

Win32 MingW:
1) INT64_IS_BUSTED code (32-bit):   0x541d4a27 (crc1) 0x8dfa57ae
(crc0)
2) 64-bit code: 0x817f722b1f17b253 (crc0)

Linux (FC1):
1) INT64_IS_BUSTED code (32-bit):   0x21d064a2 (crc1) 0xe7c74332
(crc0)
2) 64-bit code: 0x21d064a2e7c74332 (crc0)

The second interesting thing to notice was the comparative speeds:

Win32 MingW:
1) INT64_IS_BUSTED code (32-bit):   ~57us per calculation
2) 64-bit code: ~130us per calculation

Linux (FC1):
1) INT64_IS_BUSTED code (32-bit):   ~29us per calculation
2) 64-bit code: ~76us per calculation


Looking at the flags in pg_config.h from both source builds, it would appear
that the 64-bit code is being used since the compiler defines
HAVE_LONG_LONG_INT_64. So that means there could be a potential 100%
performance increase by just using the existing INT64_IS_BUSTED code on x86
32 bit processors. I don't know given the rate of xlog records being written
whether or not this translates to more than a couple of minutes in an insert
of a million records or so.

I did have a look at the assembly code produced by the INT64_IS_BUSTED code
and it appears fairly tight; I'm no x86 expert but I think it would be hard
to optimise what was produced by the compiler. If anyone else is interested
in looking at these results then I will happily send the test programs
off-list - however my main concern is that the Win32 CRC64 results just
don't seem to be right :(


Kind regards,

Mark.


WebBased Ltd
17 Research Way
Plymouth
PL6 8BT 

T: +44 (0)1752 797131
F: +44 (0)1752 791023
W: http://www.webbased.co.uk



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] postgreSQL as deductive DBMS

2005-05-16 Thread Дмитрий Летучий
Hello all,

I have some ideas how to increase expressive power of the PostgreSQL query 
language. It is

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] postgreSQL as deductive DBMS

2005-05-16 Thread Dmitriy Letuchy
Hello all,

I have some ideas how to increase expressive power of the PostgreSQL query 
language. It is not a secret that SQL is very poor to express many important 
queries, and we have to use means of procedural extensions of SQL to realize 
them. However this is not good idea to split query language into two parts 
(declarative and procedural) at least because query language must finally 
operate with the objects of data domain, not with the bits the objects consist 
of. Thus another alternative to increase expressive power of query language is 
to develop its declarative (i.e. nonprocedural) part. And here we come to 
deductive database (DDB) with its logic language Datalog.
Every logic query is a set of inverse implications, which describe what to 
find (i.e. exactly declarative approach) and not how to find (i.e. exactly 
procedural approach). We can translate logic query into set of SQL commands and 
then run them to get result. Some samples with the DLQ compiler can be 
downloaded from www.datalab.kharkov.ua (DLQ is our original version of Datalog 
which was developed with the purpose to tie closely RDB and DDB).
Now some words about what must be done to realize described feature. The 
simple quickest way but the way without future is to write language handler. 
Other more correct way is to slightly extend DML part of SQL and more 
essentially extend DDL. For example, we have relation Inheritance with two 
attributes ClassID and ParentID. Now we want to define all descendants or all 
ancestors. For this goal we define predicate inheritance_all with the next two 
rules (i.e. inverse implications):

inheritance_all(ClassID, ParentID) :- inheritance(ClassID, ParentID);
inheritance_all(ClassID, ParentID) :- inheritance(ClassID, X), 
inheritance_all(X, ParentID).

We put this rules into database and call, for example, the next SQL 
commands:

--   find all descendents
SELECT * FROM ddb_name.inheritance_all(_, _) 

-- find all descendents from ParentID = 1
SELECT * FROM ddb_name.inheritance_all(_, 1)

where ddb_name is the name of deductive database where our rules are kept,  _ 
designates anonymous variable (see Prolog notation for details).

Regards, Dmitriy

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Mark Cave-Ayland

 -Original Message-
 From: Mark Cave-Ayland [mailto:[EMAIL PROTECTED] 
 Sent: 16 May 2005 09:04
 To: 'Simon Riggs'
 Cc: 'Christopher Kings-Lynne'; 'Tom Lane'; 'Bruce Momjian'; 
 'pgsql-hackers@postgresql.org'
 Subject: RE: [HACKERS] Cost of XLogInsert CRC calculations

(cut)

 The program used to time the CRC64 calculation simply did the 
 following:

(cut)

   2) Fill the buffer with some repeatable values; in this case the
   algorithm used was the following:
 
   for (i = 0 ; i  8192 ; i++ )
   {
   *data = (char )(i % 256);
   }

Sigh, so it would help if I had added the offset to the data pointer... ;)

for (i = 0 ; i  8192 ; i++ )
{
*(data + i) = (char )(i % 256);
}

This now gives the following (correct) result on both platforms:
Win32: 1.8GHz P4, WinXP
Linux: 2.8GHz Xeon, FC1


Win32 UINT64:   0x782104059a01660   (crc0)
~158us
Win32 UINT32:   0x78210405 (crc1), 0x59a01660 (crc0)~58us

FC1 UINT64: 0x782104059a01660 (crc0)
~76us
FC1 UINT32: 0x78210405 (crc1), 0x59a01660 (crc0)
~29us


Note that we still find that using the INT64_IS_BUSTED code is over 100%
quicker than the UINT64 code for CRC64 calculation, and I believe it is not
being used by default under Linux or Win32 for 32 bit platforms. Of course
Tom's suggestion of going for CRC32 across the board would hopefully solve
this entirely and bring the times down a little further too.


Kind regards,

Mark.


WebBased Ltd
17 Research Way
Plymouth
PL6 8BT 

T: +44 (0)1752 797131
F: +44 (0)1752 791023
W: http://www.webbased.co.uk



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] SO_KEEPALIVE

2005-05-16 Thread Dennis Bjorklund
How come we don't set SO_KEEPALIVE in libpq? 

Is there any reason why we wouldn't want it on?

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] farsi faq has not been added yet in website,

2005-05-16 Thread Mahmoud Taghizadeh
would you please fix it, why farsi faq is not in web site?With Regards,--taghi
		Discover Yahoo! 
Have fun online with music videos, cool games, IM & more. Check it out!

Re: [HACKERS] postgreSQL as deductive DBMS

2005-05-16 Thread Michael Meskes
Having written my thesis about deductive DBS I cannot resist giving my 2
cent.

On Mon, May 16, 2005 at 01:42:24PM +0400, Dmitriy Letuchy wrote:
 Now some words about what must be done to realize described feature. The 
 simple quickest way but the way without future is to write language handler. 
 Other more correct way is to slightly extend DML part of SQL and more 
 essentially extend DDL. For example, we have relation Inheritance with two 
 attributes ClassID and ParentID. Now we want to define all descendants or all 
 ancestors. For this goal we define predicate inheritance_all with the next 
 two rules (i.e. inverse implications):
 
 inheritance_all(ClassID, ParentID) :- inheritance(ClassID, ParentID);
 inheritance_all(ClassID, ParentID) :- inheritance(ClassID, X), 
 inheritance_all(X, ParentID).
 
 We put this rules into database and call, for example, the next SQL 
 commands:
 
 --   find all descendents
 SELECT * FROM ddb_name.inheritance_all(_, _) 

How do you plan to execute this statement. As you mentioned above all
logic queries can be rewritten into SQL ones, but you have to find a way
to handle recursion. I would think the best way is to add recursion to
SQL and then completely rewrite the statements. 

Also, and that's where it starts to become interesting, how do you plan
to handle negation inside recursion?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] postgreSQL as deductive DBMS

2005-05-16 Thread Josh Berkus
Dimitry,

 Thus another alternative to increase expressive power of query language is
 to develop its declarative (i.e. nonprocedural) part. And here we come to
 deductive database (DDB) with its logic language Datalog.

You may want to look at the work of Rada Chirkova, who has already written a 
PostgreSQL-parse-tree-to-DataLog converter:
http://research.csc.ncsu.edu/selftune/Report_031005.pdf

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes:
 I didn't post the sources to the list originally as I wasn't sure if the
 topic were of enough interest to warrant a larger email. I've attached the
 two corrected programs as a .tar.gz - crctest.c uses uint32, whereas
 crctest64.c uses uint64.

I did some experimentation and concluded that gcc is screwing up
big-time on optimizing the CRC64 code for 32-bit Intel.  It does much
better on every other architecture though.

Here are some numbers with gcc 3.2.3 on an Intel Xeon machine.  (I'm
showing the median of three trials in each case, but the numbers were
pretty repeatable.  I also tried gcc 4.0.0 on this machine and got
similar numbers.)

gcc -O1 crctest.c   0.328571 s
gcc -O2 crctest.c   0.297978 s
gcc -O3 crctest.c   0.306894 s

gcc -O1 crctest64.c 0.358263 s
gcc -O2 crctest64.c 0.773544 s
gcc -O3 crctest64.c 0.770945 s

When -O2 is slower than -O1, you know the compiler is blowing it :-(.
I fooled around with non-default -march settings but didn't see much
change.

Similar tests on a several-year-old Pentium 4 machine, this time with
gcc version 3.4.3:

gcc -O1 -march=pentium4 crctest.c   0.486266 s
gcc -O2 -march=pentium4 crctest.c   0.520237 s
gcc -O3 -march=pentium4 crctest.c   0.520299 s

gcc -O1 -march=pentium4 crctest64.c 0.928107 s
gcc -O2 -march=pentium4 crctest64.c 1.247673 s
gcc -O3 -march=pentium4 crctest64.c 1.654102 s

Here are some comparisons showing that the performance difference is
not inherent:

IA64 (Itanium 2), gcc 3.2.3:

gcc -O1 crctest.c   0.898595 s
gcc -O2 crctest.c   0.599005 s
gcc -O3 crctest.c   0.598824 s

gcc -O1 crctest64.c 0.524257 s
gcc -O2 crctest64.c 0.524168 s
gcc -O3 crctest64.c 0.524140 s

X86_64 (Opteron), gcc 3.2.3:

gcc -O1 crctest.c   0.46 s
gcc -O2 crctest.c   0.46 s
gcc -O3 crctest.c   0.46 s

gcc -O1 crctest64.c 0.41 s
gcc -O2 crctest64.c 0.41 s
gcc -O3 crctest64.c 0.41 s

PPC64 (IBM POWER4+), gcc 3.2.3

gcc -O1 crctest.c   0.819492 s
gcc -O2 crctest.c   0.819427 s
gcc -O3 crctest.c   0.820616 s

gcc -O1 crctest64.c 0.751639 s
gcc -O2 crctest64.c 0.894250 s
gcc -O3 crctest64.c 0.888959 s

PPC (Mac G4), gcc 3.3

gcc -O1 crctest.c   0.949094 s
gcc -O2 crctest.c   1.011220 s
gcc -O3 crctest.c   1.013847 s
gcc -O1 crctest64.c 1.314093 s
gcc -O2 crctest64.c 1.015367 s
gcc -O3 crctest64.c 1.011468 s

HPPA, gcc 2.95.3:

gcc -O1 crctest.c   1.796604 s
gcc -O2 crctest.c   1.676023 s
gcc -O3 crctest.c   1.676476 s
gcc -O1 crctest64.c 2.022798 s
gcc -O2 crctest64.c 1.916185 s
gcc -O3 crctest64.c 1.904094 s

Given the lack of impressive advantage to the 64-bit code even on 64-bit
architectures, it might be best to go with the 32-bit code everywhere,
but I also think we have grounds to file a gcc bug report.

Anyone want to try it with non-gcc compilers?  I attach a slightly
cleaned-up version of Mark's original (doesn't draw compiler warnings
or errors on what I tried it on).

regards, tom lane



bin54DRmb8Gwp.bin
Description: crctest.tar.gz

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] SO_KEEPALIVE

2005-05-16 Thread Tom Lane
Dennis Bjorklund [EMAIL PROTECTED] writes:
 How come we don't set SO_KEEPALIVE in libpq? 
 Is there any reason why we wouldn't want it on?

Is there any reason we *would* want it on?  The server-side keepalive
should be sufficient to get whatever useful impact it might have.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Greg Stark


Tom Lane [EMAIL PROTECTED] writes:

 I think it would be easy to change the planner and btree to handle
 this (where easy means I remember where all the skeletons are
 buried).  But I don't know the gist code hardly at all.  Can anyone
 offer an informed opinion on whether gist can handle this now, and
 if not what it would take to handle it?

Currently gist indexes only use the first column for page splits, making
multi-key gist indexes basically useless. The problem is it's hard to imagine
an API for a pickSplit function that could handle multi-key indexes with
disparate data types and operator classes. I had an idea of a new way to deal
with gist indexes that simplified the API and side-stepped the whole issue but
you raised concerns that it might be too limiting.

Unfortunately the mailing list archive seems to have missed this discussion.
I've attached three messages from the discussion at the time.

---BeginMessage---
Oleg  Teodor,
If I understand the code correctly, GiST will only pass the first 
attribute of each index tuple to the user-defined PickSplit method when 
it wants to split a node. (see circa line 1269 of gist.c)

Is this a wise design decision? Granted, in many situations the first 
attribute in the index is sufficient to make a reasonable decision about 
how to divide the node into two halves, but I don't think that is 
universally true. For example, consider a two column index whose first 
attribute has a small number of distinct values. It could well be that 
all the first attribute values in a node-to-be-split would be the same. 
Only passing the first attribute to PickSplit would result in an 
essentially random distribution of tuples among the split nodes, rather 
than allowing the GiST extension to make use of the second attribution 
to partition the nodes. That's an extreme example, but it is easy to 
construct more realistic scenarios (basically, any situation in which 
the cardinality of the first index attribute is low -- a reasonably 
common occurrence with a multi-column index, I believe).

I'm not sure whether this would be a problem in practice. Speculation: 
repeated invocations of PickSplit are one of the main factors in 
deriving the ultimate shape of the GiST tree. Poor distribution of keys 
resulting from PickSplit would eventually result in unnecessarily loose 
bounding predicates in internal nodes, which would hurt performance.

Comments welcome,
Neil
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---End Message---
---BeginMessage---
Greg Stark [EMAIL PROTECTED] writes:

 I'm not sure that GiST indexes behave the same way as btree indexes for the
 multi-column case. 
 
 In a btree index the second column is entirely subordinate to the first
 column. In a GiST index the data is multi-dimensional, and all dimensions are
 equally important.

In fact on further consideration I do have a proposal.

If you look at the GiST implementations for various index types you'll see
that many (all?) take the same approach for PickSplit. In fact they pretty
much all have the same code copy/pasted to handle it.

The approach they take is to have a function which calculates an abstract
distance between any two entries. There's an algorithm that they use to pick
the split based on this distance function.

If you abandoned PickSplit and instead exposed this distance function as the
external API then the behaviour for multi-column indexes is clear. You
calculate the distance along all the axes and calculate the diagonal distance.

I think abandoning PickSplit for the distance function might also mean you
don't need a separate function for Penalty either, but I'm not sure on that.

-- 
greg



---End Message---
---BeginMessage---
Greg Stark [EMAIL PROTECTED] writes:
 The approach they take is to have a function which calculates an
 abstract distance between any two entries. There's an algorithm that
 they use to pick the split based on this distance function.

 If you abandoned PickSplit and instead exposed this distance
 function as the external API then the behaviour for multi-column
 indexes is clear. You calculate the distance along all the axes and
 calculate the diagonal distance.

Hmm ... the problem with that is the assumption that different opclasses
will compute similarly-scaled distances.  If opclass A generates
distances in the range (0,1e6) while B generates in the range (0,1),
combining them with Euclidean distance won't work well at all.  OTOH you
can't blindly normalize, because in some cases maybe the data is such
that a massive difference in distances is truly appropriate.

I'm also a bit leery of the assumption that every GiST application can
reduce its PickSplit logic to Euclidean distances.

regards, tom lane


---End Message---



 (hash and rtree are not at issue since they don't support multi-key
 indexes.)

It seems like it would be trivial to make 

Re: [HACKERS] pgFoundry

2005-05-16 Thread Lamar Owen
On Saturday 07 May 2005 16:23, Joshua D. Drake wrote:
  What does it mean to track the status of something? How would the
  status change except by discussion? What would be the point of announcing
  the status of something without allowing people to comment?

 No one said anything about not letting people comment or discuss. What I
 am suggesting is a better public presentation of what the heck is going
 on with PostgreSQL development.

This has been a perennial problem and has existed at least since version 
6.1.0, as that's when I first noticed it.  Looking at the website alone 
doesn't show the dynamic development that is actually happening, and has 
never shown it.  I very nearly passed PostgreSQL by because it looked 
unmaintained at the time from looking at the website.  That cannot be said at 
this juncture, but at the same time that is not real indication of how 
dynamic things really are.

From a sidelines point of view, a developer status summary page would allow 
one to follow development without having to read every message in HACKERS.  
At this point in my work, I am unable to follow development like I once did 
(one reason I stepped down as RPM maintainer) and have no real idea of the 
direction for 8.1 as a result.

To put it much more bluntly:  PostgreSQL development (both the process and the 
codebase) has one of the steepest learning curves around, steeper even than 
Plone, which is acknowledged by many to have an incredibly steep learning 
curve.

The only way in my mind to get this dynamism on the website is to make the 
website part of the process at some level.  If someone has to go through and 
digest for the website (hacker-bits, a la general bits) then that takes away 
developer resources (because someone has to be fairly close to the 
development process to do that sort of digestion).  Rather, if developers are 
using a system that automatically pushes to a web interface (or even uses a 
web interface with a cli component) then the status is automatically 
generated and the work of updating status is distributed.

  I think you have a severely flawed idea of how free software development
  proceeds.

 Then you obviously aren't paying attention. Look at other major OSS
 projects. They have these things in place. Even the Linux kernel has a
 bugzilla (although I am not advocating bugzilla). Not to mention KDE,
 Gnome, Debian..

 These projects also have reasonably defined milestones for particular
 releases and show status of those milestones during the release.

Virtually all OSS projects I am involved with publish a generalized road map 
online.   Some are more organized than others.

PostgreSQL has a different culture, this is true.  But it is somewhat of an 
intimidating culture for an outsider; once 'inside' (which takes 6 months or 
more unless you're another Tom Lane (I  love going back and reading the way 
he just 'jumped in' to the project)) this is one of the friendliest 
development communities around.  One might say the high bar of entry and the 
steep learning curve is partly the reason for that; culture changes take 
careful thought to implement, and a web-published development might easily 
change the whole culture of the project.

The biggest flamewars I have seen here involve one of the following topics:
1.) GPL vs BSD
2.) MySQL
3.) Multithreaded versus multiprocess.

Most other communities (with the notable exception of the GNUradio community, 
which is even more polite than this one, something I thought was not 
possible) have many more hot button topics than three.

Like any other management process (sorry, those of you who think OSS means 'no 
management' - there is management here) the PostgreSQL process is unique due 
to the unique collection of members, and what works for one community won't 
necessarily work for another.

Having said that, I'd love an 'at-a-glance' development status showing, 
possibly in graphical terms, where each subproject of the PostgreSQL core is 
at right now, updated frequently, as I've changed into more of a user role 
than a developer role; I can see the forest now, instead of all the RPM 
bushes and prairie grass.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] SO_KEEPALIVE

2005-05-16 Thread Dennis Bjorklund
On Mon, 16 May 2005, Tom Lane wrote:

  How come we don't set SO_KEEPALIVE in libpq? 
  Is there any reason why we wouldn't want it on?
 
 Is there any reason we *would* want it on?  The server-side keepalive
 should be sufficient to get whatever useful impact it might have.

Wouldn't the client also want to know that the server is not there
anymore? I talked to Gaetano Mendola (I think, but you never know on irc
:-) and he had some clients that had been hanging around for 3 days after
the server had been down and later up again (stuck in recv).

Server-side keepalive is enough for the server to clean up when clients
disapears, but this do nothing to help clients detect that the server is
gone. So I don't see what server side keepalive has to do with it.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] alternate regression dbs?

2005-05-16 Thread Joe Conway
Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
Try attached ... season to taste. The bulk of it is changes for dblink 
which has the dbname hardcoded.
Joe, any objections here?
Hmm, I can't find the message with the attachment, in my inbox or in the 
list archives. Can anyone point me to it?

Joe
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] pgFoundry

2005-05-16 Thread Andrew Dunstan

Lamar Owen wrote:
Look at other major OSS
projects. They have these things in place. Even the Linux kernel has a
bugzilla (although I am not advocating bugzilla). Not to mention KDE,
Gnome, Debian..
   

 

These projects also have reasonably defined milestones for particular
releases and show status of those milestones during the release.
   

Virtually all OSS projects I am involved with publish a generalized road map 
online.   Some are more organized than others.

PostgreSQL has a different culture, this is true. 
 

I don't think anybody is arguing for a radical change in culture - 
certainly I would not be so presumptuous after only a couple of years 
:-)  But a roadmap could be useful in many ways. It need not tie anybody 
down, if positioned right, but can help people to see where things are 
going, and where the gaps are. This could in a sense be as simple as 
prioritising the TODO list. Right now anybody who wants to contribute 
and looks at the list has no idea if the item is considered important or 
even if it is still thought to be desirable. There are many changes that 
can be rung on this theme - you would probably want to keep the roadmap 
process as light as possible for the cultural reasons you mention.

cheers
andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Mark Cave-Ayland
Hi Tom,

I didn't post the sources to the list originally as I wasn't sure if the
topic were of enough interest to warrant a larger email. I've attached the
two corrected programs as a .tar.gz - crctest.c uses uint32, whereas
crctest64.c uses uint64.


Kind regards,

Mark.


WebBased Ltd
17 Research Way
Plymouth
PL6 8BT 

T: +44 (0)1752 797131
F: +44 (0)1752 791023
W: http://www.webbased.co.uk
 

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: 16 May 2005 15:01
 To: Mark Cave-Ayland (External)
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Cost of XLogInsert CRC calculations
 
 
 Mark Cave-Ayland [EMAIL PROTECTED] writes:
  Sigh, so it would help if I had added the offset to the data 
  pointer... ;)
 
 Would you post the corrected program so people can try it on 
 a few other architectures?  No point in reinventing the 
 wheel, even if it is a pretty trivial wheel.
 
   regards, tom lane
 


crctest.tar.gz
Description: Binary data

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Jeffrey W. Baker
On Mon, 2005-05-16 at 09:53 -0400, Tom Lane wrote:
 Jeffrey Baker [EMAIL PROTECTED] writes:
  Change the planner/executor to use the bitmap scan in all cases where 
  index order is unimportant.  From my reading of the current code, the 
  bitmap scan is only used in case of a join.
 
 This is a fallacy, and I think your concern is largely mistaken.  Have
 you experimented with the cases you are worried about?

Perhaps I have not stated the problem clearly.  Believe me, I have
experimented extensively with this problem.  So here's the statement of
the problem: during an index scan, the heap is visited in index order,
which can be and frequently is random order on the disk.  An index scan
that currently takes 15 minutes on my database takes only 5 when the
tuples are fetched in strict disk order, and takes 8 minutes if the
tuples are fetched in disk order in groups of 1000.  The problem exists
when the scan is expected to return a lot of tuples, but the planner
chooses an index scan anyway.  In many cases, sequential scan would be
faster than index scan + random heap visits.

 It's entirely possible that the current cost model needs adjustment to
 make the planner pick the bitmap scan in more cases.  However, it is
 easy to demonstrate (either by thought-experiment or actual trial) that
 a bitmap scan isn't necessarily a win.

I agree.  There's certainly a threshold below which the bitmap would not
make sense.  It's also possible that changing the btree scan to work in
groups of tuples instead of single tuples would make more sense, which
is why I ventured two different solution to the problem.

  The overhead of maintaining the
 bitmap is far from zero, and you don't get anything unless the resulting
 pattern of heap page fetches is significantly cleaner than before.

Bitmaps scans naturally come back in TID order.  I realize this isn't
1:1 correspondence with disk order, but it's a good start.  I also like
the way the index scan and heap scan are decoupled in the bitmap code.
It leaves room for more serious optimization of disk access, which is
relatively difficult in the synchronous, one-at-a-time btree code.

 Therefore, a patch that eliminates cost-estimation in favor of trying to
 force one choice or the other is definitely not likely to be received
 favorably ...

That's not the idea.

-jwb



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] pgFoundry

2005-05-16 Thread Josh Berkus
Andrew,

 down, if positioned right, but can help people to see where things are
 going, and where the gaps are. This could in a sense be as simple as
 prioritising the TODO list. Right now anybody who wants to contribute
 and looks at the list has no idea if the item is considered important or
 even if it is still thought to be desirable. There are many changes that
 can be rung on this theme - you would probably want to keep the roadmap
 process as light as possible for the cultural reasons you mention.

The substantial problem here is that nobody *wants* to create a roadmap type 
document.   If you can find a volunteer, it'd be worth discussing -- I can 
see a way we can make a roadmap without being deceptive about how we get 
features.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] pgFoundry

2005-05-16 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I don't think anybody is arguing for a radical change in culture - 
 certainly I would not be so presumptuous after only a couple of years 
 :-)  But a roadmap could be useful in many ways. It need not tie anybody 
 down, if positioned right, but can help people to see where things are 
 going, and where the gaps are. This could in a sense be as simple as 
 prioritising the TODO list.

I think that even getting that done would turn into a flamew^H^H^H^Hhuge
distraction.  The way things really work around here is that individual
developers have their own priorities and they work on what seems most
important to them at the time.  (In some cases those priorities may be
set by their companies more than by the individuals, but that's
irrelevant from the community's perspective.)  ISTM any sort of
project-wide prioritization would be either (1) meaningless or (2) a
guaranteed-to-fail attempt to assert control over other contributors.

But the TODO list could certainly be made more informative without
getting into that swamp.  I don't think it does very well at conveying
the relative sizes of the work items, nor the extent to which there is
consensus about how particular problems ought to be solved (the fact
that something is on TODO does not necessarily mean that all the major
contributors have bought into it...).  And of course you're right that
it tells nothing at all about whether progress is currently being made
on a given item.  The markers indicating that someone has expressed
interest in an item don't mean they are actively doing anything with it.

The real difficulty here is exactly what Lamar noted: who's going to do
the work?  Bruce seems to be swamped already, so we'd need a new
volunteer to maintain a more useful TODO list, and there doesn't seem
to be anyone who wants to do it and has the depth of familiarity with
the project to do a good job of it.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] pgFoundry

2005-05-16 Thread Robert Treat
On Mon, 2005-05-16 at 12:50, Lamar Owen wrote:
 From a sidelines point of view, a developer status summary page would allow 
 one to follow development without having to read every message in HACKERS.  
 At this point in my work, I am unable to follow development like I once did 
 (one reason I stepped down as RPM maintainer) and have no real idea of the 
 direction for 8.1 as a result.

I don't think anyone is against this idea, but as of yet no one has
stepped forward with the time to keep such a thing updated. 

 The only way in my mind to get this dynamism on the website is to make the 
 website part of the process at some level.  If someone has to go through and 
 digest for the website (hacker-bits, a la general bits) then that takes away 
 developer resources (because someone has to be fairly close to the 
 development process to do that sort of digestion).  Rather, if developers are 
 using a system that automatically pushes to a web interface (or even uses a 
 web interface with a cli component) then the status is automatically 
 generated and the work of updating status is distributed.

One idea I've tossed around is requiring patches to include release
notes, and then display the release notes on the web site as a done so
far type of list. It doesn't get you what is under active development,
but would get you a more up-to-date picture of changes as a release
evolves. 


Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] alternate regression dbs?

2005-05-16 Thread Andrew Dunstan

Joe Conway wrote:
Tom Lane wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
Try attached ... season to taste. The bulk of it is changes for 
dblink which has the dbname hardcoded.

Joe, any objections here?
Hmm, I can't find the message with the attachment, in my inbox or in 
the list archives. Can anyone point me to it?

my fault - I sent the original to the wrong list - meanwhile, Tom, who 
was copied on the original, replied to that :-)

Anyway, see 
http://archives.postgresql.org/pgsql-patches/2005-05/msg00179.php

cheers
andrew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] Returning the name of a primary key

2005-05-16 Thread Juan Pablo Espino
Hello all

I need to write a function that retrieve the name of at least one
table primary key, if it exists.  The only argument passed to the
function is the table name.  I have thought something like this:

char *
give_pkey(char * table_char)
TupleDesc   tupdesc;
Form_pg_attribute   att;
Form_pg_index   ind;
int i, c=0, temp=-1;

tupdesc = (TupleDesc) RelationNameGetTupleDesc(tabla_char);

ind = something that idicates which table is
for (i=0; i(tupdesc-natts); i++)
{
att = tupdesc-attrs[i];
c = c + 1;

/* Something that can compare each attribute to determine
if it is a primary key ?*/
if ((ind-indisprimary)  (temp=-1))
{
 temp = c;
 att = tupdesc-attrs[temp];
 }
  }
  return pstrdup(NameStr(att-attname));
}

Sorry, I don't have much experience in c programming, thanks in
advance for any suggestions, regards,

Juan P. Espino

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Hannu Krosing
On P, 2005-05-15 at 23:58 -0700, Jeffrey Baker wrote:

 I'm considering one of the following courses of action:
 
 Change nodeIndexscan.c to call index_getmulti, and to handle multiple 
 tuples returned.  That code would sort the tuple array and store the 
 tuples in the result in disk order.
 
 -or-
 
 Change the planner/executor to use the bitmap scan in all cases where 
 index order is unimportant.  From my reading of the current code, the 
 bitmap scan is only used in case of a join.

I think the 2nd option is probably the way to go.  Probably not _all_
cases - it's probably cheper to not build a bitmap for cases where the
index returns only a few (or just one) rows.

OTOH, some scheme, where you fill sort_mem-sized memory structure with
tuples, which are fetched in heap order but stored in that structure in
index order could also be an interesting optimisastion for sort-order
preserving btree-index scans. this could also be used for bigger sort as
first step by saving each chunk to disk and then merging them back into
result.

in pseudocode one iteretion of this could go like this

TIDARRAY = ARRAY[1000]
I = 0
FOR TID in FETC_FROM_BTREE(0, 1000):
ARRAY [I] := (TID, I)
SORT_ARRAY_ON_TID()# this must be much faster than sorting
   # whole tuples for this method to be a win
   # over bitmap_scan + sort.
   # using some self-ordering structure like
   # btree/rbtree for TIDARRAY is also an option
OUTARRAY = ARRAY[1000]
FOR (TID, I) IN TIDARRAY:
OUTARRAY[I] = FETCH_FROM_HEAP(TID)
RETURN OUTARRAY

This can probably not use bitmap scan code, but has the nice property of
doing the disk accesses in file order but still returning the result in
index order.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Tom Lane
Jeffrey W. Baker [EMAIL PROTECTED] writes:
 On Mon, 2005-05-16 at 09:53 -0400, Tom Lane wrote:
 This is a fallacy, and I think your concern is largely mistaken.  Have
 you experimented with the cases you are worried about?

 Perhaps I have not stated the problem clearly.  Believe me, I have
 experimented extensively with this problem.

Sorry, perhaps I wasn't clear: have you experimented *with CVS tip*?
The current code is certainly capable of choosing either seqscan,
bitmap scan, or traditional index scan depending on the given query.
For example,

regression=# explain analyze select * from tenk1 where unique1 between 100 and 
1000;
QUERY PLAN
--
 Bitmap Heap Scan on tenk1  (cost=9.58..381.53 rows=930 width=244) (actual 
time=6.185..18.034 rows=901 loops=1)
   Recheck Cond: ((unique1 = 100) AND (unique1 = 1000))
   -  Bitmap Index Scan on tenk1_unique1  (cost=0.00..9.58 rows=930 width=0) 
(actual time=4.522..4.522 rows=901 loops=1)
 Index Cond: ((unique1 = 100) AND (unique1 = 1000))
 Total runtime: 23.784 ms
(5 rows)

regression=# explain analyze select * from tenk1 where unique2 between 100 and 
1000;
 QUERY PLAN
-
 Index Scan using tenk1_unique2 on tenk1  (cost=0.00..45.88 rows=805 width=244) 
(actual time=0.154..14.856 rows=901 loops=1)
   Index Cond: ((unique2 = 100) AND (unique2 = 1000))
 Total runtime: 20.331 ms
(3 rows)

(The reason these apparently-identical queries get different plans is
that the unique2 column is physically ordered, so the plain indexscan
gets a very high correlation discount.)  There are enable_foo switches
to let you force selection of any one of the three ways for testing
purposes.

 It's also possible that changing the btree scan to work in
 groups of tuples instead of single tuples would make more sense, which
 is why I ventured two different solution to the problem.

My feeling is that that would add a lot of complexity for dubious win.
The reason it's dubious is that it would penalize some cases, for
instance LIMIT-type queries where you aren't going to fetch many tuples
anyway.  I think that at least for the time being (8.1 time frame) we
should leave traditional indexscans alone and concentrate on being sure
we are getting the most we can out of the new bitmap scan code.  Only
after that dust has settled will we have hard facts with which to argue
about whether compromise behaviors might be wins.

I think the work that's most needed at the moment is to test the
bitmap-scan cost model to see if it has much to do with reality...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Returning the name of a primary key

2005-05-16 Thread Alvaro Herrera
On Mon, May 16, 2005 at 01:25:46PM -0500, Juan Pablo Espino wrote:

 I need to write a function that retrieve the name of at least one
 table primary key, if it exists.  The only argument passed to the
 function is the table name.  I have thought something like this:

Why mess around with a C function, if you can do it in straight SQL?
You can write a SQL function if need be.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos / con todos los humanos acabaré (Bender)

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Jeffrey W. Baker
On Mon, 2005-05-16 at 14:35 -0400, Tom Lane wrote:
 Jeffrey W. Baker [EMAIL PROTECTED] writes:
  It's also possible that changing the btree scan to work in
  groups of tuples instead of single tuples would make more sense, which
  is why I ventured two different solution to the problem.
 
 My feeling is that that would add a lot of complexity for dubious win.
 The reason it's dubious is that it would penalize some cases, for
 instance LIMIT-type queries where you aren't going to fetch many tuples
 anyway.  I think that at least for the time being (8.1 time frame) we
 should leave traditional indexscans alone and concentrate on being sure
 we are getting the most we can out of the new bitmap scan code.  Only
 after that dust has settled will we have hard facts with which to argue
 about whether compromise behaviors might be wins.

I agree.  I'll look at how my workload behaves with CVS code.  I wasn't
proposing this for 8.1 inclusion, and the TODO isn't marked for 8.1.

 I think the work that's most needed at the moment is to test the
 bitmap-scan cost model to see if it has much to do with reality...

Alright.

-jwb

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] pgFoundry

2005-05-16 Thread Lamar Owen
On Monday 16 May 2005 14:12, Robert Treat wrote:
 On Mon, 2005-05-16 at 12:50, Lamar Owen wrote:
  The only way in my mind to get this dynamism on the website is to make
  the website part of the process at some level.  If someone has to go

 One idea I've tossed around is requiring patches to include release
 notes, and then display the release notes on the web site as a done so
 far type of list. It doesn't get you what is under active development,
 but would get you a more up-to-date picture of changes as a release
 evolves.

Well, there is always the '-committers' list; if all CVS commits had/have 
meaningful commit changelog notices, then that could drive something.

There are two things being talked about here:
1.) A forward-looking rad map;
2.) A status indication of where development is happening, and a history of 
past development.

In my mind 2 is more important than 1, for all the reasons Tom already 
mentioned.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] pgFoundry

2005-05-16 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 One idea I've tossed around is requiring patches to include release
 notes, and then display the release notes on the web site as a done so
 far type of list. It doesn't get you what is under active development,
 but would get you a more up-to-date picture of changes as a release
 evolves. 

We did do that (not very rigorously) during the 7.4 release cycle.
I'm not sure why we fell out of the habit again for 8.0.  It seems
like a reasonable idea to me.  I don't think I'd necessarily do it
exactly the way it was done before though --- rather than keep the
info in release.sgml, which is big and hard to edit already, maybe
a separate plain-text file would be easier to work with.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Returning the name of a primary key

2005-05-16 Thread Tom Lane
Juan Pablo Espino [EMAIL PROTECTED] writes:
 I need to write a function that retrieve the name of at least one
 table primary key, if it exists.  The only argument passed to the
 function is the table name.  I have thought something like this:

You need to be searching the list of indexes, not the attributes per se.
ATExecDropNotNull() might be a useful example.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] pgFoundry

2005-05-16 Thread Lamar Owen
On Monday 16 May 2005 14:07, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  This could in a sense be as simple as
  prioritising the TODO list.

 But the TODO list could certainly be made more informative without
 getting into that swamp. 

We need to prune the TODO list to make it more useful.  This will be hard, 
this is true, but if a pruning discussion for each item on the list is held 
then the real interest in the item would stick out like a sore thumb.  It's 
just too big and not really hierarchical.

Are we too close to freeze and beta on 8.1 to have this sort of discussion?  
It doesn't need to be discussed close to a beta cycle, IMO, or it could 
easily turn into the huge distraction Tom speaks of.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Simon Riggs
On Mon, 2005-05-16 at 12:12 +0100, Mark Cave-Ayland wrote:
 This now gives the following (correct) result on both platforms:
   Win32: 1.8GHz P4, WinXP
   Linux: 2.8GHz Xeon, FC1
 
 
   Win32 UINT64:   0x782104059a01660   (crc0)
 ~158us
   Win32 UINT32:   0x78210405 (crc1), 0x59a01660 (crc0)~58us
 
   FC1 UINT64: 0x782104059a01660 (crc0)
 ~76us
   FC1 UINT32: 0x78210405 (crc1), 0x59a01660 (crc0)
 ~29us
 
 
 Note that we still find that using the INT64_IS_BUSTED code is over 100%
 quicker than the UINT64 code for CRC64 calculation, and I believe it is not
 being used by default under Linux or Win32 for 32 bit platforms. Of course
 Tom's suggestion of going for CRC32 across the board would hopefully solve
 this entirely and bring the times down a little further too.

I think perhaps that the difference in hardware is the reason for the
difference in elapsed time, not the OS.

The performance gain is disturbing. I think its supposed to be the other
way around isn't it? Like having INT64 is supposed to be good...

Perhaps the BIOS on your systems don't correctly support 64-bit, so
mimicking it costs more. 

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] keepalive

2005-05-16 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,
this weekend our DB server crashed ( 7.4.x ) and the DB engine was relocated
on another server. So far so well.

Unfortunatelly most of our clients are still on a recv trying to
recv data from the old DB engine instance:

#  netstat -anp | grep ESTABLISHED | grep 54497
tcp0  0 193.251.135.46:5449710.10.50.47:5432ESTABLISHED 
27331/psql

# ps -eafwww | grep 27331
apache   27331 27327  0 May13 ?00:00:00 psql -t -c SELECT 
is_authorized_flag FROM v_ts_quota WHERE login='*' -h x.y.z.w  yyy


as you can see this connection is there since 3 days now.

Digging on it I see these sockets are not using the keepalive options:


netstat -anop | grep ESTABLISHED | grep 54497
tcp0  0 193.251.135.46:5449710.10.50.47:5432ESTABLISHED 
27331/psql  off (0.00/0/0)

normaly on other connections I see:

tcp0  0 127.0.0.1:199   127.0.0.1:32784 ESTABLISHED 
1255/snmpd  keepalive (7102.52/0/0)


The same happens with JDBC connections ( the keepalive missing ).


There is a reason for this or a way to instruct psql to use the keepalive 
option ?


Regards
Gaetano Mendola




-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCiJNF7UpzwH2SGd4RAh8aAJ0WOBEzzjYf1gj1OaFGsFBE9mr4hgCfUhna
D1F420Pa94lvrA04xA73tiE=
=muzn
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Hannu Krosing
On E, 2005-05-16 at 12:12 +0100, Mark Cave-Ayland wrote:
  -Original Message-
  From: Mark Cave-Ayland [mailto:[EMAIL PROTECTED] 
  Sent: 16 May 2005 09:04
  To: 'Simon Riggs'
  Cc: 'Christopher Kings-Lynne'; 'Tom Lane'; 'Bruce Momjian'; 
  'pgsql-hackers@postgresql.org'
  Subject: RE: [HACKERS] Cost of XLogInsert CRC calculations
 

 
 Win32 UINT64: 0x782104059a01660 (crc0) ~158us
 Win32 UINT32: 0x78210405 (crc1), 0x59a01660 (crc0)  ~58us
 FC1 UINT64:   0x782104059a01660 (crc0)  ~76us
 FC1 UINT32:   0x78210405 (crc1), 0x59a01660 (crc0)  ~29us

It would interesting to see how much of the time taken is actual algorithm 
and how much is getting at the data (i.e the fact that the page has to go 
through CPU at all, instead DMA'ing it to disk). 

In your test setup you probably have the whole thing in CPU cache already, 
but still it would be interesting to time the same thing with some simple 
algorithm, like XOR over the whole 8k page, for comparison.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Teodor Sigaev
About page splitting algorithm in GiST in multikey case. For the beginning, page 
is splitted by calling pickSplit method of key of first column (pickSplit method 
is defined for opclass and it is a user function), then it try to find equal 
values  of first column in left and right pages ( gist.c lines 1264-1901 ). If 
it is, then GiST core will try to resort tuples with first equal keys between 
left and right pages using penalty method for second and higher column's key. If 
it's not, it leave pages untouched. But unions for parent page of second and 
higher column's keys will be formed.

So, if index is defined as 'using gist (a,b,c)' then, in principle, GiST index 
can speed up queries like 'a=V1 and c=V2'.  But it will not usable for queries 
( b=V3 and c=V2 ). By the way, instead of '=' operation may be used other 
operations. Number of supported operations by GiST is indefinite unlike, for 
example, btree which supported only five: , =, =, =, .


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Jeffrey W. Baker [EMAIL PROTECTED] writes:
  I see that Tom has already done the infrastructure work by adding
  getmulti, but getmulti isn't used by nodeIndexscan.c, only
  nodeBitmapIndexscan.c.  Will btree index scans be executed by creating
  in-memory bitmaps in 8.1, or will some scans still be executed the usual
  way?
 
 We aren't going to remove the existing indexscan behavior, because
 bitmap scans lose the ordering of the underlying index.  There are many
 situations where that ordering is important.  (See for instance the
 recent changes to make MAX/MIN use that behavior.)

Hm. There are other circumstances where the ordering doesn't matter. When
there's another unrelated ORDER BY clause or merge join wrapped around the
index scan for example.

This suggests one new 8.1 optimization strategy may be to add strategic no-op
OR clauses to cause 8.1 to use a bitmapOr node.

For example something like this where flag isn't very selective (say 25%)
might run more slowly than a sequential scan because of the random access
pattern.

SELECT id,name,flag
  FROM tab
 WHERE flag
 ORDER BY name

But adding a no-op bitmapOr node like:

SELECT id,name,flag
  FROM tab
 WHERE flag
OR indexed_never_true_flag
 ORDER BY name

Might run faster, perhaps even more quickly than the sequential scan because
the bitmap avoids the random access pattern but doesn't have to read the whole
table.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Mike Rylander
On 5/16/05, Tom Lane [EMAIL PROTECTED] wrote:
 regression=# explain analyze select * from tenk1 where unique1 between 100 
 and 1000;
 QUERY PLAN
 --
  Bitmap Heap Scan on tenk1  (cost=9.58..381.53 rows=930 width=244) (actual 
 time=6.185..18.034 rows=901 loops=1)
Recheck Cond: ((unique1 = 100) AND (unique1 = 1000))
-  Bitmap Index Scan on tenk1_unique1  (cost=0.00..9.58 rows=930 width=0) 
 (actual time=4.522..4.522 rows=901 loops=1)
  Index Cond: ((unique1 = 100) AND (unique1 = 1000))
  Total runtime: 23.784 ms
 (5 rows)
 
 regression=# explain analyze select * from tenk1 where unique2 between 100 
 and 1000;
  QUERY PLAN
 -
  Index Scan using tenk1_unique2 on tenk1  (cost=0.00..45.88 rows=805 
 width=244) (actual time=0.154..14.856 rows=901 loops=1)
Index Cond: ((unique2 = 100) AND (unique2 = 1000))
  Total runtime: 20.331 ms
 (3 rows)
 

Tom (or anyone with some round tuits and CVS-tip savy), if you have a
chance at some point would you post the non-bitmap version of the
query for tenk2 from above?  I'd be very interested to see if the
bitmap forced TID order fetch actually does help, or if it's
outweighed by the bitmap setup overhead.

TIA

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] SQL Request Size

2005-05-16 Thread Simon Riggs

What is the maximum SQL statement length that the server will accept?

The libpq message length identifier is 4 bytes, which indicates that the
max length is 4GB. But thats not exactly the same thing...

Most other systems have a SQL request size limit much smaller than this,
though I can't find reference to this. 

Thanks,

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-16 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes:
 Sigh, so it would help if I had added the offset to the data pointer... ;)

Would you post the corrected program so people can try it on a few other
architectures?  No point in reinventing the wheel, even if it is a
pretty trivial wheel.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-16 Thread Tom Lane
Jeffrey Baker [EMAIL PROTECTED] writes:
 Change the planner/executor to use the bitmap scan in all cases where 
 index order is unimportant.  From my reading of the current code, the 
 bitmap scan is only used in case of a join.

This is a fallacy, and I think your concern is largely mistaken.  Have
you experimented with the cases you are worried about?

It's entirely possible that the current cost model needs adjustment to
make the planner pick the bitmap scan in more cases.  However, it is
easy to demonstrate (either by thought-experiment or actual trial) that
a bitmap scan isn't necessarily a win.  The overhead of maintaining the
bitmap is far from zero, and you don't get anything unless the resulting
pattern of heap page fetches is significantly cleaner than before.
Therefore, a patch that eliminates cost-estimation in favor of trying to
force one choice or the other is definitely not likely to be received
favorably ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] SQL Request Size

2005-05-16 Thread Hannu Krosing
On E, 2005-05-16 at 21:18 +0100, Simon Riggs wrote:
 What is the maximum SQL statement length that the server will accept?
 
 The libpq message length identifier is 4 bytes, which indicates that the
 max length is 4GB. But thats not exactly the same thing...
 
 Most other systems have a SQL request size limit much smaller than this,
 though I can't find reference to this. 

I've had problems with a query generated by slony that was a few hundred
kilobytes in size. It gave me query too complex error, probably
overflow somewhere in planner/optimiser.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] performance of bitmap scans in nested loop joins

2005-05-16 Thread Tom Lane
I wrote:
 ... Where we are losing is mostly on the actual manipulation
 of the bitmaps (particularly hash_seq_search which is done in
 tbm_begin_iterate; and it looks like memory allocation for the bitmap
 hashtables is nontrivial too).  I had already had a TODO item to look
 into speeding up hash_seq_search ... will see what I can find.

I got around to looking at this some more.  It seems that the basic
problem is that dynahash.c isn't optimized for very small hashtables.
The test case I'm looking at (which may be an oversimplification of
Sergey's problem) never has more than one entry in the hashtables it
creates for in-memory bitmaps, and so the hashtable overhead is pretty
significant.  Particularly the overhead of creating and deleting 'em.

The other uses of dynahash.c that are at all performance-critical seem
to deal with hashtables that are (a) fairly large and (b) long-lived.
So I'm thinking that hacking dynahash.c itself to improve this situation
would probably be counterproductive overall.

An idea that comes to mind is to allow tidbitmap.c to handle the cases
of zero or one page entry directly, storing the page entry in a simple
field of the TIDBitmap struct.  Only when the bitmap needs entries for
more than one heap page will we create a subsidiary hash table.  (Note
that we could store bits for more than one tuple, if they happen to be
on the same heap page.)

This would uglify the logic in tidbitmap.c a bit, but it doesn't seem
completely preposterous.  The main objection I can see is that it would
only optimize the zero-or-one-page cases, which might be insufficient.

Anyone have a better idea for speeding up operations on small bitmaps?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] SQL Request Size

2005-05-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 What is the maximum SQL statement length that the server will accept?

There is no fixed limit, short of where you start to overflow memory
and/or stack depth.
http://archives.postgresql.org/pgsql-general/2001-02/msg00776.php

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] SO_KEEPALIVE

2005-05-16 Thread Hannu Krosing
On E, 2005-05-16 at 19:22 +0200, Dennis Bjorklund wrote:
 On Mon, 16 May 2005, Tom Lane wrote:
 
   How come we don't set SO_KEEPALIVE in libpq? 
   Is there any reason why we wouldn't want it on?
  
  Is there any reason we *would* want it on?  The server-side keepalive
  should be sufficient to get whatever useful impact it might have.
 
 Wouldn't the client also want to know that the server is not there
 anymore? I talked to Gaetano Mendola (I think, but you never know on irc
 :-) and he had some clients that had been hanging around for 3 days after
 the server had been down and later up again (stuck in recv).

stuck in recv is symptom of a reconnect bug when libpq first tries to
test for a SSL connection but the connect has already gone away.
(search for [HACKERS] oldish libpq bug still in RC2 in lists)
Tom fixed it in no time once I showed him where to look and provided a
test case. It should be fixed in 8.0.

I don't know if the fix was backported to older libpq versions as well.

 Server-side keepalive is enough for the server to clean up when clients
 disapears, but this do nothing to help clients detect that the server is
 gone. So I don't see what server side keepalive has to do with it.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Bruce Momjian

If people have GIST TODOs, please post them.

---

Teodor Sigaev wrote:
 About page splitting algorithm in GiST in multikey case. For the beginning, 
 page 
 is splitted by calling pickSplit method of key of first column (pickSplit 
 method 
 is defined for opclass and it is a user function), then it try to find equal 
 values  of first column in left and right pages ( gist.c lines 1264-1901 ). 
 If 
 it is, then GiST core will try to resort tuples with first equal keys between 
 left and right pages using penalty method for second and higher column's key. 
 If 
 it's not, it leave pages untouched. But unions for parent page of second and 
 higher column's keys will be formed.
 
 So, if index is defined as 'using gist (a,b,c)' then, in principle, GiST 
 index 
 can speed up queries like 'a=V1 and c=V2'.  But it will not usable for 
 queries 
 ( b=V3 and c=V2 ). By the way, instead of '=' operation may be used other 
 operations. Number of supported operations by GiST is indefinite unlike, for 
 example, btree which supported only five: , =, =, =, .
 
 
 
 -- 
 Teodor Sigaev   E-mail: [EMAIL PROTECTED]
 WWW: http://www.sigaev.ru/
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] pgFoundry

2005-05-16 Thread Bruce Momjian

What projects have big roadmaps?  The only one I can think of is
Mozilla, and we all know how well that worked (aka Firefox).  In fact,
you could argue that the Mozilla focus on the roadmap blinded them to
focusing on user needs, and made the obsolete.

I have modifed the TODO HTML so the completed items are in italics.

---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  I don't think anybody is arguing for a radical change in culture - 
  certainly I would not be so presumptuous after only a couple of years 
  :-)  But a roadmap could be useful in many ways. It need not tie anybody 
  down, if positioned right, but can help people to see where things are 
  going, and where the gaps are. This could in a sense be as simple as 
  prioritising the TODO list.
 
 I think that even getting that done would turn into a flamew^H^H^H^Hhuge
 distraction.  The way things really work around here is that individual
 developers have their own priorities and they work on what seems most
 important to them at the time.  (In some cases those priorities may be
 set by their companies more than by the individuals, but that's
 irrelevant from the community's perspective.)  ISTM any sort of
 project-wide prioritization would be either (1) meaningless or (2) a
 guaranteed-to-fail attempt to assert control over other contributors.
 
 But the TODO list could certainly be made more informative without
 getting into that swamp.  I don't think it does very well at conveying
 the relative sizes of the work items, nor the extent to which there is
 consensus about how particular problems ought to be solved (the fact
 that something is on TODO does not necessarily mean that all the major
 contributors have bought into it...).  And of course you're right that
 it tells nothing at all about whether progress is currently being made
 on a given item.  The markers indicating that someone has expressed
 interest in an item don't mean they are actively doing anything with it.
 
 The real difficulty here is exactly what Lamar noted: who's going to do
 the work?  Bruce seems to be swamped already, so we'd need a new
 volunteer to maintain a more useful TODO list, and there doesn't seem
 to be anyone who wants to do it and has the depth of familiarity with
 the project to do a good job of it.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Christopher Kings-Lynne

Bruce Momjian wrote:
If people have GIST TODOs, please post them.
Concurrency :)
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [HACKERS] SQL99 hierarchical queries stalled

2005-05-16 Thread David Fetter
On Mon, May 16, 2005 at 03:09:18PM -0700, [EMAIL PROTECTED] wrote:
 David--
 
 My boss has given me approval to put up to 8 hours per week of
 SourceLabs' time in on the SQL99 hierarchical query implementation.

That's great! :)

 (I'm free, of course, to supplement this with whatever of my own
 time I can spare.)  I'm willing to take on the work.  What's the
 next step?

I suppose the first thing would be to look over the patches I
mentioned and the SQL:2003 specification, then put together a
preliminary patch and send it to -hackers.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] SO_KEEPALIVE

2005-05-16 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 On E, 2005-05-16 at 19:22 +0200, Dennis Bjorklund wrote:
 Wouldn't the client also want to know that the server is not there
 anymore? I talked to Gaetano Mendola (I think, but you never know on irc
 :-) and he had some clients that had been hanging around for 3 days after
 the server had been down and later up again (stuck in recv).

 stuck in recv is symptom of a reconnect bug when libpq first tries to
 test for a SSL connection but the connect has already gone away.
 (search for [HACKERS] oldish libpq bug still in RC2 in lists)
 Tom fixed it in no time once I showed him where to look and provided a
 test case. It should be fixed in 8.0.

 I don't know if the fix was backported to older libpq versions as well.

It was not ... but I'm not convinced that that bug explains Gaetano's
problem.  If you'll recall, that bug caused libpq to get into a tight
loop chewing CPU.  It should be pretty easy to tell the difference
between that and sitting idle because there is nothing happening.

On the other hand, it seems to me a client-side SO_KEEPALIVE would only
be interesting for completely passive clients (perhaps one that sits
waiting for NOTIFY messages?)  A normal client will try to issue some
kind of database command once in awhile, and as soon as that happens,
there is a reasonably short timeout before connection failure is reported.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Alvaro Herrera
On Tue, May 17, 2005 at 09:39:20AM +0800, Christopher Kings-Lynne wrote:
 
 
 Bruce Momjian wrote:
 If people have GIST TODOs, please post them.
 
 Concurrency :)

And WAL support.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
No necesitamos banderas
 No reconocemos fronteras  (Jorge González)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-16 Thread Bruce Momjian
Alvaro Herrera wrote:
 On Tue, May 17, 2005 at 09:39:20AM +0800, Christopher Kings-Lynne wrote:
  
  
  Bruce Momjian wrote:
  If people have GIST TODOs, please post them.
  
  Concurrency :)
 
 And WAL support.

Already there:

* Add WAL index reliability improvement to non-btree indexes

and this too:

* Add concurrency to GIST

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] SQL99 hierarchical queries stalled

2005-05-16 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 What's the next step?

 I suppose the first thing would be to look over the patches I
 mentioned and the SQL:2003 specification, then put together a
 preliminary patch and send it to -hackers.

You can get useful feedback long before having anything that looks
like a patch --- and I'd encourage you to do so.  Send us design
notes, rough data structures, etc.

Quite honestly, hierarchical queries aren't the easiest thing in the
world and wouldn't be my recommendation of the first ... or even the
second ... backend hacking project for a new posthacker to tackle.
If that's where you feel you must start, OK, but try to get as much
feedback as soon as you can, sooner not later.

I seem to recall some discussion of how to do this in the past;
have you trolled the pghackers archives?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] SO_KEEPALIVE

2005-05-16 Thread Dennis Bjorklund
On Mon, 16 May 2005, Tom Lane wrote:

 On the other hand, it seems to me a client-side SO_KEEPALIVE would only
 be interesting for completely passive clients (perhaps one that sits
 waiting for NOTIFY messages?)  A normal client will try to issue some
 kind of database command once in awhile

At least some of the clients was psql.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Returning the name of a primary key

2005-05-16 Thread Jim C. Nasby
On Mon, May 16, 2005 at 02:35:03PM -0400, Alvaro Herrera wrote:
 On Mon, May 16, 2005 at 01:25:46PM -0500, Juan Pablo Espino wrote:
 
  I need to write a function that retrieve the name of at least one
  table primary key, if it exists.  The only argument passed to the
  function is the table name.  I have thought something like this:
 
 Why mess around with a C function, if you can do it in straight SQL?
 You can write a SQL function if need be.

http://lnk.nu/cvs.pgfoundry.org/2op.sql is an example of how to get the
info in pure SQL.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Learning curves and such (was Re: [HACKERS] pgFoundry)

2005-05-16 Thread Tom Lane
Lamar Owen [EMAIL PROTECTED] writes:
 To put it much more bluntly: PostgreSQL development (both the process
 and the codebase) has one of the steepest learning curves around,

The backend hacking curve is certainly steep, but I wonder whether the
problem isn't largely one of people biting off more than they can chew.

I got my start by hacking some things in libpq, which is way more
self-contained than any aspect of the backend; and then started hacking
relatively small backend stuff.  I think the reason I know as much as
I do today is that I've always been willing to investigate minor bugs.
No one of them was all *that* exciting, but over time I've had the
opportunity to study a lot of the backend code.  Nearby you can watch
Neil Conway bootstrapping himself by doing minor code beautification
projects --- again not that exciting in itself, but useful, and in any
case the *real* reason he's doing it is to learn the backend code in
general.  (Right, Neil?)

As against that I notice some new arrivals proposing to add deductive
reasoning to Postgres:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01045.php
or implement SQL99 hierarchical queries:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01089.php

I might be wrong, but I'll bet lunch that neither of those projects will
come to anything.  You can't run before you learn to crawl.

Maybe what we need is some documentation about how to get started
as a Postgres hacker --- what to read, what sort of things to tackle
for your first hack, etc.  I think the people who have been successful
around here are the ones who have managed to figure out the syllabus
by themselves ... but surely we could try to teach those who come
after.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: Learning curves and such (was Re: [HACKERS] pgFoundry)

2005-05-16 Thread Josh Berkus
Lamar,

  To put it much more bluntly: PostgreSQL development (both the process
  and the codebase) has one of the steepest learning curves around,

You haven't looked at the OpenOffice.org code.  wince

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: Learning curves and such (was Re: [HACKERS] pgFoundry)

2005-05-16 Thread Jim C. Nasby
On Tue, May 17, 2005 at 01:32:03AM -0400, Tom Lane wrote:
 Maybe what we need is some documentation about how to get started
 as a Postgres hacker --- what to read, what sort of things to tackle
 for your first hack, etc.  I think the people who have been successful
 around here are the ones who have managed to figure out the syllabus
 by themselves ... but surely we could try to teach those who come
 after.

Didn't such a document exist at one point? I seem to remember reading
about it on the old website...

I think it might also be valuable to have a list of things that are good
'starter projects'. Maybe some indication of TODO items that are
simpler. Possibly a list of bugs, too.

It would probably also be useful to point out ways people can help that
don't involve hacking C code (or at least not much). Things like docs,
the website, advocacy, performance testing, etc.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly