Re: [HACKERS] Need a mentor, and a project.

2009-12-16 Thread Gurjeet Singh
2009/12/16 decibel deci...@decibel.org

 On Dec 11, 2009, at 8:44 PM, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
  Ashish wrote:
  I am thinking about starting with the following TODO item:
  -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated
  and actual row counts differ by a specified percentage.
 
  I even have a sample patch you can use as a start, attached.
 
  IMO the hard part of the TODO item is to design a useful user interface
  for highlighting specific EXPLAIN entries (and NOTICE messages probably
  ain't it either).  Getting the numbers is trivial.

 What about prefixing explain output with line numbers? NOTICEs (or whatever
 mechanism) could then reference the line numbers.


+1

-- 
Lets call it Postgres

EnterpriseDB  http://www.enterprisedb.com

gurjeet[.sin...@enterprisedb.com

singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com
Twitter: singh_gurjeet
Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 09:10 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Investigating how that could come about, it looks like there is some
  fairly strange stuff going on here. StandbyRecoverPreparedTransactions()
  is never called at all.
 
 I told you so:
 http://archives.postgresql.org/message-id/4b260b5e.3010...@enterprisedb.com

OK, I didn't understand that. Removed.


  * Skip assert when doing SubTransSetParent in
  RecoverPreparedTransactions() when in Hot Standby mode because some
  subxids will have been marked already
 
 Seems reasonable. Let's still Assert that the parent is the same as last
 time.

Fix committed to git.

---

That fixes or explains all known issues, from me. Are there any other
things you know about that I haven't responded to? Do you think we have
addressed every issue, except deferred items?

I will be looking to commit to CVS later today; waiting on any
objections.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Hiroyuki Yamada

That fixes or explains all known issues, from me. Are there any other
things you know about that I haven't responded to? Do you think we have
addressed every issue, except deferred items?

I will be looking to commit to CVS later today; waiting on any
objections.


Is following problem reported or fixed ?

-
1. configure with --enable-cassert option, then make, make install
2. initdb, enable WAL archiving
3. run the server
4. run pgbench -i, with scaling factor 10 or more
5. server dies with following backtrace

(gdb) backtrace
#0  0x009e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00a22815 in raise () from /lib/tls/libc.so.6
#2  0x00a24279 in abort () from /lib/tls/libc.so.6
#3  0x082dbf98 in ExceptionalCondition (conditionName=0x84201d4 
!(lock-nGranted == 1), errorType=0x8308dd4 FailedAssertion,
fileName=0x8420fb2 lock.c, lineNumber=2296) at assert.c:57
#4  0x08231127 in GetRunningTransactionLocks (nlocks=0x0) at lock.c:2296
#5  0x0822c110 in LogStandbySnapshot (oldestActiveXid=0x0, nextXid=0x0) at 
standby.c:578
#6  0x080cc13f in CreateCheckPoint (flags=32) at xlog.c:6826
#7  0x08204cf6 in BackgroundWriterMain () at bgwriter.c:490
#8  0x080ec291 in AuxiliaryProcessMain (argc=2, argv=0xbff25cc4) at 
bootstrap.c:413
#9  0x0820b0af in StartChildProcess (type=Variable type is not available.
) at postmaster.c:4218
#10 0x0820c722 in reaper (postgres_signal_arg=17) at postmaster.c:2322
#11 signal handler called
#12 0x009e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#13 0x00abcbbd in ___newselect_nocancel () from /lib/tls/libc.so.6
#14 0x0820b2b8 in ServerLoop () at postmaster.c:1360
#15 0x0820d59e in PostmasterMain (argc=3, argv=0x8579860) at postmaster.c:1065
#16 0x081b78f8 in main (argc=3, argv=0x8579860) at main.c:188
-

Also, is the problem reported in 
http://archives.postgresql.org/pgsql-hackers/2009-12/msg01324.php
fixed or deferred ?


regrards,


--
  Hiroyuki YAMADA
  Kokolink Corporation
  yam...@kokolink.net

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 18:08 +0900, Hiroyuki Yamada wrote:
 That fixes or explains all known issues, from me. Are there any other
 things you know about that I haven't responded to? Do you think we have
 addressed every issue, except deferred items?
 
 I will be looking to commit to CVS later today; waiting on any
 objections.
 
 
 Is following problem reported or fixed ?

That is fixed, as of a couple of days ago. Thanks for your vigilence.

 Also, is the problem reported in 
 http://archives.postgresql.org/pgsql-hackers/2009-12/msg01324.php
 fixed or deferred ?

Wasn't aware of it. Am now. Will respond on that thread

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
   foreach p2_member in unnest(p2) loop
 p1 := array(select period_except(p1_member, p2_member)
 from unnest(p1) p1_member);
   end loop;

 But maybe it can be done in a single SQL command.

Yeah, as soon as you have LATERAL, I think. Without it there's no way to
compose SRF in SQL, AFAIK.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Albe Laurenz
Kevin Grittner wrote:
 Just to make those who care aware of it, here is Michael Cahill's
 Doctoral Thesis based on implementing Serializable Snapshot
 Isolation in InnoDB using a refined version of the techniques
 previously used in the Berkley DB (and previously discussed on this
 list):
  
 http://hdl.handle.net/2123/5353
  
 Seriously, this post is just for the benefit of those who may be
 interested in following these developments -- I don't have the
 inclination or energy for another round of debate on the topic just
 now.  :-/

I understand that, and thank you for the information.
Although it may have seemed that I was out to shoot the idea down,
I am interested in the topic. I guess my way of understanding something
is trying to find holes in it...

I read into the text, and I was particularly interested how he solved
the problem of phantom reads.

Quote:
   The problem [of phantom reads] was identified in (Eswaran et al., 1976),
   but the general purpose predicate locking solution suggested there
   has not been widely adopted because of the difficulty in testing mutual
   satisfiability of predicates.

   Instead, locking DBMS implementations commonly use algorithms based on
   next-key locking. In these, a range of key space is protected against
   concurrent insertion or deletion by acquiring a shared lock on the next
   row in order, as a scan is made to check whether rows match a predicate.
   The scan might be through the data records or through an index.

   Inserts and deletes follow the same protocol, obtaining an exclusive
   lock on the row after the one being inserted or deleted. The result
   of this locking protocol is that a range scan prevents concurrent
   inserts or delete within the range of the scan, and vice versa.

That sounds like it should actually work.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-16 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Tue, Dec 15, 2009 at 3:47 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Tom Lane wrote:
 The very, very large practical problem with this is that if you decide
 to change the behavior at any time, the only way to be sure that the WAL
 receiver is using the right libpq version is to perform a soname major
 version bump.  The transformations done by libpq will essentially become
 part of its ABI, and not a very visible part at that.
 Not having to change the libpq API would certainly be a big advantage.
 
 Done; I replaced PQgetXLogData and PQputXLogRecPtr with PQgetCopyData and
 PQputCopyData.

Great! The logical next step is move the handling of TimelineID and
system identifier out of libpq as well.


I'm thinking of refactoring the protocol along these lines:

0. Begin by connecting to the master just like a normal backend does. We
don't necessarily need the new ProtocolVersion code either, though it's
probably still a good idea to reject connections to older server versions.

1. Get the system identifier of the master.

Slave - Master: Query message, with a query string like
GET_SYSTEM_IDENTIFIER

Master - Slave: RowDescription, DataRow CommandComplete, and
ReadyForQuery messages. The system identifier is returned in the DataRow
message.

This is identical to what happens when a query is executed against a
normal backend using the simple query protocol, so walsender can use
PQexec() for this.

2. Another query exchange like above, for timeline ID. (or these two
steps can be joined into one query, to eliminate one round-trip).

3. Request a backup history file, if needed:

Slave - Master: Query message, with a query string like
GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr or file name.

Master - Slave: RowDescription, DataRow CommandComplete and
ReadyForQuery messages as usual. The file contents are returned in the
DataRow message.


4. Start replication

Slave - Master: Query message, with query string START REPLICATION:
, where  is the RecPtr of the starting point.

Master - Slave: CopyOutResponse followed by a continuous stream of
CopyData messages with WAL contents.


This minimizes the changes to the protocol and libpq, with a clear way
of extending by adding new commands. Similar to what you did a long time
ago, connecting as an actual backend at first and then switching to
walsender mode after running a few queries, but this would all be
handled in a separate loop in walsender instead of running as a
full-blown backend. We'll still need small changes to libpq to allow
sending messages back to the server in COPY_IN mode (maybe add a new
COPY_IN_OUT mode for that).

Thoughts?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG patch N+1, fix auto-prepare

2009-12-16 Thread Michael Meskes
 OK, here's another approach. output_statement()'s interface
 is kept as the original, and not this function decides which

I still think this could be solved more easily.

 value it uses. I also introduced
 static char *ecpg_statement_type_name[]
 for the textual names of the ECPGst_* symbols to keep the
 preprocessed code readable, and minimize the impact on the
 regression tests. So output_statement() always emits
 ECPGst_* symbols in the preprocessed code instead of
 ECPGst_normal/prepnormal and numeric value for the
 other two cases. This way only 7 regression tests' source
 has changed instead of 45... There are less
 1 - ECPGst_execute and
 2 - ECPGst_exec_immediate
 changes than
 ECPGst_normal - 0
 changes would have been if I chose emitting the numeric value.
 
 Is it acceptable?

Yes sure.

I changed some small parts of your patch (see above) and will commit in a few
minutes. Just running regression tests.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org
VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-16 Thread Greg Stark
I'm interested in abstracting out features of replication from libpq too. It
would be nice if we could implement different communication bus modules.

For example if you have dozens of replicas you may want to use something
like spread to distribute the records using multicast.

Sorry for top posting -- I haven't yet figured out how not to in this
client.

On 16 Dec 2009 09:54, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

Fujii Masao wrote:  On Tue, Dec 15, 2009 at 3:47 AM, Heikki Linnakangas 
heikki.linnakan...@enter...
Great! The logical next step is move the handling of TimelineID and
system identifier out of libpq as well.


I'm thinking of refactoring the protocol along these lines:

0. Begin by connecting to the master just like a normal backend does. We
don't necessarily need the new ProtocolVersion code either, though it's
probably still a good idea to reject connections to older server versions.

1. Get the system identifier of the master.

Slave - Master: Query message, with a query string like
GET_SYSTEM_IDENTIFIER

Master - Slave: RowDescription, DataRow CommandComplete, and
ReadyForQuery messages. The system identifier is returned in the DataRow
message.

This is identical to what happens when a query is executed against a
normal backend using the simple query protocol, so walsender can use
PQexec() for this.

2. Another query exchange like above, for timeline ID. (or these two
steps can be joined into one query, to eliminate one round-trip).

3. Request a backup history file, if needed:

Slave - Master: Query message, with a query string like
GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr or file name.

Master - Slave: RowDescription, DataRow CommandComplete and
ReadyForQuery messages as usual. The file contents are returned in the
DataRow message.


4. Start replication

Slave - Master: Query message, with query string START REPLICATION:
, where  is the RecPtr of the starting point.

Master - Slave: CopyOutResponse followed by a continuous stream of
CopyData messages with WAL contents.


This minimizes the changes to the protocol and libpq, with a clear way
of extending by adding new commands. Similar to what you did a long time
ago, connecting as an actual backend at first and then switching to
walsender mode after running a few queries, but this would all be
handled in a separate loop in walsender instead of running as a
full-blown backend. We'll still need small changes to libpq to allow
sending messages back to the server in COPY_IN mode (maybe add a new
COPY_IN_OUT mode for that).

Thoughts?

-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com --

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
changes to your subscript...


Re: [HACKERS] An example of bugs for Hot Standby

2009-12-16 Thread Simon Riggs
On Tue, 2009-12-15 at 20:25 +0900, Hiroyuki Yamada wrote:
 Hot Standby node can freeze when startup process calls LockBufferForCleanup().
 This bug can be reproduced by the following procedure.

Interesting. Looks like this can happen, which is a shame cos I just
removed the wait checking code after not ever having seen a wait.

Thanks for the report.

Must-fix item for HS.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Hiroyuki Yamada

On Wed, 2009-12-16 at 18:08 +0900, Hiroyuki Yamada wrote:
 That fixes or explains all known issues, from me. Are there any other
 things you know about that I haven't responded to? Do you think we have
 addressed every issue, except deferred items?
 
 I will be looking to commit to CVS later today; waiting on any
 objections.
 
 
 Is following problem reported or fixed ?

That is fixed, as of a couple of days ago. Thanks for your vigilence.


I tested somewhat older patch(the RC patch in this mailing list). Sorry for 
annoying you.


By the way, reading LogStandbySnapshot() and GetRunningTransactionLocks()
raised following questions.

 * There is a window beween gathering lock information in 
GetRunningTransactionLocks()
   and writing WAL in LogAccessExclusiveLocks().
 * In current lock redo algorithm, locks are released when the transaction 
holding the lock
   are commited or aborted.

... then what happens if any transaction holding ACCESS EXCLUSIVE lock commits 
in the 
   window ?

Similary,

 * There is a window beween writing COMMIT WAL in RecordTransactionCommit() and
releasing locks in ResourceOwnerRelease()

... then what happens when GetRunningTransactionLocks() gathers ACCESS 
EXCLUSIVE 
   locks whose holder has already written the COMMIT WAL ?


Are there any chances of releasing locks which have no COMMIT WAL for releasing 
them ?


regards,

--
  Hiroyuki YAMADA
  Kokolink Corporation
  yam...@kokolink.net

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Nicolas Barbier
2009/12/16 Albe Laurenz laurenz.a...@wien.gv.at:

 Quote:
   The problem [of phantom reads] was identified in (Eswaran et al., 1976),
   but the general purpose predicate locking solution suggested there
   has not been widely adopted because of the difficulty in testing mutual
   satisfiability of predicates.

   Instead, locking DBMS implementations commonly use algorithms based on
   next-key locking. In these, a range of key space is protected against
   concurrent insertion or deletion by acquiring a shared lock on the next
   row in order, as a scan is made to check whether rows match a predicate.
   The scan might be through the data records or through an index.

   Inserts and deletes follow the same protocol, obtaining an exclusive
   lock on the row after the one being inserted or deleted. The result
   of this locking protocol is that a range scan prevents concurrent
   inserts or delete within the range of the scan, and vice versa.

 That sounds like it should actually work.

That boils down to 2PL, using a granularity that is somewhere between
table locks and single-row locks (note that the latter doesn't
correctly enforce serializability, hence something more coarse which
also locks not-yet-existing rows is needed).

Disadvantages:

1. Unstable latency: Depending on whether indexes or table scans are
used (i.e., the plan), other transactions may be blocked for long
durations or not.
2. Unstable susceptibility to deadlocks: Idem; it is possible that
once the planner starts to use another kind of scans, that your
transactions start to deadlock.

It seems that the proposed SIREAD method fixes at least (1), because
there is no additional blocking involved. I am not sure whether the
serialization failures that it may cause are dependent on the plan
used. If so, that would be very similar to (2).

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Florian Weimer
* Albe Laurenz:

 That sounds like it should actually work.

If you have got an index, yes.  It seems to me that it would make
locking behavior dependent on your query plan, too.

BTW, PostgreSQL could raise a different error when a unique constraint
violation is detected which involves a row which is not visible at the
current snapshot.  At least in my limited experience, that would allow
applications to recover more easily if small transactions fail
(similar to what you have to do on deadlock).  Right now (well, at
least with 8.3, haven't checked 8.4 yet), it's not possible to tell a
unique constraint violation caused by a phantom from an application
bug.  (We currently faking this by retrying a fixed number of times
and bailing out if the error returned by PostgreSQL looks like a
unique constraint violation.)

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG patch N+1, fix auto-prepare

2009-12-16 Thread Boszormenyi Zoltan
Michael Meskes írta:
 OK, here's another approach. output_statement()'s interface
 is kept as the original, and not this function decides which
 

 I still think this could be solved more easily.
   

Thanks very much for committing it.

But I don't understand your change. My code was:

=
ecpg.addons:
ECPG: stmtDeleteStmt block
ECPG: stmtInsertStmt block
ECPG: stmtSelectStmt block
ECPG: stmtUpdateStmt block
{ output_statement($1, 1, auto_prepare ? ECPGst_prepnormal :
ECPGst_normal); }
=
output.c:
static char *ecpg_statement_type_name[] = {
ECPGst_normal,
ECPGst_execute,
ECPGst_exec_immediate,
ECPGst_prepnormal
};

void
output_statement(char *stmt, int whenever_mode, enum ECPG_statement_type st)
{

fprintf(yyout, { ECPGdo(__LINE__, %d, %d, %s, %d, , compat,
force_indicator, connection ? connection : NULL, questionmarks);
if (st == ECPGst_normal || st == ECPGst_prepnormal)
{
fprintf(yyout, %s, \, ecpg_statement_type_name[st]);
output_escaped_str(stmt, false);
fputs(\, , yyout);
}
else
fprintf(yyout, %s, %s, , ecpg_statement_type_name[st],
stmt);
=

So the ECPGst_normal vs. prepnormal is decided at the caller
and output_statement() is simplified a bit vs the original.

Your code is:

=
ecpg.addons:
ECPG: stmtDeleteStmt block
ECPG: stmtInsertStmt block
ECPG: stmtSelectStmt block
ECPG: stmtUpdateStmt block
{ output_statement($1, 1, ECPGst_prepnormal); }
=
output.c:
static char *ecpg_statement_type_name[] = {
ECPGst_normal,
ECPGst_execute,
ECPGst_exec_immediate,
ECPGst_prepnormal
};

void
output_statement(char *stmt, int whenever_mode, enum ECPG_statement_type st)
{
fprintf(yyout, { ECPGdo(__LINE__, %d, %d, %s, %d, , compat,
force_indicator, connection ? connection : NULL, questionmarks);
if (st == ECPGst_execute || st == ECPGst_exec_immediate)
{
fprintf(yyout, %s, %s, , ecpg_statement_type_name[st],
stmt);
}
else
{
if (st == ECPGst_prepnormal  auto_prepare)
fputs(ECPGst_prepnormal, \, yyout);
else
fputs(ECPGst_normal, \, yyout);

output_escaped_str(stmt, false);
fputs(\, , yyout);
}
=

Your code in ecpg.addons calls output_statement()
unconditionally with ECPGst_prepnormal and
output_statement() decides what to do with the
auto_prepare global variable. Your code doesn't
seem more readable than mine, but does the same
with the currently existing callers.

 value it uses. I also introduced
 static char *ecpg_statement_type_name[]
 for the textual names of the ECPGst_* symbols to keep the
 preprocessed code readable, and minimize the impact on the
 regression tests. So output_statement() always emits
 ECPGst_* symbols in the preprocessed code instead of
 ECPGst_normal/prepnormal and numeric value for the
 other two cases. This way only 7 regression tests' source
 has changed instead of 45... There are less
 1 - ECPGst_execute and
 2 - ECPGst_exec_immediate
 changes than
 ECPGst_normal - 0
 changes would have been if I chose emitting the numeric value.

 Is it acceptable?
 

 Yes sure.

 I changed some small parts of your patch (see above) and will commit in a few
 minutes. Just running regression tests.
   

Okay, I have to rebase my SQLDA and DESCRIBE
patches again, since the regression tests' results may
have changed beause of this patch. I will post them soon.

Again, thanks for committing it.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Heikki Linnakangas
Hiroyuki Yamada wrote:
 By the way, reading LogStandbySnapshot() and GetRunningTransactionLocks()
 raised following questions.
 
  * There is a window beween gathering lock information in 
 GetRunningTransactionLocks()
and writing WAL in LogAccessExclusiveLocks().
  * In current lock redo algorithm, locks are released when the transaction 
 holding the lock
are commited or aborted.
 
 ... then what happens if any transaction holding ACCESS EXCLUSIVE lock 
 commits in the 
window ?

Hmm, when replying the locks record, we should be ignoring entries
belonging to already-committed transactions. But I don't actually see
that in the code. That needs to be fixed, I think.

 Similary,
 
  * There is a window beween writing COMMIT WAL in RecordTransactionCommit() 
 and
 releasing locks in ResourceOwnerRelease()
 
 ... then what happens when GetRunningTransactionLocks() gathers ACCESS 
 EXCLUSIVE 
locks whose holder has already written the COMMIT WAL ?

This is handled in standby in the same manner as the above case.

 Are there any chances of releasing locks which have no COMMIT WAL for 
 releasing them ?

Yes. In normal operation, there should eventually be a commit or abort
WAL record for every transaction. But as a safety valve, when a
running-xacts record is replayed, we mark as aborted all transactions
that are older than the oldest still-running XID according to the
running-xacts record, and release their locks. See StandbyReleaseOldLocks().

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Albe Laurenz
Nicolas Barbier wrote:
 Quote:
[...]

 That sounds like it should actually work.
 
 That boils down to 2PL, using a granularity that is somewhere between
 table locks and single-row locks (note that the latter doesn't
 correctly enforce serializability, hence something more coarse which
 also locks not-yet-existing rows is needed).

That's how I understood it too.

 Disadvantages:
 
 1. Unstable latency: Depending on whether indexes or table scans are
 used (i.e., the plan), other transactions may be blocked for long
 durations or not.
 2. Unstable susceptibility to deadlocks: Idem; it is possible that
 once the planner starts to use another kind of scans, that your
 transactions start to deadlock.
 
 It seems that the proposed SIREAD method fixes at least (1), because
 there is no additional blocking involved. I am not sure whether the
 serialization failures that it may cause are dependent on the plan
 used. If so, that would be very similar to (2).

Well, I guess that you have to pay somehow for serializability - there
will be more locks and more lock management.

I did not think of that, but it is really unpleasant if your transactions
suddenly start receiving serialization errors because the plan has been
changed. And the thesis says that the tests did not reveal too many
false positives, so maybe it is not that bad.

But maybe that's a price worth paying for serializability?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG patch N+1, fix auto-prepare

2009-12-16 Thread Michael Meskes
On Wed, Dec 16, 2009 at 11:54:41AM +0100, Boszormenyi Zoltan wrote:
 Your code in ecpg.addons calls output_statement()
 unconditionally with ECPGst_prepnormal and
 output_statement() decides what to do with the
 auto_prepare global variable. Your code doesn't
 seem more readable than mine, but does the same
 with the currently existing callers.

It better should do the same. :-)

Maybe finding it simpler this way just comes from me being used to the way this
part of the source code works. It's essantially the same, I only moved that one
auto_prepare test out of the parser.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org
VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 19:35 +0900, Hiroyuki Yamada wrote:
 Sorry for annoying you.

Not at all! Good to get fresh eyes on this.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 19:35 +0900, Hiroyuki Yamada wrote:

  * There is a window beween gathering lock information in 
 GetRunningTransactionLocks()
and writing WAL in LogAccessExclusiveLocks().
  * In current lock redo algorithm, locks are released when the transaction 
 holding the lock
are commited or aborted.
 
 ... then what happens if any transaction holding ACCESS EXCLUSIVE lock 
 commits in the 
window ?

Yes, was a problem in that code. Fixed in git.

We were doing it for prepared transactions but not for normal xacts.
I will look again at that code.

Thanks very much for reading the code. Any more?!?

 Similary,
 
  * There is a window beween writing COMMIT WAL in RecordTransactionCommit() 
 and
 releasing locks in ResourceOwnerRelease()
 
 ... then what happens when GetRunningTransactionLocks() gathers ACCESS 
 EXCLUSIVE 
locks whose holder has already written the COMMIT WAL ?

Same issue, so fixed as well. Thanks to Heikki for making that the same
code path, so we have nothing to do for that.

 Are there any chances of releasing locks which have no COMMIT WAL for 
 releasing them ?

No

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Dec 15, 2009 at 04:16:28PM +0100, Nicolas Barbier wrote:

[...]

 whatever
 
 and
 
 same whatever as before + the character with the lowest value in
 lexicographical ordering.
 
 I don't think it is possible to get anything in between those two strings.

Yes, that was basically Andrew's argumentation. I was taken away too
much by the similarity between strings and (decimal, base N) fractions,
forgetting that the last have lots of unwritten zeros at the end...

But hey, 'twas nice to learn that strings aren't as boring as I first
thought...

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFLKN9cBcgs9XrR2kYRAl4LAJ9rHs/mlR3+j+79YOUtNUTCY0JOEwCZAROn
WsIQoT8nCbgCOaDWraH7jVk=
=vjV5
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Dec 15, 2009 at 11:49:19AM -0800, David Fetter wrote:
 On Tue, Dec 15, 2009 at 11:31:05AM -0800, Scott Bailey wrote:
  Jeff Davis wrote:
  On Tue, 2009-12-15 at 10:19 -0500, Tom Lane wrote:
  
  Would it be OK if we handled float timestamp ranges as continuous
  and int64 timestamps discrete?
 
 That sounds like a recipe for disaster.  Whatever timestamp ranges
 are, float and int64 should be treated the same way so as not to get
 surprises due to implementation details.

This alone would practically preclude discrete -- int and float would
behave quite differently (float's granules growing towards the edges
or having to choose a bigger granule for float than for int in the first
place).

[...]

 FWIW, I think it would be a good idea to treat timestamps as
 continuous in all cases.

This would come as a corollary from the above

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFLKODXBcgs9XrR2kYRAlpLAJ9nO5f0SHwX8A4CjTn6c/xyZdim1ACdGHTq
Fwn5ygKvCDFGadufOYPGrfA=
=ivCP
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An example of bugs for Hot Standby

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 10:33 +, Simon Riggs wrote:
 On Tue, 2009-12-15 at 20:25 +0900, Hiroyuki Yamada wrote:
  Hot Standby node can freeze when startup process calls 
  LockBufferForCleanup().
  This bug can be reproduced by the following procedure.
 
 Interesting. Looks like this can happen, which is a shame cos I just
 removed the wait checking code after not ever having seen a wait.
 
 Thanks for the report.
 
 Must-fix item for HS.

So this deadlock can happen at two places:

1. When a relation lock waits behind an AccessExclusiveLock and then
Startup runs LockBufferForCleanup()

2. When Startup is a pin count waiter and a lock acquire begins to wait
on a relation lock

So we must put in direct deadlock detection in both places. We can't use
the normal deadlock detector because in case (1) the backend might
already have exceeded deadlock_timeout.

Proposal:

Make Startup wait on a well-known semaphore rather than on its
proc-sem. This means we can skip the spinlock check in
ProcSendSignal(). 

For (1) if Startup runs LockBufferForCleanup and can't get cleanup lock
then it marks itself waiting. It then checks for any lock waiters. If
there are 0 lock waiters then it waits for up to max_standby_delay and
then aborts all current lock waiters, none of whom would ever wake if we
continue waiting.

For (2) If a normal backend goes into a lock wait in HS then it will
check to see if Startup is waiting, if so, throw ERROR. This can happen
immediately because if Startup is already waiting then to wait for the
lock would cause deadlock.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Marko Kreen
On 12/16/09, Kurt Harriman harri...@acm.org wrote:
 [Please ignore the previous incomplete version of this reply, which I
  sent by mistake.  Sorry for the list noise.]

  On 12/15/2009 2:09 PM, Marko Kreen wrote:
  
   Oh.  Ok then.  Force-inline seems better fix as we may want to use
   it for other reasons too (converting big macros).
  
   So it seems like a good moment to solve it for gcc too.

  Is ordinary inlining not sufficient?

  If deluxe inlining is something that might be wanted in the
  future, but isn't needed right now, perhaps we should wait
  until then.

It allows the effects the __forceinline has on MSVC also observe
on GCC.  It should not hurt anything, unless we go overboard with
inline usage.

But yeah, it can be omitted.  I was just wondering on code structure,
to make it possible to turn forceinline for gcc too.

   Your worry ii) can be ignored, managing to compile on such
   compilers is already overachievement.

  I think so too.  With your opinion added to mine, do we constitute a
  consensus of the pg community?  Someone might object that a sample of
  two individuals is insufficiently representative of the whole, but
  away with the pedants: let us not quibble over trifles.

Note that we are not voting here, we are on technical discussion list.
Interested people can voice counter-arguments, if we can overturn them,
we can proceed...

   The question is now what should we put into configure and what into
   headers.

  PostgreSQL seems mostly to follow the GNU tradition of using
  autoconf rather than thickets of #ifdefs to discover platform
  characteristics such as supported features, variant spellings of
  keywords and identifiers, and the like.  This often makes it easier
  to support new platforms, assuming the autoconf mechanism is
  understood.

  Autoconf facilitates testing directly for the needed features:
  generally a better approach than hard-coding knowledge of the
  peculiarities of particular compilers, compiler release levels,
  instruction set architectures, etc.  For example, instead of
  writing #ifdefs to decide, if this is gcc, and the version is
  high enough, then __funcname__ should work, it's better to let
  autoconf actually try to compile a program using __funcname__.
  That way PostgreSQL can keep up with the evolution of improved
  and new compilers without continually updating an #ifdef-encoded
  knowledge base of compiler capabilities.

Well, yeah, except on one platform we are mostly bypassing the
autoconf logic and have lot of duplicate logic.

But I'm not opposed putting it into configure, assuming the
final keyword is still called inline.

   Simplest would be to have plain AC_C_INLINE in configure
   and then in header (c.h?):
  
   #ifdef _MSC_VER
   #undef inline
   #define inline __forceinline
   #else
   #ifdef __GNUC__
   #undef inline
   #define inline inline __attribute__((always_inline))
   #endif
   #endif
  
   (Not compile tested :)

  This would force every inline.  Why do we want that?

The compiler is allowed to uninline a plain inline function
if it feels like it.  The gcc uses guesstimate instruction
count (before inlining) to decide on that issue.

ATM it's not a problem, but eg. if we want to turn more complex
macros into inlines (heap_getattr()), it may go over limit and get
uninlined.

  For gcc, I think the __attribute__ has to come after the function's
  parameter list, rather than before the return type.

No.


-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Marko Kreen
On 12/16/09, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 15, 2009 at 10:34 PM, Kurt Harriman harri...@acm.org wrote:
   Your worry ii) can be ignored, managing to compile on such
   compilers is already overachievement.
  
   I think so too.  With your opinion added to mine, do we constitute a
   consensus of the pg community?  Someone might object that a sample of
   two individuals is insufficiently representative of the whole, but
   away with the pedants: let us not quibble over trifles.


 I haven't completely followed this thread, but I think there has been
  some discussion of making changes to inline that would cause
  regressions for people using old, crappy compilers, and I think we
  should avoid doing that unless there is some compelling benefit.  I'm
  not sure what that benefit would be - I don't think cleaner code is
  enough.

Seems you have not followed the thread...

Hypothetical old, crappy compilers would still work, only AC_C_INLINE
would turn static inline into plain static, so hypothetically
they would get some warnings about unused functions.

As this is all hypothetical, I don't see why that should stop us
cleaning our code?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea - new aggregates median, listagg

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 1:36 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 b) function listagg (it is analogy of group_concat from MySQL) - it
 should simplify report generating and some other

 This is redundant, as it's equivalent to array_to_string(array_agg()).

 when I implement it in orafce - the almost all code was parameters
 checking. Implementation is trivial, because important part of work
 was done with array_agg support. It is wrapper over stringInfo.


 If it's done as syntactic sugar over that, it's fine.  If it's a
 separate implementation, it's a bad idea.  Probably best as an
 optional module of some kind

 I am not sure if implementation as syntactic sugar is best. It needs
 parser changes.

I don't think this is important enough to change the parser for it.  I
don't see a concrete proposal for syntax here, but it seems like  the
standard function call syntax should be adequate.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Tom Lane
Marko Kreen mark...@gmail.com writes:
 On 12/16/09, Kurt Harriman harri...@acm.org wrote:
 For gcc, I think the __attribute__ has to come after the function's
 parameter list, rather than before the return type.

 No.

[ squint... ]  That's nowhere documented that I can find: all the
examples in the gcc docs show __attribute__ after the parameters.
It does seem to work, but should we rely on it?

The bigger problem though is that not all versions of gcc understand
always_inline:

$ gcc -Wall check.c
check.c:3: warning: `always_inline' attribute directive ignored

which I think is sufficient reason to put an end to this sub-thread.
We have no particular need for force-inline semantics anyway, as
long as the compiler behaves reasonably for unreferenced inlines,
which gcc always has.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
  foreach p2_member in unnest(p2) loop
  p1 := array(select period_except(p1_member, p2_member)
  from unnest(p1) p1_member);
  end loop;
 
 But maybe it can be done in a single SQL command.

 Yeah, as soon as you have LATERAL, I think. Without it there's no way to
 compose SRF in SQL, AFAIK.

Hm, how would you do it with LATERAL?  The problem is not so much
composition as the need for a variable number of rounds of composition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 9:30 AM, Marko Kreen mark...@gmail.com wrote:
 On 12/16/09, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 15, 2009 at 10:34 PM, Kurt Harriman harri...@acm.org wrote:
   Your worry ii) can be ignored, managing to compile on such
   compilers is already overachievement.
  
   I think so too.  With your opinion added to mine, do we constitute a
   consensus of the pg community?  Someone might object that a sample of
   two individuals is insufficiently representative of the whole, but
   away with the pedants: let us not quibble over trifles.


 I haven't completely followed this thread, but I think there has been
  some discussion of making changes to inline that would cause
  regressions for people using old, crappy compilers, and I think we
  should avoid doing that unless there is some compelling benefit.  I'm
  not sure what that benefit would be - I don't think cleaner code is
  enough.

 Seems you have not followed the thread...

 Hypothetical old, crappy compilers would still work, only AC_C_INLINE
 would turn static inline into plain static, so hypothetically
 they would get some warnings about unused functions.

 As this is all hypothetical, I don't see why that should stop us
 cleaning our code?

I don't think that hypothetical is the right word.  There is no
question that such compilers exist.  What seems to me to be relevant
is whether anyone is still using them.  Maybe with sufficient research
you could demonstrate that there are no platforms we support where
these are still in common use - e.g. the oldest version of AIX
compiler that has this problem is version X, but PostgreSQL is only
supported on versions Y and higher, where YX.  But I don't see that
any attempt has been made to do that research, or at least I haven't
seen any attempt to go through our supported platforms and discuss the
situation for each one.

I'm not really sure there's enough benefit in this project to warrant
that effort, but that's up to you to decide.  However, I am pretty
confident that there is going to be opposition to degrading
performance or causing lots of compiler warnings on older platforms
that are still supported.  In fact, such opposition has already been
registered on this thread and you can find it here:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00239.php

Our standard for accepting patches is that (1) they have a
demonstrated benefit and (2) they don't break anything.  Leaving aside
the question of whether this patch makes anything worse, it would
really improve the case for this patch if someone could enumerate the
supported platforms and compilers where it makes things better.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast or immediate shutdown

2009-12-16 Thread Peter Eisentraut
On tis, 2009-12-15 at 17:19 +, Simon Riggs wrote:
 running with log_checkpoints = on
 
 pg_ctl -D foo -m fast stop
 
 log says
 
 LOG:  received fast shutdown request
 LOG:  aborting any active transactions
 LOG:  shutting down
 LOG:  restartpoint starting: shutdown immediate
 
 Some of us know that the immediate word refers to the restartpoint
 request, though that word causes conceptual conflict with the shutdown
 mode, which was fast, not immediate.
 
 Think we need to change the wording of this
 
 LOG:  restartpoint starting: shutdown immediate
 
 so it is clearer what we mean

We *do* we mean?  And why are we logging it?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Tom Lane
Marko Kreen mark...@gmail.com writes:
 Hypothetical old, crappy compilers would still work, only AC_C_INLINE
 would turn static inline into plain static, so hypothetically
 they would get some warnings about unused functions.

 As this is all hypothetical, I don't see why that should stop us
 cleaning our code?

There's nothing hypothetical about it --- I still regularly check
that the code builds on an old HP compiler that doesn't have inline.

I remind you that the project policy is to not require any compiler
features not found in C89.  If you can exploit inline on more compilers
than now, fine, but assuming that everything has got it is not OK.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea - new aggregates median, listagg

2009-12-16 Thread Kevin Grittner
Pavel Stehule pavel.steh...@gmail.com wrote:
 2009/12/15 David Fetter da...@fetter.org:
 
 http://www.simple-talk.com/sql/t-sql-programming/median-workbench/
 
 In this article the are two medians - statistical and financial. I
 am for both. But only one can be named median.
 
Well, since the statistical median requires that you specify whether
you want the left or right median (in case there are an even number
of values), you could either have two median functions, one of which
would take an extra parameter for this, or you could have median,
left_median, and right_median functions.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Marko Kreen
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
   On 12/16/09, Kurt Harriman harri...@acm.org wrote:

  For gcc, I think the __attribute__ has to come after the function's
   parameter list, rather than before the return type.

   No.


 [ squint... ]  That's nowhere documented that I can find: all the
  examples in the gcc docs show __attribute__ after the parameters.
  It does seem to work, but should we rely on it?

Heh.  At least in 3.0, 3.1 and 4.2 docs there is:

  __attribute__((noreturn)) void d0 (void),
  __attribute__((format(printf, 1, 2))) d1 (const char *, ...),
  d2 (void);

describing that _att before decl list applies to all declarations,
but _att after decl applies to only that declaration.  That sort of
explains also why all examples have _att afterwards.

  http://gcc.gnu.org/onlinedocs/gcc-3.1.1/gcc/Attribute-Syntax.html

But I dare not to pick out sentences from Function-Attributes.html
that describe that...

  The bigger problem though is that not all versions of gcc understand
  always_inline:

  $ gcc -Wall check.c
  check.c:3: warning: `always_inline' attribute directive ignored

Oh, another argument against logic in headers.

  which I think is sufficient reason to put an end to this sub-thread.
  We have no particular need for force-inline semantics anyway, as
  long as the compiler behaves reasonably for unreferenced inlines,
  which gcc always has.

Ok.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 4:52 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote:
 Kevin Grittner wrote:
 Just to make those who care aware of it, here is Michael Cahill's
 Doctoral Thesis based on implementing Serializable Snapshot
 Isolation in InnoDB using a refined version of the techniques
 previously used in the Berkley DB (and previously discussed on this
 list):

 http://hdl.handle.net/2123/5353

 Seriously, this post is just for the benefit of those who may be
 interested in following these developments -- I don't have the
 inclination or energy for another round of debate on the topic just
 now.  :-/

 I understand that, and thank you for the information.
 Although it may have seemed that I was out to shoot the idea down,
 I am interested in the topic. I guess my way of understanding something
 is trying to find holes in it...

 I read into the text, and I was particularly interested how he solved
 the problem of phantom reads.

 Quote:
   The problem [of phantom reads] was identified in (Eswaran et al., 1976),
   but the general purpose predicate locking solution suggested there
   has not been widely adopted because of the difficulty in testing mutual
   satisfiability of predicates.

   Instead, locking DBMS implementations commonly use algorithms based on
   next-key locking. In these, a range of key space is protected against
   concurrent insertion or deletion by acquiring a shared lock on the next
   row in order, as a scan is made to check whether rows match a predicate.
   The scan might be through the data records or through an index.

   Inserts and deletes follow the same protocol, obtaining an exclusive
   lock on the row after the one being inserted or deleted. The result
   of this locking protocol is that a range scan prevents concurrent
   inserts or delete within the range of the scan, and vice versa.

 That sounds like it should actually work.

Only if you can guarantee that the database will access the rows using
some particular index.  If it gets to the data some other way it might
accidentally circumvent the lock.  That's kind of a killer in terms of
making this work for PostgreSQL.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea - new aggregates median, listagg

2009-12-16 Thread Thom Brown
2009/12/15 Pavel Stehule pavel.steh...@gmail.com

 Hello

 I am looking on new feature - ORDER clause in aggregate, and I thing,
 so we are able to effectively implement some non standard, but well
 known aggregates.

 a) function median - it is relative frequent request - with usually
 slow implementation

 b) function listagg (it is analogy of group_concat from MySQL) - it
 should simplify report generating and some other

 What is your opinion? Do you like to see these functions in core?



I'm probably missing the point here, but when I originally saw MySQL's
group_concat function, I found it odd that it featured ordering
functionality.  Shouldn't the order by determined by the query itself?
Otherwise it's almost as if its separating the relationship between the
result column and the resultset.

Thom


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Andres Freund
Moin,

On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
Inserts and deletes follow the same protocol, obtaining an exclusive
lock on the row after the one being inserted or deleted. The result
of this locking protocol is that a range scan prevents concurrent
inserts or delete within the range of the scan, and vice versa.
 
  That sounds like it should actually work.
 
 Only if you can guarantee that the database will access the rows using
 some particular index.  If it gets to the data some other way it might
 accidentally circumvent the lock.  That's kind of a killer in terms of
 making this work for PostgreSQL.
Isnt the whole topic only relevant for writing access? There you have to 
access the index anyway.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote:
 On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
    Inserts and deletes follow the same protocol, obtaining an exclusive
    lock on the row after the one being inserted or deleted. The result
    of this locking protocol is that a range scan prevents concurrent
    inserts or delete within the range of the scan, and vice versa.
 
  That sounds like it should actually work.

 Only if you can guarantee that the database will access the rows using
 some particular index.  If it gets to the data some other way it might
 accidentally circumvent the lock.  That's kind of a killer in terms of
 making this work for PostgreSQL.
 Isnt the whole topic only relevant for writing access? There you have to
 access the index anyway.

Yeah, I guess you have to insert the new tuple.  I guess while you
were at it you might check whether the next tuple is locked...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Marko Kreen
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
   Hypothetical old, crappy compilers would still work, only AC_C_INLINE
   would turn static inline into plain static, so hypothetically
   they would get some warnings about unused functions.

   As this is all hypothetical, I don't see why that should stop us
   cleaning our code?

  There's nothing hypothetical about it --- I still regularly check
  that the code builds on an old HP compiler that doesn't have inline.

Ok, good.  Thus far only argument was historically they have existed,
which does not sound good enough worry about them.  If somebody is
actually testing and caring abouth such compilers, they need to be
taken more seriously.

  I remind you that the project policy is to not require any compiler
  features not found in C89.  If you can exploit inline on more compilers
  than now, fine, but assuming that everything has got it is not OK.

Note - my advanced proposal (instead duplicate macros, let 'static inline'
functions fall back to being plain 'static') would still support
non-inline compilers, but with potentially small performance hit.

So the plain-C89 compilers would be downgraded to second-class
targets, not worth getting max performance out of them.

Is this OK?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Kevin Grittner
Albe Laurenz laurenz.a...@wien.gv.at wrote:
 
 Although it may have seemed that I was out to shoot the idea down,
 I am interested in the topic. I guess my way of understanding
 something is trying to find holes in it...
 
No problem.  That's how ideas are explored and improved.  The brick
wall was that there seemed to be overwhelming opposition to any new
technique which causes serialization failures under conditions where
the cause isn't blindingly obvious, regardless of the resulting
improvements in data reliability.
 
 I was particularly interested how he solved the problem of phantom
 reads.
 
 That sounds like it should actually work.
 
Well, that's certainly not the novel part -- those techniques were
described and proven theoretically decades ago and have been in use
by many popular products for almost as long.  It's that non-blocking
SIREAD lock which is the fun part.  ;-)
 
I just wish I had time to read all the documents referenced in the
footnotes
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Kevin Grittner
Nicolas Barbier nicolas.barb...@gmail.com wrote:
 
 I am not sure whether the serialization failures that it may cause
 are dependent on the plan used.
 
They are.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea - new aggregates median, listagg

2009-12-16 Thread Pavel Stehule
2009/12/16 Thom Brown thombr...@gmail.com:
 2009/12/15 Pavel Stehule pavel.steh...@gmail.com

 Hello

 I am looking on new feature - ORDER clause in aggregate, and I thing,
 so we are able to effectively implement some non standard, but well
 known aggregates.

 a) function median - it is relative frequent request - with usually
 slow implementation

 b) function listagg (it is analogy of group_concat from MySQL) - it
 should simplify report generating and some other

 What is your opinion? Do you like to see these functions in core?



 I'm probably missing the point here, but when I originally saw MySQL's
 group_concat function, I found it odd that it featured ordering
 functionality.  Shouldn't the order by determined by the query itself?
 Otherwise it's almost as if its separating the relationship between the
 result column and the resultset.


Aggregates as group_concat or listagg are not typical SQL aggregates.
With these aggregates we are able to do some reports on SQL level
without stored procedures. What I know, order is determined only for
non hash aggregates - and you cannot specify method of aggregation, so
possibility to specify ORDER is important. But this feature isn't
related to this proposal. It was commited yesterday - so you can
look on discussion about this feature.

Regards
Pavel Stehuke

 Thom



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast or immediate shutdown

2009-12-16 Thread Simon Riggs
On Wed, 2009-12-16 at 17:04 +0200, Peter Eisentraut wrote:
 On tis, 2009-12-15 at 17:19 +, Simon Riggs wrote:
  running with log_checkpoints = on
  
  pg_ctl -D foo -m fast stop
  
  log says
  
  LOG:  received fast shutdown request
  LOG:  aborting any active transactions
  LOG:  shutting down
  LOG:  restartpoint starting: shutdown immediate
  
  Some of us know that the immediate word refers to the restartpoint
  request, though that word causes conceptual conflict with the shutdown
  mode, which was fast, not immediate.
  
  Think we need to change the wording of this
  
  LOG:  restartpoint starting: shutdown immediate
  
  so it is clearer what we mean
 
 We *do* we mean?  And why are we logging it?

The words after the colon refer to options sent to RequestCheckpoint and
it is logged because we asked for it by doing log_checkpoints = on.

I suggest we say smoothed when checkpoint option is !immediate. So we
will remove the word immediate from the message.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL project policy compendium

2009-12-16 Thread Kurt Harriman

On 12/16/2009 7:10 AM, Tom Lane wrote:

the project policy is to not require any compiler features not found

 in C89.

Is there somewhere a compendium of such policies which
fledgling hackers should consult to avoid embarrassment?

Regards,
... kurt


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash

2009-12-16 Thread Peter Eisentraut
Here is a set of patches to address this issue.

The first one is a small refactoring of the signal setting portability
business.

The second one fixes the SIGQUIT handler inadvertently unblocking
SIGQUIT within itself.

The third one installs an alarm so that if the ereport() call in
quickdie() doesn't finish after 60 seconds, it skips it and finishes up.
The precise logic of this could be debated, but it more or less appears
to get the job done.
From d70d637ffe60dd3c73690a2640e84ad6e2cdbb50 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pet...@postgresql.org
Date: Wed, 16 Dec 2009 14:41:10 +0200
Subject: [PATCH 1/3] If there is no sigdelset(), define it as a macro

Removes some duplicate code that recreated the identical workaround.
---
 src/backend/postmaster/bgwriter.c  |4 
 src/backend/postmaster/walwriter.c |4 
 src/backend/tcop/postgres.c|4 
 src/include/libpq/pqsignal.h   |6 --
 4 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index a2c32a7..81c0d36 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -244,11 +244,7 @@ BackgroundWriterMain(void)
 	pqsignal(SIGWINCH, SIG_DFL);
 
 	/* We allow SIGQUIT (quickdie) at all times */
-#ifdef HAVE_SIGPROCMASK
 	sigdelset(BlockSig, SIGQUIT);
-#else
-	BlockSig = ~(sigmask(SIGQUIT));
-#endif
 
 	/*
 	 * Initialize so that first time-driven event happens at the correct time.
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 90b8cfe..7d95696 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -126,11 +126,7 @@ WalWriterMain(void)
 	pqsignal(SIGWINCH, SIG_DFL);
 
 	/* We allow SIGQUIT (quickdie) at all times */
-#ifdef HAVE_SIGPROCMASK
 	sigdelset(BlockSig, SIGQUIT);
-#else
-	BlockSig = ~(sigmask(SIGQUIT));
-#endif
 
 	/*
 	 * Create a resource owner to keep track of our resources (not clear that
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c985478..c7906bf 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3292,11 +3292,7 @@ PostgresMain(int argc, char *argv[], const char *username)
 	if (IsUnderPostmaster)
 	{
 		/* We allow SIGQUIT (quickdie) at all times */
-#ifdef HAVE_SIGPROCMASK
 		sigdelset(BlockSig, SIGQUIT);
-#else
-		BlockSig = ~(sigmask(SIGQUIT));
-#endif
 	}
 
 	PG_SETMASK(BlockSig);		/* block everything except SIGQUIT */
diff --git a/src/include/libpq/pqsignal.h b/src/include/libpq/pqsignal.h
index d8e791c..82414d6 100644
--- a/src/include/libpq/pqsignal.h
+++ b/src/include/libpq/pqsignal.h
@@ -26,7 +26,7 @@ extern sigset_t UnBlockSig,
 			StartupBlockSig;
 
 #define PG_SETMASK(mask)	sigprocmask(SIG_SETMASK, mask, NULL)
-#else
+#else /* not HAVE_SIGPROCMASK */
 extern int	UnBlockSig,
 			BlockSig,
 			StartupBlockSig;
@@ -37,7 +37,9 @@ extern int	UnBlockSig,
 #define PG_SETMASK(mask)		pqsigsetmask(*((int*)(mask)))
 int			pqsigsetmask(int mask);
 #endif
-#endif
+
+#define sigdelset(set, signum)	(*(set) = ~(sigmask(signum)))
+#endif /* not HAVE_SIGPROCMASK */
 
 typedef void (*pqsigfunc) (int);
 
-- 
1.6.5

From 96a3bc5fe46e226ff179a729baa58efc16c0aaba Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pet...@postgresql.org
Date: Wed, 16 Dec 2009 16:07:12 +0200
Subject: [PATCH 2/3] Don't unblock SIGQUIT in the SIGQUIT handler

---
 src/backend/tcop/postgres.c  |1 +
 src/include/libpq/pqsignal.h |1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c7906bf..b2fb501 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2538,6 +2538,7 @@ drop_unnamed_stmt(void)
 void
 quickdie(SIGNAL_ARGS)
 {
+	sigaddset(BlockSig, SIGQUIT); /* prevent nested calls */
 	PG_SETMASK(BlockSig);
 
 	/*
diff --git a/src/include/libpq/pqsignal.h b/src/include/libpq/pqsignal.h
index 82414d6..4659ae2 100644
--- a/src/include/libpq/pqsignal.h
+++ b/src/include/libpq/pqsignal.h
@@ -38,6 +38,7 @@ extern int	UnBlockSig,
 int			pqsigsetmask(int mask);
 #endif
 
+#define sigaddset(set, signum)	(*(set) |= (sigmask(signum)))
 #define sigdelset(set, signum)	(*(set) = ~(sigmask(signum)))
 #endif /* not HAVE_SIGPROCMASK */
 
-- 
1.6.5

From bebb95abe7a55173cab0558da3373d6a3631465b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pet...@postgresql.org
Date: Wed, 16 Dec 2009 17:19:14 +0200
Subject: [PATCH 3/3] Time out the ereport() call in quickdie() after 60 seconds

---
 src/backend/tcop/postgres.c |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b2fb501..ab6805a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -191,6 +191,7 @@ static bool IsTransactionExitStmtList(List *parseTrees);
 static bool IsTransactionStmtList(List *parseTrees);
 

Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Tom Lane
Marko Kreen mark...@gmail.com writes:
 So the plain-C89 compilers would be downgraded to second-class
 targets, not worth getting max performance out of them.

Hm?  Failing to inline is already a performance hit, which is why
Kurt got interested in this in the first place.

I think you're way overthinking this.  Where we started was just
a proposal to try to expand the set of inline-ing compilers beyond
gcc only.  I don't see why we need to do anything but that.  The
code is fine as-is except for the control #ifdefs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL project policy compendium

2009-12-16 Thread Peter Eisentraut
On ons, 2009-12-16 at 07:44 -0800, Kurt Harriman wrote: 
 On 12/16/2009 7:10 AM, Tom Lane wrote:
  the project policy is to not require any compiler features not found
   in C89.
 
 Is there somewhere a compendium of such policies which
 fledgling hackers should consult to avoid embarrassment?

The installation instructions say that you need a C89 compiler.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL project policy compendium

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 10:44 AM, Kurt Harriman harri...@acm.org wrote:
 On 12/16/2009 7:10 AM, Tom Lane wrote:

 the project policy is to not require any compiler features not found

 in C89.

 Is there somewhere a compendium of such policies which
 fledgling hackers should consult to avoid embarrassment?

Well the best way to avoid embarrassment is to just not be
embarrassed.  I don't think anyone will think ill of you for
suggesting something that turns out not to be where the project wants
to go, as long as you're not too dogmatic about it.

This particular point is mentioned in the documentation:

http://www.postgresql.org/docs/8.4/static/install-requirements.html

The rest of chapter 15 is worth a read, too.

I do think your basic point is well-taken, though.  There are a lot of
details about how we do things that I have had to learn through
osmosis, and maybe if I can figure out what they all are, I'll try to
write them down somewhere, if someone else hasn't done it already.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash

2009-12-16 Thread Alvaro Herrera
Peter Eisentraut escribió:
 Here is a set of patches to address this issue.
 
 The first one is a small refactoring of the signal setting portability
 business.

This one looks like should be applied immediately to get some buildfarm
coverage (and alpha3)



-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL project policy compendium

2009-12-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 16, 2009 at 10:44 AM, Kurt Harriman harri...@acm.org wrote:
 Is there somewhere a compendium of such policies which
 fledgling hackers should consult to avoid embarrassment?

 I do think your basic point is well-taken, though.  There are a lot of
 details about how we do things that I have had to learn through
 osmosis, and maybe if I can figure out what they all are, I'll try to
 write them down somewhere, if someone else hasn't done it already.

Yeah, this is something that is difficult for the senior hackers to
undertake because we've forgotten what it was we had to learn the hard
way ;-).  Maybe some of the newer project members could collaborate
on scribbling down such stuff (wiki page?).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Alvaro Herrera
Tom Lane escribió:

 I think you're way overthinking this.  Where we started was just
 a proposal to try to expand the set of inline-ing compilers beyond
 gcc only.  I don't see why we need to do anything but that.  The
 code is fine as-is except for the control #ifdefs.

IIRC Kurt was also on about getting rid of some ugly macros that could
instead be coded as inline functions (fastgetattr for example)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea - new aggregates median, listagg

2009-12-16 Thread Scott Bailey

Thom Brown wrote:
2009/12/15 Pavel Stehule pavel.steh...@gmail.com 
mailto:pavel.steh...@gmail.com


Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?



I'm probably missing the point here, but when I originally saw MySQL's 
group_concat function, I found it odd that it featured ordering 
functionality.  Shouldn't the order by determined by the query itself?  
Otherwise it's almost as if its separating the relationship between the 
result column and the resultset.


For xmlagg in particular, it is quite useful to be able order the 
results. And sorting the query doesn't work for ordering the agg unless 
you do it in a subquery. Oracle has this functionality and it is quite 
handy. It would be nice to see listagg with the option to order as well.


Scott

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Tom Lane
Marko Kreen mark...@gmail.com writes:
 On 12/15/09, Kurt Harriman harri...@acm.org wrote:
 Attached is a revised patch, offered for the 2010-01 commitfest.
 It's also available in my git repository in the submitted branch:
 
 http://git.postgresql.org/gitweb?p=users/harriman/share.git;a=shortlog;h=refs/heads/submitted

 -1.  The PG_INLINE is ugly.

FWIW, I think the patch is largely OK, except for the autoconf hackery
which I'm not the best-qualified person to opine on.  I would only
suggest that the cleanest coding would be

#ifdef USE_INLINE

static inline foo(...) ...

#else

... non-inline definition of foo

#endif

ie, go ahead and rely on autoconf's definition (if any) of inline
and add a policy symbol USE_INLINE to determine whether to use it.
The proposed PG_INLINE coding conflates the symbol needed in the code
with the policy choice.

Another possibility would be to call the policy symbol HAVE_INLINE,
but that (a) risks collision with a name defined by autoconf built-in
macros, and (b) looks like it merely indicates whether the compiler
*has* inline, not that we have made a choice about how to use it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Marko Kreen
On 12/16/09, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:

  So the plain-C89 compilers would be downgraded to second-class
   targets, not worth getting max performance out of them.


  Hm?  Failing to inline is already a performance hit, which is why
  Kurt got interested in this in the first place.

  I think you're way overthinking this.  Where we started was just
  a proposal to try to expand the set of inline-ing compilers beyond
  gcc only.  I don't see why we need to do anything but that.  The
  code is fine as-is except for the control #ifdefs.

My proposal is basically about allowing more widespread use of
static inline.  That is - static inline does not need to be
paired with equivalent macro.

But if C89 compilers are still project's primary target, then this
cannot be allowed.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash

2009-12-16 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Here is a set of patches to address this issue.
 The first one is a small refactoring of the signal setting portability
 business.

OK

 The second one fixes the SIGQUIT handler inadvertently unblocking
 SIGQUIT within itself.

OK

 The third one installs an alarm so that if the ereport() call in
 quickdie() doesn't finish after 60 seconds, it skips it and finishes up.
 The precise logic of this could be debated, but it more or less appears
 to get the job done.

I'm not too happy with depending on alarm(), which according to the
pgbench sources is not portable to Windows.  The postmaster does
something equivalent using enable_sig_alarm(); can we use that?

A different line of thought is that this still doesn't fix the problem
because you're depending on a fair amount of libc infrastructure that
might not be too reentrant (and the same objection could probably be
mounted against enable_sig_alarm).  It would be better for the
postmaster to be in charge of enforcing the timeout, and have it issue
SIGKILL against children that don't die fast enough to suit it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 I think you're way overthinking this.  Where we started was just
 a proposal to try to expand the set of inline-ing compilers beyond
 gcc only.  I don't see why we need to do anything but that.  The
 code is fine as-is except for the control #ifdefs.

 IIRC Kurt was also on about getting rid of some ugly macros that could
 instead be coded as inline functions (fastgetattr for example)

I'd just bounce that as useless activity.  If they are macros now,
and work, the only possible effects of changing them are negative.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL project policy compendium

2009-12-16 Thread Greg Smith

Kurt Harriman wrote:

On 12/16/2009 7:10 AM, Tom Lane wrote:

the project policy is to not require any compiler features not found

 in C89.

Is there somewhere a compendium of such policies which
fledgling hackers should consult to avoid embarrassment?


The list of suggestions at http://wiki.postgresql.org/wiki/Developer_FAQ 
and http://wiki.postgresql.org/wiki/Submitting_a_Patch are what we've 
got now.  There's a number of us who try to record any interesting 
discussion bits like this one into one of those, to shorten the future 
learning curve for others.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Sun, 2009-12-13 at 23:49 -0800, Scott Bailey wrote:
 So basically I have an anyrange pseudo type with the functions prev, 
 next, last, etc defined. So instead of hard coding range types, we would 
 allow the user to define their own range types. Basically if we are able 
 to determine the previous and next values of the base types we'd be able 
 to define a range type. I'm envisioning in a manner much like defining 
 an enum type.

After an off-list discussion with Scott, I think there may be a solution
here that works for everyone if we don't try so hard to unify the
implementation of discrete and continuous ranges. The API should be very
similar, of course, but the implementation doesn't need to be.

Continuous ranges absolutely require the following information: start,
end, and inclusivity information.

But discrete ranges can instead be stored by counting the number of
granules from the start point. For instance, it could be stored as:
start, num_granules.

That has a lot of benefits for discrete ranges of time. First of all, it
allows the algebra to work reasonably well for the days and months
part of the interval, so we can allow a granule of 1 day/week/month/year
for a timestamp range. For output of the range, we can then just
multiply the granule by the number of granules, and add that to the
start time; thus avoiding the incremental addition problem with date
math. I think this works reasonably well for timestamp/date ranges --
let me know if there is a problem here (aside from timestamptz, which I
address below).

Secondly, in the case of a timestamp range, we can use 7 bytes for
storing the number of granules rather than another full 8-byte
timestamp, leaving one byte for flags to represent NULL boundaries,
infinite boundaries, etc. For timestamps that would still mean that an
interval could be 2000 years long with '1 microsecond' granularity. For
dates, 3 bytes is sufficient for a date range 45000 years long with
granules of '1 day'. That means that we can get back down to a 16 byte
representation for timestamp ranges, or 8 byte representation for date
ranges. There are a few details, like infinite ranges, but those can be
pretty easily solved with flags as well.

There's one problem, and that's for timestamptz ranges with intervals
that include days and months. Timezone adjustments are just not
well-defined for that kind of granule (nor would it be particularly
useful even if it magically worked), so this would have to be blocked
somehow. I think that's a special case, and we could provide the user
with a nice error message telling the user to use a date or timestamp
range instead.

So, the idea is to default to a continuous range type, but if the user
supplies a granule, prior and next functions, and other necessary
details, then it becomes a discrete range type.

 * continuous ranges can still have everything that everyone wants, 
   including flags to indicate special values.
 * discrete range granule is specified explicitly, so it's not an 
   implementation detail
 * discrete ranges can have a compact representation
 * discrete ranges would still have room for flags to indicate special 
   values
 
Comments?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 12:31 PM, Jeff Davis pg...@j-davis.com wrote:
 There's one problem, and that's for timestamptz ranges with intervals
 that include days and months. Timezone adjustments are just not
 well-defined for that kind of granule (nor would it be particularly
 useful even if it magically worked), so this would have to be blocked
 somehow. I think that's a special case, and we could provide the user
 with a nice error message telling the user to use a date or timestamp
 range instead.

This seems like a fairly special-purpose type.  You'd be targeting it
at people who are very concerned with storing large numbers of these
(so they really care about space consumption) but for some reason
don't need to mix days and months (actually, the current interval
representation stores days, months, and seconds separately).  I
certainly think this might be useful to some people but it doesn't
really sounds like a general range type facility, since it seems to
involve some hacks that are fairly datatype-specific.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 [ hacky special-case representation for discrete timestamp ranges ]

I'm still not exactly clear on what the use-case is for discrete
timestamp ranges, and I wonder how many people are going to be happy
with a representation that can't handle a range that's open-ended
on the left.

 So, the idea is to default to a continuous range type, but if the user
 supplies a granule, prior and next functions, and other necessary
 details, then it becomes a discrete range type.

Huh?  You're not going to be able to have a special case data
representation for one or two data types at the same time as you have a
function-based datatype-independent concept of a parameterized range
type.  Well, maybe you could have special code paths for just date and
timestamp but it'd be horrid.

More importantly, the notion of a representation granule is still 100%
wishful thinking for any inexact-representation datatype, which is going
to be a severe crimp in getting this accepted for timestamp, let alone
defining it in a way that would allow users to try to apply it to
floats.  Float timestamps might not be the default case anymore but they
are still supported.

I think you should let go of the feeling that you have to shave bytes
off the storage format.  You're creating a whole lot of work for
yourself and a whole lot of user-visible corner cases in return for
what ultimately isn't much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Scott Bailey

Jeff Davis wrote:

On Sun, 2009-12-13 at 23:49 -0800, Scott Bailey wrote:
So basically I have an anyrange pseudo type with the functions prev, 
next, last, etc defined. So instead of hard coding range types, we would 
allow the user to define their own range types. Basically if we are able 
to determine the previous and next values of the base types we'd be able 
to define a range type. I'm envisioning in a manner much like defining 
an enum type.


After an off-list discussion with Scott, I think there may be a solution
here that works for everyone if we don't try so hard to unify the
implementation of discrete and continuous ranges. The API should be very
similar, of course, but the implementation doesn't need to be.

Continuous ranges absolutely require the following information: start,
end, and inclusivity information.

But discrete ranges can instead be stored by counting the number of
granules from the start point. For instance, it could be stored as:
start, num_granules.

That has a lot of benefits for discrete ranges of time. First of all, it
allows the algebra to work reasonably well for the days and months
part of the interval, so we can allow a granule of 1 day/week/month/year
for a timestamp range. For output of the range, we can then just
multiply the granule by the number of granules, and add that to the
start time; thus avoiding the incremental addition problem with date
math. I think this works reasonably well for timestamp/date ranges --
let me know if there is a problem here (aside from timestamptz, which I
address below).

Secondly, in the case of a timestamp range, we can use 7 bytes for
storing the number of granules rather than another full 8-byte
timestamp, leaving one byte for flags to represent NULL boundaries,
infinite boundaries, etc. For timestamps that would still mean that an
interval could be 2000 years long with '1 microsecond' granularity. For
dates, 3 bytes is sufficient for a date range 45000 years long with
granules of '1 day'. That means that we can get back down to a 16 byte
representation for timestamp ranges, or 8 byte representation for date
ranges. There are a few details, like infinite ranges, but those can be
pretty easily solved with flags as well.

There's one problem, and that's for timestamptz ranges with intervals
that include days and months. Timezone adjustments are just not
well-defined for that kind of granule (nor would it be particularly
useful even if it magically worked), so this would have to be blocked
somehow. I think that's a special case, and we could provide the user
with a nice error message telling the user to use a date or timestamp
range instead.

So, the idea is to default to a continuous range type, but if the user
supplies a granule, prior and next functions, and other necessary
details, then it becomes a discrete range type.

 * continuous ranges can still have everything that everyone wants, 
   including flags to indicate special values.
 * discrete range granule is specified explicitly, so it's not an 
   implementation detail

 * discrete ranges can have a compact representation
 * discrete ranges would still have room for flags to indicate special 
   values
 
Comments?


As I pointed out off-list, I think the granularity for timestamp range 
should be limited to hours and smaller. Anything larger is asking for 
trouble. And quite honestly if they wanted day granularity, they should 
use date range. Also, I think the granule should be same type as 
returned when subtracting two subtypes. So granule of date range should 
be int not interval. And if user wanted something with month 
granularity, perhaps an enum range of 'MM' would be better.


Quite honestly the following 3 cases would probably meet 99% of need:
CREATE TYPE period AS RANGE(timestamptz(0), interval '1 s');
CREATE TYPE period AS RANGE(timestamptz(3), interval '1 ms');
CREATE TYPE period AS RANGE(timestamptz, interval '1 us');

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Alvaro Herrera
Robert Haas escribió:
 On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote:
  On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
     Inserts and deletes follow the same protocol, obtaining an exclusive
     lock on the row after the one being inserted or deleted. The result
     of this locking protocol is that a range scan prevents concurrent
     inserts or delete within the range of the scan, and vice versa.
  
   That sounds like it should actually work.
 
  Only if you can guarantee that the database will access the rows using
  some particular index.  If it gets to the data some other way it might
  accidentally circumvent the lock.  That's kind of a killer in terms of
  making this work for PostgreSQL.
  Isnt the whole topic only relevant for writing access? There you have to
  access the index anyway.
 
 Yeah, I guess you have to insert the new tuple.  I guess while you
 were at it you might check whether the next tuple is locked...

So you'd have to disable HOT updates when true serializability was
active?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Scott Bailey

Tom Lane wrote:

Jeff Davis pg...@j-davis.com writes:

[ hacky special-case representation for discrete timestamp ranges ]


I'm still not exactly clear on what the use-case is for discrete
timestamp ranges, and I wonder how many people are going to be happy
with a representation that can't handle a range that's open-ended
on the left.


They wouldn't. But the timestamp data would be the anchor, not 
necessarily the start point. As long as we ranges unbounded on both ends 
we'd be ok.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Robert Haas escribió:
 On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote:
  On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
     Inserts and deletes follow the same protocol, obtaining an exclusive
     lock on the row after the one being inserted or deleted. The result
     of this locking protocol is that a range scan prevents concurrent
     inserts or delete within the range of the scan, and vice versa.
  
   That sounds like it should actually work.
 
  Only if you can guarantee that the database will access the rows using
  some particular index.  If it gets to the data some other way it might
  accidentally circumvent the lock.  That's kind of a killer in terms of
  making this work for PostgreSQL.
  Isnt the whole topic only relevant for writing access? There you have to
  access the index anyway.

 Yeah, I guess you have to insert the new tuple.  I guess while you
 were at it you might check whether the next tuple is locked...

 So you'd have to disable HOT updates when true serializability was
 active?

I thought about that, but I don't think so.   HOT only applies to
updates, and predicate locking only applies to inserts.  Unless I have
my head in the sand?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Robert Haas escribió:
 On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote:
  On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
     Inserts and deletes follow the same protocol, obtaining an exclusive
     lock on the row after the one being inserted or deleted. The result
     of this locking protocol is that a range scan prevents concurrent
     inserts or delete within the range of the scan, and vice versa.
  
   That sounds like it should actually work.
 
  Only if you can guarantee that the database will access the rows using
  some particular index.  If it gets to the data some other way it might
  accidentally circumvent the lock.  That's kind of a killer in terms of
  making this work for PostgreSQL.
  Isnt the whole topic only relevant for writing access? There you have to
  access the index anyway.

 Yeah, I guess you have to insert the new tuple.  I guess while you
 were at it you might check whether the next tuple is locked...

 So you'd have to disable HOT updates when true serializability was
 active?

 I thought about that, but I don't think so.   HOT only applies to
 updates, and predicate locking only applies to inserts.  Unless I have
 my head in the sand?

Err, no, wait.  Predicate locking can apply to updates, but since HOT
updates never update an indexed column, I think we might still be OK?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Boszormenyi Zoltan
Robert Haas írta:
 On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote:
   
 On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 
 Robert Haas escribió:
   
 On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de wrote:
 
 On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:
   
   Inserts and deletes follow the same protocol, obtaining an exclusive
   lock on the row after the one being inserted or deleted. The result
   of this locking protocol is that a range scan prevents concurrent
   inserts or delete within the range of the scan, and vice versa.

 That sounds like it should actually work.
   
 Only if you can guarantee that the database will access the rows using
 some particular index.  If it gets to the data some other way it might
 accidentally circumvent the lock.  That's kind of a killer in terms of
 making this work for PostgreSQL.
 
 Isnt the whole topic only relevant for writing access? There you have to
 access the index anyway.
   
 Yeah, I guess you have to insert the new tuple.  I guess while you
 were at it you might check whether the next tuple is locked...
 
 So you'd have to disable HOT updates when true serializability was
 active?
   
 I thought about that, but I don't think so.   HOT only applies to
 updates, and predicate locking only applies to inserts.  Unless I have
 my head in the sand?
 

 Err, no, wait.  Predicate locking can apply to updates, but since HOT
 updates never update an indexed column, I think we might still be OK?
   

A predicate can include columns from an index plus others.
Am I missing something?

 ...Robert

   


-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Kevin Grittner
Alvaro Herrera alvhe...@commandprompt.com wrote:
 
 So you'd have to disable HOT updates when true serializability was
 active?
 
I wouldn't think so; but someone familiar with HOT logic could
probably determine whether the unmodified algorithm could be used by
reviewing the simplifying assumptions near the bottom of page 42,
and the page about Generalizing to other database engines (section
4.8).  I think those portions might stand on their own without
reading the rest of the document.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Scott Bailey
Ok, silly question here. But how do you determine the length of a 
continuous range? By definition length of [a, b) and (a, b] = b-a. But 
what about (a,b) and [a,b]? Are we saying that because they are 
continuous, the difference between values included in the range and 
those excluded are so infinitesimally small so as not to matter? Thus 
length (a,b) == length [a,b] == length [a,b)? And if that is the case, 
does the inclusiveness of the range really even matter?


And can anyone point me to a reference for working with continuous 
ranges? Google just insists that I spelled contiguous wrong.


Scott

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Scott Bailey arta...@comcast.net writes:
 As I pointed out off-list, I think the granularity for timestamp range 
 should be limited to hours and smaller. Anything larger is asking for 
 trouble. And quite honestly if they wanted day granularity, they should 
 use date range.

I'm still not real clear on what the expected use-case is for this.
You're evidently envisioning applications where the allowed form of
an interval is constrained, but in the cases I can think of, the
constraints are a lot more convoluted than what you're proposing.
For example, if you're trying to do classroom scheduling, it might be
useful to constrain the periods to start and end on hour boundaries
--- but the next thing you'll want is to have it know that the next
slot after 5pm Friday is 8am Monday.  Except on holidays.  And then
there's the fact that my alma mater starts most hour-long classes on
the half hour.

I think that wiring such constraints into the low-level datatype is
doomed to failure.  What you'd be better off with is a function that
finds the next period given a current period and some suitable
representation of the off-limits intervals.  The argument for having
granularity wired into the datatype seems to boil down to just space
savings.  I don't find that compelling enough to justify code
contortions and user-visible restrictions on functionality.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XLogInsert

2009-12-16 Thread Gurjeet Singh
2009/12/15 Greg Smith g...@2ndquadrant.com

 Jaime Casanova wrote:

 So in this extreme case avg tps is just 6 transactions better


 Great job trying to find the spot where the code worked better.  I'm not so
 sure I trust pgbench results where the TPS was so low though.  Which leads
 us right back to exactly how Jeff measured his original results.

 As I said already, I think we need more insight into Jeff's performance
 report, a way to replicate that test, to look a bit at the latency as
 reported by the updated LWLock patch that Pierre submitted.  Tweaking your
 test to give more useful results is a nice second opinion on top of that.
  But we're out of time for now, so this patch is getting returned with
 feedback.  I encourage Jeff to resubmit the same patch or a better one with
 a little more data on performance measurements to our final 8.5 CommitFest
 in hopes we can confirm this an improvement worth committing.



Last week I worked on a FUSE based filesystem, which I call BlackholeFS. Its
similar to /dev/null, but for directories. Basically it simply returns
success for all the writes, but doesn't do any writes on the files under it.

Would moving the pg_xlog/ (and possibly table data too) to such a filesystem
exercise this patch better?

Best regards,
-- 
Lets call it Postgres

EnterpriseDB  http://www.enterprisedb.com

gurjeet[.sin...@enterprisedb.com

singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com
Twitter: singh_gurjeet
Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] XLogInsert

2009-12-16 Thread Andres Freund
On Wednesday 16 December 2009 20:07:07 Gurjeet Singh wrote:
 2009/12/15 Greg Smith g...@2ndquadrant.com
 
  Jaime Casanova wrote:
  So in this extreme case avg tps is just 6 transactions better
 
  Great job trying to find the spot where the code worked better.  I'm not
  so sure I trust pgbench results where the TPS was so low though.  Which
  leads us right back to exactly how Jeff measured his original results.
 
  As I said already, I think we need more insight into Jeff's performance
  report, a way to replicate that test, to look a bit at the latency as
  reported by the updated LWLock patch that Pierre submitted.  Tweaking
  your test to give more useful results is a nice second opinion on top of
  that. But we're out of time for now, so this patch is getting returned
  with feedback.  I encourage Jeff to resubmit the same patch or a better
  one with a little more data on performance measurements to our final 8.5
  CommitFest in hopes we can confirm this an improvement worth committing.
 
 Last week I worked on a FUSE based filesystem, which I call BlackholeFS.
  Its similar to /dev/null, but for directories. Basically it simply returns
  success for all the writes, but doesn't do any writes on the files under
  it.
I doubt that it will be faster than a tmpfs - the additional context switches 
et al probably will hurt already.
If you constrain the checkpoint_segments to something sensible it shouldnt use 
too much memory.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 12:42 -0500, Robert Haas wrote:
 On Wed, Dec 16, 2009 at 12:31 PM, Jeff Davis pg...@j-davis.com wrote:
  There's one problem, and that's for timestamptz ranges with intervals
  that include days and months. Timezone adjustments are just not
  well-defined for that kind of granule (nor would it be particularly
  useful even if it magically worked), so this would have to be blocked
  somehow. I think that's a special case, and we could provide the user
  with a nice error message telling the user to use a date or timestamp
  range instead.
 
 This seems like a fairly special-purpose type.  You'd be targeting it
 at people who are very concerned with storing large numbers of these
 (so they really care about space consumption) but for some reason
 don't need to mix days and months (actually, the current interval
 representation stores days, months, and seconds separately).  I
 certainly think this might be useful to some people but it doesn't
 really sounds like a general range type facility, since it seems to
 involve some hacks that are fairly datatype-specific.

My statement should have read days or months. In other words, you
can't have a timestamptz range with a granularity of '3 days'. But if
that's your granularity, as Scott says, you should be using a date
range, not a timestamptz range.

Timestamptz ranges are only really useful when you have a granularity
measured in seconds (or some fraction or multiple thereof). Otherwise,
the timezone adjustment doesn't make any sense.

So this isn't a case of limited functionality, just that we need to
inform the user that a timestamptz range with granularity '1 day' or '1
month' makes no sense.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: Add hstore_to_json()

2009-12-16 Thread David E. Wheeler
I just realized that this was easy to do, and despite my complete lack of C 
skillz was able to throw this together in a couple of hours. It might be handy 
to some, though the possible downsides are:

* No json_to_hstore().
* Leads to requests for hstore_to_yaml(), hstore_to_xml(), etc.
* Andrew Gierth said “no” when I suggested it.

But it's kind of handy, too. Thoughts?

Best,

David




hstore_to_json.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pinging Dano

2009-12-16 Thread Markus Wanner

Hi,

who is the main editor named Dano of the wiki page about Parallel 
Query Execution 
(http://wiki.postgresql.org/wiki/Parallel_Query_Execution), please speak up.


Is there any code or patch available ATM? What discussion with Tom and 
Simon is that page referring to?


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 12:50 -0500, Tom Lane wrote:
 I'm still not exactly clear on what the use-case is for discrete
 timestamp ranges, and I wonder how many people are going to be happy
 with a representation that can't handle a range that's open-ended
 on the left.

Huh? We're miscommunicating somewhere. Discrete ranges are values, and
those values can be displayed a number of different ways. That's one of
the biggest advantages.

The very same discrete range can be displayed as open-open, closed-open,
open-closed, or closed-closed. There are edge cases, like how infinity
is never closed, and overflow conditions. But generally speaking, you
have more options for presenting a discrete range than a continuous
range.

The range [5, 7) is equal to the set {5, 6} and equal to the ranges:
(4,7), (4,6], [5,7), and [5,6]. One application can insert it as [5,7)
and another can read it as (4,6].

That's the use case: the application's preferences don't have to match.
It's OK to mix various representation preferences, because you can
convert between them. The on disk format happens to hint at one
particular canonical form, but doesn't enforce that on anyone.

 Huh?  You're not going to be able to have a special case data
 representation for one or two data types at the same time as you have a
 function-based datatype-independent concept of a parameterized range
 type.  Well, maybe you could have special code paths for just date and
 timestamp but it'd be horrid.

They aren't supposed to be exactly the same API, I said that from the
start. There are API differences between continuous and discrete ranges,
and we shouldn't ignore them.

One important differences is that (barring overflow conditions and
special values) prior, first, last, and next are defined for all
discrete range values, but not for all continuous range values.

For instance, the discrete range [5,7) has prior=4, first=5, last=6,
next=7.

Whereas the continuous range [5,7) has prior=undef, first=5, last=undef,
next=7.

We could define one API, that treats discrete and continuous ranges
differently. But you'll never be able to transform a continuous range to
a different representation, while you can do so with a discrete range.

 More importantly, the notion of a representation granule is still 100%
 wishful thinking for any inexact-representation datatype, which is going
 to be a severe crimp in getting this accepted for timestamp, let alone
 defining it in a way that would allow users to try to apply it to
 floats.  Float timestamps might not be the default case anymore but they
 are still supported.

If the only objection is that a superuser can confuse the system by
poorly defining a range type on a non-default build, I think that
objection can be overcome.

 I think you should let go of the feeling that you have to shave bytes
 off the storage format.  You're creating a whole lot of work for
 yourself and a whole lot of user-visible corner cases in return for
 what ultimately isn't much.

This isn't just to shave bytes. It's also because I like the semantics
of discrete ranges for many cases.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-16 Thread David E. Wheeler
*** a/doc/src/sgml/hstore.sgml
--- b/doc/src/sgml/hstore.sgml
***
*** 278,284 
entryget typehstore/'s keys as a set/entry
entryliteralskeys('a=gt;1,b=gt;2')/literal/entry
entry
! 22programlisting
  a
  b
  /programlisting/entry
--- 278,284 
entryget typehstore/'s keys as a set/entry
entryliteralskeys('a=gt;1,b=gt;2')/literal/entry
entry
! programlisting
  a
  b
  /programlisting/entry


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Remove gcc dependency in definition of inline functions

2009-12-16 Thread Peter Eisentraut
On ons, 2009-12-16 at 10:49 -0500, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  So the plain-C89 compilers would be downgraded to second-class
  targets, not worth getting max performance out of them.
 
 Hm?  Failing to inline is already a performance hit, which is why
 Kurt got interested in this in the first place.
 
 I think you're way overthinking this.  Where we started was just
 a proposal to try to expand the set of inline-ing compilers beyond
 gcc only.  I don't see why we need to do anything but that.  The
 code is fine as-is except for the control #ifdefs.

I think the ifdefs should just be HAVE_INLINE  !MSVC, right?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-16 Thread Magnus Hagander
On Wed, Dec 16, 2009 at 20:34, David E. Wheeler da...@kineticode.com wrote:
 *** a/doc/src/sgml/hstore.sgml
 --- b/doc/src/sgml/hstore.sgml

Heh, interesting. That clearly shouldn't be there. Applied.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Alvaro Herrera
to...@tuxteam.de wrote:

 (and as Andrew Dunstan pointed out off-list: I was wrong with my bold
 assertion that one can squeeze infinitely many (arbitrary length)
 strings between two given. This is not always the case).

Of course you can do that if you assume lexicographical order, or any
other arbitrary order.  The interesting point is whether there exists
some ordering on which this does not happen.  And in fact there is:
order strings by length first, and then lexicographically.  If you do
this then you have next() and prev() for any given string.  If you use
ASCII only, you have a.next = b, and so on.

There is the argument that some languages do not sort lexicographically
but this is also besides the point -- you only need to find *some* way
to sort the characters in the alphabet.  If you dictate that in your
ordering á comes before à and both after a, and all of them before
b, then you know that a.next = á and á.next = à and à.next = b.  (Note
that I have also dictated that there is no other character that sorts
after a and before b, which is perfectly possible because the alphabet
is fixed for any given language.  You could use the complete list of
characters coded in a given set of Unicode planes, or even extant all
planes, to obtain the same result).

Defining strings with this ordering means you can have some useful
ranges like [a-z], but then you cannot meaningfully use ranges for
things like [aardvark - zulu] --- note that in this particular example,
the order is reversed, because zulu comes before aardvark which is
probably not what you want.   zzz.next = 

In short, I think that while it is possible to define ranges of strings,
it is not as useful as one would like.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Does verbose Need to be Reserved?

2009-12-16 Thread David E. Wheeler
Hey All,

I was just getting a new version of pgTAP ready for release, and while testing 
it on HEAD, I got this error:

+ psql:pgtap.sql:5789: ERROR:  syntax error at end of input
+ LINE 28: IF verbose THEN RETURN NEXT diag(tests[i] ||...
+^

I asked on IRC, and Andrew “RhodiumToad” Gierth pointed out that it became a 
reserved word at some point. I'm fine to rename my variable, but Andew and I 
were wondering if it's really necessary for verbose to be reserved, since 
it's not in the spec.

Thanks,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Martijn van Oosterhout
On Tue, Dec 15, 2009 at 04:29:26PM -0800, Jeff Davis wrote:
 On Tue, 2009-12-15 at 18:06 -0600, decibel wrote:
  Now that varlena's don't have an enormous fixed overhead, perhaps it's
  worth looking at using them. Obviously some operations would be
  slower, but for your stated examples of auditing and history, I
  suspect that you're not going to notice the overhead that much.
 
 For most varvarlena types, you only get stuck with the full alignment
 burden if you get unlucky. In this case, we're moving from 16 bytes to
 17, which really means 24 bytes with alignment. Try creating two tables:
 
   create table foo(i int8, t1 timestamp, t2 timestamp);
   create table bar(i int8, c char, t1 timestamp, t2 timestamp);

But a period type will take just one or two more bytes if you don't
require alignment. Alignment on a varlena type seems silly anyway,
since you'll be aligning the header byte rather than the content.

In the implementation you may need to copy the content before
processing to satisfy the alignment of the contained type, but that's
just a SMOP.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Does verbose Need to be Reserved?

2009-12-16 Thread Andrew Gierth
 David == David E Wheeler da...@kineticode.com writes:

 David Hey All,
 David I was just getting a new version of pgTAP ready for release, and while 
testing it on HEAD, I got this error:

 David + psql:pgtap.sql:5789: ERROR:  syntax error at end of input
 David + LINE 28: IF verbose THEN RETURN NEXT diag(tests[i] 
||...
 David +^

 David I asked on IRC, and Andrew “RhodiumToad” Gierth pointed out
 David that it became a reserved word at some point. I'm fine to
 David rename my variable, but Andew and I were wondering if it's
 David really necessary for verbose to be reserved, since it's not
 David in the spec.

Looking at it more closely, this is likely to be fallout from the
plpgsql lexer/parser changes; it probably worked before only because
plpgsql was doing its own thing rather than using the main lexer.

VERBOSE has been reserved all along in order to distinguish 
'vacuum verbose;' from 'vacuum tablename;' and so on; but it's still an
interesting pitfall for plpgsql users.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Martijn van Oosterhout
On Wed, Dec 16, 2009 at 10:57:19AM -0800, Scott Bailey wrote:
 Ok, silly question here. But how do you determine the length of a  
 continuous range? By definition length of [a, b) and (a, b] = b-a. But  
 what about (a,b) and [a,b]? Are we saying that because they are  
 continuous, the difference between values included in the range and  
 those excluded are so infinitesimally small so as not to matter? Thus  
 length (a,b) == length [a,b] == length [a,b)? And if that is the case,  
 does the inclusiveness of the range really even matter?

Short answer: Yes

Longer answer: You need to decide on your definition of length and
what you usually use is the measure. And yes, the difference between
the two is so called measure 0 and thus has no effect on the length.

Note the measure has to be done considering the intervals as intervals
on a real line. The integers by themselves have no measure (they are
countable). So for the length of a set of integers you might consider
the count of the set.

http://planetmath.org/encyclopedia/ProofThatTheOuterLebesgueMeasureOfAnIntervalIsItsLength.html
http://en.wikipedia.org/wiki/Outer_measure

As for continuous, as you use it above is not a way I recognise.
There are contiguous sets, but they are something else.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 1:29 PM, Boszormenyi Zoltan z...@cybertec.at wrote:
 Robert Haas írta:
 On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 Robert Haas escribió:

 On Wed, Dec 16, 2009 at 10:29 AM, Andres Freund and...@anarazel.de 
 wrote:

 On Wednesday 16 December 2009 16:24:42 Robert Haas wrote:

   Inserts and deletes follow the same protocol, obtaining an exclusive
   lock on the row after the one being inserted or deleted. The result
   of this locking protocol is that a range scan prevents concurrent
   inserts or delete within the range of the scan, and vice versa.

 That sounds like it should actually work.

 Only if you can guarantee that the database will access the rows using
 some particular index.  If it gets to the data some other way it might
 accidentally circumvent the lock.  That's kind of a killer in terms of
 making this work for PostgreSQL.

 Isnt the whole topic only relevant for writing access? There you have to
 access the index anyway.

 Yeah, I guess you have to insert the new tuple.  I guess while you
 were at it you might check whether the next tuple is locked...

 So you'd have to disable HOT updates when true serializability was
 active?

 I thought about that, but I don't think so.   HOT only applies to
 updates, and predicate locking only applies to inserts.  Unless I have
 my head in the sand?


 Err, no, wait.  Predicate locking can apply to updates, but since HOT
 updates never update an indexed column, I think we might still be OK?


 A predicate can include columns from an index plus others.
 Am I missing something?

Hmm, interesting point.  In that case you couldn't use the index to
enforce predicate locking under MVCC without disabling HOT.  But there
will be other cases where that wouldn't help anyway - a predicate
could also include unindexed columns exclusively.  For those, the
traditional approach (not the one discussed in this paper) probably
requires locking against any heap insert, or checking each new heap
insert against the constraint, or... something.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 In short, I think that while it is possible to define ranges of strings,
 it is not as useful as one would like.

Note it is not the *range* that is the problem, it is the assumption
that there's a unique next string.  There's no unique next in the
reals or rationals either, but we have no problem considering intervals
over those sets.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Robert Haas
On Wed, Dec 16, 2009 at 1:30 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Alvaro Herrera alvhe...@commandprompt.com wrote:

 So you'd have to disable HOT updates when true serializability was
 active?

 I wouldn't think so; but someone familiar with HOT logic could
 probably determine whether the unmodified algorithm could be used by
 reviewing the simplifying assumptions near the bottom of page 42,
 and the page about Generalizing to other database engines (section
 4.8).  I think those portions might stand on their own without
 reading the rest of the document.

This thread veered off into a discussion of the traditional technique,
rather than the one in the paper.  I think that's the part Alvaro was
responding to.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Does verbose Need to be Reserved?

2009-12-16 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 I asked on IRC, and Andrew “RhodiumToad” Gierth pointed out that
 it became a reserved word at some point.

Some point would have been around the time VACUUM VERBOSE got
invented, ie January 1997 according to the CVS logs.  We can't unreserve
it until you're ready to break that syntax (is VERBOSE a table name
or a flag for a database-wide vacuum?).  Don't hold your breath.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote:
 The argument for having
 granularity wired into the datatype seems to boil down to just space
 savings.  I don't find that compelling enough to justify code
 contortions and user-visible restrictions on functionality.

The argument (at least from me) is that discrete ranges have better
semantics. The counterargument was that the granularity of a timestamp
is an implementation detail. So I countered by making it explicit.

Space savings is not crucial, but it would be frustrating to needlessly
waste space.

I still have not seen an answer to the problem of changing the
representation of a continuous range. If you have the continuous range
[5, 10], you're pretty much stuck with that representation, even if the
application is expecting things in the form [ ).

From an application's standpoint, you probably want to get the
information about a range as separate columns (as opposed to parsing a
string). So an application expecting data in [) format might use a query
like:

 select ..., first(myperiod), next(myperiod) from mytable;

That gives the application complete information about the range. You can
even make a view over a table like that to make it even more transparent
to the application. It's not entirely unreasonable that many such
applications exist; there are many presentations and tutorials that have
been telling people to use a start and end column, and assume that
the start is inclusive and the end is exclusive.

If there is some other application that expects data in (] format, you
just use the query:

  select ..., prior(myperiod), last(myperiod) from mytable;

With discrete ranges, that all just works (barring overflow or special
values).

With continuous ranges, first() or next() might fail on some values that
were produced by some other application. Really, for continuous ranges,
you'd need to have a query like:

  select ..., start(myperiod), start_inclusive(myperiod),
end(myperiod), end_inclusive(myperiod) from mytable;

in order to have all of the information. And that means that the
application needs a full implementation of a range type to understand
the inclusivity and produce a correct result.

And to further make the case for allowing user-defined discrete ranges,
what about ip4r? That's a discrete range, and it's user-defined. And
that probably means other useful discrete ranges will be invented, as
well.

If we want to say that there is no discrete TIMESTAMP range by default,
and that the superuser has to define it, that's one thing. But if we say
that the only acceptable base types for discrete ranges will be
hard-wired, that's way too restrictive. If nothing else, it makes some
system types special which we have not done very many other places.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  In short, I think that while it is possible to define ranges of strings,
  it is not as useful as one would like.
 
 Note it is not the *range* that is the problem, it is the assumption
 that there's a unique next string.  There's no unique next in the
 reals or rationals either, but we have no problem considering intervals
 over those sets.

Yeah, agreed.  It's easy (I think) to define more useful ranges of
strings if you don't insist in having next.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 But a period type will take just one or two more bytes if you don't
 require alignment. Alignment on a varlena type seems silly anyway,
 since you'll be aligning the header byte rather than the content.

You might still end up paying the alignment overhead after the field,
of course.  But avoiding embedding the alignment in the type itself
seems worth doing.

One idea that might be interesting to consider in this regard is
force-packing varlena range values.  Consider a text range('a', 'b').
The datums are likely to come in with 4-byte headers requiring
alignment.  If we have the smarts to force them to 1-byte header form
inside the varlena range value, not only do we save bytes right there,
but we don't have to expend cycles to copy them somewhere to re-align
them before we can pass them to the datatype-specific functions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Wed, 2009-12-16 at 12:50 -0500, Tom Lane wrote:
 I'm still not exactly clear on what the use-case is for discrete
 timestamp ranges, and I wonder how many people are going to be happy
 with a representation that can't handle a range that's open-ended
 on the left.

 Huh? We're miscommunicating somewhere.

Yeah, apparently.  By open-ended I meant -infinity left bound, or null
left bound if you prefer.  Not sure if there's a better term.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote:
 The argument for having
 granularity wired into the datatype seems to boil down to just space
 savings.  I don't find that compelling enough to justify code
 contortions and user-visible restrictions on functionality.

 The argument (at least from me) is that discrete ranges have better
 semantics. The counterargument was that the granularity of a timestamp
 is an implementation detail. So I countered by making it explicit.

Making it explicit doesn't fix the fact that you can't rely on the
arithmetic to be exact.

 I still have not seen an answer to the problem of changing the
 representation of a continuous range. If you have the continuous range
 [5, 10], you're pretty much stuck with that representation, even if the
 application is expecting things in the form [ ).

That is not our problem.  It's the application's problem if it can't
handle the concept.  You might as well be complaining that type numeric
is broken because it can represent values that will fail to fit into
float8 when some application tries to force them into that form.

 And to further make the case for allowing user-defined discrete ranges,
 what about ip4r?

What about it?  I don't have a problem with the concept that next() is
well defined for some datatypes.  I just have a problem with the concept
that it's well-defined for timestamps.  It's not, and I don't believe
that forcing it to have some definition is useful in the real world
(which, as a rule, isn't going to fit the simplifying assumptions you
have to make to make it even sort-of work).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Martijn van Oosterhout
On Wed, Dec 16, 2009 at 03:57:44PM -0500, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  I still have not seen an answer to the problem of changing the
  representation of a continuous range. If you have the continuous range
  [5, 10], you're pretty much stuck with that representation, even if the
  application is expecting things in the form [ ).
 
 That is not our problem.  It's the application's problem if it can't
 handle the concept.  You might as well be complaining that type numeric
 is broken because it can represent values that will fail to fit into
 float8 when some application tries to force them into that form.

However, it does seem reasonable to allow people to restrict, either by
typmod or a check constraint the kinds of values that can be stored in
a particular column. Then an application can decide which way they want
their intervals to work and have the database enforce it.

(Intermediate values may become a different kind, just as long as the
result being stored it the right kind.)

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-16 Thread Kevin Grittner
Robert, Please forgive a couple editorial inserts to your statement
-- I hope it clarifies.  If I've distorted your meaning, feel free
to straighten me out.   :-)
 
Robert Haas robertmh...@gmail.com wrote:
 
 This thread veered off into a discussion of the traditional
 [predicate locking] technique, rather than the [serializable] one
 in the paper.  I think that's the part Alvaro was responding to.
 
If you're right about Alvaro's concern -- my rough understanding is
that HOT creates a linked lists of tuples which are mutations of one
another without altering any value which is part of an index.  If
that's anywhere near a correct understanding, I can't see how there
would be a problem with using the described locking techniques with
any tuple on the list.
 
As an aside, the thesis mentions smart use of multiple locking
granularities only under the future work section.  I can't see an
implementation being considered production quality without that, as
without it there would be no way to constrain the space required to
track the locks.  But there is no shortage of literature on how to
do that, so I view such discussions as more appropriate to low-level
implementation discussions should we ever get serious about using
the techniques which are the main thrust of Dr. Cahill's thesis.
 
If anyone is interested in reviewing recent literature on these
techniques, Dr. Cahill seemed to like (Hellerstein et al., 2007), to
the point where I may well track it down when I'm done pondering the
work which I referenced at the start of this thread.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 15:46 -0500, Tom Lane wrote:
  Huh? We're miscommunicating somewhere.
 
 Yeah, apparently.  By open-ended I meant -infinity left bound, or null
 left bound if you prefer.  Not sure if there's a better term.

But my proposal allowed both of those things with various flag settings
(because it has a flag byte in the latest proposal). I said so
explicitly.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 However, it does seem reasonable to allow people to restrict, either by
 typmod or a check constraint the kinds of values that can be stored in
 a particular column. Then an application can decide which way they want
 their intervals to work and have the database enforce it.

Sure --- the range datatype should absolutely provide inquiry functions
that let you determine all the properties of a range, so something like
CHECK (is_open_on_right(col)) would work for that.  I'm of the opinion
that we must not usurp typmod for range behavior --- the right thing is
to pass that through to the contained type, just as we do with arrays.

(Note that a range over timestamp(0) would eliminate at least some of
the platform dependencies we've been arguing about.  I'm still quite
dubious that next timestamp is anything except evidence that you've
misformulated your problem, though.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 15:57 -0500, Tom Lane wrote:
 Making it explicit doesn't fix the fact that you can't rely on the
 arithmetic to be exact.

Can't rely on what arithmetic to be exact? Int64 timestamps should
clearly work for granules of 1 second.

If the administrator can choose a timestamp format that can't accurately
represent whole seconds, that's not an argument that modeling based on
whole seconds is worthless. We can restrict float timestamps for ranges
as a special case, if we're that worried about people misusing them.

  I still have not seen an answer to the problem of changing the
  representation of a continuous range. If you have the continuous range
  [5, 10], you're pretty much stuck with that representation, even if the
  application is expecting things in the form [ ).
 
 That is not our problem.  It's the application's problem if it can't
 handle the concept.  You might as well be complaining that type numeric
 is broken because it can represent values that will fail to fit into
 float8 when some application tries to force them into that form.

...except that we support float8. So if applications like to work float8
float8, we let them.

You're arguing that we should not support discrete time ranges (or even
allow such a type to be defined by a superuser) even though applications
and users may happen to model time that way.

Who are we to argue that all of those people are so wrong that we won't
even support their type? Especially when they may have just finished a
couple books on the subject[1][2] which told them to model it that way?

  And to further make the case for allowing user-defined discrete ranges,
  what about ip4r?
 
 What about it?  I don't have a problem with the concept that next() is
 well defined for some datatypes.

Ok, so we'll allow users to specify user-defined types for discrete
ranges? How should that be specified, and how will that differ from
earlier proposals?

Earlier you proposed that we hard-wire the set of types that are allowed
to be used for discrete ranges:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg01278.php

Regards,
Jeff Davis

[1] Temporal Data and the Relational Model by C.J. Date, et al.
[2] Developing Time-Oriented Database Applications in SQL by Richard
Snodgrass.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Scott Bailey

Tom Lane wrote:

Martijn van Oosterhout klep...@svana.org writes:

However, it does seem reasonable to allow people to restrict, either by
typmod or a check constraint the kinds of values that can be stored in
a particular column. Then an application can decide which way they want
their intervals to work and have the database enforce it.


Sure --- the range datatype should absolutely provide inquiry functions
that let you determine all the properties of a range, so something like
CHECK (is_open_on_right(col)) would work for that.  I'm of the opinion
that we must not usurp typmod for range behavior --- the right thing is
to pass that through to the contained type, just as we do with arrays.

(Note that a range over timestamp(0) would eliminate at least some of
the platform dependencies we've been arguing about.  I'm still quite
dubious that next timestamp is anything except evidence that you've
misformulated your problem, though.)

regards, tom lane


Well our work is based on over 15 years of temporal research (not by us) 
and numerous books from Snodgrass, Date and Celko; as well as partial 
implementations in other databases. So its not like we took a blue pill 
this weekend and woke up with this hair-brained idea.


I understand your concern. But I think the objections are based more on 
implementation details with float timestamp rather than conceptually.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] determine snapshot after obtaining locks for first statement

2009-12-16 Thread Kevin Grittner
The Cahill thesis mentions an interesting optimization -- they defer
determination of the snapshot until after any locks required for the
first statement have been acquired.  Where the first statement was,
for example, an UPDATE, this reduced re-reads or rollbacks in the
face of concurrent modifications.
 
Does PostgreSQL currently do this?  If not, would it make sense?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls and pg_migrator

2009-12-16 Thread Bruce Momjian
Takahiro Itagaki wrote:
 
 KaiGai Kohei kai...@kaigai.gr.jp wrote:
 
  Can SELECT lo_create(16385); help this situation?
 
 SELECT lo_create(loid) FROM (SELECT DISTINCT loid FROM pg_largeobject) AS t
 
 would work for pg_migrator.
 
  I'm not clear whether we also check pg_largeobejct has chunks with same
  LOID on lo_create(). In the regular operation, it shall never happen.
 
 I think the omission is a reasonable optimization.

Thanks, I have updated pg_migrator to use your suggested method.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Jeff Davis
On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote:
 For example, if you're trying to do classroom scheduling, it might be
 useful to constrain the periods to start and end on hour boundaries
 --- but the next thing you'll want is to have it know that the next
 slot after 5pm Friday is 8am Monday.  Except on holidays.  And then
 there's the fact that my alma mater starts most hour-long classes on
 the half hour.

Data types are only a first-level constraint -- a domain of reasonable
values. The class isn't going to start on a fraction-of-a-minute
boundary, so it would be reasonable to reject those values early.

I never suggested that next() should be such a highly business-dependent
function as you suggest above (skipping holidays, etc); it should just
return the next value in the domain (if it's discrete).

Surely you wouldn't suggest that the ipv4 data type's next() function
should skip over addresses that aren't in a valid subnet on your
network. But you seem to think those make useful discrete ranges.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls (r2460)

2009-12-16 Thread Robert Haas
On Thu, Dec 10, 2009 at 10:41 PM, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:

 KaiGai Kohei kai...@ak.jp.nec.com wrote:

 What's your opinion about:
   long desc: When turned on, privilege checks on large objects perform with
              backward compatibility as 8.4.x or earlier releases.

 I updated the description as your suggest.

 Applied with minor editorialization,
 mainly around tab-completion support in psql.

The documentation in this patch needs work.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >