Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 So I propose moving the TablespaceCreateDbspace() call to mdcreate(),
 removing the redundant check from smgrcreate(), and deleting that
 portion of the comment which says this is in the wrong place.

While I don't care for having smgr.c call tablespace.c, moving the call to
md.c instead is surely not an improvement :-(.  The problem here is that
we'd like the tablespace code to be above the smgr code, not below.
Calling it from md.c makes the layer inversion worse not better.

 You could argue that perhaps md.c isn't the right place either, but it
 certainly makes more sense than smgr.c, and I'd argue it's exactly
 right.

On what grounds pray tell?

 A related, interesting question is whether there's any purpose to the
 smgr layer at all.

Not a lot anymore, I think.  This has been ranted about before, but
I think the Berkeley guys bet on the wrong horse when they put an API
layer here.  The facilities that might once have been usefully switched
between at this level are now all down inside kernel device drivers.
I could see merging smgr.c and md.c entirely.  But they'd still be
appropriately placed below, not above, the tablespace commands.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-20 Thread Steven Schlansker

On Aug 19, 2010, at 3:24 PM, Tom Lane wrote:
 Steven Schlansker ste...@trumpet.io writes:
 
 I'm not at all experienced with character encodings so I could
 be totally off base, but isn't it wrong to ever call isspace(0x85), 
 whatever the result may be, given that the actual character is 0xCF85?
 (U+03C5, GREEK SMALL LETTER UPSILON)
 
 We generally assume that in server-safe encodings, the ctype.h functions
 will behave sanely on any single-byte value.  You can argue the wisdom
 of that, but deciding to change that policy would be a rather massive
 code change; I'm not excited about going that direction.

Fair enough.  I presume there are no server-safe encodings for which
a multibyte sequence 0x XX20 would be valid - which would break anyway
(as the second byte looks like a real space)

 You need a setlocale() call, else the program acts as though it's in C
 locale regardless of environment.

Sigh.  I hate C sometimes. :-p

Anyway, it looks like this is actually a BSD bug which got copy +
pasted into Apple's Darwin source -

http://lists.freebsd.org/pipermail/freebsd-i18n/2007-September/000157.html

I have a couple of contacts at Apple so I'll see if there's any interest in
backporting a fix, but I wouldn't hope for it to happen quickly if at all...

Thanks for taking a look into fixing this, I hope you guys can reach
consensus on how to get it fixed :)

Best,
Steven Schlansker

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-20 Thread Steven Schlansker
On Aug 19, 2010, at 2:35 PM, Tom Lane wrote:

 Steven Schlansker ste...@trumpet.io writes:
 I'm having a rather annoying problem - a particular string is causing the 
 Postgres COPY functionality to lose a byte, causing data corruption in 
 backups and transferred data.
 
 I was able to reproduce this on my own Mac.  Some tracing shows that the
 problem is that isspace(0x85) returns true when in locale en_US.utf-8.
 This causes array_in to drop the final byte of the array element string,
 thinking that it's insignificant whitespace.

The 0x85 seems to be the second byte of a multibyte UTF-8
sequence.  I'm not at all experienced with character encodings so I could
be totally off base, but isn't it wrong to ever call isspace(0x85), 
whatever the result may be, given that the actual character is 0xCF85?
(U+03C5, GREEK SMALL LETTER UPSILON)


  I believe that you must
 not have produced the data file data.copy on a Mac, or at least not in
 that locale setting, because array_out should have double-quoted the
 array element given that behavior of isspace().

Correct, it was produced on a Linux machine.  That said, the charset
there was also UTF-8.

 
 Now, it's probably less than sane for isspace() to be behaving that way,
 since in a UTF8-based locale 0x85 can't be regarded as a valid character
 code at all.  But I'm not hopeful about the results of filing a bug with
 Apple, because their UTF8-based locales have a lot of other bu^H^Hdubious
 behaviors too, which they appear not to care much about.

I actually can't reproduce that behavior here:

#include ctype.h
#include stdio.h
int main() {
printf(%d\n, isspace(0x85));
return 0;
}

[ste...@xxx:~]% gcc -o test test.c
[ste...@xxx:~]% ./test
0
[ste...@xxx:~]% locale
LANG=en_US.utf-8
LC_COLLATE=en_US.utf-8
LC_CTYPE=en_US.utf-8
LC_MESSAGES=en_US.utf-8
LC_MONETARY=en_US.utf-8
LC_NUMERIC=en_US.utf-8
LC_TIME=en_US.utf-8
LC_ALL=
[ste...@xxx:~]% uname -a
Darwin xxx.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 
2010; root:xnu-1504.7.4~1/RELEASE_I386 i386 i386


Thanks much for your help,
Steven Schlansker


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-20 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Aug 19, 2010 at 08:36:09PM +0300, Heikki Linnakangas wrote:

[...]

 Hmm, will need to think about a suitable API for that. The nice thing would 
 be that we could implement it using pselect() where available. (And 
 reliable - the Linux select() man page says that glibc's pselect() is 
 emulated using select(), and suffers from the very same race condition 
 pselect() was invented to solve. How awful is that!?)

It is indeed. It seems, though, that from Linux kernel 2.6.16 and glibc
2.4 on, things look better [1]. As a reference, Debian stable (not known
to adventure too far into the present ;-) is libc 2.7 on kernel 2.6.26.

Of course, enterprise GNU/Linux distros are said to be even more
conservative...

[1] http://lwn.net/Articles/176911/

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFMbiauBcgs9XrR2kYRAhiwAJ41f29jSIy409epTH0eJRXW17oByACeIkRo
CRg2BCw8tn3PkdnNR1i/MCY=
=GVMT
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Josh Berkus
On 8/19/10 3:51 PM, Josh Berkus wrote:
 Kevin,
 
 This one is for you:
 
 Two sessions, in transaction:
 
 Process A Process B
 
 update session where id = X;
   update order where orderid = 5;
 update order where orderid = 5;
   update order where orderid = 5;
 ... deadlock error.

Johto on IRC pointed out I left something out of the above: session is
referenced in an FK by orders, and session = X is related to orderid = 5.

 
 It seems like we ought to be able to avoid a deadlock in this case;
 there's a clear precedence of who grabbed the order row first.  Does
 your serializability patch address the above situation at all?
 


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Thom Brown
On 20 August 2010 09:39, Josh Berkus j...@agliodbs.com wrote:
 On 8/19/10 3:51 PM, Josh Berkus wrote:
 Kevin,

 This one is for you:

 Two sessions, in transaction:

 Process A             Process B

 update session where id = X;
                       update order where orderid = 5;
 update order where orderid = 5;
                       update order where orderid = 5;
 ... deadlock error.

 Johto on IRC pointed out I left something out of the above: session is
 referenced in an FK by orders, and session = X is related to orderid = 5.


I was wondering what that had to do with anything.

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Marko Tiikkaja

On 2010-08-20 11:39 AM +0300, Josh Berkus wrote:

On 8/19/10 3:51 PM, Josh Berkus wrote:

Two sessions, in transaction:

Process A   Process B

update session where id = X;
update order where orderid = 5;
update order where orderid = 5;
update order where orderid = 5;
... deadlock error.


Johto on IRC pointed out I left something out of the above: session is
referenced in an FK by orders, and session = X is related to orderid = 5.


Right, that would result in a deadlock.  I think truly serializable 
transactions still need to SELECT FOR SHARE here for foreign keys to 
work, no?



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 01:01, Quan Zongliang quanzongli...@gmail.com wrote:
 Because Windows's CreateService has serial start-type:
 SERVICE_AUTO_START
 SERVICE_BOOT_START
 SERVICE_DEMAND_START
 SERVICE_DISABLED
 SERVICE_SYSTEM_START

 Although all of them are not useful for pg service.
 I think it is better to use enum.

I don't see us ever using anything other than auto or demand. The
others aren't for regular services, except for disabled. And
adding a disabled service makes no sense :-) So I'm with Alvaro, I
think it's a good idea to simplify that.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SQLSTATE of notice PGresult

2010-08-20 Thread Dmitriy Igrishin
Hey all,

Accordingly to the documentation of libpq, SQLSTATE code field is not
localizable, and
is always present.. But it seems, in some cases it isn't. E.g.

  /* the main code */
  PGresult* res = Pg::PQexec(conn, select 1);
  Oid id = PQparamtype(res, 1);

  /* the notice receiver */
  void myNoticeReceiver(void *arg, const PGresult *res)
{
  /* Presents - NOTICE */
  const char* severity = Pg::PQresultErrorField(res, PG_DIAG_SEVERITY);

  /* NOT presents - NULL. Why not 0 ? */
  const char* sqlstate = Pg::PQresultErrorField(res, PG_DIAG_SQLSTATE);

  /* Presents - parameter number 1 is out of range 0..-1 */
  const char* primary = Pg::PQresultErrorField(res,
PG_DIAG_MESSAGE_PRIMARY);
}

So, SQLSTATE field is not always presents.

Regards,
Dmitriy


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 09:49, Max Bowsher m...@f2s.com wrote:
 On 19/08/10 10:35, Magnus Hagander wrote:
 On Thu, Aug 19, 2010 at 07:00, Michael Haggerty mhag...@alum.mit.edu wrote:
 Magnus Hagander wrote:
 Is there some way to make cvs2git work this way, and just not bother
 even trying to create merge commits, or is that fundamentally
 impossible and we need to look at another tool?

 The good news: (I just reminded myself/realized that) Max Bowsher has
 already implemented pretty much exactly what you want in the cvs2svn
 trunk version, including noting in the commit messages any cherry-picks
 that are not reflected in the repo ancestry.

 Ah, that's great.

 I should mention that the way it notes this is to reference commits by
 their timestamp, author and initial line of log message - it does this
 because cvs2git doesn't know the commit sha ever - that doesn't appear
 until the stream is fed through git fast-import. I did briefly raise the
 idea of augmenting the fast-import process to support substituting
 fast-import marks to shas in log messages, but didn't get time to take
 it beyond an idea.

 The bad news: It is broken [1].  But I don't think it should be too much
 work to fix it.

 That's less great of course, but it gives hope!

 Thanks for your continued efforts!

 I've just made a commit to cvs2svn trunk. I hope this should now be fixed.


Great. I will download and test the trunk version soon. I'm currently
running a test using cvs2svn and then git-svn clone from that - but
it's insanely slow (been going for 30+ hours now, and probably has
8-10 hours more to go)...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why assignment before return?

2010-08-20 Thread Magnus Hagander
This code-pattern appears many times in pgstatfuncs.c:

Datum
pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
int64   result;
PgStat_StatTabEntry *tabentry;

if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
result = 0;
else
result = (int64) (tabentry-blocks_fetched);

PG_RETURN_INT64(result);
}


Why do we assign this to result and then return, why not just:
if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
PG_RETURN_INT64(0);
else
PG_RETURN_INT64(tabentry-blocks_fetched);


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 13:50, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 12:02, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 09:49, Max Bowsher m...@f2s.com wrote:
 On 19/08/10 10:35, Magnus Hagander wrote:
 On Thu, Aug 19, 2010 at 07:00, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 Magnus Hagander wrote:
 Is there some way to make cvs2git work this way, and just not bother
 even trying to create merge commits, or is that fundamentally
 impossible and we need to look at another tool?

 The good news: (I just reminded myself/realized that) Max Bowsher has
 already implemented pretty much exactly what you want in the cvs2svn
 trunk version, including noting in the commit messages any cherry-picks
 that are not reflected in the repo ancestry.

 Ah, that's great.

 I should mention that the way it notes this is to reference commits by
 their timestamp, author and initial line of log message - it does this
 because cvs2git doesn't know the commit sha ever - that doesn't appear
 until the stream is fed through git fast-import. I did briefly raise the
 idea of augmenting the fast-import process to support substituting
 fast-import marks to shas in log messages, but didn't get time to take
 it beyond an idea.

 The bad news: It is broken [1].  But I don't think it should be too much
 work to fix it.

 That's less great of course, but it gives hope!

 Thanks for your continued efforts!

 I've just made a commit to cvs2svn trunk. I hope this should now be fixed.


 Great. I will download and test the trunk version soon. I'm currently
 running a test using cvs2svn and then git-svn clone from that - but
 it's insanely slow (been going for 30+ hours now, and probably has
 8-10 hours more to go)...

 Uh, you are? Why do it that way?

Trying other possible options, in case this one doesn't work out :-) I
figured I might try something while you guys were working on a fix -
didn't expect the fix to show up quite so quickly :)


 The thing I fixed pertains to the direct use of cvs2git, and will have
 no effect on executions of cvs2svn.

Right. I started this one yesterday...


 I have run cvs2git on the pgsql module of your CVS locally (is that the
 right thing to convert?) if you'd like to compare notes on specific
 parts of the conversion.

Correct, that's the one. Can you put your repo up somewhere so we can
look at it? Then I don't have to wait for my process to finish :D

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why assignment before return?

2010-08-20 Thread Thom Brown
On 20 August 2010 12:46, Magnus Hagander mag...@hagander.net wrote:
 This code-pattern appears many times in pgstatfuncs.c:

 Datum
 pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
 {
        Oid                     relid = PG_GETARG_OID(0);
        int64           result;
        PgStat_StatTabEntry *tabentry;

        if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
                result = 0;
        else
                result = (int64) (tabentry-blocks_fetched);

        PG_RETURN_INT64(result);
 }


 Why do we assign this to result and then return, why not just:
        if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
                PG_RETURN_INT64(0);
        else
                PG_RETURN_INT64(tabentry-blocks_fetched);


 --

And then drop the int64result; declaration as a result.

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Kevin Grittner
Josh Berkus  wrote:
 
 Two sessions, in transaction:

 Process AProcess B

 update session where id = X;
 update order where orderid = 5;
 update order where orderid = 5;
 update order where orderid = 5;
 ... deadlock error.

 Johto on IRC pointed out I left something out of the above:
 session is referenced in an FK by orders, and session = X is
 related to orderid = 5.
 
The patch I'm offering implements the SSI techniques published by
Michael Cahill, et al.  Those techniques basically allow the current
snapshot isolation to run as it currently does, but monitors for
read/write conflicts to generate a new type of serialization failure
when a cycle becomes possible which could create an anomaly.  There
are no read/write conflict cycles in your example, so it would behave
just as REPEATABLE READ and SERIALIZABLE now behave -- you get a
deadlock which rolls back one of the transactions.
 
I don't see how SSI can be modified to generate some other form of
serialization failure here, but I'm always open to suggestions.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Greg Stark
On Fri, Aug 20, 2010 at 3:43 AM, Robert Haas robertmh...@gmail.com wrote:
 A related, interesting question is whether there's any purpose to the
 smgr layer at all.  Would we accept a patch that implemented an
 alternative smgr layer, perhaps on a per-tablespace basis?

I definitely want to keep it.

I think we could usefully do an application-level raid implementation.
It would be useful for people running on machines where they don't
have administrative access on the machine. In particular I'm picturing
shared cluster machines that run other jobs and can't be reconfigured
specifically for the database. Adding per-tablespace behaviour would
make the argument a lot stronger too since it's not so easy to set up
different stripe sizes per table if you're using OS level raid.

I also have various crazy plans to experiment with network-based
storage and had intended to use smgr to do so. At google we have a
bunch of different storage technologies and they're all
application-level network services. You can always implement a fuse
module that calls back up to a daemon which acts as the client but
that doesn't make me feel any happier about it.

And I know EDB has their infinicache thing using memcached -- I don't
recall if it uses the smgr layer but it would certainly be a natural
place to hook it in.

I guess my point here is that regardless of whether we plan on
accepting any such patches in core it's a very handy hook for third
parties to extend postgres with. It would be nice if we had some of
those modules in contrib to keep us honest with the api but even as it
stands I think it's useful.




-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Vaccum and analyze counters in pgstat

2010-08-20 Thread Magnus Hagander
Attached is a patch that adds columns to pg_stat_*_tables for number
of [auto]vacuum and [auto]analyze runs on a table, completing the
current one that just had the last time these ran. It's particularly
useful to see how much autovac is doing on the tables, but I included
the counts of regular vacuum and analyze as well for completeness.

Comments?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


stat_vacuum_counters.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Heikki Linnakangas

On 20/08/10 15:45, Greg Stark wrote:

On Fri, Aug 20, 2010 at 3:43 AM, Robert Haasrobertmh...@gmail.com  wrote:

A related, interesting question is whether there's any purpose to the
smgr layer at all.  Would we accept a patch that implemented an
alternative smgr layer, perhaps on a per-tablespace basis?


I definitely want to keep it.

I think we could usefully do an application-level raid implementation.
It would be useful for people running on machines where they don't
have administrative access on the machine. In particular I'm picturing
shared cluster machines that run other jobs and can't be reconfigured
specifically for the database. Adding per-tablespace behaviour would
make the argument a lot stronger too since it's not so easy to set up
different stripe sizes per table if you're using OS level raid.

I also have various crazy plans to experiment with network-based
storage and had intended to use smgr to do so. At google we have a
bunch of different storage technologies and they're all
application-level network services. You can always implement a fuse
module that calls back up to a daemon which acts as the client but
that doesn't make me feel any happier about it.


I think you would be better off implementing that somewhere in md.c, or 
even file.c. The smgr layer has been dead for such a long time that it 
would take a significant amount of work to make it usable again. The 
whole infrastructure to handle fsyncs() at checkpoints, for example, is 
in md.c, so you'd need to rewrite all that.



And I know EDB has their infinicache thing using memcached -- I don't
recall if it uses the smgr layer but it would certainly be a natural
place to hook it in.


FWIW, it doesn't. It's in bufmgr, it needs some co-operation from the 
buffer cache.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 14:11, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 12:55, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 13:50, Max Bowsher m...@f2s.com wrote:
 I have run cvs2git on the pgsql module of your CVS locally (is that the
 right thing to convert?) if you'd like to compare notes on specific
 parts of the conversion.

 Correct, that's the one. Can you put your repo up somewhere so we can
 look at it? Then I don't have to wait for my process to finish :D

 Placed at http://red-bean.com/~maxb/pgsql-test.git - about 230MB -
 sorry, dumb transport only, but hopefully that's not an issue for this
 use case.

It does. I've pushed up a mirror to
http://git.postgresql.org/gitweb?p=git-migration-test.git;a=summary -
that one is a lot faster to work with for me at least.

I'm also going to run my branch-verification script on it to see that
it deosn't mess any of that up - that one takes a few hours to run
(mainly the fault of the cvs we compare to :D) - I'll get back to you
when it's done.

For other who test this - it's obviously missing the author name
mapping, but that's a minor thing.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why assignment before return?

2010-08-20 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 This code-pattern appears many times in pgstatfuncs.c:
 Datum
 pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
 {
   Oid relid = PG_GETARG_OID(0);
   int64   result;
   PgStat_StatTabEntry *tabentry;

   if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
   result = 0;
   else
   result = (int64) (tabentry-blocks_fetched);

   PG_RETURN_INT64(result);
 }


I see nothing wrong with that style.  Reducing it as you propose
probably wouldn't change the emitted code at all, and what it would
do is reduce flexibility.  For instance, if we ever needed to add
additional operations just before the RETURN (releasing a lock on
the tabentry, perhaps) we'd just have to undo the improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-20 Thread Heikki Linnakangas

On 19/08/10 20:59, Tom Lane wrote:

Offhand I'd suggest something like

SetSleepInterrupt() -- called by signal handlers, writes pipe
ClearSleepInterrupt()   -- called by sleep-and-do-something loops, clears pipe

pg_usleep() itself remains the same, but it is now guaranteed to return
immediately if SetSleepInterrupt is called, or has been called since the
last ClearSleepInterrupt.


Hmm, we have pg_usleep() calls in some fairly low-level functions, like 
mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we 
don't want those pg_usleep()s to return immediately. And pg_usleep() is 
used in some client code too. I think we need a separate sleep function 
for this.


Another idea is to not use unix signals at all, but ProcSendSignal() and 
ProcWaitForSignal(). We would not need the signal handler at all. 
Walsender would use ProcWaitForSignal() instead of pg_usleep() and 
backends that want to wake it up would use ProcSendSignal(). The problem 
is that there is currently no way to specify a timeout, but I presume 
the underlying semaphore operations have that capability, and we could 
expose it.


Actually ProcSendSignal()/ProcWaitForSignal() won't work as is, because 
walsender doesn't have a PGPROC entry, but you could easily build a 
similar mechanism, using PGSemaphoreLock/Unlock like 
ProcSendSignal()/WaitForSignal() does.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why assignment before return?

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 15:10, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 This code-pattern appears many times in pgstatfuncs.c:
 Datum
 pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
 {
       Oid                     relid = PG_GETARG_OID(0);
       int64           result;
       PgStat_StatTabEntry *tabentry;

       if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
               result = 0;
       else
               result = (int64) (tabentry-blocks_fetched);

       PG_RETURN_INT64(result);
 }


 I see nothing wrong with that style.  Reducing it as you propose
 probably wouldn't change the emitted code at all, and what it would
 do is reduce flexibility.  For instance, if we ever needed to add
 additional operations just before the RETURN (releasing a lock on
 the tabentry, perhaps) we'd just have to undo the improvement.

I'm not saying it's wrong, I'm just trying to figure out why it's
there since I wanted to add other functions and it looked.. Odd. I'll
change my new functions to look like this for consistency, but I was
curious if there was some specific reason why it was better to do it
this way.

I see your answer as no, not really any reason, but also not worth
changing, which is fine by me :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-20 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Hmm, we have pg_usleep() calls in some fairly low-level functions, like 
 mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we 
 don't want those pg_usleep()s to return immediately. And pg_usleep() is 
 used in some client code too. I think we need a separate sleep function 
 for this.

Well, we'd need some careful thought about which sleeps need what, but I
don't necessarily have an objection to a separate interruptable sleep
function.

 Another idea is to not use unix signals at all, but ProcSendSignal() and 
 ProcWaitForSignal(). We would not need the signal handler at all. 
 Walsender would use ProcWaitForSignal() instead of pg_usleep() and 
 backends that want to wake it up would use ProcSendSignal().

You keep on proposing solutions that only work for walsender :-(.
Most of the other places where we have pg_usleep actually do want
a timed sleep, I believe.  It's also unclear that we can always expect
ProcSendSignal to be usable --- for example, stuff like SIGHUP would
be sent by processes that might not be connected to shared memory.

 The problem 
 is that there is currently no way to specify a timeout, but I presume 
 the underlying semaphore operations have that capability, and we could 
 expose it.

They don't, or at least the semop-based implementation doesn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why assignment before return?

2010-08-20 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 I see your answer as no, not really any reason, but also not worth
 changing, which is fine by me :-)

Yeah, that's a fair summary.  If it had been coded the other way
to start with, I'd also say it wasn't worth changing, at least
not until such time as we actually needed to.

In the meantime, any added functions of the same ilk should definitely
be made to look like the existing ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Robert Haas
On Thu, Aug 19, 2010 at 11:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 While I don't care for having smgr.c call tablespace.c, moving the call to
 md.c instead is surely not an improvement :-(.  The problem here is that
 we'd like the tablespace code to be above the smgr code, not below.
 Calling it from md.c makes the layer inversion worse not better.

 You could argue that perhaps md.c isn't the right place either, but it
 certainly makes more sense than smgr.c, and I'd argue it's exactly
 right.

 On what grounds pray tell?

If smgr wants to even have the pretense of being an abstraction layer,
it can't very well know about the underlying file system structure.
But there's no getting around the fact that md.c has to know that
stuff; it has to create and write those files.  There is, perhaps, a
layer inversion problem here, but if anything I think it's that the
functionality of tablespace.c spans everything from the SQL layer all
the way down to pushing bits in the filesystem.  But that's not really
the fault of smgr.c/md.c.  Perhaps tablespace.c shouldn't assume
anything about the underlying filesystem representation and that
knowledge should be moved somewhere under src/backend/storage, but I
don't see how it makes sense for the smgr layer to include assumptions
about what filesystem abstraction md.c happens to implement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 15:04, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Aug 20, 2010 at 14:11, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 12:55, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 13:50, Max Bowsher m...@f2s.com wrote:
 I have run cvs2git on the pgsql module of your CVS locally (is that the
 right thing to convert?) if you'd like to compare notes on specific
 parts of the conversion.

 Correct, that's the one. Can you put your repo up somewhere so we can
 look at it? Then I don't have to wait for my process to finish :D

 Placed at http://red-bean.com/~maxb/pgsql-test.git - about 230MB -
 sorry, dumb transport only, but hopefully that's not an issue for this
 use case.

 It does. I've pushed up a mirror to
 http://git.postgresql.org/gitweb?p=git-migration-test.git;a=summary -
 that one is a lot faster to work with for me at least.

 I'm also going to run my branch-verification script on it to see that
 it deosn't mess any of that up - that one takes a few hours to run
 (mainly the fault of the cvs we compare to :D) - I'll get back to you
 when it's done.

That turned out to be a non-starter, since that clone doesn't have
expanded keywords. I'll run a new conversion with the same options
file used last time, and we can work off that.

I believe Robert had some comments/questions as well :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Robert Haas
On Fri, Aug 20, 2010 at 8:45 AM, Greg Stark gsst...@mit.edu wrote:
 On Fri, Aug 20, 2010 at 3:43 AM, Robert Haas robertmh...@gmail.com wrote:
 A related, interesting question is whether there's any purpose to the
 smgr layer at all.  Would we accept a patch that implemented an
 alternative smgr layer, perhaps on a per-tablespace basis?

 I definitely want to keep it.

 I think we could usefully do an application-level raid implementation.
 It would be useful for people running on machines where they don't
 have administrative access on the machine. In particular I'm picturing
 shared cluster machines that run other jobs and can't be reconfigured
 specifically for the database. Adding per-tablespace behaviour would
 make the argument a lot stronger too since it's not so easy to set up
 different stripe sizes per table if you're using OS level raid.

That would actually be kind of cool.

 I also have various crazy plans to experiment with network-based
 storage and had intended to use smgr to do so. At google we have a
 bunch of different storage technologies and they're all
 application-level network services. You can always implement a fuse
 module that calls back up to a daemon which acts as the client but
 that doesn't make me feel any happier about it.

Me either.

I really like the idea of trying to use network-based storage in some
way.  Gigabit Ethernet is a big I/O channel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why assignment before return?

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 15:27, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 I see your answer as no, not really any reason, but also not worth
 changing, which is fine by me :-)

 Yeah, that's a fair summary.  If it had been coded the other way
 to start with, I'd also say it wasn't worth changing, at least
 not until such time as we actually needed to.

 In the meantime, any added functions of the same ilk should definitely
 be made to look like the existing ones.

Yeah. I notice there are some functions that are not following this
pattern, but most are, so I'll adjust my patch with this.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ...  Perhaps tablespace.c shouldn't assume
 anything about the underlying filesystem representation and that
 knowledge should be moved somewhere under src/backend/storage, but I
 don't see how it makes sense for the smgr layer to include assumptions
 about what filesystem abstraction md.c happens to implement.

Well, the other approach we could take is to move the tablespace.c
filesystem-whacking code into md.c, expose it via a new smgr API, and
have commands/tablespace.c call that.  I wouldn't have a layering
problem with a design like that, and as you say it's probably cleaner
than what's there.  But having something in smgr calling something in
/commands is Just Wrong.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Kevin Grittner
I wrote:
 
 I don't see how SSI can be modified to generate some other form of
 serialization failure here, but I'm always open to suggestions.
 
Actually, after thinking about it a bit more, the UPDATE statements
*do* read the rows before writing, so a naive implementation would
see write skew in Josh's example and generate a rollback before
things got far enough to cause a deadlock.  In fact, a few months
ago the implementation probably would have done so, before we
implemented the optimization mentioned in section 3.7.3 of Cahill's
doctoral thesis[1].
 
The reasons for implementing that change were:
 
(1) It avoids getting an SIREAD lock on a row if that row has been
updated by the transaction.  I believe that in the PostgreSQL
implementation we even avoid taking the SIREAD lock when we're in a
scan from an UPDATE or DELETE statement, but I'd have to dig into
the code to confirm.
 
(2) Because of (1) and the removal of an SIREAD lock on a row is
later updated, the shared memory structures used for tracking SIREAD
locks can be somewhat smaller and access to them will be a bit
faster.
 
(3) I *think* that having the additional SIREAD locks would tend to
increase the false positive rate, although I'd need to spend some
time working through that to be sure.
 
So, the question would be: does this optimization from the paper
actually improve performance because of the above points more than
the savings which would accrue from catching the conflict in Josh's
example before it gets to the point of deadlock?  I can add that to
the list of things to check once we have a good set of benchmarks.
 
-Kevin
 
[1] http://hdl.handle.net/2123/5353


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Robert Haas
On Fri, Aug 20, 2010 at 9:28 AM, Magnus Hagander mag...@hagander.net wrote:
 I believe Robert had some comments/questions as well :-)

What Magnus means is that I'm a grumpy old developer who complains
about everything.

Anyway, what I noticed was that we're getting stuff like this:

http://git.postgresql.org/gitweb?p=git-migration-test.git;a=commit;h=586b324c255a4316d72a5757566ebe6e630df47e

commit 586b324c255a4316d72a5757566ebe6e630df47e
Author: cvs2git 
Date:   Thu May 13 16:39:49 2010 +

This commit was manufactured by cvs2svn to create branch 'REL8_4_STABLE'.

Cherrypick from master 2010-05-13 16:39:43 UTC adunstan 'Abandon the use of
src/pl/plperl/plperl_opmask.pl

We're not getting that on EVERY back-patch, just on some of them.  I
really just want to turn this code to detect merges and cherry-picks
OFF altogether, so that we get the original committer and commit
message instead off the above.  It's much easier to read if you're
browsing the back-branch history, and it's probably easier to match up
commits across branches, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Deadlock bug

2010-08-20 Thread Joel Jacobson
(Magnus and pghackers, I've included you in this email, since it appears to
be PostgreSQL bug. The example below is general, and not specific to Glue
Finance database model. Feel free to share it with anyone.)

I've just tried to replicate the deadlock in 8.4.4 and 9.0b4.
Same problem in both versions.
8.4.4 example: http://screencast.com/t/ZTBlMTBmNTc

 start of comments, specific to Glue Finance database 

(1) Orders.SessionID is not really necessary, we only store it to log what
session created which order. We never use this information, it is merely
saved for logging purposes.
Dropping the foreign key...
orders_sessionid_fkey FOREIGN KEY (sessionid) REFERENCES
sessions(sessionid)
...would mean we risk data integrity problems if the session would be
deleted (which it never is), even if it would be deleted, we wouldn't really
care since it just for logging purposes.

(2) Setting Orders.Heartbeat to now() on each
intended-to-be-most-of-the-times-read-only-until-something-happends-request
(aka Get_Server_Request) is of course a huge performance hit, as it require
a row exclusive lock, meaning such requests cannot be performed in
parallell.
We will therefore remove the Orders.Heartbeat column entirely.

(3) Making sure Orders is always locked first, before obtaining the Sessions
lock, would like you suggest also solve the problem, but requires a larger
rewrite of probably a lot of functions.
Removing the foreign key means we don't have to rewrite the functions.

(4) Fix the PostgreSQL bug.

(1) would effectively solve the deadlock issue, but not the performance
issue, we should therefore do (2) as well.
 end of comments, specific to Glue Finance database 

I think this clearly looks like a bug in PostgreSQL because of the following
observations:

Below are comments to the screencast at http://screencast.com/t/NTk2Y2VhMW

The following example is not specific for Glue Finance database.
Attached, please find the text file with the queries and simple example
schema.

1. Process 1 executes UPDATE A SET Col1 = 1 WHERE AID = 1;.
We can see it obtains two RowExclusiveLocks on relations a_pkey and a.
This is the expected result.

2. Process 2 then executes UPDATE B SET Col2 = 1 WHERE BID = 2;.
We can see it obtains two RowExclusiveLocks on relations b_pkey and b.
I don't know if this is expected, since the row in B references the row in A
being updated by process 1.
Because of the foreign key, shouldn't some kind of share lock on A be
obtained by process 2, or some other kind of lock?

3. Process 1 tries to execute UPDATE B SET Col2 = 1 WHERE BID = 2; and
will of course have to wait, because process 2 already has a
RowExclusiveLock on the same row in table B.

Process 1 is now waiting...

4. Now, in the other SQL prompt (process 2), we take a look at the vLocks
view.
Unexpected observations:
a) both processes have been granted a RowExclusiveLock on table B. How can
both be granted a RowExclusiveLock on the same table? Since the table only
contains one row, it must be a lock on the same row, which should be
impossible, right?
b) process 1 (which is currently waiting) has been granted a lock of type
tuple, page 0, tuple 1, mode ExclusiveLock on table B. I don't know what
a tuple lock is, but what surprises me is process 1 being granted the
lock, and not process 2 (since process 2 updated B before 1).

Now, while process 1 is waiting, let's execute the same query in process 2:

5. Process 2 tries to execute UPDATE B SET Col2 = 1 WHERE BID = 2; which
is exactly the same query as in step 2 above.
Since process 2 already hold a granted RowExclusiveLock on the row in table
B it tries to update, I think this query should be executed instantly
without any problem. Instead, it causes a deadlock in process 2, allowing
process 1 to commit. Very strange.

Could this have any other explanation than a bug (or perhaps feature) in
postgres?

-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


deadlock.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Dave Page
On Fri, Aug 20, 2010 at 2:36 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Aug 20, 2010 at 9:28 AM, Magnus Hagander mag...@hagander.net wrote:
 I believe Robert had some comments/questions as well :-)

 What Magnus means is that I'm a grumpy old developer who complains
 about everything.

Don't put yourself down. You're not that old :-p

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vaccum and analyze counters in pgstat

2010-08-20 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Attached is a patch that adds columns to pg_stat_*_tables for number
 of [auto]vacuum and [auto]analyze runs on a table, completing the
 current one that just had the last time these ran. It's particularly
 useful to see how much autovac is doing on the tables, but I included
 the counts of regular vacuum and analyze as well for completeness.

 Comments?

Looks reasonably sane in a quick read-through.

 --- 117,125 
  is a subsystem that supports collection and reporting of information 
 about
  server activity.  Presently, the collector can count accesses to tables
  and indexes in both disk-block and individual-row terms.  It also tracks
 !the total number of rows in each table, and information about vacuum and
 !analyze for each table.  It can also count calls to user-defined 
 functions
 !and the total time spent in each one.
 /para

information about vacuum and analyze actions might read better.

 --- 318,325 
 entrySimilar to structnamepg_stat_all_tables/, but counts 
 actions
 taken so far within the current transaction (which are 
 emphasisnot/
 yet included in structnamepg_stat_all_tables/ and related views).
 !   The columns for numbers of live and dead rows and vacuum and
 !   analyze values are not present in this view./entry
/row

Likewise values - actions here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Heikki Linnakangas

On 20/08/10 16:30, Robert Haas wrote:

I really like the idea of trying to use network-based storage in some
way.  Gigabit Ethernet is a big I/O channel.


NFS?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Robert Haas
On Fri, Aug 20, 2010 at 9:51 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 20/08/10 16:30, Robert Haas wrote:

 I really like the idea of trying to use network-based storage in some
 way.  Gigabit Ethernet is a big I/O channel.

 NFS?

I don't particularly trust NFS to be either reliable or performant for
database use.  Do you?  And what if you want additional functionality,
like sharding or mirroring?  ISTM that something built around a custom
protocol that mimics exactly what we need from the smgr layer would be
a lot easier to set up and a lot easier to be confident in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-20 Thread Heikki Linnakangas

On 20/08/10 16:24, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

Hmm, we have pg_usleep() calls in some fairly low-level functions, like
mdunlink() and s_lock(). If someone has called SetSleepInterrupt(), we
don't want those pg_usleep()s to return immediately. And pg_usleep() is
used in some client code too. I think we need a separate sleep function
for this.


Well, we'd need some careful thought about which sleeps need what, but I
don't necessarily have an objection to a separate interruptable sleep
function.


If we have to, we could also support multiple interrupts with multiple 
self-pipes, so that you can choose at pg_usleep() which ones to wake up on.



Another idea is to not use unix signals at all, but ProcSendSignal() and
ProcWaitForSignal(). We would not need the signal handler at all.
Walsender would use ProcWaitForSignal() instead of pg_usleep() and
backends that want to wake it up would use ProcSendSignal().


You keep on proposing solutions that only work for walsender :-(.


Well yes, the other places where we use pg_usleep() are not really a 
problem as is. If as a side-effect we can make them respond more quickly 
to signals, with small changes, that's good, but walsender is the only 
one that's performance critical.


That said, a select() based solution is my current favorite.


Most of the other places where we have pg_usleep actually do want
a timed sleep, I believe.  It's also unclear that we can always expect
ProcSendSignal to be usable --- for example, stuff like SIGHUP would
be sent by processes that might not be connected to shared memory.


The problem
is that there is currently no way to specify a timeout, but I presume
the underlying semaphore operations have that capability, and we could
expose it.


They don't, or at least the semop-based implementation doesn't.


There's semtimedop(). I don't know how portable it is, but it seems to 
exist at least on Linux, Solaris, HPUX and AIX. On what platforms do we 
use sysv semaphores?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes:
 a) both processes have been granted a RowExclusiveLock on table B. How can
 both be granted a RowExclusiveLock on the same table? Since the table only
 contains one row, it must be a lock on the same row, which should be
 impossible, right?

This complaint seems to be based on a complete misunderstanding of what
RowExclusiveLock is.  Please see
http://www.postgresql.org/docs/8.4/static/explicit-locking.html

RowExclusiveLock on a table is just a type of lock on a *table*.
It is not taken on any particular row, and it does not prevent other
processes from also taking RowExclusiveLock on the same table.  (As
the docs note, the names of the lock modes aren't terribly mnemonic.)

There will also be row-level locks (either shared or exclusive) on
specific rows, but those generally aren't visible in pg_locks because
of implementation restrictions.

 b) process 1 (which is currently waiting) has been granted a lock of type
 tuple, page 0, tuple 1, mode ExclusiveLock on table B. I don't know what
 a tuple lock is, but what surprises me is process 1 being granted the
 lock, and not process 2 (since process 2 updated B before 1).

Well, what that really means is that process 1 is waiting to acquire
exclusive row-level lock on that row.  Process 2 has got that lock,
but you can't see that in pg_locks.  What you can see is a transient
heavyweight lock that is taken out while waiting.  IIRC the main
reason for doing that is to ensure that the heavyweight lock manager
can resolve any conflicts that might come from multiple processes
trying to acquire the same row-level lock.

 5. Process 2 tries to execute UPDATE B SET Col2 = 1 WHERE BID = 2; which
 is exactly the same query as in step 2 above.
 Since process 2 already hold a granted RowExclusiveLock on the row in table
 B it tries to update, I think this query should be executed instantly
 without any problem. Instead, it causes a deadlock in process 2, allowing
 process 1 to commit. Very strange.

It does go through without any deadlock, *if* there is no foreign key
involved.  You didn't tell us exactly what the FK relationship is, but
I suspect the reason for the deadlock is that one process is trying to
update a row that references some row already updated by the other.
That will require a row-level share lock on the referenced row, so you
can get a deadlock.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Heikki Linnakangas

On 20/08/10 17:01, Robert Haas wrote:

On Fri, Aug 20, 2010 at 9:51 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 20/08/10 16:30, Robert Haas wrote:


I really like the idea of trying to use network-based storage in some
way.  Gigabit Ethernet is a big I/O channel.


NFS?


I don't particularly trust NFS to be either reliable or performant for
database use.  Do you?


Depends on the implementation, I guess, but the point is that there's a 
bazillion network-based filesystems with different tradeoffs out there 
already. It seems unlikely that you could outperform them with something 
built into PostgreSQL.


To put it other way, if you built network-based storage into PostgreSQL, 
what PostgreSQL-specific knowledge could you take advanage of to make it 
more performant/reliable? If there isn't any, I don't see the point.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

2010-08-20 Thread Tom Lane
[ It's way past time to change the thread title ]

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 20/08/10 16:24, Tom Lane wrote:
 You keep on proposing solutions that only work for walsender :-(.

 Well yes, the other places where we use pg_usleep() are not really a 
 problem as is.

Well, yes they are.  They cause unnecessary process wakeups and thereby
consume cycles even when the database is idle.  See for example a
longstanding complaint here:
https://bugzilla.redhat.com/show_bug.cgi?id=252129

If we're going to go to the trouble of having a mechanism like this,
I'd like it to fix that problem so I can close out that bug.

 There's semtimedop(). I don't know how portable it is, but it seems to 
 exist at least on Linux, Solaris, HPUX and AIX.

It's not on my HPUX, and I don't see it in the Single Unix Spec.

 On what platforms do we use sysv semaphores?

AFAIK, everything except Windows and extremely old versions of OS X.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 You didn't tell us exactly what the FK relationship is
 
The original post has an attachment with a self-contained example,
starting with table creation.
 
 I suspect the reason for the deadlock is that one process is
 trying to update a row that references some row already updated by
 the other.
 
The surprising thing is that a particular row is (using the
identifiers from the attachment):
 
Process 2 updates a particular row without blocking.
Process 1 updates the same row, which blocks.
Process 2 updates the same row again (with *exactly* the same UPDATE
statement), which fails with a deadlock.
 
I'm not sure I consider that a bug, but it moves the needle on the
astonishment meter.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Robert Haas
On Fri, Aug 20, 2010 at 10:20 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 20/08/10 17:01, Robert Haas wrote:

 On Fri, Aug 20, 2010 at 9:51 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 On 20/08/10 16:30, Robert Haas wrote:

 I really like the idea of trying to use network-based storage in some
 way.  Gigabit Ethernet is a big I/O channel.

 NFS?

 I don't particularly trust NFS to be either reliable or performant for
 database use.  Do you?

 Depends on the implementation, I guess, but the point is that there's a
 bazillion network-based filesystems with different tradeoffs out there
 already. It seems unlikely that you could outperform them with something
 built into PostgreSQL.

 To put it other way, if you built network-based storage into PostgreSQL,
 what PostgreSQL-specific knowledge could you take advanage of to make it
 more performant/reliable? If there isn't any, I don't see the point.

PostgreSQL-specific knowledge?  Probably not.  But:

- Setting up NFS is very easy to do wrong.  I bet if you find 100
people who are running PG over NFS, 80 of them have a wrong setting
somewhere that's compromising their data integrity.
- NFS, like all other solutions in this area, is platform-specific,
and thus not available everywhere.
- We don't need a general-purpose network file system - we need
something very specific, which should therefore be able to be done in
a more lightweight fashion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security hook on authorization

2010-08-20 Thread Robert Haas
2010/8/19 KaiGai Kohei kai...@ak.jp.nec.com:
 (2010/08/20 11:45), Robert Haas wrote:
 2010/8/19 KaiGai Koheikai...@ak.jp.nec.com:
 I also plan to add a security hook on authorization time.
 It shall allow external security providers to set up credential of
 the authenticated clients.

 Please note that it is not intended to control authentication process.
 It is typically checked based on a pair of username and password.
 What I want to discuss is things after success of this authentication
 steps.

  From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains
 a security label of the peer process, so it does not need to consider
 database username. But we can easily assume other security mechanism
 which assigns a certain label based on the authenticated database user
 such as Oracle Label Security.

 So, I think this hook should be also invoked on the code path of
 SET SESSION AUTHORIZATION, not only database login time, although
 SE-PostgreSQL ignores this case.

 So, I think SetSessionUserId() is a candidate to put this hook which is
 entirely called from both of the code path.
 This routine is to assign credential of the default database privilege
 mechanism, so it seems to me it is a good point where external security
 provider also assigns its credential of the authenticated database user.

 How is this different from what we rejected before?

 It made clear the purpose of this hook.

 I also intended to use the previous hook for authorization purpose,
 but it was deployed just after initialize_acl() without no argument.
 It might be suitable for SE-PostgreSQL, because it does not depend on
 authenticated database user, but might be too specific.

 The new hook shall be invoked on two code paths (database login and
 SET SESSION AUTHORIZATION). It allows upcoming security module which
 may assign client's credential based on the database user to utilize
 this hook also.

I think our standard criteria for the inclusion of hooks is that you
must demonstrate that the hook can be used to do something interesting
that couldn't be done without the hook.  So far I'm unconvinced.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Max Bowsher
On 20/08/10 12:55, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 13:50, Max Bowsher m...@f2s.com wrote:
 I have run cvs2git on the pgsql module of your CVS locally (is that the
 right thing to convert?) if you'd like to compare notes on specific
 parts of the conversion.
 
 Correct, that's the one. Can you put your repo up somewhere so we can
 look at it? Then I don't have to wait for my process to finish :D

Placed at http://red-bean.com/~maxb/pgsql-test.git - about 230MB -
sorry, dumb transport only, but hopefully that's not an issue for this
use case.

Max.



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Max Bowsher
On 19/08/10 10:35, Magnus Hagander wrote:
 On Thu, Aug 19, 2010 at 07:00, Michael Haggerty mhag...@alum.mit.edu wrote:
 Magnus Hagander wrote:
 Is there some way to make cvs2git work this way, and just not bother
 even trying to create merge commits, or is that fundamentally
 impossible and we need to look at another tool?

 The good news: (I just reminded myself/realized that) Max Bowsher has
 already implemented pretty much exactly what you want in the cvs2svn
 trunk version, including noting in the commit messages any cherry-picks
 that are not reflected in the repo ancestry.
 
 Ah, that's great.

I should mention that the way it notes this is to reference commits by
their timestamp, author and initial line of log message - it does this
because cvs2git doesn't know the commit sha ever - that doesn't appear
until the stream is fed through git fast-import. I did briefly raise the
idea of augmenting the fast-import process to support substituting
fast-import marks to shas in log messages, but didn't get time to take
it beyond an idea.

 The bad news: It is broken [1].  But I don't think it should be too much
 work to fix it.
 
 That's less great of course, but it gives hope!
 
 Thanks for your continued efforts!

I've just made a commit to cvs2svn trunk. I hope this should now be fixed.

Max.




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Max Bowsher
On 20/08/10 12:02, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 09:49, Max Bowsher m...@f2s.com wrote:
 On 19/08/10 10:35, Magnus Hagander wrote:
 On Thu, Aug 19, 2010 at 07:00, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 Magnus Hagander wrote:
 Is there some way to make cvs2git work this way, and just not bother
 even trying to create merge commits, or is that fundamentally
 impossible and we need to look at another tool?

 The good news: (I just reminded myself/realized that) Max Bowsher has
 already implemented pretty much exactly what you want in the cvs2svn
 trunk version, including noting in the commit messages any cherry-picks
 that are not reflected in the repo ancestry.

 Ah, that's great.

 I should mention that the way it notes this is to reference commits by
 their timestamp, author and initial line of log message - it does this
 because cvs2git doesn't know the commit sha ever - that doesn't appear
 until the stream is fed through git fast-import. I did briefly raise the
 idea of augmenting the fast-import process to support substituting
 fast-import marks to shas in log messages, but didn't get time to take
 it beyond an idea.

 The bad news: It is broken [1].  But I don't think it should be too much
 work to fix it.

 That's less great of course, but it gives hope!

 Thanks for your continued efforts!

 I've just made a commit to cvs2svn trunk. I hope this should now be fixed.
 
 
 Great. I will download and test the trunk version soon. I'm currently
 running a test using cvs2svn and then git-svn clone from that - but
 it's insanely slow (been going for 30+ hours now, and probably has
 8-10 hours more to go)...

Uh, you are? Why do it that way?

The thing I fixed pertains to the direct use of cvs2git, and will have
no effect on executions of cvs2svn.

I have run cvs2git on the pgsql module of your CVS locally (is that the
right thing to convert?) if you'd like to compare notes on specific
parts of the conversion.

Max.



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] SQLSTATE of notice PGresult

2010-08-20 Thread Euler Taveira de Oliveira
Dmitriy Igrishin escreveu:
   /* NOT presents - NULL. Why not 0 ? */
   const char* sqlstate = Pg::PQresultErrorField(res, PG_DIAG_SQLSTATE);
 
That's because the protocol doesn't set error field when the command
succeeded. IMHO it's an oversight (the documentation is correct but the code
is not) and should be correct because the spec enforces it.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Kevin Grittner
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote:
 
 I think truly serializable transactions still need to SELECT FOR
 SHARE here for foreign keys to work, no?
 
That depends on how you look at it.  The SSI patch that Dan and I
have been working on doesn't attempt to change the implementation
techniques for foreign keys, because SSI only enforces integrity
among serializable transactions -- and we want foreign keys to be
enforced regardless of the transaction isolation levels used.
 
When writing queries which will be run at the serializable isolation
level, if you are only concerned with anomalies from interaction
with other serializable transactions, you *never* have to explicitly
code SELECT FOR SHARE or SELECT FOR UPDATE, nor do you ever need to
explicitly request a lock; so from that perspective the answer to
the question is No.  Under the covers, PostgreSQL will continue to
use existing techniques for enforcing referential integrity defined
by foreign keys; so from that perspective the answer to the question
is Yes.
 
Hopefully that made sense
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 The surprising thing is that a particular row is (using the
 identifiers from the attachment):
 
 Process 2 updates a particular row without blocking.
 Process 1 updates the same row, which blocks.
 Process 2 updates the same row again (with *exactly* the same UPDATE
 statement), which fails with a deadlock.
 
 I'm not sure I consider that a bug, but it moves the needle on the
 astonishment meter.

OK, I looked a bit closer.  The first update in process 2 is changing
a row in B that has an FK reference to an already-modified row in A.
The only reason that doesn't block is that we optimize away taking a
sharelock on the referenced row if the update doesn't change the FK
column(s), as this doesn't.  However, the *second* update doesn't
get the benefit of that optimization, as per this comment in trigger.c:

 * There is one exception when updating FK tables: if the
 * updated row was inserted by our own transaction and the
 * FK is deferred, we still need to fire the trigger. This
 * is because our UPDATE will invalidate the INSERT so the
 * end-of-transaction INSERT RI trigger will not do
 * anything, so we have to do the check for the UPDATE
 * anyway.

So it goes and waits for sharelock on the A row, and then you have a
deadlock because process 1 has exclusive lock on that row and is already
blocked waiting for process 2.

The Glue guys aren't the first to complain of this behavior, so it'd
be nice to improve it.

If we knew that the already-updated row was one for which we'd been able
to optimize away the FK check, then we could do so again on the second
update (assuming it still didn't change the FK columns), but I don't see
any practical way to know that.  We only have our hands on the current
update's old and new tuples, not on previous versions; and there's no
convenient way to find the previous version because the update ctid
links run the other way.

[ thinks for awhile... ]  Conceivably we could get around this by
programming the ON INSERT trigger to chase forward to the latest live
row version, rather than just doing nothing when the initially inserted
row has been outdated.  It'd be a pretty ticklish thing to get right,
though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Marko Tiikkaja

On 2010-08-20 6:19 PM, Kevin Grittner wrote:

Marko Tiikkajamarko.tiikk...@cs.helsinki.fi  wrote:


I think truly serializable transactions still need to SELECT FOR
SHARE here for foreign keys to work, no?


That depends on how you look at it.  The SSI patch that Dan and I
have been working on doesn't attempt to change the implementation
techniques for foreign keys, because SSI only enforces integrity
among serializable transactions -- and we want foreign keys to be
enforced regardless of the transaction isolation levels used.


Exactly.


Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Joel Jacobson
Hm, in my example, there are no INSERTs in the two conflicting transactions?
The suggestion on adding an ON INSERT trigger would have no effect as far as
I can see.
The comment from trigger.c is also about INSERT, can't see how it affects
us.

I don't understand exactly why this deadlock occurs, but the one thing I
cannot understand is why process 2 is not allowed to update the same row,
which it has already updated in the same transaction.

In general, if a transaction has a write row lock (or what ever it is
called in postgres), i.e., exclusive right to modify the row in the table,
shouldn't that same transaction always be allowed to update the same row in
a later stage? I understand the foreign key is the reason for the conflict,
but process 2 doesn't attempt to modify the foreign key data, it only does
update on table B.

It just doesn't make sense to abort process 2 with a deadlock in my example.

(If it helps, we would be willing to assign a bounty prize to anyone taking
on the task to solve this problem.)

Best regards,

Joel Jacobson
Glue Finance


2010/8/20 Tom Lane t...@sss.pgh.pa.us

 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  The surprising thing is that a particular row is (using the
  identifiers from the attachment):

  Process 2 updates a particular row without blocking.
  Process 1 updates the same row, which blocks.
  Process 2 updates the same row again (with *exactly* the same UPDATE
  statement), which fails with a deadlock.

  I'm not sure I consider that a bug, but it moves the needle on the
  astonishment meter.

 OK, I looked a bit closer.  The first update in process 2 is changing
 a row in B that has an FK reference to an already-modified row in A.
 The only reason that doesn't block is that we optimize away taking a
 sharelock on the referenced row if the update doesn't change the FK
 column(s), as this doesn't.  However, the *second* update doesn't
 get the benefit of that optimization, as per this comment in trigger.c:

 * There is one exception when updating FK tables: if
 the
 * updated row was inserted by our own transaction and
 the
 * FK is deferred, we still need to fire the trigger.
 This
 * is because our UPDATE will invalidate the INSERT so
 the
 * end-of-transaction INSERT RI trigger will not do
 * anything, so we have to do the check for the UPDATE
 * anyway.

 So it goes and waits for sharelock on the A row, and then you have a
 deadlock because process 1 has exclusive lock on that row and is already
 blocked waiting for process 2.

 The Glue guys aren't the first to complain of this behavior, so it'd
 be nice to improve it.

 If we knew that the already-updated row was one for which we'd been able
 to optimize away the FK check, then we could do so again on the second
 update (assuming it still didn't change the FK columns), but I don't see
 any practical way to know that.  We only have our hands on the current
 update's old and new tuples, not on previous versions; and there's no
 convenient way to find the previous version because the update ctid
 links run the other way.

 [ thinks for awhile... ]  Conceivably we could get around this by
 programming the ON INSERT trigger to chase forward to the latest live
 row version, rather than just doing nothing when the initially inserted
 row has been outdated.  It'd be a pretty ticklish thing to get right,
 though.

regards, tom lane




-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Joshua D. Drake
On Fri, 2010-08-20 at 09:36 -0400, Robert Haas wrote:
 On Fri, Aug 20, 2010 at 9:28 AM, Magnus Hagander mag...@hagander.net wrote:
  I believe Robert had some comments/questions as well :-)
 
 What Magnus means is that I'm a grumpy old developer who complains
 about everything.

+1

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Joshua D. Drake
On Fri, 2010-08-20 at 14:47 +0100, Dave Page wrote:
 On Fri, Aug 20, 2010 at 2:36 PM, Robert Haas robertmh...@gmail.com wrote:
  On Fri, Aug 20, 2010 at 9:28 AM, Magnus Hagander mag...@hagander.net 
  wrote:
  I believe Robert had some comments/questions as well :-)
 
  What Magnus means is that I'm a grumpy old developer who complains
  about everything.
 
 Don't put yourself down. You're not that old :-p

Does that mean he is only going to get worse?

;)

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Greg Stark
On Thu, Aug 19, 2010 at 11:51 PM, Josh Berkus j...@agliodbs.com wrote:
 update session where id = X;
                        update order where orderid = 5;
 update order where orderid = 5;

So i think this will already deadlock.

A has a exclusive-lock on sessionX and is waiting on order5. B has
an exclusive lock on order5 and is waiting on a share-lock on
sessionx

                        update order where orderid = 5;
 ... deadlock error.


Do you actually get a prompt here to type this command?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote:
 Josh Berkus j...@agliodbs.com wrote:
 update session where id = X;
update order where orderid = 5;
 update order where orderid = 5;
 
 So i think this will already deadlock.
 
 A has a exclusive-lock on sessionX and is waiting on order5. B
 has an exclusive lock on order5 and is waiting on a share-lock
 on sessionx
 
No, see Tom's explanation on the related Deadlock bug thread:
 
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01464.php
 
update order where orderid = 5;
 ... deadlock error.
 
 Do you actually get a prompt here to type this command?
 
Yes.  The attachment at the start of the other thread makes it easy
to confirm:
 
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01447.php
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Alvaro Herrera
Excerpts from Tom Lane's message of jue ago 19 23:35:06 -0400 2010:
 Robert Haas robertmh...@gmail.com writes:
  So I propose moving the TablespaceCreateDbspace() call to mdcreate(),
  removing the redundant check from smgrcreate(), and deleting that
  portion of the comment which says this is in the wrong place.
 
 While I don't care for having smgr.c call tablespace.c, moving the call to
 md.c instead is surely not an improvement :-(.  The problem here is that
 we'd like the tablespace code to be above the smgr code, not below.
 Calling it from md.c makes the layer inversion worse not better.

I proposed moving that code to storage.c some time ago but was shot down
for reasons I don't remember, and didn't press further.  Perhaps the
idea of moving it all to smgr.c/md.c for tablespace.c to call makes more
sense; but if that doesn't happen, I still think that storage.c is a
better place.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Tom Lane
Max Bowsher m...@f2s.com writes:
 The history that cvs2svn is aiming to represent here is this:

 1) At the time of creation of the REL8_4_STABLE branch, plperl_opmask.pl
 did *not* exist.

 2) Later, it was added to trunk.

 3) Then, someone retroactively added the branch tag to the file, marking
 it as included in the REL8_4_STABLE branch. [This corresponds to the git
 changeset that Robert is questioning]

Uh, no.  We have never retroactively added anything to any branch.
I don't know enough about the innards of CVS to know what its internal
representation of this sort of thing is, but all that actually happened
here was a cvs add and a cvs commit in REL8_4_STABLE long after the
branch occurred.  We would like the git history to look like that too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 19:28, Tom Lane t...@sss.pgh.pa.us wrote:
 Max Bowsher m...@f2s.com writes:
 The history that cvs2svn is aiming to represent here is this:

 1) At the time of creation of the REL8_4_STABLE branch, plperl_opmask.pl
 did *not* exist.

 2) Later, it was added to trunk.

 3) Then, someone retroactively added the branch tag to the file, marking
 it as included in the REL8_4_STABLE branch. [This corresponds to the git
 changeset that Robert is questioning]

 Uh, no.  We have never retroactively added anything to any branch.
 I don't know enough about the innards of CVS to know what its internal
 representation of this sort of thing is, but all that actually happened
 here was a cvs add and a cvs commit in REL8_4_STABLE long after the
 branch occurred.  We would like the git history to look like that too.

Yeah.

In fact, is the only thing that's wrong here the commit message?
Because it's probably trivial to just patch that away.. Hmm, but i
guess we'd like to hav ethe actual commit message and not just another
fixed one..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small smgrcreate cleanup patch

2010-08-20 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of jue ago 19 23:35:06 -0400 2010:
 While I don't care for having smgr.c call tablespace.c, moving the call to
 md.c instead is surely not an improvement :-(.  The problem here is that
 we'd like the tablespace code to be above the smgr code, not below.
 Calling it from md.c makes the layer inversion worse not better.

 I proposed moving that code to storage.c some time ago but was shot down
 for reasons I don't remember, and didn't press further.  Perhaps the
 idea of moving it all to smgr.c/md.c for tablespace.c to call makes more
 sense; but if that doesn't happen, I still think that storage.c is a
 better place.

Hmm.  I've never been terribly happy with storage.c: it doesn't have any
clear place in the layering, and looks like it's mostly code that has
been ripped out of smgr.c for no very defensible reason, and then moved
into catalog/ for even less reason.  Maybe we should back up and rethink
the organization of all of smgr.c/md.c/storage.c.

The real underlying problem here is that there's so much stuff in
commands/ that is freely violating the smgr abstraction level by poking
at the filesystem directly.  It's impossible to have a nice layering,
or even what you could call an abstraction, as long as things stay like
that.  I guess it got that way because smgr.c was so useless that nobody
bothered to factor it in when thinking about how to implement
tablespaces and suchlike.  But if we're going to do anything at all in
the way of cleaning up these interfaces, we need to start with a
redesign that re-establishes some clear division of labor.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 19:41, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 18:30, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 19:28, Tom Lane t...@sss.pgh.pa.us wrote:
 Max Bowsher m...@f2s.com writes:
 The history that cvs2svn is aiming to represent here is this:

 1) At the time of creation of the REL8_4_STABLE branch, plperl_opmask.pl
 did *not* exist.

 2) Later, it was added to trunk.

 3) Then, someone retroactively added the branch tag to the file, marking
 it as included in the REL8_4_STABLE branch. [This corresponds to the git
 changeset that Robert is questioning]

 Uh, no.  We have never retroactively added anything to any branch.
 I don't know enough about the innards of CVS to know what its internal
 representation of this sort of thing is, but all that actually happened
 here was a cvs add and a cvs commit in REL8_4_STABLE long after the
 branch occurred.  We would like the git history to look like that too.

 Yeah.

 In fact, is the only thing that's wrong here the commit message?
 Because it's probably trivial to just patch that away.. Hmm, but i
 guess we'd like to hav ethe actual commit message and not just another
 fixed one..

 There is no actual commit message - the entire changeset is
 synthesized by cvs2git to represent the addition of a branch tag to the
 file - i.e. the logical equivalent of a cvs tag -b, which has no
 commit message.

There is a commit message on the trunk when the file was added there.
Is there any chance of being able to lift that message off trunk and
just copy it into the branch, instead of saying this is a cherry-pick
of this commit over here?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 19:56, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 18:43, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 19:41, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 18:30, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 19:28, Tom Lane t...@sss.pgh.pa.us wrote:
 Max Bowsher m...@f2s.com writes:
 The history that cvs2svn is aiming to represent here is this:

 1) At the time of creation of the REL8_4_STABLE branch, plperl_opmask.pl
 did *not* exist.

 2) Later, it was added to trunk.

 3) Then, someone retroactively added the branch tag to the file, marking
 it as included in the REL8_4_STABLE branch. [This corresponds to the git
 changeset that Robert is questioning]

 Uh, no.  We have never retroactively added anything to any branch.
 I don't know enough about the innards of CVS to know what its internal
 representation of this sort of thing is, but all that actually happened
 here was a cvs add and a cvs commit in REL8_4_STABLE long after the
 branch occurred.  We would like the git history to look like that too.

 Yeah.

 In fact, is the only thing that's wrong here the commit message?
 Because it's probably trivial to just patch that away.. Hmm, but i
 guess we'd like to hav ethe actual commit message and not just another
 fixed one..

 There is no actual commit message - the entire changeset is
 synthesized by cvs2git to represent the addition of a branch tag to the
 file - i.e. the logical equivalent of a cvs tag -b, which has no
 commit message.

 There is a commit message on the trunk when the file was added there.
 Is there any chance of being able to lift that message off trunk and
 just copy it into the branch, instead of saying this is a cherry-pick
 of this commit over here?

 It wouldn't be accurate to do so, because the synthetic commit is not
 copying the entire change, only registering the addition of (in this
 case) one file which happens to be part of the changeset. Please note
 that there is a changeset on the branch immediately following the
 synthetic one under discussion which contains the 'real' commit message
 used when committing to the branch.

Hmm. Good point.

I wonder if we should try to locate these places and fix them with git
filter-branch or rebase -i after the fact, with history rewriting.

There seem to be 57 of them.

Searching for those, I also found a bunch with the comment Sprouted
from branch. What do those mean?


 My guess at this point is that there may be a (very old?) version of cvs
 which, when adding a file to a branch, actually misrecorded the file as
 having existed on the branch from the moment it was first added to trunk
 - this would explain this anomaly.

Well, the one Robert pointed to is a very recent commit. Not sure if
it uses the client version or the server version - the version on
cvs.postgresql.org is:

[...@cvs ~]$ cvs --version

Concurrent Versions System (CVS) 1.11.17-FreeBSD (client/server)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
Hackers,

A while ago, I asked if .0 releases could be versioned with three digits 
instead of two. That is, it would be 8.4.0 instead of 8.4. This is to make 
the format consistent with maintenance releases (8.4.1, etc.). I thought this 
was generally agreed upon, but maybe not, because I just went to build the 
latest 9.0 beta and saw that the version number is 9.0beta4.

Would it be possible to *always* use three integers? So the next release would 
be 9.0.0beta5 or 9.0.0rc1? In addition to being more consistent, it also 
means that PostgreSQL would be adhering to Semantic Versioning 
(http://semver.org/), which is a very simple format that's internally 
consistent. I'm planning to require semantic versioning for PGXN, and it'd be 
nice if the core could do the same thing (it will make it nicer for specifying 
dependencies on core contrib modules, for example).

Thanks,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel pg_restore versus dependencies

2010-08-20 Thread Tom Lane
I've been poking into bug #5626,
http://archives.postgresql.org/pgsql-bugs/2010-08/msg00291.php

What's basically going on here is:
1. User tried to suppress the public schema from the restore list.
2. Since almost everything in the dump depends on the public schema,
pg_restore skipped over most of it looking for something it could
restore.  It soon hit a TABLE DATA item from another schema, triggering
the switch into actual parallel restore mode.
3. Eventually, it found the public schema, which SortTocFromFile had
pushed to the end of the TOC list.  At that point it recognized that
it shouldn't actually emit the item, so it didn't, but it did mark
the dependencies satisfied.
4. Now the floodgates are open to try to restore all the DDL items
in the public schema.  But we're trying to do it in parallel.
Because pg_dump is exceedingly cavalier about marking DDL items with
their full dependencies, things soon go pear-shaped: in the reported
bug, we tried to do two interdependent DDL ops in parallel, and when
trying to duplicate the bug using the regression database, I
consistently got failures from restoring a view that depended on
not-yet-restored functions.

It'd probably be nice if the dependency data were more complete for
DDL items, but getting that right is a long-term project, and in
any case pg_restore can't really rely on it to be there in existing
dump files.  Right now that data is only really trustable for
SECTION_DATA and SECTION_POST_DATA items, and we have to rely on
the dump ordering for PRE_DATA items.

I think we can patch it up for now by doing two things:

* Tweak SortTocFromFile so that items not-to-be-restored end up at the
*head* of the re-ordered TOC list, not the tail.  This won't actually
make any difference in net runtime, but what it will do is ensure that
we scan those items and mark their dependencies as satisfied before
anything starts to happen for real.  Thus omitting an item won't
result in unexpected departures from the commanded restore order.

* In restore_toc_entries_parallel, don't exit the serial restore mode
and start parallel restoring until we reach a TOC item that is both
DATA/POST_DATA *and* marked to be restored.  This will prevent any
not-to-be-restored DATA/POST_DATA items at the list head from triggering
a premature switch into parallel restore mode.

It will still be the case that you can break it with an unwise choice of
restore order from a -L file, but at least it won't fail because of
hidden implementation behaviors.

In HEAD and perhaps 9.0, we could make things more robust by only
putting DATA/POST_DATA items into the parallel-restore lists in the
first place, and forcing all PRE_DATA items to be done in the initial
serial restore loop.  However this would amount to ignoring the
commanded -L order to a greater extent than strictly necessary.
I'm not entirely sure if that's a good idea or not.  Should we try to
honor the -L order even when it's not very safe?

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Glue] [HACKERS] Deadlock bug

2010-08-20 Thread Josh Berkus
On 8/20/10 7:18 AM, Tom Lane wrote:
 It does go through without any deadlock, *if* there is no foreign key
 involved.  You didn't tell us exactly what the FK relationship is, but
 I suspect the reason for the deadlock is that one process is trying to
 update a row that references some row already updated by the other.
 That will require a row-level share lock on the referenced row, so you
 can get a deadlock.

That's correct. This is the generic example I was talking about earlier
on -hackers.  I'm not certain it's a bug per spec; I wanted to talk
through with Kevin what we *should* be doing in this situation.

This is one example of a set of user-hostile FK-related deadlock
behavior we have.  I'm just not certain it's logically possible to
improve it.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Tom Lane
Max Bowsher m...@f2s.com writes:
 My guess at this point is that there may be a (very old?) version of cvs
 which, when adding a file to a branch, actually misrecorded the file as
 having existed on the branch from the moment it was first added to trunk
 - this would explain this anomaly.

I have no idea what version of CVS is running on our master server.
I have noticed that it sometimes generates its own synthetic commit
messages for cases related to this, for example these events on HEAD:

2010-05-13 12:40  adunstan

* src/pl/plperl/sql/plperlu_plperl.sql: file plperlu_plperl.sql was
initially added on branch REL8_4_STABLE.

2010-05-13 12:40  adunstan

* src/pl/plperl/expected/plperlu_plperl.out: file
plperlu_plperl.out was initially added on branch REL8_4_STABLE.

I don't see one of these for plperl_opmask.pl in particular, so there
may be more than one anomaly involved.

However, the bottom line here is that we don't want the history that
cvs2git is preparing for these events, because it doesn't correspond to
what we did.  Whether this is the most faithful representation of the
CVS history is academic; it simply is not reality.  What we would like
is for the history to look like the file got added to the branch as of
the first commit that touched it on that branch.  That is reality, as
it appears from our neck of the woods anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David Fetter
On Fri, Aug 20, 2010 at 11:12:56AM -0700, David Wheeler wrote:
 Hackers,
 
 A while ago, I asked if .0 releases could be versioned with three
 digits instead of two. That is, it would be 8.4.0 instead of
 8.4. This is to make the format consistent with maintenance
 releases (8.4.1, etc.). I thought this was generally agreed upon,
 but maybe not, because I just went to build the latest 9.0 beta and
 saw that the version number is 9.0beta4.
 
 Would it be possible to *always* use three integers? So the next
 release would be 9.0.0beta5 or 9.0.0rc1? In addition to being
 more consistent, it also means that PostgreSQL would be adhering to
 Semantic Versioning (http://semver.org/), which is a very simple
 format that's internally consistent. I'm planning to require
 semantic versioning for PGXN, and it'd be nice if the core could do
 the same thing (it will make it nicer for specifying dependencies on
 core contrib modules, for example).

+1 for three-number versions...well, until we really see the light and
go to two-number versions.  8.3 and 8.4 are different enough that they
shouldn't even mildly appear the same, for example.

Cheers,
David (Oh, how silly!  You actually want Frobozz 3.1.4.1.5.2.6, not 
3.1.4.1.5.2.5!).
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding deadlocks ...

2010-08-20 Thread Josh Berkus
On 8/20/10 8:23 AM, Marko Tiikkaja wrote:
 On 2010-08-20 6:19 PM, Kevin Grittner wrote:
 Marko Tiikkajamarko.tiikk...@cs.helsinki.fi  wrote:

 I think truly serializable transactions still need to SELECT FOR
 SHARE here for foreign keys to work, no?

 That depends on how you look at it.  The SSI patch that Dan and I
 have been working on doesn't attempt to change the implementation
 techniques for foreign keys, because SSI only enforces integrity
 among serializable transactions -- and we want foreign keys to be
 enforced regardless of the transaction isolation levels used.

Ok, then that's not a fix for this particular problem.  This case is a
good example, though, of showing how deadlocks are the most expensive
type of serialization failure, and thus models which avoid deadlocks (in
favor of other kinds of blocking and/or serialization errors) are desirable.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
On Aug 20, 2010, at 11:34 AM, David Fetter wrote:

 +1 for three-number versions...well, until we really see the light and
 go to two-number versions.  8.3 and 8.4 are different enough that they
 shouldn't even mildly appear the same, for example.

No idea what you mean by that, but generally it's a bad idea to switch from 
dotted-integer version numbers and numeric version numbers. See Perl (Quel 
désastre!).

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Glue] [HACKERS] Deadlock bug

2010-08-20 Thread Joel Jacobson
In my example,

Process 1:Process 2:
BEGIN;
SELECT pg_backend_pid();
BEGIN;
SELECT
pg_backend_pid();
UPDATE A SET Col1 = 1 WHERE AID = 1;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 =
1 WHERE BID = 2;
SELECT * FROM vLocks
WHERE PID IN (2165,2157);
UPDATE B SET Col2 = 1 WHERE BID = 2;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 =
1 WHERE BID = 2;
SELECT * FROM vLocks
WHERE PID IN (2165,2157);

Process 2 is aborted due to deadlock, while process 1 is allowed to commit.

If the locking logic would be modified to allow process 2 to go through, and
instead abort process 1, I understand some other scenarios would of course
be affected, where the situation would be handled in a less optimal way.

Is there any example of scenarios where it is optimal to handle this kind of
locking situation in this way?

I am totally fine living with a feature, which is a problem in some cases,
and something good in other cases, as long as the good cases are more common
than the problem cases.

Another question, Tom referred to a comment in
src/backend/command/trigger.c.
My example does not contain any triggers, nor insert commands. Is the
trigger.c-comment still relevant or is it a misunderstanding?

2010/8/20 Josh Berkus j...@agliodbs.com

 On 8/20/10 7:18 AM, Tom Lane wrote:
  It does go through without any deadlock, *if* there is no foreign key
  involved.  You didn't tell us exactly what the FK relationship is, but
  I suspect the reason for the deadlock is that one process is trying to
  update a row that references some row already updated by the other.
  That will require a row-level share lock on the referenced row, so you
  can get a deadlock.

 That's correct. This is the generic example I was talking about earlier
 on -hackers.  I'm not certain it's a bug per spec; I wanted to talk
 through with Kevin what we *should* be doing in this situation.

 This is one example of a set of user-hostile FK-related deadlock
 behavior we have.  I'm just not certain it's logically possible to
 improve it.

 --
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com




-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


Re: [HACKERS] Version Numbering

2010-08-20 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 A while ago, I asked if .0 releases could be versioned with three
 digits instead of two. That is, it would be 8.4.0 instead of 8.4.

We've been doing that for some time, no?  A quick look at the CVS
history shows that 8.0.0 and up were tagged that way.

 This is to make the format consistent with maintenance releases (8.4.1, 
 etc.). I thought this was generally agreed upon, but maybe not, because I 
 just went to build the latest 9.0 beta and saw that the version number is 
 9.0beta4.

.0 is for releases, not betas.  I see no need for an extra number in
beta versions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler

On Aug 20, 2010, at 11:40 AM, Tom Lane wrote:

 David E. Wheeler da...@kineticode.com writes:
 A while ago, I asked if .0 releases could be versioned with three
 digits instead of two. That is, it would be 8.4.0 instead of 8.4.
 
 We've been doing that for some time, no?  A quick look at the CVS
 history shows that 8.0.0 and up were tagged that way.

Ah, good for the final release.

 This is to make the format consistent with maintenance releases (8.4.1, 
 etc.). I thought this was generally agreed upon, but maybe not, because I 
 just went to build the latest 9.0 beta and saw that the version number is 
 9.0beta4.
 
 .0 is for releases, not betas.  I see no need for an extra number in
 beta versions.

Again, it means the format would be consistent. Always three integers. Nice 
thing about Semantic Versions is that if you append any ASCII string to the 
third integer, it automatically means less than that integer.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Glue] [HACKERS] Deadlock bug

2010-08-20 Thread Josh Berkus

 Another question, Tom referred to a comment in
 src/backend/command/trigger.c.
 My example does not contain any triggers, nor insert commands. Is the
 trigger.c-comment still relevant or is it a misunderstanding?

It's relevant for how the FKs are handled.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David Fetter
On Fri, Aug 20, 2010 at 11:36:55AM -0700, David Wheeler wrote:
 On Aug 20, 2010, at 11:34 AM, David Fetter wrote:
 
  +1 for three-number versions...well, until we really see the light
  and go to two-number versions.  8.3 and 8.4 are different enough
  that they shouldn't even mildly appear the same, for example.
 
 No idea what you mean by that, but generally it's a bad idea to
 switch from dotted-integer version numbers and numeric version
 numbers. See Perl (Quel désastre!).

I'm thinking that after 9.0, the first release of the next major
version should be 10.0, and the one after that, 11.0, etc., etc.

The current system give people the completely false impression that
7.0 and 7.4 are somehow similar.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
On Aug 20, 2010, at 11:47 AM, David Fetter wrote:

 No idea what you mean by that, but generally it's a bad idea to
 switch from dotted-integer version numbers and numeric version
 numbers. See Perl (Quel désastre!).
 
 I'm thinking that after 9.0, the first release of the next major
 version should be 10.0, and the one after that, 11.0, etc., etc.

Oh. Good luck with that. I disagree, FWIW.

 The current system give people the completely false impression that
 7.0 and 7.4 are somehow similar.

On what planet?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes:
 I don't understand exactly why this deadlock occurs, but the one thing I
 cannot understand is why process 2 is not allowed to update the same row,
 which it has already updated in the same transaction.

It *is* allowed to, and in fact has already done so.  The problem is
that it now needs a sharelock on the referenced row in order to ensure
that the FK constraint remains satisfied, ie, nobody deletes the
referenced row before we commit the update.  In the general case where
the referencing row is new (or has a new FK value) in the current
transaction, such a lock is necessary for correctness.  Your case would
work if we could optimize away the FK check, but with only a limited
view of what's happened in the current transaction, it's not always
possible to optimize away the check.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 In fact, is the only thing that's wrong here the commit message?
 Because it's probably trivial to just patch that away.. Hmm, but i
 guess we'd like to hav ethe actual commit message and not just another
 fixed one.. 

If I understand Max's statements correctly, there is an observable
problem in the actual git history, not just the commit log entries:
it will believe that a file added on a branch had been there since
the branch forked off, not just as of the time it got added.

Now, I would think that your tests of file contents as of the various
release tags should have caught extra files, so maybe I'm
misunderstanding.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 20:52, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 In fact, is the only thing that's wrong here the commit message?
 Because it's probably trivial to just patch that away.. Hmm, but i
 guess we'd like to hav ethe actual commit message and not just another
 fixed one..

 If I understand Max's statements correctly, there is an observable
 problem in the actual git history, not just the commit log entries:
 it will believe that a file added on a branch had been there since
 the branch forked off, not just as of the time it got added.

 Now, I would think that your tests of file contents as of the various
 release tags should have caught extra files, so maybe I'm
 misunderstanding.

I haven't been able to complete that test on the repo converted by the
new version yet, because the repo Max prepared for us had the keyword
problem. The other process is still running.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Glue] Deadlock bug

2010-08-20 Thread Greg Stark
On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson j...@gluefinance.com wrote:
 If the locking logic would be modified to allow process 2 to go through, and
 instead abort process 1, I understand some other scenarios would of course
 be affected, where the situation would be handled in a less optimal way.

Which process dies when there's a deadlock is pretty much arbitary.
The first process to notice the deadlock will just throw an error
itself.  Which one notices first depends on the timing of when the
blocking locks were taken.

If the second process to get stuck blocks before the first process
checks then the first process will notice first. If it does other
stuff first then the first process will check, not find a deadlock and
go back to sleep. Then the deadlock won't be detected until the second
process checks.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Glue] Deadlock bug

2010-08-20 Thread Joel Jacobson
OK. Thanks for the explanation. It's probably the case in general, but in
all of my tests (10), process 2 always aborts. I don't think it is
arbitrary in this example, or could it be?

2010/8/20 Greg Stark gsst...@mit.edu

 On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson j...@gluefinance.com
 wrote:
  If the locking logic would be modified to allow process 2 to go through,
 and
  instead abort process 1, I understand some other scenarios would of
 course
  be affected, where the situation would be handled in a less optimal way.

 Which process dies when there's a deadlock is pretty much arbitary.
 The first process to notice the deadlock will just throw an error
 itself.  Which one notices first depends on the timing of when the
 blocking locks were taken.

 If the second process to get stuck blocks before the first process
 checks then the first process will notice first. If it does other
 stuff first then the first process will check, not find a deadlock and
 go back to sleep. Then the deadlock won't be detected until the second
 process checks.

 --
 greg




-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


Re: [HACKERS] Version Numbering

2010-08-20 Thread Greg Stark
On Fri, Aug 20, 2010 at 7:34 PM, David Fetter da...@fetter.org wrote:
 +1 for three-number versions...well, until we really see the light and
 go to two-number versions.  8.3 and 8.4 are different enough that they
 shouldn't even mildly appear the same, for example.

You realize if we did that 9.0 would be version 18?

 David (Oh, how silly!  You actually want Frobozz 3.1.4.1.5.2.6, not 
 3.1.4.1.5.2.5!).

So eventually you end up with the same problem. Oh, you wanted version
117 not 116!




-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Glue] [HACKERS] Deadlock bug

2010-08-20 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 That's correct. This is the generic example I was talking about
 earlier on -hackers.  I'm not certain it's a bug per spec; I
 wanted to talk through with Kevin what we *should* be doing in
 this situation.
 
I'm certainly happy to address what impact the SSI patch will have
on such behavior, and I've been known to have opinions on related
issues, but I don't know if I can carry the weight you seem to be
suggesting with that statement.  ;-)
 
[gamely doing my best...]
 
In general, the spec defines levels less strict than serializable
(and also serializable in spec versions before 1999) in terms of
anomalies which are prohibited, with the database being allowed to
block and/or generate serialization failures as needed to prevent
the anomalies.  In the 1999 version and later there is the
additional requirement that behavior of concurrent serializable
transactions which successfully commit be consistent with *some*
serial execution of those transactions.
 
I don't see anything in PostgreSQL's current behavior on the
particular example you raised which isn't compliant with the spec,
even if it is surprising.  (Well, with the exception of the SQLSTATE
used for deadlocks, which in my opinion should be '40001'.)
 
 This is one example of a set of user-hostile FK-related deadlock
 behavior we have.  I'm just not certain it's logically possible to
 improve it.
 
If there are a lot of user-hostile behaviors there, it might be
worth looking at the possibility of bending the SSI techniques to
that end, although I think it would be a mistake to burden the
initial patch with that.  Off the top of my head, I think it would
require extending much of the SSI behavior to most of the DML
execution on tables which participate in FK relationships,
regardless of transaction isolation level.  I'm not sure if that's
even feasible -- if it is, someone would need to work out a solid
theoretical basis for why and how it would work.  It might be that
the only way SSI could cover FK relationships is if there was a
database or cluster option to make all transactions fully
serializable.  (NOTE: If there were, *I* would use it, since it
would guarantee that I could rely upon any business rules enforced
by database triggers, which I would consider a valuable guarantee.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Joel Jacobson
Process 1 updates A in its transaction, which is still going on when process
2 updates B, requiring a sharelock on A, which it is granted. But when
process 2 does its second update of B, also of course requiring a sharelock
on A, it is not granted.

I fully agree it must obtain a sharelock on the FK, but I cannot understand
why it is granted it the first time, but not the second time?

2010/8/20 Tom Lane t...@sss.pgh.pa.us

 Joel Jacobson j...@gluefinance.com writes:
  I don't understand exactly why this deadlock occurs, but the one thing I
  cannot understand is why process 2 is not allowed to update the same row,
  which it has already updated in the same transaction.

 It *is* allowed to, and in fact has already done so.  The problem is
 that it now needs a sharelock on the referenced row in order to ensure
 that the FK constraint remains satisfied, ie, nobody deletes the
 referenced row before we commit the update.  In the general case where
 the referencing row is new (or has a new FK value) in the current
 transaction, such a lock is necessary for correctness.  Your case would
 work if we could optimize away the FK check, but with only a limited
 view of what's happened in the current transaction, it's not always
 possible to optimize away the check.

regards, tom lane




-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


Re: [HACKERS] Version Numbering

2010-08-20 Thread Greg Stark
On Fri, Aug 20, 2010 at 7:42 PM, David E. Wheeler da...@kineticode.com wrote:
 Again, it means the format would be consistent. Always three integers. Nice 
 thing about Semantic Versions is that if you append any ASCII string to the 
 third integer, it automatically means less than that integer.


So I count three integers in both 9.0rc1 and 9.0beta4


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
On Aug 20, 2010, at 12:02 PM, Greg Stark wrote:

 Again, it means the format would be consistent. Always three integers. Nice 
 thing about Semantic Versions is that if you append any ASCII string to the 
 third integer, it automatically means less than that integer.
 
 
 So I count three integers in both 9.0rc1 and 9.0beta4

No, I mean 9.0.0beta4. If we were to adopt the Semantic Versioning spec, one 
would *always* use X.Y.Z, with optional ASCII characters appended to Z to add 
meaning (including less than unadorned Z).

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes:
 I fully agree it must obtain a sharelock on the FK, but I cannot understand
 why it is granted it the first time, but not the second time?

It *isn't* granted it the first time, because it doesn't try to acquire
it the first time.  That FK check gets optimized away, while the second
one doesn't.  Please reread what I said before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Aug 20, 2010, at 12:02 PM, Greg Stark wrote:
 So I count three integers in both 9.0rc1 and 9.0beta4

 No, I mean 9.0.0beta4. If we were to adopt the Semantic Versioning spec, one 
 would *always* use X.Y.Z, with optional ASCII characters appended to Z to add 
 meaning (including less than unadorned Z).

Well, I for one will fiercely resist adopting any such standard, because
it's directly opposite to the way that RPM will sort such version numbers.
Apparently whoever wrote Semantic Versioning didn't bother to inquire
into existing practice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
On Aug 20, 2010, at 12:15 PM, Tom Lane wrote:

 No, I mean 9.0.0beta4. If we were to adopt the Semantic Versioning spec, one 
 would *always* use X.Y.Z, with optional ASCII characters appended to Z to 
 add meaning (including less than unadorned Z).
 
 Well, I for one will fiercely resist adopting any such standard, because
 it's directly opposite to the way that RPM will sort such version numbers.

Which is how?

 Apparently whoever wrote Semantic Versioning didn't bother to inquire
 into existing practice.

Tom Preston-Warner of GitHub fame.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Josh Berkus

 It *is* allowed to, and in fact has already done so.  The problem is
 that it now needs a sharelock on the referenced row in order to ensure
 that the FK constraint remains satisfied, ie, nobody deletes the
 referenced row before we commit the update.  In the general case where
 the referencing row is new (or has a new FK value) in the current
 transaction, such a lock is necessary for correctness.  Your case would
 work if we could optimize away the FK check, but with only a limited
 view of what's happened in the current transaction, it's not always
 possible to optimize away the check.

Hmmm.  It seems to me that we'd need a sharelock on the referenced row
both times.  Is the below sequence missing something?

process 1   process 1 locks process 2   process 2 locks

update session; exclusive lock
session row;
update orders;  exclusive lock
orders row;
share lock
session row;
update orders;  exclusive lock
requested orders row
(blocks);
share lock session row; 
update orders;  exclusive lock
orders row;
share lock
session row;

(in this example, there is an fk orders.sessionid -- session.id )

It certainly seems that process 2 is acquiring exactly the same locks
twice, since the referenced value is never being changed.  So why does
it need a share lock the 2nd time and not the first?  Or is the
sharelock in the first cycle being optimized away improperly?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Devrim GÜNDÜZ

+1 for Tom's post.

--
Devrim GÜNDÜZ
PostgreSQL DBA @ Akinon/Markafoni, Red Hat Certified Engineer
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

20.Ağu.2010 tarihinde 21:40 saatinde, Tom Lane t...@sss.pgh.pa.us  
şunları yazdı:



David E. Wheeler da...@kineticode.com writes:

A while ago, I asked if .0 releases could be versioned with three
digits instead of two. That is, it would be 8.4.0 instead of 8.4.


We've been doing that for some time, no?  A quick look at the CVS
history shows that 8.0.0 and up were tagged that way.

This is to make the format consistent with maintenance releases  
(8.4.1, etc.). I thought this was generally agreed upon, but  
maybe not, because I just went to build the latest 9.0 beta and saw  
that the version number is 9.0beta4.


.0 is for releases, not betas.  I see no need for an extra number in
beta versions.

   regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Joel Jacobson
Optimized away, check, OK, but why? Because there is no new data in the FK
(table A) at the point of the first update of table B in process 2? But when
process 1 updates A, the FK B-A points to new data, which leads to process
2 tries to acquire a sharelock, which is not granted due to the update of A?

2010/8/20 Tom Lane t...@sss.pgh.pa.us

 Joel Jacobson j...@gluefinance.com writes:
  I fully agree it must obtain a sharelock on the FK, but I cannot
 understand
  why it is granted it the first time, but not the second time?

 It *isn't* granted it the first time, because it doesn't try to acquire
 it the first time.  That FK check gets optimized away, while the second
 one doesn't.  Please reread what I said before.

regards, tom lane




-- 
Best regards,

Joel Jacobson
Glue Finance

E: j...@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


Re: [HACKERS] Version Numbering

2010-08-20 Thread Devrim GÜNDÜZ
20.Ağu.2010 tarihinde 21:47 saatinde, David Fetter da...@fetter.org  
şunları yazdı:

The current system give people the completely false impression that
7.0 and 7.4 are somehow similar.


Well, I do find PostgreSQL versioning policy very good, which is  
pretty much similar to Linux. For me, 7.x are similar. Remember why we  
jumped from 7.5 to 8.0 or from 8.5 to 9.0.


Cheers,
--
Devrim GÜNDÜZ
PostgreSQL DBA @ Akinon/Markafoni, Red Hat Certified Engineer
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git: uh-oh

2010-08-20 Thread Magnus Hagander
On Fri, Aug 20, 2010 at 21:22, Max Bowsher m...@f2s.com wrote:
 On 20/08/10 19:54, Magnus Hagander wrote:
 On Fri, Aug 20, 2010 at 20:52, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 In fact, is the only thing that's wrong here the commit message?
 Because it's probably trivial to just patch that away.. Hmm, but i
 guess we'd like to hav ethe actual commit message and not just another
 fixed one..

 If I understand Max's statements correctly, there is an observable
 problem in the actual git history, not just the commit log entries:
 it will believe that a file added on a branch had been there since
 the branch forked off, not just as of the time it got added.

 Not since the branch forked off, but rather it will believe the file
 added to the branch from the moment it was added to trunk - the issue is
 actually in the cvs repository too - were you to ask CVS for the state
 of the branch at the relevant time, you'd see the extra file there too.

 In the specific case we've been looking at so far, the file is only
 appearing less than a minute prematurely.

Yeah, that's because in our backpatching we generally do them at the
same time, so cvs2cl will pick it up. E.g. you modify all the branches
and have a script commit to them all with the same commit message.


 Now, I would think that your tests of file contents as of the various
 release tags should have caught extra files, so maybe I'm
 misunderstanding.

 I haven't been able to complete that test on the repo converted by the
 new version yet, because the repo Max prepared for us had the keyword
 problem. The other process is still running.

 Would it help at all for you to send me the options file and related
 file so I can produce a repository converted as you are expecting?

In fact, the conversion *just* finished. I'm running the comparison
script now. It's at least looking reasonably right - no changes in
REL6_4. It'll take a while for it to finish on the rest... This, in
fact, means that it's doing better than version 2.3.0 with regards to
the small issues with had with vendor branches as well, which is good
news (see other threads in the archives).

That said, the options file is certainly not secret. I've sent the one
used for 2.3.0 before, here's the one I used for trunk (trunk of
cvs2git).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


cvs2git-trunk.options
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread David E. Wheeler
On Aug 20, 2010, at 12:21 PM, Devrim GÜNDÜZ wrote:

 +1 for Tom's post.
 
 20.Ağu.2010 tarihinde 21:40 saatinde, Tom Lane t...@sss.pgh.pa.us şunları 
 yazdı:
 
 .0 is for releases, not betas.  I see no need for an extra number in
 beta versions.

Yes, well, it's still implicit, isn't it?

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-20 Thread Tom Lane
Tatsuo Ishii is...@sraoss.co.jp writes:
 We generally assume that in server-safe encodings, the ctype.h functions
 will behave sanely on any single-byte value.

 I think this wisedom is only true for C locale.  I'm not surprised
 all that it does not work with non C locales.

 From array_funcs.c:

   while (isspace((unsigned char) *p))
   p++;

 IMO this should be something like:

   while (isspace((unsigned char) *p))
   p += pg_mblen(p);

I don't think that's likely to help at all.  The risk is that isspace
will do something not-sane with a fragment of a character.  If it's not
coded to guard against that, it's just as likely to give wrong results
for the leading byte as for non-leading bytes.  (In the case at hand,
I think the underlying problem is that it imagines what it's given is
a Unicode code point, not a byte of a UTF8 string.  There apparently
aren't any code points in the range U+00C0 - U+00FF for which isspace
is true, but that's not true for isalpha for example.)

If we were going to try to code around this, we'd need to change all
these loops to look something like

while ((isascii((unsigned char) *p) ||
pg_database_encoding_max_length() == 1) 
   isspace((unsigned char) *p))
p += pg_mblen(p);   // or p++, it wouldn't matter

However, given the limited number of platforms where this is an issue
and the fact that it is an acknowledged bug on those platforms,
I'm not eager to go there.

In any case, no matter whether we changed that or not, we'd still have
the problem that it's a bad idea to have any locale-dependent behavior
in array_in; and the behavior *would* still be locale-dependent, at
least in single-byte encodings.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-20 Thread Tom Lane
Steven Schlansker ste...@trumpet.io writes:
 On Aug 19, 2010, at 3:24 PM, Tom Lane wrote:
 We generally assume that in server-safe encodings, the ctype.h functions
 will behave sanely on any single-byte value.  You can argue the wisdom
 of that, but deciding to change that policy would be a rather massive
 code change; I'm not excited about going that direction.

 Fair enough.  I presume there are no server-safe encodings for which
 a multibyte sequence 0x XX20 would be valid - which would break anyway
 (as the second byte looks like a real space)

Right: our definition of a server-safe encoding is precisely that no
byte of a multibyte character looks like ASCII, ie all bytes have their
high bit set.  We're essentially assuming that the ctype.h functions
will all return false for any byte with the high bit set, if the
selected encoding is multibyte.

 Anyway, it looks like this is actually a BSD bug which got copy +
 pasted into Apple's Darwin source -
 http://lists.freebsd.org/pipermail/freebsd-i18n/2007-September/000157.html

Interesting.  So the BSD people did fix it upstream?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Stephen Frost
* David E. Wheeler (da...@kineticode.com) wrote:
 On Aug 20, 2010, at 12:21 PM, Devrim GÜNDÜZ wrote:
 
  +1 for Tom's post.
  
  20.Ağu.2010 tarihinde 21:40 saatinde, Tom Lane t...@sss.pgh.pa.us şunları 
  yazdı:
  
  .0 is for releases, not betas.  I see no need for an extra number in
  beta versions.
 
 Yes, well, it's still implicit, isn't it?

It's still useless garbage..  Sorry, I'm w/ Tom on this one.

THanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Deadlock bug

2010-08-20 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Hmmm.  It seems to me that we'd need a sharelock on the referenced row
 both times.

No, we don't.  The first update knows that it's updating a pre-existing
referencing row and not changing the FK value.  If someone were to try
to delete the referenced row, they would see the original version of the
referencing row as good and hence fail their FK deletion check.

The case where we need a sharelock is for insertion of a new referencing
row.  It's to prevent the race condition where somebody deletes the
referenced row and thinks it's OK because he doesn't see the new
referencing row yet.

In principle we don't need to sharelock the referencing row in either
update in this example, since the original row version is still there.
The problem is to know that, given the limited amount of information
available when performing the second update.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Kevin Grittner
David E. Wheeler da...@kineticode.com wrote:
 
 .0 is for releases, not betas.  I see no need for an extra
 number in beta versions.
 
 Yes, well, it's still implicit, isn't it?
 
Not the way I read it.  If we had a development cycle which resulted
in 8.4.5beta4, then you would have a point.  We don't.
 
Now, if you wanted to argue that it would be better to use 9.0.beta4
than 9.0beta4, that might be defensible.  I think I like that
better; but I'm not inclined to think the difference is worth the
pain of changing an established convention.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Glue] Deadlock bug

2010-08-20 Thread Greg Stark
On Fri, Aug 20, 2010 at 7:57 PM, Joel Jacobson j...@gluefinance.com wrote:
 OK. Thanks for the explanation. It's probably the case in general, but in
 all of my tests (10), process 2 always aborts. I don't think it is
 arbitrary in this example, or could it be?

Well, note the part where I said if it does other stuff first. It's
arbitrary in that it depends on the timing in ways that aren't
obvious. If you're doing the same thing every time you'll trigger the
same arbitrary behavious.
-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Josh Berkus

 Yes, well, it's still implicit, isn't it?

But the last .0 in 9.0.0 is the patch level, effectively.  This makes
that .0 inappropriate for betas; the beta number is the patch level,
i.e. 9.0.beta4.  It doesn't make any sense to have a 9.0.0beta4, since
we're never going to have a 9.0.2beta4.

The betas are pre-.0.  Maybe we should have 9.0.(-3) instead. Or 8.9.97?
 ;-)

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version Numbering

2010-08-20 Thread Devrim GÜNDÜZ


20.Ağu.2010 tarihinde 23:03 saatinde, Josh Berkus j...@agliodbs.com  
şunları yazdı:


The betas are pre-.0.  Maybe we should have 9.0.(-3) instead. Or  
8.9.97?

;-)


This is pretty much what Fedora does actually :-)

--
Devrim GÜNDÜZ
PostgreSQL DBA @ Akinon/Markafoni, Red Hat Certified Engineer
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >