[HACKERS] Re: [COMMITTERS] pgsql: Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented

2012-06-10 Thread Simon Riggs
On 9 June 2012 17:19, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 9 June 2012 16:46, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't believe there was actual consensus for this change,

 It was hardly a subject of marked disagreement.

 It was hardly a subject of discussion, as yet.

 Personally I'm pretty doubtful about suddenly starting to throw errors
 for syntax we've accepted without complaint for over nine years, on
 merely the grounds that we *might* get around to making it do something
 different in the future.  You yourself have complained loudly about
 compatibility breaks that were considerably better founded than this.

 Possibly a NOTICE or WARNING (with some other text than this) would be
 a better choice for warning people that a compatibility break might
 be coming.

OK, I will revert pending further discussion and agreement.

The reason for action was simply to close an ubobtrusive open item,
but its clear it wasn't.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visual Studio 2012 RC

2012-06-10 Thread Dave Page
On Sunday, June 10, 2012, Brar Piening wrote:

 Magnus Hagander wrote:

 I don't have too much hope for them actually changing it - they seem
 hell-bent on forcing everybody into metro, and this seems to be yet another
 way to do that. But we can always hope...


 Looks like they've learnt their lesson...

 http://blogs.msdn.com/b/**visualstudio/archive/2012/06/**
 08/visual-studio-express-2012-**for-windows-http://blogs.msdn.com/b/visualstudio/archive/2012/06/08/visual-studio-express-2012-for-windows-desktop.aspx


Yeah, though what I didn't realise was that 2012 won't target XP (and
2k3?). Urgh.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-10 Thread Kohei KaiGai
Sorry for my late reply.

2012/6/6 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 22:33 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 I can live with any behaviour, as long as it doesn't depends on details
 of the query plan. My vote would be for always using the role which was
 active at statement creation time (i.e. at PREPARE/DECLARE time) for
 RLS purposes though. That way, we could treat the role as being IMMUTABLE
 I think.

 I have same opinion at the point where the active role should be 
 consistent,
 but a bit different in details. I think, it should be a timing at
 beginning of execution,
 not plan creation time, like as existing permission checks are applied
 on InitPlan()
 or its callee.

 From a consistency POV, I'd agree. But binding the active role after
 planning means giving up a lot of query optimization opportunities. It
 will prevent the role from participating in constant folding, which
 translates to potentially enormous loss of performance. Take the following
 policy function as an example

  (role = 'some_role' AND expensive_function(row)) OR
  (role = 'other_role' AND row.field = 'some_value')

 and assume that there's an index on 'field'. Now, if role is known to be
 'other_row', the planner can eliminate the expensive_function() from the
 plan, and satisfy row.field = 'some_value' with an index scan. Without
 that knowledge, you'll probably get a sequential scan.

 I do value consistency, but in this case the benefit of not being
 consistent with other privilege checks outweighs the drawbacks I think.

I never deny some possibility to have special optimization using some
information to be determined at planner stage, in the near future.
However, I wonder the first version of this patch should include all
the desirable features at once.

 It seems to me, caution of the problem is current_user is performing out
 of the snapshot controls. According to the definition of stable function,
 its result should not be changed during a particular scan.

 Yeah, well, ISTM that our definition of STABLE doesn't really take functions
 whose return value depends on GUCs into account.

 Probably, has_XXX_privilege() should work with reliable data source being
 safe from user-id switching. But it is a tough work to enhance whole of the
 GUC stuff for snapshot aware…

 We don't need that for every GUC, though. I'd say its sufficient to somehow
 provide the policy function with a stable value of the active role (whatever
 active ends up meaning). If we do that, we can simply document that making
 the policy function depend on other GUCs is a seriously bad idea.

 We *do* need that stable active role value though, since we cannot very
 well advise against using current_role in the policy function without 
 providing
 an alternativ…

I have no preference about implementation detail of this. As long as we can
have the actually stable user-id until v9.3 release, it is sufficient for me.

 To reiterate, my point is that we won't get away with zero planner changes
 even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
 costs of RLSBYPASS whether we actually add that feature or not. Which makes
 not adding it look like a pretty bad deal to me.

 It seems to me, 95% of our opinion is common, except for a few detailed 
 stuff.
 I never dislike the idea of RLSBYPASS permission, but I'd like to investigate
 this idea with more time, until v9.3 being closed.
 At least, nobody opposed to allow superusers to bypass RLS policy.
 So, it can be a minimum startup point, isn't it? RLS patch will takes 
 hundreds
 of lines due to syntax support or query rewriting stuff. It is always good 
 idea
 to focus on the core functionality at the starting point.

 Absolutely. But at the same time, it's important to check that the design 
 allows
 the additional features to be added later easily. In the case of RLS, I'm
 worried that decreeing that it's the role at execution time, not planning
 time, that counts, we're painting ourselves into a corner. I view RLSBYPASS
 as a good sanity check for that. Another one is that 
 GRANT-with-filter-function
 idea. Both seem to fall into place quite naturally if handled while planning -
 you simply have to add decide which additional WHERE clause to add, if any.
 Without any negative performance effects from policies which don't apply to
 the current role, if the optimizer does it's job. Getting the same level of
 performance when policies are added unconditionally seems quite impossible.
 You might be able to make up for some losses by caching STABLE values (which
 amounts to another constant folding pass), but you won't ever be able to make 
 up
 for a lost index scan opportunity or other optimizations which change the 
 structure
 of the whole query plan.

If we have a mechanism to check permissions at planner stage, it should perform
as if it works at executor stage, as existing permission mechanism doing.
Thus, here is no inconsistency even if 

Re: [HACKERS] Visual Studio 2012 RC

2012-06-10 Thread Magnus Hagander
On Sun, Jun 10, 2012 at 3:16 AM, Brar Piening b...@gmx.de wrote:
 Magnus Hagander wrote:

 I don't have too much hope for them actually changing it - they seem
 hell-bent on forcing everybody into metro, and this seems to be yet another
 way to do that. But we can always hope...


 Looks like they've learnt their lesson...

 http://blogs.msdn.com/b/visualstudio/archive/2012/06/08/visual-studio-express-2012-for-windows-desktop.aspx

Yeah. Didn't expect that to happen, but happy to see that it did! :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-06-10 Thread Kohei KaiGai
2012/6/8 Simon Riggs si...@2ndquadrant.com:
 On 25 April 2012 10:40, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 I tried to implement a patch according to the idea. It allows extensions
 to register an entry point of the self-managed daemon processes,
 then postmaster start and stop them according to the normal manner.

 The patch needs much work yet, but has many good ideas.

 There doesn't seem to be a place where we pass the parameter to say
 which one of the multiple daemons a particular process should become.
 It would be helpful for testing to make the example module call 2
 daemons each with slightly different characteristics or parameters, so
 we can test the full function of the patch.

This patch intended to register a daemon multiple times with different
name such as auth-counter-1 or auth-counter-2.
But, I agree with the suggestion to take a parameter to identify each
daemon makes interface better than the original one.

 I think its essential that we allow these processes to execute SQL, so
 we must correctly initialise them as backends and set up signalling.
 Which also means we need a parameter to limit the number of such
 processes.

It should be controllable with a flag of RegisterExtraDaemon().
Although it helps to reduce code duplication in case when extra daemons
execute SQL, but some other use-cases may not need SQL execution.

 Also, I prefer to call these bgworker processes, which is more similar
 to auto vacuum worker and bgwriter naming. That also gives a clue as
 to how to set up signalling etc..

 I don't think we should allow these processes to override sighup and
 sigterm. Signal handling should be pretty standard, just as it is with
 normal backends.

Hmm. CHECK_FOR_INTERRUPTS() might be sufficient to handle
signaling behavior according to the standard.

 I have a prototype that has some of these characteristics, so I see
 our work as complementary.

 At present, I don't think this patch would be committable in CF1, but
 I'd like to make faster progress with it than that. Do you want to
 work on this more, or would you like me to merge our prototypes into a
 more likely candidate?

I'm not favor in duplicate similar efforts. If available, could you merge
some ideas in my patch into your prototypes?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog

2012-06-10 Thread Magnus Hagander
On Wed, Jun 6, 2012 at 8:10 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jun 5, 2012 at 11:42 PM, Magnus Hagander mag...@hagander.net wrote:
 Works for me. We still need a (reworked) patch, though, right? We just
 move where the move between seconds and milliseconds happens?

 Attached is the updated version of the patch.

Thanks.


 I definitely don't think we need subsecond granularity in the user
 facing number. Even a second is pretty short.

 Yep.

 (We do need to retain the ability to set it to 0 = off of course).

 Yep, a value of zero disables the status updates, and the patch adds
 that explanation into the document of pg_basebackup and pg_receivexlog.


Applied, with some small modifications. For example, you don't need a
frontend-specific copy of #define's that are in the backend, since
those don't require linking to the backend, just the #include.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and feedback message

2012-06-10 Thread Magnus Hagander
On Thu, Jun 7, 2012 at 12:40 PM, Magnus Hagander mag...@hagander.net wrote:
 On Thursday, June 7, 2012, Fujii Masao wrote:

 On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander mag...@hagander.net
 wrote:
  On Thursday, June 7, 2012, Fujii Masao wrote:
 
  On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander mag...@hagander.net
  wrote:
   On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao masao.fu...@gmail.com
   wrote:
   On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander
   mag...@hagander.net
   wrote:
   On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao masao.fu...@gmail.com
   wrote:
   On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander
   mag...@hagander.net
   wrote:
   Right now, pg_receivexlog sets:
                          replymsg-write = InvalidXLogRecPtr;
                          replymsg-flush = InvalidXLogRecPtr;
                          replymsg-apply = InvalidXLogRecPtr;
  
   when it sends it's status updates.
  
   I'm thinking it sohuld set replymsg-write = blockpos instad.
  
   Why? That way you can see in pg_stat_replication what has
   actually
   been received by pg_receivexlog - not just what we last sent.
   This
   can
   be useful in combination with an archive_command that can block
   WAL
   recycling until it has been saved to the standby. And it would be
   useful as a general monitoring thing as well.
  
   I think the original reason was that it shouldn't interefer with
   synchronous replication - but it does take away a fairly useful
   usecase...
  
   I think that not only replaymsg-write but also -flush should be
   set
   to
   blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
   as synchronous standby, so we can write WAL to both local and
   remote
   synchronously. I believe there are some use cases for synchronous
   pg_receivexlog.
  
   pg_receivexlog doesn't currently fsync() after every write. It only
   fsync():s complete files. So we'd need to set -flush only at the
   end
   of a segment, right?
  
   Yes.
  
   Currently the status update is sent for each status interval. In
   sync
   replication, transaction has to wait for a while even after
   pg_receivexlog
   has written or flushed the WAL data.
  
   So we should add new option which specifies whether pg_receivexlog
   sends the status packet back as soon as it writes or flushes the WAL
   data, like the walreceiver does?
  
   That might be useful, but I think that's 9.3 material at this point.
 
  Fair enough. That's new feature rather than a bugfix.
 
   But I think we can get the set the write location in as a bugfix.
 
  Also set the flush location? Sending the flush location back seems
  helpful when using pg_receivexlog for WAL archiving purpose. By
  seeing the flush location we can ensure that WAL file has been archived
  durably (IOW, WAL file has been flushed in remote archive area).
 
 
  You  can do that with the write location as well, as long as you round
  it
 You mean to prevent pg_receivexlog from sending back the end of WAL file
 as the write location *before* it completes the WAL file? If so, yes. But
 why do you want to keep the flush location invalid?


 No. pg_receivexlog sends back the correct write location. Whoever does the
 check (through pg_stat_replication) rounds down, so it only counts it once
 pg_receivexlog has acknowledged receiving the whole mail.

 I'm not against doing the flush location as well, I'm just worried about
 feature-creep :-) But let's see how big a change that would turn out to
 be...

How about this?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


receivelog_flushpos.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New Postgres committer: Kevin Grittner

2012-06-10 Thread Magnus Hagander
On Fri, Jun 8, 2012 at 1:04 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 7 June 2012 23:40, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 7 June 2012 23:15, Tom Lane t...@sss.pgh.pa.us wrote:
 Please join me in welcoming him aboard.

 Congratulations, Kevin.

 Idle thought for the web team: Now might be a good time to take down
 the blurb on .org in which Kevin describes the mailing list support as
 amazing.  :-)

Should we, though?

We have quotes from e.g. Simon in the list as well - just because he's
a commiter doesn't make it any less valid, does it?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Boyer-Moore, no less

2012-06-10 Thread Erik Rijkers
Searching for Boyer-Moore in the manual, one  only finds the following line (in 
the 8.4 release
notes):

Improve the performance of text_position() and related functions by using 
Boyer-Moore-
Horspool searching (David Rowley)

I have two questions:

1. Where in postgres is the Boyer-Moore algorithm used?  (What happened to 
text_position() anyway?)

2. Shouldn't there be, for 'advocacy' reasons, a mention of this well-known 
algorithm in the
manual proper?  (I can make a doc-patch if needed)


Thanks,


Erik Rijkers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup --xlog compatibility break

2012-06-10 Thread Magnus Hagander
On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote:
 Yeah, good arguments all around, i agree too :-) Next question is -
 suggestions for naming of said paramter?

 --xlog-method=something?  And/or -Xsomething, which would automatically
 enable -x?

How's this?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


basebackup_xlog_param.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Boyer-Moore, no less

2012-06-10 Thread Heikki Linnakangas

On 10.06.2012 14:28, Erik Rijkers wrote:

Searching for Boyer-Moore in the manual, one  only finds the following line (in 
the 8.4 release
notes):

Improve the performance of text_position() and related functions by using 
Boyer-Moore-
Horspool searching (David Rowley)

I have two questions:

1. Where in postgres is the Boyer-Moore algorithm used?  (What happened to 
text_position() anyway?)


text_position() is in src/backend/utils/adt/varlena.c. It was split into 
a few subroutines, which together implement the Boyer-Moore algorithm. 
The internal text_position() function backs the SQL-style 
POSITION(substring IN string) syntax. In hindsight, the release notes 
should've spelled out POSITION(substring IN string), rather than just 
assume that people know that text_position() refers to that.



2. Shouldn't there be, for 'advocacy' reasons, a mention of this well-known 
algorithm in the
manual proper?  (I can make a doc-patch if needed)


Doesn't seem necessary to me. I don't see any natural place to put such 
a notice in the docs. I think people just rightly assume that the 
built-in functions are implemented efficiently, and no need to go into 
details.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On lör, 2012-06-09 at 18:26 -0400, Tom Lane wrote:
 That's not actually quite the same thing as what I suggest above.
 Currently, unix_socket_directory *overrides* the compiled-in choice.
 I'm suggesting that it would be better to invent a list that is *added
 to* the compiled-in choice.  If we think it would be best to still be
 able to override that, then I'd vote for keeping unix_socket_directory
 as is, and then adding a list named something like
 secondary_socket_directories.   But if we just turn
 unix_socket_directory into a list, I think the lack of separation
 between primary and secondary directories will be confusing.

 By that logic, any list-valued parameter should be split into a primary
 and secondary setting.

Well, no: the key point here is that there will be one directory that is
special because it's the one baked into libpq.  I agree that for the
purposes of the backend in isolation, we might as well just have a list.
What's less clear is whether, when considering the backend+client
ecosystem as a whole, the special status of the configure-time socket
directory ought to be reflected in the way we set up the GUCs.  I have
to admit that I'm not totally sold on either approach.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup --xlog compatibility break

2012-06-10 Thread Fujii Masao
On Sun, Jun 10, 2012 at 8:43 PM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote:
 Yeah, good arguments all around, i agree too :-) Next question is -
 suggestions for naming of said paramter?

 --xlog-method=something?  And/or -Xsomething, which would automatically
 enable -x?

 How's this?

Looks good to me.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Something weird happening in the buildfarm

2012-06-10 Thread Tom Lane
Although HEAD builds seem to still be happy, all back branches have
been failing with git errors for the last six hours or so.  Who broke
what?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-www] Something weird happening in the buildfarm

2012-06-10 Thread Magnus Hagander
On Sun, Jun 10, 2012 at 3:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Although HEAD builds seem to still be happy, all back branches have
 been failing with git errors for the last six hours or so.  Who broke
 what?

We had a server failure on the box that runs git.postgresql.org, so it
was unavailable for a few hours (I think a bit over 3 hours). It ha
sbeen back up for a while now, though, so they should be going green
now... I think the reason it's showing red on so many backbranches
still is that they just don't build as often. All tries are almost 5
hours ago, which is around the time it came back up.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Robert Haas
On Sun, Jun 10, 2012 at 8:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On lör, 2012-06-09 at 18:26 -0400, Tom Lane wrote:
 That's not actually quite the same thing as what I suggest above.
 Currently, unix_socket_directory *overrides* the compiled-in choice.
 I'm suggesting that it would be better to invent a list that is *added
 to* the compiled-in choice.  If we think it would be best to still be
 able to override that, then I'd vote for keeping unix_socket_directory
 as is, and then adding a list named something like
 secondary_socket_directories.   But if we just turn
 unix_socket_directory into a list, I think the lack of separation
 between primary and secondary directories will be confusing.

 By that logic, any list-valued parameter should be split into a primary
 and secondary setting.

 Well, no: the key point here is that there will be one directory that is
 special because it's the one baked into libpq.  I agree that for the
 purposes of the backend in isolation, we might as well just have a list.
 What's less clear is whether, when considering the backend+client
 ecosystem as a whole, the special status of the configure-time socket
 directory ought to be reflected in the way we set up the GUCs.  I have
 to admit that I'm not totally sold on either approach.

I think we should consider this in the context of allowing both
additional UNIX sockets and additional TCP ports.  In the case of TCP
ports, it's clearly no good to turn port into a list, because one
port number has to be primary, since it goes into the PID file and
also affects the naming of any UNIX sockets created.  If we add
secondary_socket_dirs, I think we will also need secondary_ports.  One
idea might be to have one new GUC that serves both purposes:

additional_sockets = '12345, /foo'

I'm sure there are other ways to skin the cat as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_newpage header comment

2012-06-10 Thread Robert Haas
On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila amit.kap...@huawei.com wrote:
On further review, I think that we ought to make MarkBufferDirty() the
caller's job, because sometimes we may need to xlog only if
XLogIsNeeded(), but the buffer's got to get marked dirty either way.

 Incase the place where Xlog is not required, woudn't it fsync the data;
 So in that case even MarkBufferDirty() will also be not required.

Uh... no.  The whole point of doing things in shared buffers is that
you don't have to write and fsync the buffers immediately.  Instead,
buffer evicting handles that stuff for you.

So I think the new function should just do step 5 - emit XLOG and set
LSN/TLI.

 In the API log_newpage_buffer(), as buffer already contains the page to be 
 logged, so can't it be assumed that the page will be initialized and no need 
 to check
 if PageIsNew before setting LSN/TLI.

I don't see why it's any different from log_newpage() in that regard.
That data is initialized before being written, as well, but someone
contemplated the possible need to write a page of all zeros.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_newpage header comment

2012-06-10 Thread Robert Haas
On Sat, Jun 9, 2012 at 10:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Whee, testing is fun.  Second try.

 I'm concerned by the fact that neither the original nor the new code
 bother to test whether the relation is WAL-loggable.  It may be that
 ginbuildempty cannot be invoked for temp tables, but it still seems
 like an oversight waiting to bite you.  I'd be happier if there were
 a RelationNeedsWAL test here.  Come to think of it, the other
 foobuildempty functions aren't checking this either.

That's a good thing, because if they were, it would be wrong.
RelationNeedsWAL() tells you whether the main fork is WAL-logged, but
the buildempty routines exist for the purpose of constructing the init
fork, which should always be WAL-logged.

 A related point is that most of the other existing calls to log_newpage
 are covered by if (XLogIsNeeded()) tests.  It's not immediately
 obvious why these two shouldn't be.  After some reflection I think
 that's correct, but probably the comments for log_newpage and
 log_newpage_buffer need to explain the different WAL-is-needed tests
 that apply to the two usages.

 (I'm also thinking that the XLogIsNeeded macro is very poorly named,
 as the situations it should be used in are far more narrow than the
 macro name suggests.  Haven't consumed enough caffeine to think of
 a better name, though.)

XLogIsNeeded() is, indeed, not a very good name: it's telling us
whether wal_level  minimal, and thus whether there might be a
standby.  When log_newpage() is used to rebuild a relation, we use WAL
bypass whenever possible, since the whole relation will be removed if
the transaction rolls back, but we must still log it if a standby
exists.  In contrast, the init forks need to make it to the standby
regardless.  I'm not sure that I agree that it's the job of
log_newpage() to explain to people calling it on what things they must
conditionalize WAL generation: after all, none of this is unique to
new-page records.  If we emit some other record while creating an
index on the primary, we can skip that when !XLogIsNeeded(), too.  Any
operation we perform on any relation fork other than the init fork can
be conditionalized on RelationNeedsWAL(), not just new-page records.
The charter of the function is just to avoid duplicating the
mumbo-jumbo that's required to emit the record.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and feedback message

2012-06-10 Thread Fujii Masao
On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander mag...@hagander.net wrote:
 How about this?

+   /*
+* Set flushed position to the last byte in the 
previous
+* file. Per above we know that 
xrecoff%XLOG_SEG_SIZE=0
+*/
+   flushedpos = blockpos;
+   if (flushedpos.xrecoff == 0)
+   {
+   flushedpos.xlogid--;
+   flushedpos.xrecoff = XLogFileSize-1;
+   }
+   else
+   flushedpos.xrecoff--;

flushedpos.xrecoff doesn't need to be decremented by one.
If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last
byte of previous (i.e., flushed) WAL file.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented

2012-06-10 Thread Kevin Grittner
 Tom Lane  wrote:
 Kevin Grittner  writes:
 Because the current support for temporary tables is relatively
 similar to the standard's description of LOCAL TEMPORARY TABLES,
 but nothing at all like the standard's descri0ption of GLOBAL
 TEMPORARY TABLES.
 
 Um ... did you read the spec before stating that?
 
Well, I did, but not recently.  Clearly I should not have trusted my
memory.
 
A fresh review brought it all back to me.  When I was reviewing the
standard (in 1998) to decide what to implement for the SQL parser in
the Wisconsin Courts framework I was designing, I decided to
effectively consider any compound statement block (delimited by
BEGIN/END) in a trigger or stored procedure to be equivalent to a
module in terms of features such as temporary tables -- because we
didn't have a concept of modules and wanted to borrow features from
the standard which were defined in terms of modules.  So over the
years I muddled what was in the standard with implementation details
of our framework. I apologize for the confusion.
 
So there are three types of temporary tables defined in the standard,
and the PostgreSQL implementation doesn't look like any of them.  The
bad thing is that PostgreSQL supports syntax for two of them without
matching the standard semantics.  :-(
 
CREATE TEMPORARY TABLE (without LOCAL or GLOBAL) would be an
extension which could not be confused with standard syntax, so
arguably refusing to accept those would be the right thing to do from
a standards compliance perspective -- it would be a legitimate
PostgreSQL extension that way, but the breakage of working code which
would result from suddenly doing that could never be considered
acceptable.
 
What to do about all this?
 
Unless we expect to implement modules in the next release or two,
perhaps we should address the LOCAL noise-word in the docs, with a
note that its usage is non-standard and discouraged since it might
conflict with standard usage in future releases.  Since we've had an
actual patch submitted for implementing GLOBAL temporary tables,
perhaps that merits a run-time warning in 9.2 (also supported by a
warning in the docs).
 
I think we've dug ourselves into a hole by supporting standard syntax
with non-standard semantics.  As the saying goes, when you find
yourself in that position, the first thing to do is to stop digging.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Temporary tables under hot standby

2012-06-10 Thread Noah Misch
On Fri, Jun 08, 2012 at 01:26:20PM -0400, Robert Haas wrote:
 On Sun, Apr 29, 2012 at 4:02 PM, Noah Misch n...@leadboat.com wrote:
  On Tue, Apr 24, 2012 at 11:55:15PM -0400, Noah Misch wrote:
  Concerning everyone's favorite topic, how to name the new type of table, I
  liked Tom's proposal[1] to make CREATE TEMP TABLE retain current behavior 
  and
  have CREATE GLOBAL TEMP TABLE and/or CREATE LOCAL TEMP TABLE request the 
  new
  SQL-standard variety. ?(I'd vote for using CREATE GLOBAL and retaining 
  CREATE
  LOCAL for future expansion.) ?As he mentions, to get there, we'd ideally 
  start
  by producing a warning instead of silently accepting GLOBAL as a noise 
  word.
  Should we put such a warning into 9.2?
 
  Here is the change I'd make.
 
 This is listed on the open items list.
 
 I haven't ever heard anyone propose to redefine CREATE LOCAL TEMP
 TABLE to mean anything different than CREATE TEMP TABLE, so I'm
 disinclined to warn about that.

From a documentation perspective, it will be awkward to explain (or decline to
explain) that both GLOBAL TEMPORARY and LOCAL TEMPORARY are standard syntaxes
with non-standard behavior, only one of which emits a warning.  That unduly
telegraphs a prediction about which one will change first/ever.  Maybe that's
nonetheless the right pragmatic answer.

 I would be more open to warning people about CREATE GLOBAL TEMP TABLE
 - frankly, it's pretty wonky that we allow that but treat GLOBAL as a
 noise word in this first place.  But I'm a little disinclined to have
 the message speculate about what might happen in future versions of
 PostgreSQL.  Such predictions don't have a very good track record of
 being accurate.

I feel the predictions in question (This may specify different semantics in
future versions of PostgreSQL. and This usage is deprecated and may specify
standard-compliant behavior in the future.) were broad enough to mitigate
this concern.  If we ever do change the interpretation of this syntax, to what
could it be other than the standard behavior?  We're not likely to introduce a
different but still-nonstandard behavior for this standard syntax.

I wrote the verbiage that way for the benefit of users encountering the new
warning.  They might reasonably ask, Why did the PostgreSQL developers create
this work for me?  No objection to removing the errhint, but I think the
documentation wording should stay.


Concerning whether to make this a WARNING or an ERROR, does anyone still
object to WARNING?

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time for pgindent run?

2012-06-10 Thread Noah Misch
On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote:
 On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote:
  On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   Is everyone ready for me to run pgindent? ?We are nearing the first
   commit-fest (June 15) and will have to branch the git tree soon.
 
  Also, we should do the pgindent run well before the commitfest, so that
  authors of pending patches have time to rebase their patches in case
  pgindent changes the code they are patching ...
 
  Ah, good point. ?That will affect commit-fest patches. ?We could run it
  only on the 9.3 branch, but that makes double-patching very hard.
 
  Is everyone good for a pgindent run this week?
 
 The sooner the better.

+1

Barring a pgindent commit in the next ~2 days, I think we should branch and
run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Streaming-only Remastering

2012-06-10 Thread Joshua Berkus
So currently we have a major limitation in binary replication, where it is not 
possible to remaster your system (that is, designate the most caught-up 
standby as the new master) based on streaming replication only.  This is a 
major limitation because the requirement to copy physical logs over scp (or 
similar methods), manage and expire them more than doubles the administrative 
overhead of managing replication.  This becomes even more of a problem if 
you're doing cascading replication.

Therefore I think this is a high priority for 9.3.

As far as I can tell, the change required for remastering over streaming is 
relatively small; we just need to add a new record type to the streaming 
protocol, and then start writing the timeline change to that.  Are there other 
steps required which I'm not seeing?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time for pgindent run?

2012-06-10 Thread Bruce Momjian
On Sun, Jun 10, 2012 at 01:47:10PM -0400, Noah Misch wrote:
 On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote:
  On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote:
   On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
Is everyone ready for me to run pgindent? ?We are nearing the first
commit-fest (June 15) and will have to branch the git tree soon.
  
   Also, we should do the pgindent run well before the commitfest, so that
   authors of pending patches have time to rebase their patches in case
   pgindent changes the code they are patching ...
  
   Ah, good point. ?That will affect commit-fest patches. ?We could run it
   only on the 9.3 branch, but that makes double-patching very hard.
  
   Is everyone good for a pgindent run this week?
  
  The sooner the better.
 
 +1
 
 Barring a pgindent commit in the next ~2 days, I think we should branch and
 run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE.

OK, running it now.  Sorry.  I am at a conference and forgot about this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time for pgindent run?

2012-06-10 Thread Bruce Momjian
On Sun, Jun 10, 2012 at 02:58:03PM -0400, Bruce Momjian wrote:
 On Sun, Jun 10, 2012 at 01:47:10PM -0400, Noah Misch wrote:
  On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote:
   On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote:
On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote:
Bruce Momjian br...@momjian.us writes:
 Is everyone ready for me to run pgindent? ?We are nearing the first
 commit-fest (June 15) and will have to branch the git tree soon.
   
Also, we should do the pgindent run well before the commitfest, so that
authors of pending patches have time to rebase their patches in case
pgindent changes the code they are patching ...
   
Ah, good point. ?That will affect commit-fest patches. ?We could run it
only on the 9.3 branch, but that makes double-patching very hard.
   
Is everyone good for a pgindent run this week?
   
   The sooner the better.
  
  +1
  
  Barring a pgindent commit in the next ~2 days, I think we should branch and
  run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE.
 
 OK, running it now.  Sorry.  I am at a conference and forgot about this.

Done.  Sorry for the delay.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming-only Remastering

2012-06-10 Thread Rob Wultsch
On Sun, Jun 10, 2012 at 11:47 AM, Joshua Berkus j...@agliodbs.com wrote:
 So currently we have a major limitation in binary replication, where it is 
 not possible to remaster your system (that is, designate the most caught-up 
 standby as the new master) based on streaming replication only.  This is a 
 major limitation because the requirement to copy physical logs over scp (or 
 similar methods), manage and expire them more than doubles the administrative 
 overhead of managing replication.  This becomes even more of a problem if 
 you're doing cascading replication.

 Therefore I think this is a high priority for 9.3.

 As far as I can tell, the change required for remastering over streaming is 
 relatively small; we just need to add a new record type to the streaming 
 protocol, and then start writing the timeline change to that.  Are there 
 other steps required which I'm not seeing?


Problem that may exist and is likely out of scope:
It is possible for a master with multiple slave servers to have slaves
which have not read all of the logs off of the master. It is annoying
to have to rebuild a replica because it was 1kb behind in reading logs
from the master. If the new master could deliver the last bit of the
old masters logs that would be very nice.

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-10 Thread Noah Misch
On Wed, Aug 24, 2011 at 03:38:09PM -0400, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Aug 24, 2011 at 1:24 PM, Daniel Farina dan...@heroku.com wrote:
  At Heroku we use CREATE INDEX CONCURRENTLY with great success, but
  recently when frobbing around some indexes I realized that there is no
  equivalent for DROP INDEX, and this is a similar but lesser problem
  (as CREATE INDEX takes much longer), as DROP INDEX takes an ACCESS
  EXCLUSIVE lock on the parent table while doing the work to unlink
  files, which nominally one would think to be trivial, but I assure you
  it is not at times for even indexes that are a handful of gigabytes
  (let's say ~= a dozen).
 
  Are you sure that you are really waiting on the time to unlink the
  file?  there's other stuff going on in there like waiting for lock,
  plan invalidation, etc.  Point being, maybe the time consuming stuff
  can't really be deferred which would make the proposal moot.
 
 Assuming the issue really is the physical unlinks (which I agree I'd
 like to see some evidence for), I wonder whether the problem could be

I'd believe it.  With a cold cache (sudo sysctl -w vm.drop_caches=3), my
desktop takes 2.6s to rm a 985 MiB file.

 addressed by moving smgrDoPendingDeletes() to after locks are released,
 instead of before, in CommitTransaction/AbortTransaction.  There does
 not seem to be any strong reason why we have to do that before lock
 release, since incoming potential users of a table should not be trying
 to access the old physical storage after that anyway.

Agreed.  We now have $OLD_SUBJECT, but this is a win independently.  I have
reviewed the code that runs between the old and new call sites, and I did not
identify a hazard of moving it as you describe.

nm
diff --git a/src/backend/access/transam/xact.c 
b/src/backend/access/transam/xact.c
index 8f00186..8e4a455 100644
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
***
*** 1944,1957  CommitTransaction(void)
 */
AtEOXact_Inval(true);
  
-   /*
-* Likewise, dropping of files deleted during the transaction is best 
done
-* after releasing relcache and buffer pins.  (This is not strictly
-* necessary during commit, since such pins should have been released
-* already, but this ordering is definitely critical during abort.)
-*/
-   smgrDoPendingDeletes(true);
- 
AtEOXact_MultiXact();
  
ResourceOwnerRelease(TopTransactionResourceOwner,
--- 1944,1949 
***
*** 1961,1966  CommitTransaction(void)
--- 1953,1969 
 RESOURCE_RELEASE_AFTER_LOCKS,
 true, true);
  
+   /*
+* Likewise, dropping of files deleted during the transaction is best 
done
+* after releasing relcache and buffer pins.  (This is not strictly
+* necessary during commit, since such pins should have been released
+* already, but this ordering is definitely critical during abort.)  
Since
+* this may take many seconds, also delay until after releasing locks.
+* Other backends will observe the attendant catalog changes and not
+* attempt to access affected files.
+*/
+   smgrDoPendingDeletes(true);
+ 
/* Check we've released all catcache entries */
AtEOXact_CatCache(true);
  
***
*** 2354,2360  AbortTransaction(void)
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
AtEOXact_Inval(false);
-   smgrDoPendingDeletes(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
 RESOURCE_RELEASE_LOCKS,
--- 2357,2362 
***
*** 2362,2367  AbortTransaction(void)
--- 2364,2370 
ResourceOwnerRelease(TopTransactionResourceOwner,
 
RESOURCE_RELEASE_AFTER_LOCKS,
 false, true);
+   smgrDoPendingDeletes(false);
AtEOXact_CatCache(false);
  
AtEOXact_GUC(false, 1);
***
*** 4238,4250  AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s-subTransactionId,
  
s-parent-subTransactionId);
AtEOSubXact_Inval(false);
-   AtSubAbort_smgr();
ResourceOwnerRelease(s-curTransactionOwner,
 RESOURCE_RELEASE_LOCKS,
 false, false);
ResourceOwnerRelease(s-curTransactionOwner,
 
RESOURCE_RELEASE_AFTER_LOCKS,

[HACKERS] Resource Owner reassign Locks

2012-06-10 Thread Jeff Janes
As discussed in several different email threads here and on
performance , when using pg_dump a on large number of objects, the
server has a quadratic behavior in LockReassignCurrentOwner where it
has to dig through the entire local lock table to push one or two
locks up from the portal being dropped to its parent.

The attached patch fixes that by remembering up to 10 local locks, and
pushing them up specifically rather than digging through the entire
lock table.  If the list overflows, then it reverts to the old
behavior of digging through the entire local lock table.

The same change was made to the case where the locks are being
released, rather than reassigned (i.e. subtransaction abort rather
than commit).  I have no evidence that digging through the local lock
table during bulk release was ever a bottleneck, but it seemed
inconsistent not to make that change as well.

When it needs to  forget a lock, it searches backwards in the list of
released lock and then just moves the last lock into the place of the
one to be forgotten.  Other ResourceOwner Forget functions slide the
entire list down to close the gap, rather than using the
selection-sort-like method.  I don't understand why they do that.  If
Forgets are strictly LIFO, then it would not make a difference.  If
they are mostly LIFO but occasionally not, maybe the insertion method
would win over the selection method.  From what I can tell, Locks are
mostly released in bulk anyway at transaction end, and rarely released
explicitly.


This patch reduces the time needed to dump 20,000 simple empty tables
from 3m50.903s to 20.695s, and of course larger gains at larger
numbers.

dropdb test; createdb test
for f in `seq 1 1 400` ; do
  perl -le print qq{create table foo\$_ (k integer , v integer);}
foreach ($f..$f+) | psql test /dev/null
  time pg_dump test|wc -c
done

For degrading performance in other cases, I think the best test case
is pgbench -P (implemented in another patch in this commitfest)
which has a loop which pushes one or two locks up from a portal to the
parent (which already owns them, due to previous rounds of the same
loop) very frequently.  There might be a performance degradation of
0.5% or so, but it is less than the run to run variation.  I plan to
run some longer test to get a better estimate.  If there is a
degradation in that range, how important is that?

I wanted to test a real worst case, which would be a malicious
ordering of lock releases (take 10 locks, release the first lock
taken, then release the other 9 in reverse order), but with the
absence of UNLOCK TABLE command, I can't figure out how to engineer
that.

Cheers,

Jeff


resowner_lock_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Temporary tables under hot standby

2012-06-10 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Fri, Jun 08, 2012 at 01:26:20PM -0400, Robert Haas wrote:
 I haven't ever heard anyone propose to redefine CREATE LOCAL TEMP
 TABLE to mean anything different than CREATE TEMP TABLE, so I'm
 disinclined to warn about that.

 From a documentation perspective, it will be awkward to explain (or decline to
 explain) that both GLOBAL TEMPORARY and LOCAL TEMPORARY are standard syntaxes
 with non-standard behavior, only one of which emits a warning.

Yeah.  If we're going to touch this at all, I think we should warn about
both, because they are both being interpreted in a non-standards-compliant
fashion.  It's possible that different message texts would be
appropriate, though.

If we create the infrastructure necessary to make GLOBAL TEMP
standards-compliant, it would not be totally unreasonable (IMO) to make
LOCAL TEMP act like GLOBAL TEMP.  It would still be non-compliant, but
closer than it is today.  Moreover, if you argue that the whole session
is one SQL module, it could actually be seen as compliant, in a subsetty
kind of way.  (Or so I think; but I've not read the relevant parts of
the spec very recently either.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-10 Thread Noah Misch
On Wed, Mar 28, 2012 at 04:25:06PM +0200, Marco Nenciarini wrote:
 Il giorno lun, 19/03/2012 alle 18.41 +0100, Marco Nenciarini ha scritto:
  
  Attached is v5, which should address all the remaining issues.
 
 Please find attached v6 of the EACH Foreign Key patch. From v5 only
 cosmetic changes to the documentation were made.

This has bitrotted; please refresh.

Also, please evaluate Peter's feedback:
http://archives.postgresql.org/message-id/1333693277.32606.9.ca...@vanquo.pezone.net

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Peter Eisentraut
On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote:
 I think we should consider this in the context of allowing both
 additional UNIX sockets and additional TCP ports.  In the case of TCP
 ports, it's clearly no good to turn port into a list, because one
 port number has to be primary, since it goes into the PID file

Not necessarily.  The port number in the PID file is only used for
pg_ctl to contact the server, and for that it only needs some port, not
the primary one.  Also, we write the first listen_address into the PID
file for the same reason.  So if you think there should be a primary
port, then there should also be a primary listen_addresses.

 and also affects the naming of any UNIX sockets created.

Why would that matter?  If you configure M ports and N Unix socket
locations, you get M*N actual sockets created.

 If we add
 secondary_socket_dirs, I think we will also need secondary_ports.  One
 idea might be to have one new GUC that serves both purposes:
 
 additional_sockets = '12345, /foo'

I was getting around to that, although I don't follow the specific
syntax you have chosen here.

I would like something where you set host and port together, so you can
listen on port 5432 on localhost, and port 5433 on *, for example.  So
maybe

listen_addresses = localhost:5432,*:5433

Web servers usually allow that sort of thing, but if you dig deep there,
the configuration settings and their interactions can get pretty
complicated.  For example, you can usually set a default port and then
override it in the listen_addresses equivalent.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Robert Haas
On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote:
 I think we should consider this in the context of allowing both
 additional UNIX sockets and additional TCP ports.  In the case of TCP
 ports, it's clearly no good to turn port into a list, because one
 port number has to be primary, since it goes into the PID file

 Not necessarily.  The port number in the PID file is only used for
 pg_ctl to contact the server, and for that it only needs some port, not
 the primary one.  Also, we write the first listen_address into the PID
 file for the same reason.  So if you think there should be a primary
 port, then there should also be a primary listen_addresses.

Fair enough, as far as this part goes, but...

 and also affects the naming of any UNIX sockets created.

 Why would that matter?  If you configure M ports and N Unix socket
 locations, you get M*N actual sockets created.

...I *seriously* doubt that this is the behavior anyone wants.
Creating M sockets per directory seems patently silly.

 If we add
 secondary_socket_dirs, I think we will also need secondary_ports.  One
 idea might be to have one new GUC that serves both purposes:

 additional_sockets = '12345, /foo'

 I was getting around to that, although I don't follow the specific
 syntax you have chosen here.

I was thinking that each element could be of the form /path or port.
But I guess it should really be /path or host:port.

 I would like something where you set host and port together, so you can
 listen on port 5432 on localhost, and port 5433 on *, for example.  So
 maybe

 listen_addresses = localhost:5432,*:5433

 Web servers usually allow that sort of thing, but if you dig deep there,
 the configuration settings and their interactions can get pretty
 complicated.  For example, you can usually set a default port and then
 override it in the listen_addresses equivalent.

That seems like the obvious syntax, but I'm fuzzy on the details.  We
could let 'port' continue to mean the default port, and then
listen_addresses can contain either unadorned addresses (in which case
we bind to that address using the default port) or address:port
designators (in which case we bind to the given address and port).
But then support port = 1234 and listen_addresses = '5.5.5.5:6789'.
Presumably the UNIX socket is still /tmp/.s.PGSQL.1234, but then what
ends up in the lock file?  PostmasterMain's current algorithm for
figuring that out would write 5.5.5.5 for the host and 1234 for the
port, which seems like nonsense.  Such confusion is why I thought we
might fall back on listing all the additional listen locations in a
new, separate GUC.  But perhaps there is a way to make it work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-10 Thread Robert Haas
On Sun, Jun 10, 2012 at 4:19 PM, Noah Misch n...@leadboat.com wrote:
 On Wed, Aug 24, 2011 at 03:38:09PM -0400, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Aug 24, 2011 at 1:24 PM, Daniel Farina dan...@heroku.com wrote:
  At Heroku we use CREATE INDEX CONCURRENTLY with great success, but
  recently when frobbing around some indexes I realized that there is no
  equivalent for DROP INDEX, and this is a similar but lesser problem
  (as CREATE INDEX takes much longer), as DROP INDEX takes an ACCESS
  EXCLUSIVE lock on the parent table while doing the work to unlink
  files, which nominally one would think to be trivial, but I assure you
  it is not at times for even indexes that are a handful of gigabytes
  (let's say ~= a dozen).

  Are you sure that you are really waiting on the time to unlink the
  file?  there's other stuff going on in there like waiting for lock,
  plan invalidation, etc.  Point being, maybe the time consuming stuff
  can't really be deferred which would make the proposal moot.

 Assuming the issue really is the physical unlinks (which I agree I'd
 like to see some evidence for), I wonder whether the problem could be

 I'd believe it.  With a cold cache (sudo sysctl -w vm.drop_caches=3), my
 desktop takes 2.6s to rm a 985 MiB file.

 addressed by moving smgrDoPendingDeletes() to after locks are released,
 instead of before, in CommitTransaction/AbortTransaction.  There does
 not seem to be any strong reason why we have to do that before lock
 release, since incoming potential users of a table should not be trying
 to access the old physical storage after that anyway.

 Agreed.  We now have $OLD_SUBJECT, but this is a win independently.  I have
 reviewed the code that runs between the old and new call sites, and I did not
 identify a hazard of moving it as you describe.

I looked at this when we last discussed it and didn't see a problem
either, so I tend to agree that we ought to go ahead and do this,
unless someone else sees a problem.  Holding locks for even slightly
less time is a good idea, and if it turns out to be substantially less
time in some cases, then we win more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote:
 If we add
 secondary_socket_dirs, I think we will also need secondary_ports.  One
 idea might be to have one new GUC that serves both purposes:

 additional_sockets = '12345, /foo'

 I was getting around to that, although I don't follow the specific
 syntax you have chosen here.

 I was thinking that each element could be of the form /path or port.
 But I guess it should really be /path or host:port.

I'm uncomfortable with the potential for ambiguity there.  I think we'd
be much better off having two lists, one for TCP addresses and one for
Unix socket directories.

I'm unconvinced that allowing multiple port numbers is worth the
amount of confusion it will cause.  In particular, we've traditionally
used the port number as part of the key for resources such as shared
memory.  I think we'd want the number used for that purpose to be what
is written into the lock file ... but then what if the postmaster is not
actually listening on *any* actual socket with that number?  pg_ctl will
not be happy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 10, 2012 at 4:19 PM, Noah Misch n...@leadboat.com wrote:
 Agreed.  We now have $OLD_SUBJECT, but this is a win independently.  I have
 reviewed the code that runs between the old and new call sites, and I did not
 identify a hazard of moving it as you describe.

 I looked at this when we last discussed it and didn't see a problem
 either, so I tend to agree that we ought to go ahead and do this,

+1, as long as you mean in 9.3 not 9.2.  I don't see any risk either,
but the time for taking new risks in 9.2 is past.

Noah, please add this patch to the upcoming CF, if you didn't already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ability to listen on two unix sockets

2012-06-10 Thread Robert Haas
On Sun, Jun 10, 2012 at 5:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote:
 If we add
 secondary_socket_dirs, I think we will also need secondary_ports.  One
 idea might be to have one new GUC that serves both purposes:

 additional_sockets = '12345, /foo'

 I was getting around to that, although I don't follow the specific
 syntax you have chosen here.

 I was thinking that each element could be of the form /path or port.
 But I guess it should really be /path or host:port.

 I'm uncomfortable with the potential for ambiguity there.  I think we'd
 be much better off having two lists, one for TCP addresses and one for
 Unix socket directories.

I suggested it mostly because we already use that convention in libpq:
leading slash = pathname.

 I'm unconvinced that allowing multiple port numbers is worth the
 amount of confusion it will cause.  In particular, we've traditionally
 used the port number as part of the key for resources such as shared
 memory.  I think we'd want the number used for that purpose to be what
 is written into the lock file ... but then what if the postmaster is not
 actually listening on *any* actual socket with that number?  pg_ctl will
 not be happy.

Well, that's why I shied away from the syntax Peter is proposing.  I
think if we leave the semantics of listen_addresses and port alone,
and just add a new GUC for extra places to listen, there's no problem.
 If you look at the patch I posted upthread, you'll see that I set
things up so that we'll still fail if the primary port can't be
listened on; once we've established that we can listen there, we'll
try to set up the other sockets as well.  I think that's a pretty sane
way to allow this (which a number of people, not only me, seem to
support) without creating surprising semantics.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backends stalled in 'startup' state: index corruption

2012-06-10 Thread Jeff Frost
On May 26, 2012, at 9:17 AM, Tom Lane wrote:

 Would you guys please try this in the problem databases:
 
 select a.ctid, c.relname
 from pg_attribute a join pg_class c on a.attrelid=c.oid
 where c.relnamespace=11 and c.relkind in ('r','i')
 order by 1 desc;
 
 If you see any block numbers above about 20 then maybe the triggering
 condition is a row relocation after all.


Sorry for such a long delay on the reply.  Took a while to get the data 
directory moved elsewhere:

select a.ctid, c.relname
from pg_attribute a join pg_class c on a.attrelid=c.oid
where c.relnamespace=11 and c.relkind in ('r','i')
order by 1 desc;

  ctid   | relname 
-+-
 (18,31) | pg_extension_name_index
 (18,30) | pg_extension_oid_index
 (18,29) | pg_seclabel_object_index
 (18,28) | pg_seclabel_object_index
 (18,27) | pg_seclabel_object_index
 (18,26) | pg_seclabel_object_index

 As the next step, I'd suggest verifying that the stall is reproducible
 if you remove pg_internal.init (and that it's not there otherwise), and
 then strace'ing the single incoming connection to see what it's doing.

It does take a little while, but not nearly as long as the stalls we were 
seeing before.  The pg_internal.init is 108k in case that's an interesting data 
point.

Any other tests you'd like me to run on that bad data dir?

Also, thus far, the newly initdb'd system continues to hum along just fine.  
It's also been upgraded to 9.1.4, so if it was the rebuilding of 
pg_internal.init, then your fix should keep it happy.

Re: [HACKERS] [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-10 Thread Alvaro Herrera

Excerpts from Bruce Momjian's message of dom jun 10 15:20:34 -0400 2012:
 Run pgindent on 9.2 source tree in preparation for first 9.3
 commit-fest.

Hm, does this touch stuff that would also be modified by perltidy?  I
wonder if we should refrain from doing entab/detab on perl files and
instead have perltidy touch such code.

Perhaps the thing to do is ensure that perltidy also uses tabs instead
of spaces.

 src/tools/msvc/Install.pm   | 1056 ++-
 src/tools/msvc/MSBuildProject.pm|  325 +++---
 src/tools/msvc/Mkvcbuild.pm | 1278 ---
 src/tools/msvc/Project.pm   |  607 ++--
 src/tools/msvc/Solution.pm  |  980 +-
 src/tools/msvc/VCBuildProject.pm|  292 +++---
 src/tools/msvc/VSObjectFactory.pm   |  152 ++--
 src/tools/msvc/build.pl |   20 +-
 src/tools/msvc/builddoc.pl  |   32 +-
 src/tools/msvc/config_default.pl|   38 +-
 src/tools/msvc/gendef.pl|   66 +-
 src/tools/msvc/install.pl   |4 +-
 src/tools/msvc/pgbison.pl   |   14 +-
 src/tools/msvc/pgflex.pl|   60 +-
 src/tools/msvc/vcregress.pl |  400 

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3

2012-06-10 Thread Bruce Momjian
On Sun, Jun 10, 2012 at 08:55:13PM -0400, Alvaro Herrera wrote:
 
 Excerpts from Bruce Momjian's message of dom jun 10 15:20:34 -0400 2012:
  Run pgindent on 9.2 source tree in preparation for first 9.3
  commit-fest.
 
 Hm, does this touch stuff that would also be modified by perltidy?  I
 wonder if we should refrain from doing entab/detab on perl files and
 instead have perltidy touch such code.
 
  src/tools/msvc/Install.pm   | 1056 ++-
  src/tools/msvc/MSBuildProject.pm|  325 +++---
  src/tools/msvc/Mkvcbuild.pm | 1278 
  ---
  src/tools/msvc/Project.pm   |  607 ++--
  src/tools/msvc/Solution.pm  |  980 +-
  src/tools/msvc/VCBuildProject.pm|  292 +++---
  src/tools/msvc/VSObjectFactory.pm   |  152 ++--
  src/tools/msvc/build.pl |   20 +-
  src/tools/msvc/builddoc.pl  |   32 +-
  src/tools/msvc/config_default.pl|   38 +-
  src/tools/msvc/gendef.pl|   66 +-
  src/tools/msvc/install.pl   |4 +-
  src/tools/msvc/pgbison.pl   |   14 +-
  src/tools/msvc/pgflex.pl|   60 +-
  src/tools/msvc/vcregress.pl |  400 

The Perl files were modified by perltidy and not by pgindent, as
documented in the pgindent README:

9) Indent the Perl MSVC code:

cd src/tools/msvc
perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm

 Perhaps the thing to do is ensure that perltidy also uses tabs instead
 of spaces.

If you would like 'entab' run on the Perl files, let me know.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_newpage header comment

2012-06-10 Thread Amit Kapila
Uh... no.  The whole point of doing things in shared buffers is that
you don't have to write and fsync the buffers immediately.  Instead,
buffer evicting handles that stuff for you.

So you mean to say that there exists operations where Xlog is not required
even though it marks the buffer as dirty for later eviction.

I don't see why it's any different from log_newpage() in that regard.
That data is initialized before being written, as well, but someone
contemplated the possible need to write a page of all zeros.

The comment above the code indicates that the page is uninitialized.
Does the code consider page with all zero's as uninitialized or the comment
is not
appropriate.

-Original Message-
From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: Sunday, June 10, 2012 7:14 PM
To: Amit kapila
Cc: Tom Lane; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] log_newpage header comment

On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila amit.kap...@huawei.com wrote:
On further review, I think that we ought to make MarkBufferDirty() the
caller's job, because sometimes we may need to xlog only if
XLogIsNeeded(), but the buffer's got to get marked dirty either way.

 Incase the place where Xlog is not required, woudn't it fsync the data;
 So in that case even MarkBufferDirty() will also be not required.

Uh... no.  The whole point of doing things in shared buffers is that
you don't have to write and fsync the buffers immediately.  Instead,
buffer evicting handles that stuff for you.

So I think the new function should just do step 5 - emit XLOG and set
LSN/TLI.

 In the API log_newpage_buffer(), as buffer already contains the page to be
logged, so can't it be assumed that the page will be initialized and no need
to check
 if PageIsNew before setting LSN/TLI.

I don't see why it's any different from log_newpage() in that regard.
That data is initialized before being written, as well, but someone
contemplated the possible need to write a page of all zeros.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers