Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat
I am looking into perpunion.c and analyze.c

There is a catch inserting subqueries for corresponding in the planner.
Parser expects to see equal number of columns in both sides of the
UNION query. If there is corresponding however we cannot guarantee that.
Target columns, collations and types for the SetOperationStmt are
determined in the parser. If we pass the column number equality checks,
it is not clear that how one would proceed with the targetlist generation
loop
which is a forboth for two table's columns.

One way would be filtering the columns in the parser anyway and inserting
subqueries in the planner but it leads to the previous problem of column
ordering and view definition mess-up, and it would be too much bloat
methinks.

I can guess what needs to be done in prepunion.c, but I need a waypointer
for the parser.

tom lane: Thanks for your description


regards

Kerem KAT

On Fri, Sep 23, 2011 at 07:40, Tom Lane t...@sss.pgh.pa.us wrote:

 Kerem Kat kerem...@gmail.com writes:
  While testing I noticed that ordering is incorrect in my implementation.
 At
  first I thought that removing mismatched entries from ltargetlist and
  rtargetlist would be enough, it didn't seem enough so I added rtargetlist
  sorting.

 I don't think you can get away with changing the targetlists of the
 UNION subqueries; you could break their semantics.  Consider for
 instance

select distinct a, b, c from t1
union corresponding
select b, c from t2;

 If you discard the A column from t1's output list then it will deliver a
 different set of rows than it should, because the DISTINCT is
 considering the wrong set of values.

 One possible way to fix that is to introduce a level of sub-select,
 as if the query had been written

select b, c from (select distinct a, b, c from t1) ss1
union
select b, c from (select b, c from t2) ss2;

 However, the real problem with either type of hackery is that these
 machinations will be visible in the parsed query, which means for
 example that a view defined as

create view v1 as
select distinct a, b, c from t1
union corresponding
select b, c from t2;

 would come out looking like the transformed version rather than the
 original when it's dumped, or even just examined with tools such as
 psql's \d+.  I think this is bad style.  It's certainly ugly to expose
 your implementation shortcuts to the user like that, and it also can
 cause problems down the road: if in the future we think of some better
 way to implement CORRESPONDING, we've lost the chance to do so for any
 stored views that got transformed this way.  (There are several places
 in Postgres now that take such shortcuts, and all of them were mistakes
 that we need to clean up someday, IMO.)

 So I think that as far as the parser is concerned, you just want to
 store the CORRESPONDING clause more or less as-is, and not do too much
 more than verify that it's valid.  The place to actually implement it is
 in the planner (see prepunion.c).  Possibly the add-a-level-of-subselect
 approach will work, but you want to do that querytree transformation at
 plan time not parse time.

regards, tom lane



Re: [HACKERS] CUDA Sorting

2011-09-24 Thread Hannu Krosing
On Mon, 2011-09-19 at 15:12 +0100, Greg Stark wrote:
 On Mon, Sep 19, 2011 at 1:11 PM, Vitor Reus vitor.r...@gmail.com wrote:
  Since I'm new to pgsql development, I replaced the code of pgsql
  qsort_arg to get used with the way postgres does the sort. The problem
  is that I can't use the qsort_arg_comparator comparator function on
  GPU, I need to implement my own. I didn't find out how to access the
  sorting key value data of the tuples on the Tuplesortstate or
  SortTuple structures. This part looks complicated because it seems the
  state holds the pointer for the scanner(?), but I didn't managed to
  access the values directly. Can anyone tell me how this works?



 With the GPU I'm curious to see how well
 it handles multiple processes contending for resources, it might be a
 flashy feature that gets lots of attention but might not really be
 very useful in practice. But it would be very interesting to see.

There are cases where concurrency may not be that important like some
specialized OLAP loads where you have to sort, for example finding a
median in large data sets.


-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CUDA Sorting

2011-09-24 Thread Hannu Krosing
On Mon, 2011-09-19 at 10:36 -0400, Greg Smith wrote:
 On 09/19/2011 10:12 AM, Greg Stark wrote:
  With the GPU I'm curious to see how well
  it handles multiple processes contending for resources, it might be a
  flashy feature that gets lots of attention but might not really be
  very useful in practice. But it would be very interesting to see.
 
 
 The main problem here is that the sort of hardware commonly used for 
 production database servers doesn't have any serious enough GPU to 
 support CUDA/OpenCL available.  The very clear trend now is that all 
 systems other than gaming ones ship with motherboard graphics chipsets 
 more than powerful enough for any task but that.  I just checked the 5 
 most popular configurations of server I see my customers deploy 
 PostgreSQL onto (a mix of Dell and HP units), and you don't get a 
 serious GPU from any of them.
 
 Intel's next generation Ivy Bridge chipset, expected for the spring of 
 2012, is going to add support for OpenCL to the built-in motherboard 
 GPU.  We may eventually see that trickle into the server hardware side 
 of things too.
 
 I've never seen a PostgreSQL server capable of running CUDA, and I don't 
 expect that to change.

CUDA sorting could be beneficial on general server hardware if it can
run well on multiple cpus in parallel. GPU-s being in essence parallel
processors on fast shared memory, it may be that even on ordinary RAM
and lots of CPUs some CUDA algorithms are a significant win.

and then there is non-graphics GPU availabe on EC2 

  Cluster GPU Quadruple Extra Large Instance

  22 GB of memory
  33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem”
   architecture)
  2 x NVIDIA Tesla “Fermi” M2050 GPUs
  1690 GB of instance storage
  64-bit platform
  I/O Performance: Very High (10 Gigabit Ethernet)
  API name: cg1.4xlarge

It costs $2.10 per hour, probably a lot less if you use the Spot
Instances.

 -- 
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TABLE tab completion

2011-09-24 Thread Magnus Hagander
TABLE tab completion in psql only completes to tables, not views. but
the TABLE command works fine for both tables and views (and also
sequences).

Seems we should just complete it to relations and not tables - or can
anyone see a particular reason why we shouldn't?

Trivial patch attached.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f7df36..8515c38 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -636,7 +636,7 @@ static const pgsql_thing_t words_after_create[] = {
 	{SCHEMA, Query_for_list_of_schemas},
 	{SEQUENCE, NULL, Query_for_list_of_sequences},
 	{SERVER, Query_for_list_of_servers},
-	{TABLE, NULL, Query_for_list_of_tables},
+	{TABLE, NULL, Query_for_list_of_relations},
 	{TABLESPACE, Query_for_list_of_tablespaces},
 	{TEMP, NULL, NULL, THING_NO_DROP},		/* for CREATE TEMP TABLE ... */
 	{TEMPLATE, Query_for_list_of_ts_templates, NULL, THING_NO_SHOW},

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Hannu Krosing
On Wed, 2011-09-07 at 01:22 +0200, Jan Urbański wrote:
 On 07/09/11 01:13, Peter Geoghegan wrote:
  On 6 September 2011 08:29, Peter Eisentraut pete...@gmx.net wrote:
  I was thinking about splitting up plpython.c, but it's not even on that
  list. ;-)
  
  IIRC the obesity of that file is something that Jan Urbański intends
  to fix, or is at least concerned about. Perhaps you should compare
  notes.
 
 Yeah, plpython.c could easily be splitted into submodules for builtin
 functions, spi support, subtransactions support, traceback support etc.
 
 If there is consensus that this should be done, I'll see if I can find
 time to submit a giant-diff-no-behaviour-changes patch for the next CF.

+1 from me.

Would make working with it much nicer.

 Cheers,
 Jan
 

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-24 Thread Magnus Hagander
On Fri, Sep 23, 2011 at 16:44, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Magnus Hagander's message of vie sep 23 11:31:37 -0300 2011:

 On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

  This seems strange to me.  Why not have a second option to let the user
  indicate the desired SSL verification?
 
  sslmode=disable/allow/prefer/require
  sslverify=none/ca-if-present/ca/full
 
  (ca-if-present being the current require sslmode behavior).
 
  We could then deprecate sslmode=verify and verify-full and have them be
  synonyms of sslmode=require and corresponding sslverify.

 Hmm. I agree that the other suggestion was a bit weird, but I'm not
 sure I like the multiple-options approach either. That's going to
 require redesign of all software that deals with it at all today :S

 Why?  They could continue to use the existing options; or switch to the
 new options if they wanted different behavior, as is the case of the OP.

I guess. I was mostly thinking in the terms of anything that has
connection things that look anything like the one in pgadmin for
example - which will now suddenly need more than one dropdown box, for
what really should be a simple setting. But I guess that can be
considered an UI thing, and jus thave said application map a single
dropdown to multiple options in the connection string.


 Maybe we should just update the docs and be done with it :-)

 That's another option, sure ... :-)

I've applied a docs fix for this now. We can keep discussing how to
make a more extensive fix in head :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs
On Fri, Sep 23, 2011 at 5:51 PM, Josh Berkus j...@agliodbs.com wrote:
 Simon,

 There are many. Tools I can name include pgpool, 2warm, PITRtools, but
 there are also various tools from Sun, an IBM reseller I have
 forgotten the name of, OmniTI and various other backup software
 providers. Those are just the ones I can recall quickly. We've
 encouraged people to write software on top and they have done so.

 Actually, just to correct this list:
 * there are no tools from Sun

Just for the record, I sat through a 1 hour presentation at the
PostgreSQL UK conference from a Sun employee describing the product
and its very clear use of PostgreSQL facilities. Josh definitely
wasn't at that presentation, many others here were.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs
On Fri, Sep 23, 2011 at 6:17 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 23, 2011 at 12:51 PM, Josh Berkus j...@agliodbs.com wrote:
 I'm happy to make upgrades easier, but I want a path which eventually
 ends in recovery.conf going away.  It's a bad API, confuses our users,
 and is difficult to support and maintain.

 I agree.

 GUC = Grand Unified Configuration, but recovery.conf is an example of
 where it's not so unified after all.  We've already done a non-trivial
 amount of work to allow recovery.conf values to be specified without
 quotes, a random incompatibility with GUCs that resulted from having
 different parsing code for each file.  If that were the last issue,
 then maybe it wouldn't be worth worrying about, but it's not.  For
 example, it would be nice to have reload behavior on SIGHUP.
...
 I don't want us
 to have to implement such things separately for postgresql.conf and
 recovery.conf.

It was always my plan to do exactly the above, and there are code
comments that say that from 2004. The time to replace it is now and I
welcome that day and have already agreed to it.

We all want every word quoted above and nothing there is under debate.


 And we
 keep talking about having an ALTER SYSTEM SET guc = value or SET
 PERMANENT guc = value command, and I think ALTER SYSTEM SET
 recovery_target_time = '...' would be pretty sweet.  I don't want us
 to have to implement such things separately for postgresql.conf and
 recovery.conf.

There is a reason why it doesn't work that way which you overlook.
Please start a separate thread if you wish to discuss that.


 Now, it's true that Simon's proposal (of having recovery.conf
 automatically included) if it exists doesn't necessarily preclude
 those things.  But it seems to me that it is adding a lot of
 complexity to core for a pretty minimal benefit to end-users, and that
 the semantics are not going to end up being very clean.  For example,
 now you potentially have the situation where recovery.conf has
 work_mem=128MB and postgresql.conf has work_mem=4MB, and now when you
 end recovery you've got to make sure that everyone picks up the new
 setting.  Now, in some sense you could say that's a feature addition,
 and I'm not going to deny that it might be useful to some people, but
 I think it's also going to require a fairly substantial convolution of
 the GUC machinery, and it's going to discourage people from moving
 away from recovery.conf.  And like Josh, I think that ought to be the
 long-term goal, for the reasons he states.

The semantics are clear: recovery.conf is read first, then
postgresql.conf. It's easy to implement (1 line of code) and easy to
understand.

So we can support the old and the new very, very easily and clearly.
Complexity - no definitely not. Minimal benefit for end users -
backwards compatibility isn't minimal benefit. It's a major issue.

If you put things in two places, yes that causes problems. You can
already add the same parameter twice and cause exactly the same
problems.



 I don't want to go willy-nilly breaking third-party tools that work
 with PostgreSQL, but in this case I think that the reason there are so
 many tools in the first place is because what we're providing in core
 is not very good.  If we are unwilling to improve it for fear of
 breaking compatibility with the tools, then we are stuck.

No, there are many tools because there are many requirements. A
simple, open API has allowed our technology to be widely used. That
was by design not by chance.

Nobody is unwilling to improve it. The debate is about people being
unwilling to provide a simple and easy to understand backwards
compatibility feature, which breaks things for no reason and does not
interfere with the proposed new features.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian
Robert Haas wrote:
 On Fri, Sep 9, 2011 at 11:28 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
  It's very difficult or impossible to anticipate how effective the tool
  will be in practice, but when you consider that it works and does not
  produce false positives for the first 3 real-world cases tested, it
  seems reasonable to assume that it's at least worth having around. Tom
  recently said of a previous pgrminclude campaign in July 2006 that It
  took us two weeks to mostly recover, but we were still dealing with
  some fallout in December. I think that makes the case for adding this
  tool or some refinement as a complement to pgrminclude in src/tools
  fairly compelling.
 
 I'm not opposed to adding something like this, but I think it needs to
 either be tied into the actual running of the script, or have a lot
 more documentation than it does now, or both.  I am possibly stupid,
 but I can't understand from reading the script (or, honestly, the
 thread) exactly what kind of pgrminclude-induced errors this is
 protecting against; but even if we clarify that, it seems like it
 would be a lot of work to run it manually on all the files that might
 be affected by a pgrminclude run, unless we can somehow automate that.
 
 I'm also curious to see how much more fallout we're going to see from
 that run.  We had a few glitches when it was first done, but it didn't
 seem like they were really all that bad.  It might be that we'd be
 better off running pgrminclude a lot *more* often (like once a cycle,
 or even after every CommitFest), because the scope of the changes
 would then be far smaller and we wouldn't be dealing with 5 years of
 accumulated cruft all at once; we'd also get a lot more experience
 with what works or does not work with the script, which might lead to
 improvements in that script on a less-than-geologic time scale.

Interesting idea.  pgrminclude has three main problems:

o  defines --- to prevent removing an include that is referenced in an
#ifdef block that is not reached, it removes ifdef blocks, though that
might cause the file not to compile and be skipped.

o  ## expansion --- we found that CppAsString2() uses the CCP expansion
##, which throws no error if the symbol is does not exist (perhaps
through the removal of a define).  This is the problem we had with
tablespace directory names, and the script now checks for CppAsString2
and friends and skips the file.

o  imbalance of includes --- pgrminclude can remove includes that are
not required, but this can cause other files to need many more includes.
This is the imbalance include problem Tom found.

The submitted patch to compare object files only catches the second
problem we had.  

I think we determined that the best risk/reward is to run it every five
years.  The pgrminclude run removed 627 include references, which is
0.006% of our 1,077,878 lines of C code.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Martijn van Oosterhout
On Fri, Sep 23, 2011 at 04:22:09PM +0100, Greg Stark wrote:
 So you have two memory fetches which I guess I still imagine have to
 be initiated in the right order but they're both in flight at the same
 time. I have no idea how the memory controller works and I could
 easily imagine either one grabbing the value before the other.

That's easy. If one is in cache and the other isn't then the results
will come back out of order.

 I'm not even clear how other processors can reasonably avoid this. It
 must be fantastically expensive to keep track of which branch
 predictions depended on which registers and which memory fetches the
 value of those registers depended on. And then it would have  to
 somehow inform the memory controller of those old memory fetches that
 this new memory fetch is dependent on and have it ensure that the
 fetches are read the right order?

I think memory accesses are also fantastically expensive, so it's worth
some effort to optimise that.

I found the Linux kernel document on this topic quite readable. I think
the main lesson here is that processors track data dependancies (other
than the Alpha apparently), but not control dependancies.  So in the
example, the value of i is dependant on num_items, but not via any
calculation.  IThat control dependancies are not tracked makes some
sense, since branches depend on flags bit, and just about any
calculation changes the flag bits, but most of the time these changes
are not used.

It also not a question of the knowing the address either, since the
first load, if any, will be *q-items, irrespective of the precise
value of num_items.  This address may be calculated long in advance.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] What Would You Like To Do?

2011-09-24 Thread Bruce Momjian
Joshua D. Drake wrote:
 
 On 09/13/2011 11:51 AM, Michael Nolan wrote:
 
 
  The ability to restore a table from a backup file to a different
  table
  name in the same database and schema.
 
 
  This can be done but agreed it is not intuitive.
 
 
  Can you elaborate on tha a bit, please?  The only way I've been able to
  do it is to edit the dump file to change the table name.  That's not
  very practical with a several gigabyte dump file, even less so with one
  that is much larger.  If this capability already exists, is it documented?
 
 You use the -Fc method, extract the TOC and edit just the TOC (so you 
 don't have to edit a multi-gig file)

How does that work in practice?  You dump the TOC, edit it, restore the
TOC schema definition, then how do you restore the data to the renamed
table?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What Would You Like To Do?

2011-09-24 Thread Andrew Dunstan



On 09/24/2011 09:51 AM, Bruce Momjian wrote:

Joshua D. Drake wrote:

On 09/13/2011 11:51 AM, Michael Nolan wrote:


 The ability to restore a table from a backup file to a different
 table
 name in the same database and schema.


 This can be done but agreed it is not intuitive.


Can you elaborate on tha a bit, please?  The only way I've been able to
do it is to edit the dump file to change the table name.  That's not
very practical with a several gigabyte dump file, even less so with one
that is much larger.  If this capability already exists, is it documented?

You use the -Fc method, extract the TOC and edit just the TOC (so you
don't have to edit a multi-gig file)

How does that work in practice?  You dump the TOC, edit it, restore the
TOC schema definition, then how do you restore the data to the renamed
table?




How do you extract the TOC at all? There are no tools for manipulating 
the TOC that I know of, and I'm not sure we should provide any. It's not 
documented, it's a purely internal artefact. The closest thing we have 
to being able to manipulate it is --list/--use-list, and those are 
useless for this purpose. So this method description does not compute 
for me either.


+1 for providing a way to restore an object to a different object name.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for pg_upgrade

2011-09-24 Thread Bruce Momjian
\panam wrote:
 Hi, just tried to upgrade from 9.0 to 9.1 and got this error during
 pg_upgrade :
 Mismatch of relation id: database xyz, old relid 465783, new relid 16494
 It seems, I get this error on every table as I got it on another table
 (which I did not need and deleted) before as well. Schmemas seem to be
 migrated but the content is missing.
 
 I am using Windows 7 64bit (both PG servers are 64 bit as well), everthing
 on the same machine.

Sorry for the delay in replying.  It is odd you got a mismatch of relids
because pg_upgrade is supposed to preserve all of those.  Can you do a
query to find out what table is relid of 465783 on the old cluster?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - symmetric

2011-09-24 Thread Bruce Momjian
Jeff Davis wrote:
 On Tue, 2011-09-13 at 12:34 -0400, Christopher Browne wrote:
   select int4range(5,2);
   ERROR:  range lower bound must be less than or equal to range upper bound
  
   Of course, I won't argue this is a bug, but I was wondering if it 
   wouldn't be handy to allow a
   'symmetric' mode in range construction, where, if the first of the pair 
   is higher than the second,
   they are automatically swapped, similar to SYMMETRIC in the BETWEEN 
   clause.
 
 ...
 
  If you have a computation that gets a backwards range, then it is
  more than possible that what you've got isn't an error of getting the
  range backwards, but rather the error that your data is
  overconstraining, and that you don't actually have a legitimate range.
 
 Agreed. On balance, it's just as likely that you miss an error as save a
 few keystrokes.
 
 I'll add that it would also cause a little confusion with inclusivity.
 What if you do: '[5,2)'::int4range? Is that really '[2,5)' or '(2,5]'?

Reminder:  BETWEEEN supports the SYMMETRIC keyword, so there is
a precedent for this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tatsuo Ishii
 I'm not sure what you mean by not deal with but part of pgpool-II's
 functionality assumes that we can easily generate recovery.conf. If
 reconf.conf is integrated into postgresql.conf, we need to edit
 postgresql.conf, which is a little bit harder than generating
 recovery.conf, I think.
 
 Oh?  Clearly I've been abusing pgPool2 then.  Where's the code that
 generates that?

pgpool-II itself does not generate the file but scripts for pgpool-II
are generating the file as stated in documentation comimg with
pgpool-II.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Magnus Hagander
Attached patch adds a simple call to posix_fadvise with
POSIX_FADV_DONTNEED on all the files being read when doing a base
backup, to help the kernel not to trash the filesystem cache.

Seems like a simple enough fix - in fact, I don't remember why I took
it out of the original patch :O

Any reason not to put this in? Is it even safe enough to put into 9.1
(probably not, but maybe?)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 4841095..54c4d13 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -781,6 +781,15 @@ sendFile(char *readfilename, char *tarfilename, struct stat * statbuf)
 		pq_putmessage('d', buf, pad);
 	}
 
+	/*
+	 * If we have posix_fadvise(), send a note to the kernel that we are not
+	 * going to need this data anytime soon, so that it can be discarded
+	 * from the filesystem cache.
+	 */
+#if defined(USE_POSIX_FADVISE)  defined(POSIX_FADV_DONTNEED)
+	(void) posix_fadvise(fileno(fp), 0, 0, POSIX_FADV_DONTNEED);
+#endif
+
 	FreeFile(fp);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Andres Freund
Hi,

On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
 Attached patch adds a simple call to posix_fadvise with
 POSIX_FADV_DONTNEED on all the files being read when doing a base
 backup, to help the kernel not to trash the filesystem cache.
 
 Seems like a simple enough fix - in fact, I don't remember why I took
 it out of the original patch :O
 
 Any reason not to put this in? Is it even safe enough to put into 9.1
 (probably not, but maybe?)
Won't that possibly throw a formerly fully cached database out of the cache?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Magnus Hagander
On Sat, Sep 24, 2011 at 17:14, Andres Freund and...@anarazel.de wrote:
 Hi,

 On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
 Attached patch adds a simple call to posix_fadvise with
 POSIX_FADV_DONTNEED on all the files being read when doing a base
 backup, to help the kernel not to trash the filesystem cache.

 Seems like a simple enough fix - in fact, I don't remember why I took
 it out of the original patch :O

 Any reason not to put this in? Is it even safe enough to put into 9.1
 (probably not, but maybe?)
 Won't that possibly throw a formerly fully cached database out of the cache?

I was assuming the kernel was smart enough to read this as *this*
process is not going to be using this file anymore, not nobody in
the whole machine is going to use this file anymore. And the process
running the base backup is certainly not going to read it again.

But that's a good point - do you know if that is the case, or does it
mandate more testing?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Robert Haas wrote:
 I'm also curious to see how much more fallout we're going to see from
 that run.  We had a few glitches when it was first done, but it didn't
 seem like they were really all that bad.  It might be that we'd be
 better off running pgrminclude a lot *more* often (like once a cycle,
 or even after every CommitFest), because the scope of the changes
 would then be far smaller and we wouldn't be dealing with 5 years of
 accumulated cruft all at once; we'd also get a lot more experience
 with what works or does not work with the script, which might lead to
 improvements in that script on a less-than-geologic time scale.

 Interesting idea.  pgrminclude has three main problems: [ snipped ]

Actually, I believe that the *main* problem with pgrminclude is that
it fails to account for combinations of build options other than those
that Bruce uses.  In the previous go-round, the reason we were still
squashing bugs months later is that it took that long for people to
notice and complain hey, compiling with LOCK_DEBUG no longer works,
or various other odd build options that the buildfarm doesn't exercise.
I have 100% faith that we'll be squashing some bugs like that ... very
possibly, the exact same ones as five years ago ... over the next few
months.  Peter's proposed tool would catch issues like the CppAsString2
failure, but it's still only going to exercise the build options you're
testing with.

If we could get pgrminclude up to a similar level of reliability as
pgindent, I'd be for running it on every cycle.  But I'm not sure that
the current approach to it is capable even in theory of getting to it
just works reliability.  I'm also not impressed at all by the hack of
avoiding problems by excluding entire files from the processing ---
what's the point of having the tool then?

 I think we determined that the best risk/reward is to run it every five
 years.

Frankly, with the tool in its current state I'd rather not run it at
all, ever.  The value per man-hour expended is too low.  The mess it
made out of the xlog-related includes this time around makes me question
whether it's even a net benefit, regardless of whether it can be
guaranteed not to break things.  Fundamentally, there's a large
component of design judgment/taste in the question of which header files
should include which others, but this tool does not have any taste.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Tom Lane
Kerem Kat kerem...@gmail.com writes:
 There is a catch inserting subqueries for corresponding in the planner.
 Parser expects to see equal number of columns in both sides of the
 UNION query. If there is corresponding however we cannot guarantee that.

Well, you certainly need the parse analysis code to be aware of
CORRESPONDING's effects.  But I think you can confine the changes to
adjusting the computation of a SetOperationStmt's list of output column
types.  It might be a good idea to also add a list of output column
names to SetOperationStmt, and get rid of the logic that digs down into
the child queries when we need to know the output column names.

 Target columns, collations and types for the SetOperationStmt are
 determined in the parser. If we pass the column number equality checks,
 it is not clear that how one would proceed with the targetlist generation
 loop which is a forboth for two table's columns.

Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
need to have a separate code path to deduce the output column list in
that case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian
Tom Lane wrote:
 Actually, I believe that the *main* problem with pgrminclude is that
 it fails to account for combinations of build options other than those
 that Bruce uses.  In the previous go-round, the reason we were still
 squashing bugs months later is that it took that long for people to
 notice and complain hey, compiling with LOCK_DEBUG no longer works,
 or various other odd build options that the buildfarm doesn't exercise.
 I have 100% faith that we'll be squashing some bugs like that ... very
 possibly, the exact same ones as five years ago ... over the next few
 months.  Peter's proposed tool would catch issues like the CppAsString2

The new code removes #ifdef markers so all code is compiled, or the file
is skipped if it can't be compiled.  That should avoid this problem.

 failure, but it's still only going to exercise the build options you're
 testing with.
 
 If we could get pgrminclude up to a similar level of reliability as
 pgindent, I'd be for running it on every cycle.  But I'm not sure that
 the current approach to it is capable even in theory of getting to it
 just works reliability.  I'm also not impressed at all by the hack of
 avoiding problems by excluding entire files from the processing ---
 what's the point of having the tool then?

We remove the includes we safely can, and skip the others --- the
benefit is only for the files we can process.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas
On Sat, Sep 24, 2011 at 9:02 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The semantics are clear: recovery.conf is read first, then
 postgresql.conf. It's easy to implement (1 line of code) and easy to
 understand.

Eh, well, if you can implement it in one line of code, consider my
objection withdrawn.  I can't see how that would be possible, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat
On Sat, Sep 24, 2011 at 18:49, Tom Lane t...@sss.pgh.pa.us wrote:

 Kerem Kat kerem...@gmail.com writes:
  There is a catch inserting subqueries for corresponding in the planner.
  Parser expects to see equal number of columns in both sides of the
  UNION query. If there is corresponding however we cannot guarantee that.

 Well, you certainly need the parse analysis code to be aware of
 CORRESPONDING's effects.  But I think you can confine the changes to
 adjusting the computation of a SetOperationStmt's list of output column
 types.  It might be a good idea to also add a list of output column
 names to SetOperationStmt, and get rid of the logic that digs down into
 the child queries when we need to know the output column names.


In the parser while analyzing SetOperationStmt, larg and rarg needs to be
transformed as subqueries. SetOperationStmt can have two fields representing
larg and rarg with projected columns according to corresponding:
larg_corresponding,
rarg_corresponding.

Planner uses _corresponding ones if query is a corresponding query,
view-definition-generator
uses larg and rarg which represent the query user entered.

Comments?


  Target columns, collations and types for the SetOperationStmt are
  determined in the parser. If we pass the column number equality checks,
  it is not clear that how one would proceed with the targetlist generation
  loop which is a forboth for two table's columns.

 Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
 need to have a separate code path to deduce the output column list in
 that case.


If the output column list to be determined at that stage it needs to
be filtered and ordered.
In that case aren't we breaking the non-modification of user query argument?

note: I am new to this list, am I asking too much detail?

regards,

Kerem KAT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Andres Freund
Hi,

On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
 On Sat, Sep 24, 2011 at 17:14, Andres Freund and...@anarazel.de wrote:
  On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
  Attached patch adds a simple call to posix_fadvise with
  POSIX_FADV_DONTNEED on all the files being read when doing a base
  backup, to help the kernel not to trash the filesystem cache.
  Seems like a simple enough fix - in fact, I don't remember why I took
  it out of the original patch :O
  Any reason not to put this in? Is it even safe enough to put into 9.1
  (probably not, but maybe?)
  Won't that possibly throw a formerly fully cached database out of the
  cache?
 I was assuming the kernel was smart enough to read this as *this*
 process is not going to be using this file anymore, not nobody in
 the whole machine is going to use this file anymore. And the process
 running the base backup is certainly not going to read it again.
 But that's a good point - do you know if that is the case, or does it
 mandate more testing?
I am pretty but not totally sure that the kernel does not track each process 
that uses a page. For one doing so would probably prohibitively expensive. For 
another I am pretty (but not ...) sure that I restructured an application not 
to fadvise(DONTNEED) memory that is also used in other processes.

Currently I can only think of to workarounds, both os specific:
- Use O_DIRECT for reading the base backup. Will be slow in fully cached 
situations, but should work ok enough in all others. Need to be carefull about 
the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
- use mmap/mincore() to gather whether data is in cache and restore that state 
afterwards.

Too bad that POSIX_FADV_NOREUSE is not really implemented.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas
On Fri, Sep 23, 2011 at 6:55 PM, Tatsuo Ishii is...@postgresql.org wrote:
 There are many. Tools I can name include pgpool, 2warm, PITRtools, but
 there are also various tools from Sun, an IBM reseller I have
 forgotten the name of, OmniTI and various other backup software
 providers. Those are just the ones I can recall quickly. We've
 encouraged people to write software on top and they have done so.

 Actually, just to correct this list:
 * there are no tools from Sun
 * pgPool2 does not deal with recovery.conf

 I'm not sure what you mean by not deal with but part of pgpool-II's
 functionality assumes that we can easily generate recovery.conf. If
 reconf.conf is integrated into postgresql.conf, we need to edit
 postgresql.conf, which is a little bit harder than generating
 recovery.conf, I think.

Since we haven't yet come up with a reasonable way of machine-editing
postgresql.conf, this seems like a fairly serious objection to getting
rid of recovery.conf.  I wonder if there's a way we can work around
that...

*thinks a little*

What if we modified pg_ctl to allow passing configuration parameters
through to postmaster, so you could do something like this:

pg_ctl start -c work_mem=8MB -c recovery_target_time='...'

Would that meet pgpool's needs?

(Sadly pg_ctl -c means something else right now, so we'd probably have
to pick another option name, but you get the idea.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Robert Haas
On Sat, Sep 24, 2011 at 9:45 AM, Martijn van Oosterhout
klep...@svana.org wrote:
 I think memory accesses are also fantastically expensive, so it's worth
 some effort to optimise that.

This is definitely true.

 I found the Linux kernel document on this topic quite readable. I think
 the main lesson here is that processors track data dependancies (other
 than the Alpha apparently), but not control dependancies.  So in the
 example, the value of i is dependant on num_items, but not via any
 calculation.  IThat control dependancies are not tracked makes some
 sense, since branches depend on flags bit, and just about any
 calculation changes the flag bits, but most of the time these changes
 are not used.

Oh, that's interesting.  So that implies that a read-barrier would be
needed here even on non-Alpha.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Actually, I believe that the *main* problem with pgrminclude is that
 it fails to account for combinations of build options other than those
 that Bruce uses.  In the previous go-round, the reason we were still
 squashing bugs months later is that it took that long for people to
 notice and complain hey, compiling with LOCK_DEBUG no longer works,
 or various other odd build options that the buildfarm doesn't exercise.
 I have 100% faith that we'll be squashing some bugs like that ... very
 possibly, the exact same ones as five years ago ... over the next few
 months.  Peter's proposed tool would catch issues like the CppAsString2

 The new code removes #ifdef markers so all code is compiled, or the file
 is skipped if it can't be compiled.  That should avoid this problem.

It avoids it at a very large cost, namely skipping all the files where
it's not possible to compile each arm of every #if on the machine being
used.  I do not think that's a solution, just a band-aid; for instance,
won't it prevent include optimization in every file that contains even
one #ifdef WIN32?  Or what about files in which there are #if blocks
that each define the same function, constant table, etc?

The right solution would involve testing each #if block under the
conditions in which it was *meant* to be compiled.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 ... The time to replace it is now and I
 welcome that day and have already agreed to it.

Okay, so you do agree that eventually we want to be rid of
recovery.conf?  I think everyone else agrees on that.  But if we are
going to remove recovery.conf eventually, what is the benefit of
postponing doing so?  The pain is going to be inflicted sooner or later,
and the longer we wait, the more third-party code there is likely to be
that expects it to exist.  If optionally reading it helped provide a
smoother transition, then maybe there would be some point.  But AFAICS
having a temporary third behavior will just make things even more
complicated, not less so, for third-party code that needs to cope with
multiple versions.

 The semantics are clear: recovery.conf is read first, then
 postgresql.conf. It's easy to implement (1 line of code) and easy to
 understand.

It's not clear to me why the override order should be that way; I'd have
expected the other way myself.  So this isn't as open-and-shut as you
think.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Sep 23, 2011 at 6:55 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I'm not sure what you mean by not deal with but part of pgpool-II's
 functionality assumes that we can easily generate recovery.conf. If
 reconf.conf is integrated into postgresql.conf, we need to edit
 postgresql.conf, which is a little bit harder than generating
 recovery.conf, I think.

 Since we haven't yet come up with a reasonable way of machine-editing
 postgresql.conf, this seems like a fairly serious objection to getting
 rid of recovery.conf.

I don't exactly buy this argument.  If postgresql.conf is hard to
machine-edit, why is recovery.conf any easier?

 What if we modified pg_ctl to allow passing configuration parameters
 through to postmaster,

You mean like pg_ctl -o?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Peter Geoghegan
On 24 September 2011 16:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Frankly, with the tool in its current state I'd rather not run it at
 all, ever.  The value per man-hour expended is too low.  The mess it
 made out of the xlog-related includes this time around makes me question
 whether it's even a net benefit, regardless of whether it can be
 guaranteed not to break things.  Fundamentally, there's a large
 component of design judgment/taste in the question of which header files
 should include which others, but this tool does not have any taste.

I agree. If this worked well in a semi-automated fashion, there'd be
some other open source tool already available for us to use. As far as
I know, there isn't. As we work around pgrminclude's bugs, its
benefits become increasingly small and hard to quantify.

If we're not going to use it, it should be removed from the tree.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat
On Sat, Sep 24, 2011 at 19:51, Tom Lane t...@sss.pgh.pa.us wrote:
 Kerem Kat kerem...@gmail.com writes:
 In the parser while analyzing SetOperationStmt, larg and rarg needs to be
 transformed as subqueries. SetOperationStmt can have two fields representing
 larg and rarg with projected columns according to corresponding:
 larg_corresponding,
 rarg_corresponding.

 Why?  CORRESPONDING at a given set-operation level doesn't affect either
 sub-query, so I don't see why you'd need a different representation for
 the sub-queries.


In the planner to construct a subquery out of SetOperationStmt or
RangeTblRef, a new RangeTblRef is needed.
To create a RangeTableRef, parser state is needed and planner assumes
root-parse-rtable be not modified
after generating simple_rte_array.

SELECT a,b,c FROM t is larg
SELECT a,b FROM (SELECT a,b,c FROM t) is larg_corresponding
SELECT d,a,b FROM t is rarg
SELECT a,b FROM (SELECT d,a,b FROM t); is rarg_corresponding

In the planner choose _corresponding ones if the query has corresponding.

SELECT a,b FROM (SELECT a,b,c FROM t)
UNION
SELECT a,b FROM (SELECT d,a,b FROM t);



 Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
 need to have a separate code path to deduce the output column list in
 that case.

 If the output column list to be determined at that stage it needs to
 be filtered and ordered.
 In that case aren't we breaking the non-modification of user query argument?

 No.  All that you're doing is correctly computing the lists of the
 set-operation's output column types (and probably names too).  These are
 internal details that needn't be examined when printing the query, so
 they won't affect ruleutils.c.

 note: I am new to this list, am I asking too much detail?

 Well, I am beginning to wonder if you should choose a smaller project
 for your first venture into patching Postgres.



regards,

Kerem KAT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Andrew Dunstan



On 09/24/2011 01:10 PM, Peter Geoghegan wrote:

On 24 September 2011 16:41, Tom Lanet...@sss.pgh.pa.us  wrote:

Frankly, with the tool in its current state I'd rather not run it at
all, ever.  The value per man-hour expended is too low.  The mess it
made out of the xlog-related includes this time around makes me question
whether it's even a net benefit, regardless of whether it can be
guaranteed not to break things.  Fundamentally, there's a large
component of design judgment/taste in the question of which header files
should include which others, but this tool does not have any taste.

I agree. If this worked well in a semi-automated fashion, there'd be
some other open source tool already available for us to use. As far as
I know, there isn't. As we work around pgrminclude's bugs, its
benefits become increasingly small and hard to quantify.

If we're not going to use it, it should be removed from the tree.



Yeah, I've always been dubious about the actual benefit.  At best this 
can be seen as identifying some candidates for pruning, but as an 
automated tool I'm inclined to write it off as a failed experiment.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  Actually, I believe that the *main* problem with pgrminclude is that
  it fails to account for combinations of build options other than those
  that Bruce uses.  In the previous go-round, the reason we were still
  squashing bugs months later is that it took that long for people to
  notice and complain hey, compiling with LOCK_DEBUG no longer works,
  or various other odd build options that the buildfarm doesn't exercise.
  I have 100% faith that we'll be squashing some bugs like that ... very
  possibly, the exact same ones as five years ago ... over the next few
  months.  Peter's proposed tool would catch issues like the CppAsString2
 
  The new code removes #ifdef markers so all code is compiled, or the file
  is skipped if it can't be compiled.  That should avoid this problem.
 
 It avoids it at a very large cost, namely skipping all the files where
 it's not possible to compile each arm of every #if on the machine being
 used.  I do not think that's a solution, just a band-aid; for instance,
 won't it prevent include optimization in every file that contains even
 one #ifdef WIN32?  Or what about files in which there are #if blocks
 that each define the same function, constant table, etc?
 
 The right solution would involve testing each #if block under the
 conditions in which it was *meant* to be compiled.

Right.  It is under the better than nothing category, which is better
than nothing (not running it).  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Martijn van Oosterhout
On Sat, Sep 24, 2011 at 12:46:48PM -0400, Robert Haas wrote:
  I found the Linux kernel document on this topic quite readable. I think
  the main lesson here is that processors track data dependancies (other
  than the Alpha apparently), but not control dependancies.  So in the
  example, the value of i is dependant on num_items, but not via any
  calculation.  IThat control dependancies are not tracked makes some
  sense, since branches depend on flags bit, and just about any
  calculation changes the flag bits, but most of the time these changes
  are not used.
 
 Oh, that's interesting.  So that implies that a read-barrier would be
 needed here even on non-Alpha.

That is my understanding. At source code level the address being
referenced is dependant on i, but at assembly level it's possible i has
been optimised away altogether.

I think the relevent example is here:
http://www.mjmwired.net/kernel/Documentation/memory-barriers.txt (line 725)

Where A = q-items[0] and B = q-num_items.

There is no data dependancy here, so inserting such a barrier won't
help. You need a normal read barrier.

OTOH, if the list already has an entry in it, the problem (probably)
goes away, although with loop unrolling you can't really be sure.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas
On Sep 24, 2011, at 1:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't exactly buy this argument.  If postgresql.conf is hard to
 machine-edit, why is recovery.conf any easier?

Because you generally just write a brand-new file, without worrying about 
preserving existing settings. You aren't really editing at all, just writing.

 
 What if we modified pg_ctl to allow passing configuration parameters
 through to postmaster,
 
 You mean like pg_ctl -o?

Oh, cool. Yes, like that.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Jeff Janes
On Fri, Sep 9, 2011 at 2:51 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Thu, Sep 8, 2011 at 03:22, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 7, 2011 at 5:24 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 I remember we had bugs whereby an encoding conversion would fail,
 leading to elog trying to report this problem, but this attempt would
 also incur a conversion step, failing recursively until elog's stack got
 full.  I'm not saying this is impossible to solve, just something to
 keep in mind.

 Looking at elog.c, this only seems to apply to messages sent to the
 client from a backend connection. No conversion is done for log
 messages.

 Can we do something like: pass through ASCII characters unchanged, and
 output anything with the high-bit set as \xhexdigithexdigit?  That
 might be garbled in some cases, but the goal here is not perfection.
 We're just trying to give the admin (or PostgreSQL-guru-for-hire) a
 clue where to start looking for the problem.

 Or we might just replace them with '?'. This has the advantage of not
 expanding query length 4x if it does happen to be corrupted. The vast
 majority of queries are ASCII-only anyway.

Should this patch be reviewed as is, or will the substitution of
non-ASCII be implemented?

It seems like everyone agrees that this feature is wanted, but Tom is
still very much opposed to the general approach to implement it, as
being too dangerous.
Is it the reviewer's job to try to convince him otherwise?

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs
On Sat, Sep 24, 2011 at 6:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 ... The time to replace it is now and I
 welcome that day and have already agreed to it.

 Okay, so you do agree that eventually we want to be rid of
 recovery.conf?  I think everyone else agrees on that.  But if we are
 going to remove recovery.conf eventually, what is the benefit of
 postponing doing so?

I am happy that we don't recommend the use of recovery.conf in the
future, and that the handling of the contents of recovery.conf should
be identical to the handling of postgresql.conf. The latter point has
always been the plan from day one.

(I should not have used it in my sentence, since doing so always
leads to confusion)

My joyous rush into agreeing to removal has since been replaced with
the cold reality that we must support backwards compatibility.
Emphasise must.


 The semantics are clear: recovery.conf is read first, then
 postgresql.conf. It's easy to implement (1 line of code) and easy to
 understand.

 It's not clear to me why the override order should be that way; I'd have
 expected the other way myself.  So this isn't as open-and-shut as you
 think.

I agree with you that recovery.conf as an override makes more sense,
though am happy with either way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-24 Thread Dimitri Fontaine
Yeb Havinga yebhavi...@gmail.com writes:
 We might want to have a system where an extension can declare that it
 provides capabilites, and then have another extension require those
 capabilities. That would be a neater solution to the case that there are
 multiple extensions that all provide the same capability.

+1

 Yes that would be neater. I can't think of anything else however than to add
 extprovides' to pg_extension, fill it with an explicit 'provides' from the
 control file when present, or extname otherwise, and use that column to
 check the 'requires' list on extension creation time.

That sounds like a good rough plan.

Then we need to think about maintenance down the road, some releases
from now we will need more features around the same topic.  Debian
control file also has Conflicts and Replaces entries, so that using the
three of them you can handle a smooth upgrade even when the extension
changed its name or has been superseded by a new one which often has the
advantage of being maintained.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Robert Haas
On Sat, Sep 24, 2011 at 1:51 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 It seems like everyone agrees that this feature is wanted, but Tom is
 still very much opposed to the general approach to implement it, as
 being too dangerous.
 Is it the reviewer's job to try to convince him otherwise?

It's not the reviewer's job to convince Tom of anything in particular,
but I think it's helpful for them to state their opinion, whatever it
may be (agreeing with Tom, disagreeing with Tom, or whatever).

IMHO, the most compelling argument against the OP's approach made so
far is the encoding issue.  I was hoping someone (maybe the OP?) would
have an opinion on that, an idea how to work around it, or something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

2011-09-24 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Basically, for every unlogged table, you get an empty _init fork, and
 for every index of an unlogged table, you get an _init fork
 initialized to an empty index.  The _init forks are copied over the
 main forks by the startup process before entering normal running.

Let's call that metadata.

 Well, we could certainly Decree From On High that the _init forks are
 all going to be stored under $PGDATA rather than in the tablespace
 directories.  That would make things simple.  Of course, it also means

And now you need to associate a non volatile tablespace where to store
the metadata of your volatile tablespace where you want to store
unlogged data.  And we already have a default tablespace, of course.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Joshua Berkus

 Since we haven't yet come up with a reasonable way of machine-editing
 postgresql.conf, this seems like a fairly serious objection to
 getting
 rid of recovery.conf.  I wonder if there's a way we can work around
 that...

Well, we *did* actually come up with a reasonable way, but it died under an 
avalanche of bikeshedding and 
we-must-do-everything-the-way-we-always-have-done.  I refer, of course, to 
the configuration directory patch, which was a fine solution, and would 
indeed take care of the recovery.conf issues as well had we implemented it.  We 
can *still* implement it, for 9.2.
 
 pg_ctl start -c work_mem=8MB -c recovery_target_time='...'

This wouldn't survive a restart, and isn't compatible with init scripts.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Marti Raudsepp
On Sat, Sep 24, 2011 at 22:23, Robert Haas robertmh...@gmail.com wrote:
 It's not the reviewer's job to convince Tom of anything in particular,
 but I think it's helpful for them to state their opinion, whatever it
 may be (agreeing with Tom, disagreeing with Tom, or whatever).

My opinion is that this can be made safe enough and I explained why I
think so here: http://archives.postgresql.org/pgsql-hackers/2011-09/msg00308.php

Launching another process to read 1kB out of shared memory and print
it to log sounds like overkill. But if that's deemed necessary, I'm
willing to code it up too.

However, I now realize that it does make sense to write a separate
simpler function for the crashed backend case with no
vbeentry-st_changecount check loops, no checkUser, etc. That would be
more robust and easier to review.

I'll try to send a new patch implementing this in a few days.

 IMHO, the most compelling argument against the OP's approach made so
 far is the encoding issue.  I was hoping someone (maybe the OP?) would
 have an opinion on that, an idea how to work around it, or something.

I propsed replacing non-ASCII characters with '?' earlier. That would
be simpler to code, but obviously wouldn't preserve non-ASCII
characters in case the crash has anything to do with those. Since
nobody else weighed in on the '\x##' vs '?' choice, I didn't implement
it yet; but I will in my next submission.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-24 Thread Joshua Berkus
All,

  We might want to have a system where an extension can declare that
  it
  provides capabilites, and then have another extension require
  those
  capabilities. That would be a neater solution to the case that
  there are
  multiple extensions that all provide the same capability.
 
 +1

As a warning, this is the sort of thing which DEB and RPM have spent years 
implementing ... and still have problems with.  Not that we shouldn't do it, 
but we should be prepared for the amount of troubleshooting involved, which 
will be considerable.

--Josh Berkus



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-09-24 Thread Jeff Janes
On Thu, Sep 15, 2011 at 7:46 AM, Andrew Dunstan and...@dunslane.net wrote:
 On Thu, September 15, 2011 10:44 am, Andrew Dunstan wrote:

 As discussed, patch attached.



 this time with patch.

Hi Andrew,

A description of the \setenv command should show up in the output of \?.

Should there be a regression test for this?  I'm not sure how it would
work, as I don't see a cross-platform way to see what the variable is
set to.

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Cédric Villemain
2011/9/24 Andres Freund and...@anarazel.de:
 Hi,

 On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
 On Sat, Sep 24, 2011 at 17:14, Andres Freund and...@anarazel.de wrote:
  On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
  Attached patch adds a simple call to posix_fadvise with
  POSIX_FADV_DONTNEED on all the files being read when doing a base
  backup, to help the kernel not to trash the filesystem cache.
  Seems like a simple enough fix - in fact, I don't remember why I took
  it out of the original patch :O
  Any reason not to put this in? Is it even safe enough to put into 9.1
  (probably not, but maybe?)
  Won't that possibly throw a formerly fully cached database out of the
  cache?
 I was assuming the kernel was smart enough to read this as *this*
 process is not going to be using this file anymore, not nobody in
 the whole machine is going to use this file anymore. And the process
 running the base backup is certainly not going to read it again.
 But that's a good point - do you know if that is the case, or does it
 mandate more testing?
 I am pretty but not totally sure that the kernel does not track each process
 that uses a page. For one doing so would probably prohibitively expensive. For
 another I am pretty (but not ...) sure that I restructured an application not
 to fadvise(DONTNEED) memory that is also used in other processes.

DONTNEED will remove pages from cache. It may happens that it doesn't
(DONTNEED, WILLNEED are just flags, but DONTNEED is honored most of
the time)
You can either readahead the mincore status of a page to decide if you
need to remove it after (this is what some modified dd are doing).
You can also use pgfincore to work before/after basebackup to revcover
the previous state of the page cache.
There are some ideas floating around pgfincore to do seqscan (pg_dump)
with less impact on the page cache this way. (probably possible with
ExecStart/Stop hooks)


 Currently I can only think of to workarounds, both os specific:
 - Use O_DIRECT for reading the base backup. Will be slow in fully cached
 situations, but should work ok enough in all others. Need to be carefull about
 the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
 - use mmap/mincore() to gather whether data is in cache and restore that state
 afterwards.

 Too bad that POSIX_FADV_NOREUSE is not really implemented.

yes.



 Andres

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-24 Thread Noah Misch
On Fri, Sep 23, 2011 at 06:25:01PM -0400, Robert Haas wrote:
 On Mon, Sep 12, 2011 at 3:31 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
  The Part-1 implements corresponding SQL syntax stuffs which are
  security_barrier
  reloption of views, and LEAKPROOF option on creation of functions to be 
  stored
  new pg_proc.proleakproof field.
 
 The way you have this implemented, we just blow away all view options
 whenever we do CREATE OR REPLACE VIEW.  Is that the behavior we want?
 If a security_barrier view gets accidentally turned into a
 non-security_barrier view, doesn't that create a security_hole?

I think CREATE OR REPLACE needs to keep meaning just that, never becoming
replace some characteristics, merge others.  The consequence is less than
delightful here, but I don't have an idea that avoids this problem without
running afoul of some previously-raised design constraint.


pgpFge1bfLlD6.pgp
Description: PGP signature


Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-24 Thread Daniel Farina
On Fri, Sep 23, 2011 at 9:45 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Sep 23, 2011 at 11:43 AM, Aidan Van Dyk ai...@highrise.ca wrote:
 On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:

 Unfortunately, it's impossible, because the error message Could not read
 from file pg_clog/0001 at offset 32768: Success is shown (and startup
 aborted) before the turn for redo starts at message arrives.

 It looks to me that pg_clog/0001 exists, but it shorter than recovery
 expects. Which shouldn't happen, of course, because the start-backup
 checkpoint should flush all the clog that's needed by recovery to disk
 before the backup procedure begins to them.

 I think the point here is that recover *never starts*.  Something in
 the standby startup is looking for a value in a clog block that
 recovery hadn't had a chance to replay (produce) yet.

 Ah.  I think you are right - Heikki made the same point.  Maybe some
 of the stuff that happens just after this comment:

        /*
         * Initialize for Hot Standby, if enabled. We won't let backends in
         * yet, not until we've reached the min recovery point specified in
         * control file and we've established a recovery snapshot from a
         * running-xacts WAL record.
         */


 ...actually needs to be postponed until after we've reached consistency?

We have a number of backups that are like this, and the problem is
entirely reproducible for those.  We always get around it by disabling
hot standby for a while (until consistency is reached) I poked at
xlog.c a bit, and to me seems entirely likely that StartupCLOG is
being called early -- way too early, or at least parts of it.
Presumably(?) it is being called so early in the hot standby path so
that the status of transactions can be known for the purposes of
querying, but it's happening before consistency is reached, ergo not
many invariants (outside of checkpointed things like pg_controldata)
are likely to hold...such as clog being the right size to locate the
transaction status of a page.

Anyway, sorry for dropping the ball on pushing that one; we've been
using this workaround for a while after taking a look at the mechanism
and deciding it was probably not a problem (except for a sound night's
sleep).  We've now seen this dozens of times.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Greg Stark
On Sat, Sep 24, 2011 at 4:16 PM, Magnus Hagander mag...@hagander.net wrote:
 I was assuming the kernel was smart enough to read this as *this*
 process is not going to be using this file anymore, not nobody in
 the whole machine is going to use this file anymore. And the process
 running the base backup is certainly not going to read it again.

 But that's a good point - do you know if that is the case, or does it
 mandate more testing?

It's not the case on Linux. I used to use DONTNEED to flush pages from
cache before running a benchmark. I verified with mincore that the
pages were actually getting removed from cache. Sometimes there was
the occasional straggler but nearly all got flushed and after a second
or third pass the stragglers were gone too.

In case you're wondering, this was because using /proc/.../drop_caches
caused flaky benchmarks. My theory was that it was causing pages of
the executable to trigger page faults in the middle of the benchmark.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers