date:20110924

Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Greg Stark

On Sat, Sep 24, 2011 at 4:16 PM, Magnus Hagander  wrote:
> I was assuming the kernel was smart enough to read this as "*this*
> process is not going to be using this file anymore", not "nobody in
> the whole machine is going to use this file anymore". And the process
> running the base backup is certainly not going to read it again.
>
> But that's a good point - do you know if that is the case, or does it
> mandate more testing?

It's not the case on Linux. I used to use DONTNEED to flush pages from
cache before running a benchmark. I verified with mincore that the
pages were actually getting removed from cache. Sometimes there was
the occasional straggler but nearly all got flushed and after a second
or third pass the stragglers were gone too.

In case you're wondering, this was because using /proc/.../drop_caches
caused flaky benchmarks. My theory was that it was causing pages of
the executable to trigger page faults in the middle of the benchmark.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-24 Thread Daniel Farina

On Fri, Sep 23, 2011 at 9:45 AM, Robert Haas  wrote:
> On Fri, Sep 23, 2011 at 11:43 AM, Aidan Van Dyk  wrote:
>> On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas
>>  wrote:
>>
 Unfortunately, it's impossible, because the error message "Could not read
 from file "pg_clog/0001" at offset 32768: Success" is shown (and startup
 aborted) before the turn for "redo starts at" message arrives.
>>>
>>> It looks to me that pg_clog/0001 exists, but it shorter than recovery
>>> expects. Which shouldn't happen, of course, because the start-backup
>>> checkpoint should flush all the clog that's needed by recovery to disk
>>> before the backup procedure begins to them.
>>
>> I think the point here is that recover *never starts*.  Something in
>> the standby startup is looking for a value in a clog block that
>> recovery hadn't had a chance to replay (produce) yet.
>
> Ah.  I think you are right - Heikki made the same point.  Maybe some
> of the stuff that happens just after this comment:
>
>        /*
>         * Initialize for Hot Standby, if enabled. We won't let backends in
>         * yet, not until we've reached the min recovery point specified in
>         * control file and we've established a recovery snapshot from a
>         * running-xacts WAL record.
>         */
>
>
> ...actually needs to be postponed until after we've reached consistency?

We have a number of backups that are like this, and the problem is
entirely reproducible for those.  We always get around it by disabling
hot standby for a while (until consistency is reached) I poked at
xlog.c a bit, and to me seems entirely likely that StartupCLOG is
being called early -- way too early, or at least parts of it.
Presumably(?) it is being called so early in the hot standby path so
that the status of transactions can be known for the purposes of
querying, but it's happening before consistency is reached, ergo not
many invariants (outside of checkpointed things like pg_controldata)
are likely to hold...such as clog being the right size to locate the
transaction status of a page.

Anyway, sorry for dropping the ball on pushing that one; we've been
using this workaround for a while after taking a look at the mechanism
and deciding it was probably not a problem (except for a sound night's
sleep).  We've now seen this dozens of times.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-24 Thread Noah Misch

On Fri, Sep 23, 2011 at 06:25:01PM -0400, Robert Haas wrote:
> On Mon, Sep 12, 2011 at 3:31 PM, Kohei KaiGai  wrote:
> > The Part-1 implements corresponding SQL syntax stuffs which are
> > "security_barrier"
> > reloption of views, and "LEAKPROOF" option on creation of functions to be 
> > stored
> > new pg_proc.proleakproof field.
> 
> The way you have this implemented, we just blow away all view options
> whenever we do CREATE OR REPLACE VIEW.  Is that the behavior we want?
> If a security_barrier view gets accidentally turned into a
> non-security_barrier view, doesn't that create a security_hole?

I think CREATE OR REPLACE needs to keep meaning just that, never becoming
"replace some characteristics, merge others".  The consequence is less than
delightful here, but I don't have an idea that avoids this problem without
running afoul of some previously-raised design constraint.

pgpFge1bfLlD6.pgp
Description: PGP signature

Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Cédric Villemain

2011/9/24 Andres Freund :
> Hi,
>
> On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
>> On Sat, Sep 24, 2011 at 17:14, Andres Freund  wrote:
>> > On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
>> >> Attached patch adds a simple call to posix_fadvise with
>> >> POSIX_FADV_DONTNEED on all the files being read when doing a base
>> >> backup, to help the kernel not to trash the filesystem cache.
>> >> Seems like a simple enough fix - in fact, I don't remember why I took
>> >> it out of the original patch :O
>> >> Any reason not to put this in? Is it even safe enough to put into 9.1
>> >> (probably not, but maybe?)
>> > Won't that possibly throw a formerly fully cached database out of the
>> > cache?
>> I was assuming the kernel was smart enough to read this as "*this*
>> process is not going to be using this file anymore", not "nobody in
>> the whole machine is going to use this file anymore". And the process
>> running the base backup is certainly not going to read it again.
>> But that's a good point - do you know if that is the case, or does it
>> mandate more testing?
> I am pretty but not totally sure that the kernel does not track each process
> that uses a page. For one doing so would probably prohibitively expensive. For
> another I am pretty (but not ...) sure that I restructured an application not
> to fadvise(DONTNEED) memory that is also used in other processes.

DONTNEED will remove pages from cache. It may happens that it doesn't
(DONTNEED, WILLNEED are just flags, but DONTNEED is honored most of
the time)
You can either readahead the mincore status of a page to decide if you
need to remove it after (this is what some modified dd are doing).
You can also use pgfincore to work before/after basebackup to revcover
the previous state of the page cache.
There are some ideas floating around pgfincore to do seqscan (pg_dump)
with less impact on the page cache this way. (probably possible with
ExecStart/Stop hooks)

>
> Currently I can only think of to workarounds, both os specific:
> - Use O_DIRECT for reading the base backup. Will be slow in fully cached
> situations, but should work ok enough in all others. Need to be carefull about
> the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
> - use mmap/mincore() to gather whether data is in cache and restore that state
> afterwards.
>
> Too bad that POSIX_FADV_NOREUSE is not really implemented.

yes.

>
>
> Andres
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql setenv command

2011-09-24 Thread Jeff Janes

On Thu, Sep 15, 2011 at 7:46 AM, Andrew Dunstan  wrote:
> On Thu, September 15, 2011 10:44 am, Andrew Dunstan wrote:
>>
>> As discussed, patch attached.
>>
>
>
> this time with patch.

Hi Andrew,

A description of the \setenv command should show up in the output of \?.

Should there be a regression test for this?  I'm not sure how it would
work, as I don't see a cross-platform way to see what the variable is
set to.

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-24 Thread Joshua Berkus

All,

> >> We might want to have a system where an extension can declare that
> >> it
> >> "provides" capabilites, and then have another extension "require"
> >> those
> >> capabilities. That would be a neater solution to the case that
> >> there are
> >> multiple extensions that all provide the same capability.
> 
> +1

As a warning, this is the sort of thing which DEB and RPM have spent years 
implementing ... and still have problems with.  Not that we shouldn't do it, 
but we should be prepared for the amount of troubleshooting involved, which 
will be considerable.

--Josh Berkus



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Marti Raudsepp

On Sat, Sep 24, 2011 at 22:23, Robert Haas  wrote:
> It's not the reviewer's job to convince Tom of anything in particular,
> but I think it's helpful for them to state their opinion, whatever it
> may be (agreeing with Tom, disagreeing with Tom, or whatever).

My opinion is that this can be made safe enough and I explained why I
think so here: http://archives.postgresql.org/pgsql-hackers/2011-09/msg00308.php

Launching another process to read 1kB out of shared memory and print
it to log sounds like overkill. But if that's deemed necessary, I'm
willing to code it up too.

However, I now realize that it does make sense to write a separate
simpler function for the crashed backend case with no
vbeentry->st_changecount check loops, no checkUser, etc. That would be
more robust and easier to review.

I'll try to send a new patch implementing this in a few days.

> IMHO, the most compelling argument against the OP's approach made so
> far is the encoding issue.  I was hoping someone (maybe the OP?) would
> have an opinion on that, an idea how to work around it, or something.

I propsed replacing non-ASCII characters with '?' earlier. That would
be simpler to code, but obviously wouldn't preserve non-ASCII
characters in case the crash has anything to do with those. Since
nobody else weighed in on the '\x##' vs '?' choice, I didn't implement
it yet; but I will in my next submission.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Joshua Berkus


> Since we haven't yet come up with a reasonable way of machine-editing
> postgresql.conf, this seems like a fairly serious objection to
> getting
> rid of recovery.conf.  I wonder if there's a way we can work around
> that...

Well, we *did* actually come up with a reasonable way, but it died under an 
avalanche of bikeshedding and 
"we-must-do-everything-the-way-we-always-have-done".  I refer, of course, to 
the "configuration directory" patch, which was a fine solution, and would 
indeed take care of the recovery.conf issues as well had we implemented it.  We 
can *still* implement it, for 9.2.
 
> pg_ctl start -c work_mem=8MB -c recovery_target_time='...'

This wouldn't survive a restart, and isn't compatible with init scripts.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory

2011-09-24 Thread Dimitri Fontaine

Robert Haas  writes:
> Basically, for every unlogged table, you get an empty _init fork, and
> for every index of an unlogged table, you get an _init fork
> initialized to an empty index.  The _init forks are copied over the
> main forks by the startup process before entering normal running.

Let's call that metadata.

> Well, we could certainly Decree From On High that the _init forks are
> all going to be stored under $PGDATA rather than in the tablespace
> directories.  That would make things simple.  Of course, it also means

And now you need to associate a non volatile tablespace where to store
the metadata of your volatile tablespace where you want to store
unlogged data.  And we already have a default tablespace, of course.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Robert Haas

On Sat, Sep 24, 2011 at 1:51 PM, Jeff Janes  wrote:
> It seems like everyone agrees that this feature is wanted, but Tom is
> still very much opposed to the general approach to implement it, as
> being too dangerous.
> Is it the reviewer's job to try to convince him otherwise?

It's not the reviewer's job to convince Tom of anything in particular,
but I think it's helpful for them to state their opinion, whatever it
may be (agreeing with Tom, disagreeing with Tom, or whatever).

IMHO, the most compelling argument against the OP's approach made so
far is the encoding issue.  I was hoping someone (maybe the OP?) would
have an opinion on that, an idea how to work around it, or something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-24 Thread Dimitri Fontaine

Yeb Havinga  writes:
>> We might want to have a system where an extension can declare that it
>> "provides" capabilites, and then have another extension "require" those
>> capabilities. That would be a neater solution to the case that there are
>> multiple extensions that all provide the same capability.

+1

> Yes that would be neater. I can't think of anything else however than to add
> extprovides' to pg_extension, fill it with an explicit 'provides' from the
> control file when present, or extname otherwise, and use that column to
> check the 'requires' list on extension creation time.

That sounds like a good rough plan.

Then we need to think about maintenance down the road, some releases
from now we will need more features around the same topic.  Debian
control file also has Conflicts and Replaces entries, so that using the
three of them you can handle a smooth upgrade even when the extension
changed its name or has been superseded by a new one which often has the
advantage of being maintained.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs

On Sat, Sep 24, 2011 at 6:01 PM, Tom Lane  wrote:
> Simon Riggs  writes:
>> ... The time to replace it is now and I
>> welcome that day and have already agreed to it.
>
> Okay, so you do agree that eventually we want to be rid of
> recovery.conf?  I think everyone else agrees on that.  But if we are
> going to remove recovery.conf eventually, what is the benefit of
> postponing doing so?

I am happy that we don't recommend the use of recovery.conf in the
future, and that the handling of the contents of recovery.conf should
be identical to the handling of postgresql.conf. The latter point has
always been the plan from day one.

(I should not have used "it" in my sentence, since doing so always
leads to confusion)

My joyous rush into agreeing to removal has since been replaced with
the cold reality that we must support backwards compatibility.
Emphasise "must".

>> The semantics are clear: recovery.conf is read first, then
>> postgresql.conf. It's easy to implement (1 line of code) and easy to
>> understand.
>
> It's not clear to me why the override order should be that way; I'd have
> expected the other way myself.  So this isn't as open-and-shut as you
> think.

I agree with you that recovery.conf as an override makes more sense,
though am happy with either way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Log crashed backend's query (activity string)

2011-09-24 Thread Jeff Janes

On Fri, Sep 9, 2011 at 2:51 AM, Marti Raudsepp  wrote:
> On Thu, Sep 8, 2011 at 03:22, Robert Haas  wrote:
>> On Wed, Sep 7, 2011 at 5:24 PM, Alvaro Herrera
>>  wrote:
>>> I remember we had bugs whereby an encoding conversion would fail,
>>> leading to elog trying to report this problem, but this attempt would
>>> also incur a conversion step, failing recursively until elog's stack got
>>> full.  I'm not saying this is impossible to solve, just something to
>>> keep in mind.
>
> Looking at elog.c, this only seems to apply to messages sent to the
> client from a backend connection. No conversion is done for log
> messages.
>
>> Can we do something like: pass through ASCII characters unchanged, and
>> output anything with the high-bit set as \x?  That
>> might be garbled in some cases, but the goal here is not perfection.
>> We're just trying to give the admin (or PostgreSQL-guru-for-hire) a
>> clue where to start looking for the problem.
>
> Or we might just replace them with '?'. This has the advantage of not
> expanding query length 4x if it does happen to be corrupted. The vast
> majority of queries are ASCII-only anyway.

Should this patch be reviewed as is, or will the substitution of
non-ASCII be implemented?

It seems like everyone agrees that this feature is wanted, but Tom is
still very much opposed to the general approach to implement it, as
being too dangerous.
Is it the reviewer's job to try to convince him otherwise?

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas

On Sep 24, 2011, at 1:04 PM, Tom Lane  wrote:
> I don't exactly buy this argument.  If postgresql.conf is hard to
> machine-edit, why is recovery.conf any easier?

Because you generally just write a brand-new file, without worrying about 
preserving existing settings. You aren't really editing at all, just writing.

> 
>> What if we modified pg_ctl to allow passing configuration parameters
>> through to postmaster,
> 
> You mean like pg_ctl -o?

Oh, cool. Yes, like that.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Martijn van Oosterhout

On Sat, Sep 24, 2011 at 12:46:48PM -0400, Robert Haas wrote:
> > I found the Linux kernel document on this topic quite readable. I think
> > the main lesson here is that processors track data dependancies (other
> > than the Alpha apparently), but not control dependancies.  So in the
> > example, the value of i is dependant on num_items, but not via any
> > calculation.  IThat control dependancies are not tracked makes some
> > sense, since branches depend on flags bit, and just about any
> > calculation changes the flag bits, but most of the time these changes
> > are not used.
> 
> Oh, that's interesting.  So that implies that a read-barrier would be
> needed here even on non-Alpha.

That is my understanding. At source code level the address being
referenced is dependant on i, but at assembly level it's possible i has
been optimised away altogether.

I think the relevent example is here:
http://www.mjmwired.net/kernel/Documentation/memory-barriers.txt (line 725)

Where A = q->items[0] and B = q->num_items.

There is no data dependancy here, so inserting such a barrier won't
help. You need a normal read barrier.

OTOH, if the list already has an entry in it, the problem (probably)
goes away, although with loop unrolling you can't really be sure.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian  writes:
> > Tom Lane wrote:
> >> Actually, I believe that the *main* problem with pgrminclude is that
> >> it fails to account for combinations of build options other than those
> >> that Bruce uses.  In the previous go-round, the reason we were still
> >> squashing bugs months later is that it took that long for people to
> >> notice and complain "hey, compiling with LOCK_DEBUG no longer works",
> >> or various other odd build options that the buildfarm doesn't exercise.
> >> I have 100% faith that we'll be squashing some bugs like that ... very
> >> possibly, the exact same ones as five years ago ... over the next few
> >> months.  Peter's proposed tool would catch issues like the CppAsString2
> 
> > The new code removes #ifdef markers so all code is compiled, or the file
> > is skipped if it can't be compiled.  That should avoid this problem.
> 
> It avoids it at a very large cost, namely skipping all the files where
> it's not possible to compile each arm of every #if on the machine being
> used.  I do not think that's a solution, just a band-aid; for instance,
> won't it prevent include optimization in every file that contains even
> one #ifdef WIN32?  Or what about files in which there are #if blocks
> that each define the same function, constant table, etc?
> 
> The right solution would involve testing each #if block under the
> conditions in which it was *meant* to be compiled.

Right.  It is under the "better than nothing" category, which is better
than nothing (not running it).  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Andrew Dunstan




On 09/24/2011 01:10 PM, Peter Geoghegan wrote:

On 24 September 2011 16:41, Tom Lane  wrote:

Frankly, with the tool in its current state I'd rather not run it at
all, ever.  The value per man-hour expended is too low.  The mess it
made out of the xlog-related includes this time around makes me question
whether it's even a net benefit, regardless of whether it can be
guaranteed not to break things.  Fundamentally, there's a large
component of design judgment/taste in the question of which header files
should include which others, but this tool does not have any taste.

I agree. If this worked well in a semi-automated fashion, there'd be
some other open source tool already available for us to use. As far as
I know, there isn't. As we work around pgrminclude's bugs, its
benefits become increasingly small and hard to quantify.

If we're not going to use it, it should be removed from the tree.



Yeah, I've always been dubious about the actual benefit.  At best this 
can be seen as identifying some candidates for pruning, but as an 
automated tool I'm inclined to write it off as a failed experiment.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat

On Sat, Sep 24, 2011 at 19:51, Tom Lane  wrote:
> Kerem Kat  writes:
>> In the parser while analyzing SetOperationStmt, larg and rarg needs to be
>> transformed as subqueries. SetOperationStmt can have two fields representing
>> larg and rarg with projected columns according to corresponding:
>> larg_corresponding,
>> rarg_corresponding.
>
> Why?  CORRESPONDING at a given set-operation level doesn't affect either
> sub-query, so I don't see why you'd need a different representation for
> the sub-queries.
>

In the planner to construct a subquery out of SetOperationStmt or
RangeTblRef, a new RangeTblRef is needed.
To create a RangeTableRef, parser state is needed and planner assumes
root->parse->rtable be not modified
after generating simple_rte_array.

SELECT a,b,c FROM t is larg
SELECT a,b FROM (SELECT a,b,c FROM t) is larg_corresponding
SELECT d,a,b FROM t is rarg
SELECT a,b FROM (SELECT d,a,b FROM t); is rarg_corresponding

In the planner choose _corresponding ones if the query has corresponding.

SELECT a,b FROM (SELECT a,b,c FROM t)
UNION
SELECT a,b FROM (SELECT d,a,b FROM t);



>>> Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
>>> need to have a separate code path to deduce the output column list in
>>> that case.
>
>> If the output column list to be determined at that stage it needs to
>> be filtered and ordered.
>> In that case aren't we breaking the non-modification of user query argument?
>
> No.  All that you're doing is correctly computing the lists of the
> set-operation's output column types (and probably names too).  These are
> internal details that needn't be examined when printing the query, so
> they won't affect ruleutils.c.
>
>> note: I am new to this list, am I asking too much detail?
>
> Well, I am beginning to wonder if you should choose a smaller project
> for your first venture into patching Postgres.
>


regards,

Kerem KAT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Peter Geoghegan

On 24 September 2011 16:41, Tom Lane  wrote:
> Frankly, with the tool in its current state I'd rather not run it at
> all, ever.  The value per man-hour expended is too low.  The mess it
> made out of the xlog-related includes this time around makes me question
> whether it's even a net benefit, regardless of whether it can be
> guaranteed not to break things.  Fundamentally, there's a large
> component of design judgment/taste in the question of which header files
> should include which others, but this tool does not have any taste.

I agree. If this worked well in a semi-automated fashion, there'd be
some other open source tool already available for us to use. As far as
I know, there isn't. As we work around pgrminclude's bugs, its
benefits become increasingly small and hard to quantify.

If we're not going to use it, it should be removed from the tree.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tom Lane

Robert Haas  writes:
> On Fri, Sep 23, 2011 at 6:55 PM, Tatsuo Ishii  wrote:
>> I'm not sure what you mean by "not deal with" but part of pgpool-II's
>> functionality assumes that we can easily generate recovery.conf. If
>> reconf.conf is integrated into postgresql.conf, we need to edit
>> postgresql.conf, which is a little bit harder than generating
>> recovery.conf, I think.

> Since we haven't yet come up with a reasonable way of machine-editing
> postgresql.conf, this seems like a fairly serious objection to getting
> rid of recovery.conf.

I don't exactly buy this argument.  If postgresql.conf is hard to
machine-edit, why is recovery.conf any easier?

> What if we modified pg_ctl to allow passing configuration parameters
> through to postmaster,

You mean like pg_ctl -o?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tom Lane

Simon Riggs  writes:
> ... The time to replace it is now and I
> welcome that day and have already agreed to it.

Okay, so you do agree that eventually we want to be rid of
recovery.conf?  I think everyone else agrees on that.  But if we are
going to remove recovery.conf eventually, what is the benefit of
postponing doing so?  The pain is going to be inflicted sooner or later,
and the longer we wait, the more third-party code there is likely to be
that expects it to exist.  If optionally reading it helped provide a
smoother transition, then maybe there would be some point.  But AFAICS
having a temporary third behavior will just make things even more
complicated, not less so, for third-party code that needs to cope with
multiple versions.

> The semantics are clear: recovery.conf is read first, then
> postgresql.conf. It's easy to implement (1 line of code) and easy to
> understand.

It's not clear to me why the override order should be that way; I'd have
expected the other way myself.  So this isn't as open-and-shut as you
think.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Tom Lane

Bruce Momjian  writes:
> Tom Lane wrote:
>> Actually, I believe that the *main* problem with pgrminclude is that
>> it fails to account for combinations of build options other than those
>> that Bruce uses.  In the previous go-round, the reason we were still
>> squashing bugs months later is that it took that long for people to
>> notice and complain "hey, compiling with LOCK_DEBUG no longer works",
>> or various other odd build options that the buildfarm doesn't exercise.
>> I have 100% faith that we'll be squashing some bugs like that ... very
>> possibly, the exact same ones as five years ago ... over the next few
>> months.  Peter's proposed tool would catch issues like the CppAsString2

> The new code removes #ifdef markers so all code is compiled, or the file
> is skipped if it can't be compiled.  That should avoid this problem.

It avoids it at a very large cost, namely skipping all the files where
it's not possible to compile each arm of every #if on the machine being
used.  I do not think that's a solution, just a band-aid; for instance,
won't it prevent include optimization in every file that contains even
one #ifdef WIN32?  Or what about files in which there are #if blocks
that each define the same function, constant table, etc?

The right solution would involve testing each #if block under the
conditions in which it was *meant* to be compiled.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Robert Haas

On Sat, Sep 24, 2011 at 9:45 AM, Martijn van Oosterhout
 wrote:
> I think memory accesses are also fantastically expensive, so it's worth
> some effort to optimise that.

This is definitely true.

> I found the Linux kernel document on this topic quite readable. I think
> the main lesson here is that processors track data dependancies (other
> than the Alpha apparently), but not control dependancies.  So in the
> example, the value of i is dependant on num_items, but not via any
> calculation.  IThat control dependancies are not tracked makes some
> sense, since branches depend on flags bit, and just about any
> calculation changes the flag bits, but most of the time these changes
> are not used.

Oh, that's interesting.  So that implies that a read-barrier would be
needed here even on non-Alpha.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas

On Fri, Sep 23, 2011 at 6:55 PM, Tatsuo Ishii  wrote:
>>> There are many. Tools I can name include pgpool, 2warm, PITRtools, but
>>> there are also various tools from Sun, an IBM reseller I have
>>> forgotten the name of, OmniTI and various other backup software
>>> providers. Those are just the ones I can recall quickly. We've
>>> encouraged people to write software on top and they have done so.
>>
>> Actually, just to correct this list:
>> * there are no tools from Sun
>> * pgPool2 does not deal with recovery.conf
>
> I'm not sure what you mean by "not deal with" but part of pgpool-II's
> functionality assumes that we can easily generate recovery.conf. If
> reconf.conf is integrated into postgresql.conf, we need to edit
> postgresql.conf, which is a little bit harder than generating
> recovery.conf, I think.

Since we haven't yet come up with a reasonable way of machine-editing
postgresql.conf, this seems like a fairly serious objection to getting
rid of recovery.conf.  I wonder if there's a way we can work around
that...

*thinks a little*

What if we modified pg_ctl to allow passing configuration parameters
through to postmaster, so you could do something like this:

pg_ctl start -c work_mem=8MB -c recovery_target_time='...'

Would that meet pgpool's needs?

(Sadly pg_ctl -c means something else right now, so we'd probably have
to pick another option name, but you get the idea.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Andres Freund

Hi,

On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
> On Sat, Sep 24, 2011 at 17:14, Andres Freund  wrote:
> > On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
> >> Attached patch adds a simple call to posix_fadvise with
> >> POSIX_FADV_DONTNEED on all the files being read when doing a base
> >> backup, to help the kernel not to trash the filesystem cache.
> >> Seems like a simple enough fix - in fact, I don't remember why I took
> >> it out of the original patch :O
> >> Any reason not to put this in? Is it even safe enough to put into 9.1
> >> (probably not, but maybe?)
> > Won't that possibly throw a formerly fully cached database out of the
> > cache?
> I was assuming the kernel was smart enough to read this as "*this*
> process is not going to be using this file anymore", not "nobody in
> the whole machine is going to use this file anymore". And the process
> running the base backup is certainly not going to read it again.
> But that's a good point - do you know if that is the case, or does it
> mandate more testing?
I am pretty but not totally sure that the kernel does not track each process 
that uses a page. For one doing so would probably prohibitively expensive. For 
another I am pretty (but not ...) sure that I restructured an application not 
to fadvise(DONTNEED) memory that is also used in other processes.

Currently I can only think of to workarounds, both os specific:
- Use O_DIRECT for reading the base backup. Will be slow in fully cached 
situations, but should work ok enough in all others. Need to be carefull about 
the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
- use mmap/mincore() to gather whether data is in cache and restore that state 
afterwards.

Too bad that POSIX_FADV_NOREUSE is not really implemented.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat

On Sat, Sep 24, 2011 at 18:49, Tom Lane  wrote:
>
> Kerem Kat  writes:
> > There is a catch inserting subqueries for corresponding in the planner.
> > Parser expects to see equal number of columns in both sides of the
> > UNION query. If there is corresponding however we cannot guarantee that.
>
> Well, you certainly need the parse analysis code to be aware of
> CORRESPONDING's effects.  But I think you can confine the changes to
> adjusting the computation of a SetOperationStmt's list of output column
> types.  It might be a good idea to also add a list of output column
> names to SetOperationStmt, and get rid of the logic that digs down into
> the child queries when we need to know the output column names.
>

In the parser while analyzing SetOperationStmt, larg and rarg needs to be
transformed as subqueries. SetOperationStmt can have two fields representing
larg and rarg with projected columns according to corresponding:
larg_corresponding,
rarg_corresponding.

Planner uses _corresponding ones if query is a corresponding query,
view-definition-generator
uses larg and rarg which represent the query user entered.

Comments?

> > Target columns, collations and types for the SetOperationStmt are
> > determined in the parser. If we pass the column number equality checks,
> > it is not clear that how one would proceed with the targetlist generation
> > loop which is a forboth for two table's columns.
>
> Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
> need to have a separate code path to deduce the output column list in
> that case.
>

If the output column list to be determined at that stage it needs to
be filtered and ordered.
In that case aren't we breaking the non-modification of user query argument?

note: I am new to this list, am I asking too much detail?

regards,

Kerem KAT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Robert Haas

On Sat, Sep 24, 2011 at 9:02 AM, Simon Riggs  wrote:
> The semantics are clear: recovery.conf is read first, then
> postgresql.conf. It's easy to implement (1 line of code) and easy to
> understand.

Eh, well, if you can implement it in one line of code, consider my
objection withdrawn.  I can't see how that would be possible, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian

Tom Lane wrote:
> Actually, I believe that the *main* problem with pgrminclude is that
> it fails to account for combinations of build options other than those
> that Bruce uses.  In the previous go-round, the reason we were still
> squashing bugs months later is that it took that long for people to
> notice and complain "hey, compiling with LOCK_DEBUG no longer works",
> or various other odd build options that the buildfarm doesn't exercise.
> I have 100% faith that we'll be squashing some bugs like that ... very
> possibly, the exact same ones as five years ago ... over the next few
> months.  Peter's proposed tool would catch issues like the CppAsString2

The new code removes #ifdef markers so all code is compiled, or the file
is skipped if it can't be compiled.  That should avoid this problem.

> failure, but it's still only going to exercise the build options you're
> testing with.
> 
> If we could get pgrminclude up to a similar level of reliability as
> pgindent, I'd be for running it on every cycle.  But I'm not sure that
> the current approach to it is capable even in theory of getting to "it
> just works" reliability.  I'm also not impressed at all by the hack of
> avoiding problems by excluding entire files from the processing ---
> what's the point of having the tool then?

We remove the includes we safely can, and skip the others --- the
benefit is only for the files we can process.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Tom Lane

Kerem Kat  writes:
> There is a catch inserting subqueries for corresponding in the planner.
> Parser expects to see equal number of columns in both sides of the
> UNION query. If there is corresponding however we cannot guarantee that.

Well, you certainly need the parse analysis code to be aware of
CORRESPONDING's effects.  But I think you can confine the changes to
adjusting the computation of a SetOperationStmt's list of output column
types.  It might be a good idea to also add a list of output column
names to SetOperationStmt, and get rid of the logic that digs down into
the child queries when we need to know the output column names.

> Target columns, collations and types for the SetOperationStmt are
> determined in the parser. If we pass the column number equality checks,
> it is not clear that how one would proceed with the targetlist generation
> loop which is a forboth for two table's columns.

Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
need to have a separate code path to deduce the output column list in
that case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Tom Lane

Bruce Momjian  writes:
> Robert Haas wrote:
>> I'm also curious to see how much more fallout we're going to see from
>> that run.  We had a few glitches when it was first done, but it didn't
>> seem like they were really all that bad.  It might be that we'd be
>> better off running pgrminclude a lot *more* often (like once a cycle,
>> or even after every CommitFest), because the scope of the changes
>> would then be far smaller and we wouldn't be dealing with 5 years of
>> accumulated cruft all at once; we'd also get a lot more experience
>> with what works or does not work with the script, which might lead to
>> improvements in that script on a less-than-geologic time scale.

> Interesting idea.  pgrminclude has three main problems: [ snipped ]

Actually, I believe that the *main* problem with pgrminclude is that
it fails to account for combinations of build options other than those
that Bruce uses.  In the previous go-round, the reason we were still
squashing bugs months later is that it took that long for people to
notice and complain "hey, compiling with LOCK_DEBUG no longer works",
or various other odd build options that the buildfarm doesn't exercise.
I have 100% faith that we'll be squashing some bugs like that ... very
possibly, the exact same ones as five years ago ... over the next few
months.  Peter's proposed tool would catch issues like the CppAsString2
failure, but it's still only going to exercise the build options you're
testing with.

If we could get pgrminclude up to a similar level of reliability as
pgindent, I'd be for running it on every cycle.  But I'm not sure that
the current approach to it is capable even in theory of getting to "it
just works" reliability.  I'm also not impressed at all by the hack of
avoiding problems by excluding entire files from the processing ---
what's the point of having the tool then?

> I think we determined that the best risk/reward is to run it every five
> years.

Frankly, with the tool in its current state I'd rather not run it at
all, ever.  The value per man-hour expended is too low.  The mess it
made out of the xlog-related includes this time around makes me question
whether it's even a net benefit, regardless of whether it can be
guaranteed not to break things.  Fundamentally, there's a large
component of design judgment/taste in the question of which header files
should include which others, but this tool does not have any taste.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Magnus Hagander

On Sat, Sep 24, 2011 at 17:14, Andres Freund  wrote:
> Hi,
>
> On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
>> Attached patch adds a simple call to posix_fadvise with
>> POSIX_FADV_DONTNEED on all the files being read when doing a base
>> backup, to help the kernel not to trash the filesystem cache.
>>
>> Seems like a simple enough fix - in fact, I don't remember why I took
>> it out of the original patch :O
>>
>> Any reason not to put this in? Is it even safe enough to put into 9.1
>> (probably not, but maybe?)
> Won't that possibly throw a formerly fully cached database out of the cache?

I was assuming the kernel was smart enough to read this as "*this*
process is not going to be using this file anymore", not "nobody in
the whole machine is going to use this file anymore". And the process
running the base backup is certainly not going to read it again.

But that's a good point - do you know if that is the case, or does it
mandate more testing?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Andres Freund

Hi,

On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
> Attached patch adds a simple call to posix_fadvise with
> POSIX_FADV_DONTNEED on all the files being read when doing a base
> backup, to help the kernel not to trash the filesystem cache.
> 
> Seems like a simple enough fix - in fact, I don't remember why I took
> it out of the original patch :O
> 
> Any reason not to put this in? Is it even safe enough to put into 9.1
> (probably not, but maybe?)
Won't that possibly throw a formerly fully cached database out of the cache?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] posix_fadvsise in base backups

2011-09-24 Thread Magnus Hagander

Attached patch adds a simple call to posix_fadvise with
POSIX_FADV_DONTNEED on all the files being read when doing a base
backup, to help the kernel not to trash the filesystem cache.

Seems like a simple enough fix - in fact, I don't remember why I took
it out of the original patch :O

Any reason not to put this in? Is it even safe enough to put into 9.1
(probably not, but maybe?)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 4841095..54c4d13 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -781,6 +781,15 @@ sendFile(char *readfilename, char *tarfilename, struct stat * statbuf)
 		pq_putmessage('d', buf, pad);
 	}
 
+	/*
+	 * If we have posix_fadvise(), send a note to the kernel that we are not
+	 * going to need this data anytime soon, so that it can be discarded
+	 * from the filesystem cache.
+	 */
+#if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
+	(void) posix_fadvise(fileno(fp), 0, 0, POSIX_FADV_DONTNEED);
+#endif
+
 	FreeFile(fp);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Tatsuo Ishii

>> I'm not sure what you mean by "not deal with" but part of pgpool-II's
>> functionality assumes that we can easily generate recovery.conf. If
>> reconf.conf is integrated into postgresql.conf, we need to edit
>> postgresql.conf, which is a little bit harder than generating
>> recovery.conf, I think.
> 
> Oh?  Clearly I've been abusing pgPool2 then.  Where's the code that
> generates that?

pgpool-II itself does not generate the file but scripts for pgpool-II
are generating the file as stated in documentation comimg with
pgpool-II.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types - symmetric

2011-09-24 Thread Bruce Momjian

Jeff Davis wrote:
> On Tue, 2011-09-13 at 12:34 -0400, Christopher Browne wrote:
> > > select int4range(5,2);
> > > ERROR:  range lower bound must be less than or equal to range upper bound
> > >
> > > Of course, I won't argue this is a bug, but I was wondering if it 
> > > wouldn't be handy to allow a
> > > 'symmetric' mode in range construction, where, if the first of the pair 
> > > is higher than the second,
> > > they are automatically swapped, similar to SYMMETRIC in the BETWEEN 
> > > clause.
> 
> ...
> 
> > If you have a computation that gets a "backwards" range, then it is
> > more than possible that what you've got isn't an error of getting the
> > range backwards, but rather the error that your data is
> > overconstraining, and that you don't actually have a legitimate range.
> 
> Agreed. On balance, it's just as likely that you miss an error as save a
> few keystrokes.
> 
> I'll add that it would also cause a little confusion with inclusivity.
> What if you do: '[5,2)'::int4range? Is that really '[2,5)' or '(2,5]'?

Reminder:  BETWEEEN supports the SYMMETRIC keyword, so there is
a precedent for this.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix for pg_upgrade

2011-09-24 Thread Bruce Momjian

\panam wrote:
> Hi, just tried to upgrade from 9.0 to 9.1 and got this error during
> pg_upgrade :
> Mismatch of relation id: database "xyz", old relid 465783, new relid 16494
> It seems, I get this error on every table as I got it on another table
> (which I did not need and deleted) before as well. Schmemas seem to be
> migrated but the content is missing.
> 
> I am using Windows 7 64bit (both PG servers are 64 bit as well), everthing
> on the same machine.

Sorry for the delay in replying.  It is odd you got a mismatch of relids
because pg_upgrade is supposed to preserve all of those.  Can you do a
query to find out what table is relid of 465783 on the old cluster?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What Would You Like To Do?

2011-09-24 Thread Andrew Dunstan




On 09/24/2011 09:51 AM, Bruce Momjian wrote:

Joshua D. Drake wrote:

On 09/13/2011 11:51 AM, Michael Nolan wrote:


 The ability to restore a table from a backup file to a different
 table
 name in the same database and schema.


 This can be done but agreed it is not intuitive.


Can you elaborate on tha a bit, please?  The only way I've been able to
do it is to edit the dump file to change the table name.  That's not
very practical with a several gigabyte dump file, even less so with one
that is much larger.  If this capability already exists, is it documented?

You use the -Fc method, extract the TOC and edit just the TOC (so you
don't have to edit a multi-gig file)

How does that work in practice?  You dump the TOC, edit it, restore the
TOC schema definition, then how do you restore the data to the renamed
table?




How do you extract the TOC at all? There are no tools for manipulating 
the TOC that I know of, and I'm not sure we should provide any. It's not 
documented, it's a purely internal artefact. The closest thing we have 
to being able to manipulate it is --list/--use-list, and those are 
useless for this purpose. So this method description does not compute 
for me either.


+1 for providing a way to restore an object to a different object name.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What Would You Like To Do?

2011-09-24 Thread Bruce Momjian

Joshua D. Drake wrote:
> 
> On 09/13/2011 11:51 AM, Michael Nolan wrote:
> 
> >
> > The ability to restore a table from a backup file to a different
> > table
> > name in the same database and schema.
> >
> >
> > This can be done but agreed it is not intuitive.
> >
> >
> > Can you elaborate on tha a bit, please?  The only way I've been able to
> > do it is to edit the dump file to change the table name.  That's not
> > very practical with a several gigabyte dump file, even less so with one
> > that is much larger.  If this capability already exists, is it documented?
> 
> You use the -Fc method, extract the TOC and edit just the TOC (so you 
> don't have to edit a multi-gig file)

How does that work in practice?  You dump the TOC, edit it, restore the
TOC schema definition, then how do you restore the data to the renamed
table?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-24 Thread Martijn van Oosterhout

On Fri, Sep 23, 2011 at 04:22:09PM +0100, Greg Stark wrote:
> So you have two memory fetches which I guess I still imagine have to
> be initiated in the right order but they're both in flight at the same
> time. I have no idea how the memory controller works and I could
> easily imagine either one grabbing the value before the other.

That's easy. If one is in cache and the other isn't then the results
will come back out of order.

> I'm not even clear how other processors can reasonably avoid this. It
> must be fantastically expensive to keep track of which branch
> predictions depended on which registers and which memory fetches the
> value of those registers depended on. And then it would have  to
> somehow inform the memory controller of those old memory fetches that
> this new memory fetch is dependent on and have it ensure that the
> fetches are read the right order?

I think memory accesses are also fantastically expensive, so it's worth
some effort to optimise that.

I found the Linux kernel document on this topic quite readable. I think
the main lesson here is that processors track data dependancies (other
than the Alpha apparently), but not control dependancies.  So in the
example, the value of i is dependant on num_items, but not via any
calculation.  IThat control dependancies are not tracked makes some
sense, since branches depend on flags bit, and just about any
calculation changes the flag bits, but most of the time these changes
are not used.

It also not a question of the knowing the address either, since the
first load, if any, will be *q->items, irrespective of the precise
value of num_items.  This address may be calculated long in advance.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

signature.asc
Description: Digital signature

Re: [HACKERS] Large C files

2011-09-24 Thread Bruce Momjian

Robert Haas wrote:
> On Fri, Sep 9, 2011 at 11:28 PM, Peter Geoghegan  
> wrote:
> > It's very difficult or impossible to anticipate how effective the tool
> > will be in practice, but when you consider that it works and does not
> > produce false positives for the first 3 real-world cases tested, it
> > seems reasonable to assume that it's at least worth having around. Tom
> > recently said of a previous pgrminclude campaign in July 2006 that "It
> > took us two weeks to mostly recover, but we were still dealing with
> > some fallout in December". I think that makes the case for adding this
> > tool or some refinement as a complement to pgrminclude in src/tools
> > fairly compelling.
> 
> I'm not opposed to adding something like this, but I think it needs to
> either be tied into the actual running of the script, or have a lot
> more documentation than it does now, or both.  I am possibly stupid,
> but I can't understand from reading the script (or, honestly, the
> thread) exactly what kind of pgrminclude-induced errors this is
> protecting against; but even if we clarify that, it seems like it
> would be a lot of work to run it manually on all the files that might
> be affected by a pgrminclude run, unless we can somehow automate that.
> 
> I'm also curious to see how much more fallout we're going to see from
> that run.  We had a few glitches when it was first done, but it didn't
> seem like they were really all that bad.  It might be that we'd be
> better off running pgrminclude a lot *more* often (like once a cycle,
> or even after every CommitFest), because the scope of the changes
> would then be far smaller and we wouldn't be dealing with 5 years of
> accumulated cruft all at once; we'd also get a lot more experience
> with what works or does not work with the script, which might lead to
> improvements in that script on a less-than-geologic time scale.

Interesting idea.  pgrminclude has three main problems:

o  defines --- to prevent removing an include that is referenced in an
#ifdef block that is not reached, it removes ifdef blocks, though that
might cause the file not to compile and be skipped.

o  ## expansion --- we found that CppAsString2() uses the CCP expansion
##, which throws no error if the symbol is does not exist (perhaps
through the removal of a define).  This is the problem we had with
tablespace directory names, and the script now checks for CppAsString2
and friends and skips the file.

o  imbalance of includes --- pgrminclude can remove includes that are
not required, but this can cause other files to need many more includes.
This is the imbalance include problem Tom found.

The submitted patch to compare object files only catches the second
problem we had.  

I think we determined that the best risk/reward is to run it every five
years.  The pgrminclude run removed 627 include references, which is
0.006% of our 1,077,878 lines of C code.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs

On Fri, Sep 23, 2011 at 6:17 PM, Robert Haas  wrote:
> On Fri, Sep 23, 2011 at 12:51 PM, Josh Berkus  wrote:
>> I'm happy to make upgrades easier, but I want a path which eventually
>> ends in recovery.conf going away.  It's a bad API, confuses our users,
>> and is difficult to support and maintain.
>
> I agree.
>
> GUC = Grand Unified Configuration, but recovery.conf is an example of
> where it's not so unified after all.  We've already done a non-trivial
> amount of work to allow recovery.conf values to be specified without
> quotes, a random incompatibility with GUCs that resulted from having
> different parsing code for each file.  If that were the last issue,
> then maybe it wouldn't be worth worrying about, but it's not.  For
> example, it would be nice to have reload behavior on SIGHUP.
...
> I don't want us
> to have to implement such things separately for postgresql.conf and
> recovery.conf.

It was always my plan to do exactly the above, and there are code
comments that say that from 2004. The time to replace it is now and I
welcome that day and have already agreed to it.

We all want every word quoted above and nothing there is under debate.

> And we
> keep talking about having an ALTER SYSTEM SET guc = value or SET
> PERMANENT guc = value command, and I think ALTER SYSTEM SET
> recovery_target_time = '...' would be pretty sweet.  I don't want us
> to have to implement such things separately for postgresql.conf and
> recovery.conf.

There is a reason why it doesn't work that way which you overlook.
Please start a separate thread if you wish to discuss that.

> Now, it's true that Simon's proposal (of having recovery.conf
> automatically included) if it exists doesn't necessarily preclude
> those things.  But it seems to me that it is adding a lot of
> complexity to core for a pretty minimal benefit to end-users, and that
> the semantics are not going to end up being very clean.  For example,
> now you potentially have the situation where recovery.conf has
> work_mem=128MB and postgresql.conf has work_mem=4MB, and now when you
> end recovery you've got to make sure that everyone picks up the new
> setting.  Now, in some sense you could say that's a feature addition,
> and I'm not going to deny that it might be useful to some people, but
> I think it's also going to require a fairly substantial convolution of
> the GUC machinery, and it's going to discourage people from moving
> away from recovery.conf.  And like Josh, I think that ought to be the
> long-term goal, for the reasons he states.

The semantics are clear: recovery.conf is read first, then
postgresql.conf. It's easy to implement (1 line of code) and easy to
understand.

So we can support the old and the new very, very easily and clearly.
"Complexity" - no definitely not. "Minimal benefit for end users" -
backwards compatibility isn't minimal benefit. It's a major issue.

If you put things in two places, yes that causes problems. You can
already add the same parameter twice and cause exactly the same
problems.

> I don't want to go willy-nilly breaking third-party tools that work
> with PostgreSQL, but in this case I think that the reason there are so
> many tools in the first place is because what we're providing in core
> is not very good.  If we are unwilling to improve it for fear of
> breaking compatibility with the tools, then we are stuck.

No, there are many tools because there are many requirements. A
simple, open API has allowed our technology to be widely used. That
was by design not by chance.

Nobody is unwilling to improve it. The debate is about people being
unwilling to provide a simple and easy to understand backwards
compatibility feature, which breaks things for no reason and does not
interfere with the proposed new features.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Simon Riggs

On Fri, Sep 23, 2011 at 5:51 PM, Josh Berkus  wrote:
> Simon,
>
>> There are many. Tools I can name include pgpool, 2warm, PITRtools, but
>> there are also various tools from Sun, an IBM reseller I have
>> forgotten the name of, OmniTI and various other backup software
>> providers. Those are just the ones I can recall quickly. We've
>> encouraged people to write software on top and they have done so.
>
> Actually, just to correct this list:
> * there are no tools from Sun

Just for the record, I sat through a 1 hour presentation at the
PostgreSQL UK conference from a Sun employee describing the product
and its very clear use of PostgreSQL facilities. Josh definitely
wasn't at that presentation, many others here were.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present

2011-09-24 Thread Magnus Hagander

On Fri, Sep 23, 2011 at 16:44, Alvaro Herrera
 wrote:
>
> Excerpts from Magnus Hagander's message of vie sep 23 11:31:37 -0300 2011:
>>
>> On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera
>>  wrote:
>
>> > This seems strange to me.  Why not have a second option to let the user
>> > indicate the desired SSL verification?
>> >
>> > sslmode=disable/allow/prefer/require
>> > sslverify=none/ca-if-present/ca/full
>> >
>> > (ca-if-present being the current "require" sslmode behavior).
>> >
>> > We could then deprecate sslmode=verify and verify-full and have them be
>> > synonyms of sslmode=require and corresponding sslverify.
>>
>> Hmm. I agree that the other suggestion was a bit weird, but I'm not
>> sure I like the multiple-options approach either. That's going to
>> require redesign of all software that deals with it at all today :S
>
> Why?  They could continue to use the existing options; or switch to the
> new options if they wanted different behavior, as is the case of the OP.

I guess. I was mostly thinking in the terms of anything that has
connection things that look anything like the one in pgadmin for
example - which will now suddenly need more than one dropdown box, for
what really should be a simple setting. But I guess that can be
considered an UI thing, and jus thave said application map a single
dropdown to multiple options in the connection string.


>> Maybe we should just update the docs and be done with it :-)
>
> That's another option, sure ... :-)

I've applied a docs fix for this now. We can keep discussing how to
make a more extensive fix in head :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-09-24 Thread Hannu Krosing

On Wed, 2011-09-07 at 01:22 +0200, Jan Urbański wrote:
> On 07/09/11 01:13, Peter Geoghegan wrote:
> > On 6 September 2011 08:29, Peter Eisentraut  wrote:
> >> I was thinking about splitting up plpython.c, but it's not even on that
> >> list. ;-)
> > 
> > IIRC the obesity of that file is something that Jan Urbański intends
> > to fix, or is at least concerned about. Perhaps you should compare
> > notes.
> 
> Yeah, plpython.c could easily be splitted into submodules for builtin
> functions, spi support, subtransactions support, traceback support etc.
> 
> If there is consensus that this should be done, I'll see if I can find
> time to submit a giant-diff-no-behaviour-changes patch for the next CF.

+1 from me.

Would make working with it much nicer.

> Cheers,
> Jan
> 

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] TABLE tab completion

2011-09-24 Thread Magnus Hagander

TABLE tab completion in psql only completes to tables, not views. but
the TABLE command works fine for both tables and views (and also
sequences).

Seems we should just complete it to relations and not tables - or can
anyone see a particular reason why we shouldn't?

Trivial patch attached.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f7df36..8515c38 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -636,7 +636,7 @@ static const pgsql_thing_t words_after_create[] = {
 	{"SCHEMA", Query_for_list_of_schemas},
 	{"SEQUENCE", NULL, &Query_for_list_of_sequences},
 	{"SERVER", Query_for_list_of_servers},
-	{"TABLE", NULL, &Query_for_list_of_tables},
+	{"TABLE", NULL, &Query_for_list_of_relations},
 	{"TABLESPACE", Query_for_list_of_tablespaces},
 	{"TEMP", NULL, NULL, THING_NO_DROP},		/* for CREATE TEMP TABLE ... */
 	{"TEMPLATE", Query_for_list_of_ts_templates, NULL, THING_NO_SHOW},

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CUDA Sorting

2011-09-24 Thread Hannu Krosing

On Mon, 2011-09-19 at 10:36 -0400, Greg Smith wrote:
> On 09/19/2011 10:12 AM, Greg Stark wrote:
> > With the GPU I'm curious to see how well
> > it handles multiple processes contending for resources, it might be a
> > flashy feature that gets lots of attention but might not really be
> > very useful in practice. But it would be very interesting to see.
> >
> 
> The main problem here is that the sort of hardware commonly used for 
> production database servers doesn't have any serious enough GPU to 
> support CUDA/OpenCL available.  The very clear trend now is that all 
> systems other than gaming ones ship with motherboard graphics chipsets 
> more than powerful enough for any task but that.  I just checked the 5 
> most popular configurations of server I see my customers deploy 
> PostgreSQL onto (a mix of Dell and HP units), and you don't get a 
> serious GPU from any of them.
> 
> Intel's next generation Ivy Bridge chipset, expected for the spring of 
> 2012, is going to add support for OpenCL to the built-in motherboard 
> GPU.  We may eventually see that trickle into the server hardware side 
> of things too.
> 
> I've never seen a PostgreSQL server capable of running CUDA, and I don't 
> expect that to change.

CUDA sorting could be beneficial on general server hardware if it can
run well on multiple cpus in parallel. GPU-s being in essence parallel
processors on fast shared memory, it may be that even on ordinary RAM
and lots of CPUs some CUDA algorithms are a significant win.

and then there is non-graphics GPU availabe on EC2 

  Cluster GPU Quadruple Extra Large Instance

  22 GB of memory
  33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem”
   architecture)
  2 x NVIDIA Tesla “Fermi” M2050 GPUs
  1690 GB of instance storage
  64-bit platform
  I/O Performance: Very High (10 Gigabit Ethernet)
  API name: cg1.4xlarge

It costs $2.10 per hour, probably a lot less if you use the Spot
Instances.

> -- 
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> 
> 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CUDA Sorting

2011-09-24 Thread Hannu Krosing

On Mon, 2011-09-19 at 15:12 +0100, Greg Stark wrote:
> On Mon, Sep 19, 2011 at 1:11 PM, Vitor Reus  wrote:
> > Since I'm new to pgsql development, I replaced the code of pgsql
> > qsort_arg to get used with the way postgres does the sort. The problem
> > is that I can't use the qsort_arg_comparator comparator function on
> > GPU, I need to implement my own. I didn't find out how to access the
> > sorting key value data of the tuples on the Tuplesortstate or
> > SortTuple structures. This part looks complicated because it seems the
> > state holds the pointer for the scanner(?), but I didn't managed to
> > access the values directly. Can anyone tell me how this works?



> With the GPU I'm curious to see how well
> it handles multiple processes contending for resources, it might be a
> flashy feature that gets lots of attention but might not really be
> very useful in practice. But it would be very interesting to see.

There are cases where concurrency may not be that important like some
specialized OLAP loads where you have to sort, for example finding a
median in large data sets.


-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-24 Thread Kerem Kat

I am looking into perpunion.c and analyze.c

There is a catch inserting subqueries for corresponding in the planner.
Parser expects to see equal number of columns in both sides of the
UNION query. If there is corresponding however we cannot guarantee that.
Target columns, collations and types for the SetOperationStmt are
determined in the parser. If we pass the column number equality checks,
it is not clear that how one would proceed with the targetlist generation
loop
which is a forboth for two table's columns.

One way would be filtering the columns in the parser anyway and inserting
subqueries in the planner but it leads to the previous problem of column
ordering and view definition mess-up, and it would be too much bloat
methinks.

I can guess what needs to be done in prepunion.c, but I need a waypointer
for the parser.

tom lane: Thanks for your description


regards

Kerem KAT

On Fri, Sep 23, 2011 at 07:40, Tom Lane  wrote:

> Kerem Kat  writes:
> > While testing I noticed that ordering is incorrect in my implementation.
> At
> > first I thought that removing mismatched entries from ltargetlist and
> > rtargetlist would be enough, it didn't seem enough so I added rtargetlist
> > sorting.
>
> I don't think you can get away with changing the targetlists of the
> UNION subqueries; you could break their semantics.  Consider for
> instance
>
>select distinct a, b, c from t1
>union corresponding
>select b, c from t2;
>
> If you discard the A column from t1's output list then it will deliver a
> different set of rows than it should, because the DISTINCT is
> considering the wrong set of values.
>
> One possible way to fix that is to introduce a level of sub-select,
> as if the query had been written
>
>select b, c from (select distinct a, b, c from t1) ss1
>union
>select b, c from (select b, c from t2) ss2;
>
> However, the real problem with either type of hackery is that these
> machinations will be visible in the parsed query, which means for
> example that a view defined as
>
>create view v1 as
>select distinct a, b, c from t1
>union corresponding
>select b, c from t2;
>
> would come out looking like the transformed version rather than the
> original when it's dumped, or even just examined with tools such as
> psql's \d+.  I think this is bad style.  It's certainly ugly to expose
> your implementation shortcuts to the user like that, and it also can
> cause problems down the road: if in the future we think of some better
> way to implement CORRESPONDING, we've lost the chance to do so for any
> stored views that got transformed this way.  (There are several places
> in Postgres now that take such shortcuts, and all of them were mistakes
> that we need to clean up someday, IMO.)
>
> So I think that as far as the parser is concerned, you just want to
> store the CORRESPONDING clause more or less as-is, and not do too much
> more than verify that it's valid.  The place to actually implement it is
> in the planner (see prepunion.c).  Possibly the add-a-level-of-subselect
> approach will work, but you want to do that querytree transformation at
> plan time not parse time.
>
>regards, tom lane
>

48 matches

Mail list logo