date:20121113

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Andres Freund

On 2012-11-12 19:21:28 +, Simon Riggs wrote:
> On 10 September 2012 17:50, Tom Lane  wrote:
> 
> > The point of the proposal that I am making is to have a simple,
> > low-maintenance solution for people who need a single-application
> > database.  A compromise somewhere in the middle isn't likely to be an
> > improvement for anybody.  For instance, if you want to have additional
> > connections, you open up a whole collection of communication and
> > authentication issues, which potential users of a single-application
> > database don't want to cope with.
> 
> So the proposal is to implement a database that can't ever have 2 or
> more connections.
> ...
> It's almost impossible to purchase a CPU these days that doesn't have
> multiple cores, so the whole single-process architecture is just dead.
> Yes, we want Postgres installed everywhere, but this isn't the way to
> achieve that.
> 
> I agree we should allow a PostgreSQL installation to work for a single
> user, but I don't see that requires other changes. This idea will
> cause endless bugs, thinkos and severely waste our time. So without a
> much better justification, I don't think we should do this.

I personally think that a usable & scriptable --single mode is
justification enough, even if you don't aggree with the other
goals. Having to wait for hours just enter one more command because
--single doesn't support any scripts sucks. Especially in recovery
situations.

I also don't think a single-backend without further child processes is
all that helpful - but I think this might be a very useful stepping
stone.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Heikki Linnakangas


On 12.11.2012 22:53, Tom Lane wrote:

Here's an updated patch that fixes the GIST replay functions as well as
the other minor issues that were mentioned.  Barring objections, I'll
set about back-patching this as far as 9.0.


Ok. It won't help all that much on 9.0, though.


One thing that could use verification is my fix for
gistRedoPageSplitRecord.  AFAICS, the first page listed in the WAL
record is always the "original" page, and the ones following it are
pages that were split off from it, and can (as yet) only be reached by
following right-links from the "original" page.  As such, it should be
okay to release locks on the non-first pages as soon as we've written
them.  We have to hold lock on the original page though to avoid letting
readers follow dangling right-links.  Also, the update of
NSN/FOLLOW_RIGHT on the child page (if any) has to be done atomically
with all this, so that has to be done before releasing the original-page
lock as well.  Does that sound right?


Yep.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-11-13 Thread Andres Freund

Hi CK,

On 2012-10-30 21:16:07 +0100, Christian Kruse wrote:
> index b4fcbaf..66ed10f 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml

I think a short introduction or at least a reference on how to configure
hugepages would be a good thing.

>   
>temp_buffers (integer)
>
> diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
> index df06312..f9de239 100644
> --- a/src/backend/port/sysv_shmem.c
> +++ b/src/backend/port/sysv_shmem.c
> @@ -27,10 +27,14 @@
>  #ifdef HAVE_SYS_SHM_H
>  #include 
>  #endif
> +#ifdef MAP_HUGETLB
> +#include 
> +#endif

I think a central #define for the MAP_HUGETLB capability would be a good
idea, akin to HAVE_SYS_SHM_H.

E.g. this:
> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #ifdef HAVE_SYSLOG
>  #include 
>  #endif

is unlikely to fly on windows.


> +/*
> + *   static long InternalGetHugepageSize()
> + *
> + * Attempt to get a valid hugepage size from /sys/kernel/mm/hugepages/ by
> + * reading directory contents
> + * Will fail (return -1) if the directory could not be opened or no valid
> + * page sizes are available. Will return the biggest hugepage size on
> + * success.
> + *
> + */

The "biggest" remark is out of date.


> +static long
> +InternalGetHugepageSize()
> +{
> ...
> + if ((smallest_size == -1 || size < smallest_size)
> + && InternalGetFreeHugepagesCount(ent->d_name) > 
> 0)
> + {
> + smallest_size = size;
> + }
> ...
> +
> + if (smallest_size == -1)
> + {
> + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
> + (errmsg("Could not find a valid hugepage size"),
> +  errhint("This error usually means that either 
> CONFIG_HUGETLB_PAGE "
> +  "is not in kernel or that your 
> architecture does not "
> +  "support hugepages or you did 
> not configure hugepages")));
> + }

I think differentiating the error message between no hugepages found and
InternalGetFreeHugepagesCount(ent->d_name) always beeing zero would be a
good idea. Failing this way if
InternalGetFreeHugepagesCount(ent->d_name) < 0 seems fine.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-11-13 Thread Andres Freund

Oh, one more thing...

On 2012-10-30 21:16:07 +0100, Christian Kruse wrote:
> ok, I think I implemented all of the changes you requested. All but
> the ia64 dependent, I have to do more research for this one.

I vote for simply not caring about ia64.

This is:

> +#ifdef MAP_HUGETLB
> +#  ifdef __ia64__
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
> +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
> +#  else
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
> +#define PG_MAP_HUGETLB MAP_HUGETLB
> +#  endif
> +#else
> +#  define PG_MAP_HUGETLB 0
> +#endif

too much underdocumented crazyness for a very minor platform. Should
somebody with the approprate harware want to submit an additional patch,
fine


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Markus Wanner

On 11/13/2012 01:22 AM, Greg Smith wrote:
> Once you accept that eventually there need to be online conversion
> tools, there needs to be some easy way to distinguish which pages have
> been processed for several potential implementations.

Agreed. What I'm saying is that this identification doesn't need to be
as fine grained as a per-page bit. A single "horizon" or "border" is
enough, given an ordering of relations (for example by OID) and an
ordering of pages in the relations (obvious).

> All of the table-based checksum enabling ideas ...

This is not really one - it doesn't allow per-table switching. It's just
meant to be a more compact way of representing which pages have been
checksummed and which not.

> I'm thinking of this in some ways like the way creation of a new (but
> not yet valid) foreign key works.  Once that's active, new activity is
> immediately protected moving forward.  And eventually there's this
> cleanup step needed, one that you can inch forward over a few days.

I understand that. However, I question if users really care. If a
corruption is detected, the clever DBA tells his trainee immediately
check the file- and disk subsystem - no matter whether the corruption
was on old or new data.

You have a point in that pages with "newer" data are often more likely
to be re-read and thus getting checked. Where as the checksums written
to pages with old data might not be re-read any time soon. Starting to
write checksums from the end of the relation could mitigate this to some
extent, though.

Also keep in mind the "quietly corrupted after checked once, but still
in the middle of checking a relation" case. Thus a single bit doesn't
really give us the guarantee you ask for. Sure, we can add more than one
bit. And yeah, if done properly, adding more bits exponentially reduces
the likeliness of a corruption inadvertently turning off checksumming
for a page.

All that said, I'm not opposed to using a few bits of the page header. I
wanted to outline an alternative that I think is viable and less intrusive.

> This is why I think any good solution to this problem needs to
> incorporate restartable conversion.

I fully agree to that.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs

On 13 November 2012 06:14, Amit kapila  wrote:

>>I get the installability thang, very very much, I just don't see the
>>single process thing as the only solution. At very least an open
>>minded analysis of the actual problem and ways of solving it is called
>>for, not just reach for a close to hand solution.
>
> Some other usecase where I have seen it required is in telecom billing apps.
> In telecom application where this solution works, needs other maintainence 
> connections as well.
> Some of the reasons for its use are performance and less maintainence 
> overhead and also their data requirements are
> also not so high.
> So even if this solution doesn't meet all requirements of single process 
> solution (and neither I think it is written to address all)  but can't we 
> think of it as first version and then based on requirements extend it to have 
> other capabilities:
> a. to have a mechnism for other background processes (autovacuum, checkpoint, 
> ..).
> b. more needs to be thought of..

Why would we spend time trying to put back something that is already
there? Why not simply avoid removing it in the first place?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Alvaro Herrera

Simon Riggs escribió:

> > So even if this solution doesn't meet all requirements of single
> > process solution (and neither I think it is written to address all)
> > but can't we think of it as first version and then based on
> > requirements extend it to have other capabilities:
> > a. to have a mechnism for other background processes (autovacuum, 
> > checkpoint, ..).
> > b. more needs to be thought of..
> 
> Why would we spend time trying to put back something that is already
> there? Why not simply avoid removing it in the first place?

Actually, the whole point of this solution originally was just to serve
pg_upgrade needs, so that it doesn't have to start a complete postmaster
environment just to have to turn off most of what postmaster does, and
with enough protections to disallow everyone else from connecting.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs

On 13 November 2012 13:05, Alvaro Herrera  wrote:
> Simon Riggs escribió:
>
>> > So even if this solution doesn't meet all requirements of single
>> > process solution (and neither I think it is written to address all)
>> > but can't we think of it as first version and then based on
>> > requirements extend it to have other capabilities:
>> > a. to have a mechnism for other background processes (autovacuum, 
>> > checkpoint, ..).
>> > b. more needs to be thought of..
>>
>> Why would we spend time trying to put back something that is already
>> there? Why not simply avoid removing it in the first place?
>
> Actually, the whole point of this solution originally was just to serve
> pg_upgrade needs, so that it doesn't have to start a complete postmaster
> environment just to have to turn off most of what postmaster does, and
> with enough protections to disallow everyone else from connecting.

I don't see anything that pg_upgrade is doing that causes the need to
support a special mode.

From other people's comments it's clear that "single user mode" is
desirable to many and *will* be widely deployed if we allow it. I
support the wish to allow a database server to be limited by
configuration to a single user. However, supporting a specifically
targeted mode that presents single user as an architectural
design/limitation is a regressive step that I am strongly opposed to.

The most popular relational database in the world is Microsoft Access,
not MySQL. Access appears desirable because it allows a single user to
create and use a database (which is very good). But all business
databases have a requirement for at least one of: high availability,
multi-user access or downstream processing in other parts of the
business. Businesses worldwide curse the difficulties caused by having
critical business data in desktop databases. And worldwide, there are
also many that don't understand the problems that disconnected data
causes because they can't see past the initial benefit.

The lessons from that are that its OK to start with a database used by
a single person, but that database soon needs to allow access from
multiple users or automated agents. Many database systems support
embedded or single user mode as an architectural option. All of those
systems cause headaches in all of the businesses where they are used.
They also cause problems on small detached devices such as phones,
because even on very small systems there is a requirement for multiple
concurrently active processes each of which may need database access.

PostgreSQL was designed from the ground up as a multi-user database.
This is the very fact that puts us in a good position to become
pervasive. A single database system that works the same on all
devices, with useful replication to connect data together.

The embedded or single mode concept has long been on the "do not want"
list. I believe that is a completely rational and strongly desirable
thing. Supporting multiple architectures is extra work, and the
restrictive architecture bites people in the long term. The fact that
its an "easy patch" is not a great argument for changing that
position, and in fact, its not easy, since it comes with a request to
make it work on Windows (= extra work). The "easy" bit is not proven
since people are already starting to ask about bgwriter and
autovacuum.

In this release there is much work happening around providing
additional autonomous agents (bgworker) and other work around flexible
replication (BDR), all of which would be nullified by the introduction
and eventual wide usage of a restrictive new architecture.

Single user configuration option, yes. Architecturally limited special
version of PostgreSQL, no.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Tom Lane

Heikki Linnakangas  writes:
> On 12.11.2012 22:53, Tom Lane wrote:
>> Here's an updated patch that fixes the GIST replay functions as well as
>> the other minor issues that were mentioned.  Barring objections, I'll
>> set about back-patching this as far as 9.0.

> Ok. It won't help all that much on 9.0, though.

Well, it won't help GIST much, but the actually-reported-from-the-field
case is in btree, and it does fix that.

It occurs to me that if we're sufficiently scared of this case, we could
probably hack the planner (in 9.0 only) to refuse to use GIST indexes
in hot-standby queries.  That cure might be worse than the disease though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Merlin Moncure

On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane  wrote:
>> Ok. It won't help all that much on 9.0, though.
>
> Well, it won't help GIST much, but the actually-reported-from-the-field
> case is in btree, and it does fix that.
>
> It occurs to me that if we're sufficiently scared of this case, we could
> probably hack the planner (in 9.0 only) to refuse to use GIST indexes
> in hot-standby queries.  That cure might be worse than the disease though.

if anything, it should be documented.  if you do this kind of thing
people will stop installing bugfix releases.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-11-13 Thread Fujii Masao

On Tue, Nov 13, 2012 at 1:06 PM, Amit kapila  wrote:
> On Monday, November 12, 2012 8:23 PM Fujii Masao wrote:
> On Fri, Nov 9, 2012 at 3:03 PM, Amit Kapila  wrote:
>> On Thursday, November 08, 2012 10:42 PM Fujii Masao wrote:
>>> On Thu, Nov 8, 2012 at 5:53 PM, Amit Kapila 
>>> wrote:
>>> > On Thursday, November 08, 2012 2:04 PM Heikki Linnakangas wrote:
>>> >> On 19.10.2012 14:42, Amit kapila wrote:
>>> >> > On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:
>>> >> >> Before implementing the timeout parameter, I think that it's
>>> better
>>> >> to change
>>> >> >> both pg_basebackup background process and pg_receivexlog so that
>
 BTW, IIRC the walsender has no timeout mechanism during sending
 backup data to pg_basebackup. So it's also useful to implement the
>>> timeout mechanism for the walsender during backup.
>>
>>> Yes, its useful, but for walsender the main problem is that it uses blocking
>>> send call to send the data.
>>> I have tried using tcp_keepalive settings, but the send call doesn't comeout
>>> incase of network break.
>>> The only way I could get it out is:
>>> change in the corresponding file /proc/sys/net/ipv4/tcp_retries2 by using
>>> the command
>> echo "8" > /proc/sys/net/ipv4/tcp_retries2
>>> As per recommendation, its value should be at-least 8 (equivalent to 100
>>> sec)
>>
>>> Do you have any idea, how it can be achieved?
>
>> What about using pq_putmessage_noblock()?
>
> I will try this, but do you know why at first place in code the blocking mode 
> is used to send files?
> I am asking as I am little scared that it should not break any design which 
> was initially thought of while making send of files as blocking.

I'm afraid I don't know why. I guess that using non-blocking mode complicates
the code, so in the first version of pg_basebackup the blocking mode
was adopted.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Dimitri Fontaine

Tom Lane  writes:
> OTOH I can't see trying to back-patch a solution like that.   If we want
> to fix this in the back branches (and note the complaint linked above is
> against 8.3), I think we have to do it as attached.
>
> Thoughts?

I've been using textin(record_out(NEW)) in generic partitioning
triggers, and you can find examples of that trick in the wiki, so I
think we have users of that in the field.

Please indeed do consider backpatching!

I don't have an opinion on the opportunity to use a shorter memory
context, I feel that would need some more involved analytics than my
brainpower of the moment allows me to consider.

Thanks,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane

Dimitri Fontaine  writes:
> Tom Lane  writes:
>> Thoughts?

> I've been using textin(record_out(NEW)) in generic partitioning
> triggers, and you can find examples of that trick in the wiki, so I
> think we have users of that in the field.

I think explicit calls like that actually wouldn't be a problem,
since they'd be run in a per-tuple context anyway.  The cases that
are problematic are hard-coded I/O function calls.  I'm worried
about the ones like, say, plpgsql's built-in conversion operations.
We could probably fix printtup's usage with some confidence, but
there are a lot of other ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Robert Haas

On Mon, Nov 12, 2012 at 8:25 PM, Peter Geoghegan  wrote:
> On 13 November 2012 01:03, Jeff Janes  wrote:
>> https://wiki.postgresql.org/wiki/Index-only_scans
>>
>> This page is seriously out of date.  It suggests they are not yet
>> implemented, but only being talked about.
>
> Attached is a piece I wrote on the feature. That might form the basis
> of a new wiki page. Feel free to incorporate this material as you see
> fit.

I found this an interesting read.  As one of the people who worked on
the feature, I'm sort of curious whether people have any experience
yet with how this actually shakes out in the field.  Are you (or is
anyone) aware of positive/negative field experiences with this
feature?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Dimitri Fontaine

Tom Lane  writes:
> I think explicit calls like that actually wouldn't be a problem,
> since they'd be run in a per-tuple context anyway.  The cases that
> are problematic are hard-coded I/O function calls.  I'm worried
> about the ones like, say, plpgsql's built-in conversion operations.
> We could probably fix printtup's usage with some confidence, but
> there are a lot of other ones.

That's a good reason to get them into a shorter memory context, but
which? per transaction maybe? shorter?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to compute Max LSN of Data Pages

2012-11-13 Thread Fujii Masao

On Tue, Nov 13, 2012 at 1:23 PM, Amit kapila  wrote:
> On Monday, November 12, 2012 9:56 PM Alvaro Herrera wrote:
> Robert Haas escribió:
>> On Tue, Jul 31, 2012 at 8:09 AM, Amit kapila  wrote:
>
>>> >> I think I can see all of those things being potentially useful.  There
>>> >> are a couple of pending patches that will revise the WAL format
>>> >> slightly; not sure how much those are likely to interfere with any
>>> >> development you might do on (2) in the meantime.
>> >
>>> > Based on above conclusion, I have prepared a patch which implements 
>>> > Option-1
>>
>>> I wonder if we shouldn't make this a separate utility, rather than
>>> something that is part of pg_resetxlog.  Anyone have a thought on that
>>> topic?
>
>> That thought did cross my mind too.
>
> One of the reasons for keeping it with pg_resetxlog, is that this was 
> proposed as a solution for scenario's where user's db has become corrupt and 
> now he
> want to start it. So to do it he can find the max LSN and set the same using 
> pg_resetxlog, it will avoid the further corruption of database after it got 
> started.
> If we keep it a separate utility then user needs to first run this utility to 
> find max LSN and then use pg_resetxlog to achieve the same. I don't see a big 
> problem in that
> but may be it would have been better if there are other usecases for it.

We might be able to use this utility to decide whether we need to take
a fresh backup from the master onto the standby, to start old master
as new standby after failover.

When starting new standby after failover, any data page in the standby must
not precede the master. Otherwise, the standby cannot catch up with the master
consistently. But, the master might write the data page corresponding to
the WAL which has not been replicated to the standby yet. So, if
failover happens
before that WAL has been replicated, the data page in old master would precede
new master (i.e., old standby), and in this case the backup is required. OTOH,
if maximum LSN in data page in the standby is less than the master, the backup
is not required.

Without this utility, it's difficult to calculate the maximum LSN of
data page, so
basically we needed to take a backup when starting the standby. In the future,
thanks to this utility, we can calculate the maximum LSN, and can skip a backup
if that LSN is less than the master (i.e., last applied LSN, IOW,
timeline switch LSN).

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane

Dimitri Fontaine  writes:
> Tom Lane  writes:
>> I think explicit calls like that actually wouldn't be a problem,
>> since they'd be run in a per-tuple context anyway.  The cases that
>> are problematic are hard-coded I/O function calls.  I'm worried
>> about the ones like, say, plpgsql's built-in conversion operations.
>> We could probably fix printtup's usage with some confidence, but
>> there are a lot of other ones.

> That's a good reason to get them into a shorter memory context, but
> which? per transaction maybe? shorter?

It would have to be per-tuple to do any good.  The existing behavior
is per-query and causes problems if lots of rows are output.  In plpgsql
it would be a function-call-lifespan leak.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7656: PL/Perl SPI_freetuptable() segfault

2012-11-13 Thread Tom Lane

pgm...@joh.to writes:
> I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
> down the test case to something sensible, but I do have a backtrace:

> 219   while (context->firstchild != NULL)
> (gdb) bt
> #0  0x000104e90782 in MemoryContextDeleteChildren (context=0x102bd)
> at mcxt.c:219
> #1  0x000104e906a8 in MemoryContextDelete (context=0x102bd) at
> mcxt.c:174
> #2  0x000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
> spi.c:1003
> #3  0x00011ec9928b in plperl_spi_execute_fetch_result
> (tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
> #4  0x00011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
> "0x7f9ae3e3fe50", limit=-439796840) at plperl.c:2821
> #5  0x00011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
> cv=0x7f9ae4148e90) at SPI.c:69

> While trying to narrow down the test case I noticed what the problem was: I
> was calling spi_execute_query() instead of spi_execute_prepared().

Hm.  It looks like SPI_execute failed as expected (note the status
passed to plperl_spi_execute_fetch_result is -6 which is
SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
was probably an already-deleted tuple table.

One theory we could adopt on this is that this is
plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
free a tuple table unless status > 0.

Another theory we could adopt is that SPI functions that are capable of
setting SPI_tuptable ought to clear it at start, to ensure that they
return it as null on failure.

The latter seems like a "nicer" fix but I'm afraid it might have
unexpected side-effects.  It would certainly be a lot more invasive.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7656: PL/Perl SPI_freetuptable() segfault

2012-11-13 Thread Andrew Dunstan



On 11/13/2012 12:17 PM, Tom Lane wrote:

pgm...@joh.to writes:

I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
down the test case to something sensible, but I do have a backtrace:
219 while (context->firstchild != NULL)
(gdb) bt
#0  0x000104e90782 in MemoryContextDeleteChildren (context=0x102bd)
at mcxt.c:219
#1  0x000104e906a8 in MemoryContextDelete (context=0x102bd) at
mcxt.c:174
#2  0x000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
spi.c:1003
#3  0x00011ec9928b in plperl_spi_execute_fetch_result
(tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
#4  0x00011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
"0x7f9ae3e3fe50", limit=-439796840) at plperl.c:2821
#5  0x00011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
cv=0x7f9ae4148e90) at SPI.c:69
While trying to narrow down the test case I noticed what the problem was: I
was calling spi_execute_query() instead of spi_execute_prepared().

Hm.  It looks like SPI_execute failed as expected (note the status
passed to plperl_spi_execute_fetch_result is -6 which is
SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
was probably an already-deleted tuple table.

One theory we could adopt on this is that this is
plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
free a tuple table unless status > 0.

Another theory we could adopt is that SPI functions that are capable of
setting SPI_tuptable ought to clear it at start, to ensure that they
return it as null on failure.

The latter seems like a "nicer" fix but I'm afraid it might have
unexpected side-effects.  It would certainly be a lot more invasive.



These aren't mutually exclusive, though, are they? It seems reasonable 
to do the minimal fix for the stable branches (looks like it's just a 
matter of moving the call up a couple of lines in plperl.c) and make the 
nicer fix just for the development branch.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Tom Lane

Simon Riggs  writes:
> The most popular relational database in the world is Microsoft Access,
> not MySQL. Access appears desirable because it allows a single user to
> create and use a database (which is very good). But all business
> databases have a requirement for at least one of: high availability,
> multi-user access or downstream processing in other parts of the
> business.

That's a mighty sweeping claim, which you haven't offered adequate
evidence for.  The fact of the matter is that there is *lots* of demand
for simple single-user databases, and what I'm proposing is at least a
first step towards getting there.

The main disadvantage of approaching this via the existing single-user
mode is that you won't have any autovacuum, bgwriter, etc, support.
But the flip side is that that lack of infrastructure is a positive
advantage for certain admittedly narrow use-cases, such as disaster
recovery and pg_upgrade.  So while I agree that this isn't the only
form of single-user mode that we'd like to support, I think it is *a*
form we'd like to support, and I don't see why you appear to be against
having it at all.

A more reasonable objection would be that we need to make sure that this
isn't foreclosing the option of having a multi-process environment with
a single user connection.  I don't see that it is, but it might be wise
to sketch exactly how that case would work before accepting this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Josh Berkus

On 11/12/12 7:59 PM, Amit kapila wrote:
> On Monday, November 12, 2012 12:07 PM Greg Smith wrote:
> On 11/9/12 11:59 PM, Amit kapila wrote:
> 
>>> Please let me know if there are any objections or problems in above method 
>>> of implementation,
>>> else I can go ahead to prepare the patch for the coming CF.
> 
>> It may be the case that the locking scheme Robert described is the best
>> approach here.  It seems kind of heavy to me though.  I suspect that
>> some more thinking about it might come up with something better.

So, here's the problem I'm seeing with having a single .auto file:  when
we write settings to a file, are we writing a *single* setting or *all
of a user's current settings*?

I was imagining writing single, specific settings, which inevitably
leads to one-setting-per-file, e.g.:

SET PERSISTENT work_mem = 256MB;

What Amit seems to be talking about is more EXPORT SETTINGS, where you
dump all current settings in the session to a file.  This seems likely
to produce accidental changes when the user writes out settings they've
forgotten they changed.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Tom Lane

Josh Berkus  writes:
> I was imagining writing single, specific settings, which inevitably
> leads to one-setting-per-file, e.g.:

> SET PERSISTENT work_mem = 256MB;

> What Amit seems to be talking about is more EXPORT SETTINGS, where you
> dump all current settings in the session to a file.  This seems likely
> to produce accidental changes when the user writes out settings they've
> forgotten they changed.

Yeah.  It also seems to be unnecessarily different from the existing
model of SET.  I'd go with one-setting-per-command.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Gavin Flower


On 14/11/12 04:32, Merlin Moncure wrote:

On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane  wrote:

Ok. It won't help all that much on 9.0, though.

Well, it won't help GIST much, but the actually-reported-from-the-field
case is in btree, and it does fix that.

It occurs to me that if we're sufficiently scared of this case, we could
probably hack the planner (in 9.0 only) to refuse to use GIST indexes
in hot-standby queries.  That cure might be worse than the disease though.

if anything, it should be documented.  if you do this kind of thing
people will stop installing bugfix releases.

merlin


How about displaying a warning, when people try to use the 'feature', as 
well as document it?


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 10:32 AM, Merlin Moncure  wrote:
> On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane  wrote:
>>> Ok. It won't help all that much on 9.0, though.
>>
>> Well, it won't help GIST much, but the actually-reported-from-the-field
>> case is in btree, and it does fix that.
>>
>> It occurs to me that if we're sufficiently scared of this case, we could
>> probably hack the planner (in 9.0 only) to refuse to use GIST indexes
>> in hot-standby queries.  That cure might be worse than the disease though.
>
> if anything, it should be documented.  if you do this kind of thing
> people will stop installing bugfix releases.

Agreed.  I think doing that in a back-branch release would be
extremely user-hostile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 1:16 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> I was imagining writing single, specific settings, which inevitably
>> leads to one-setting-per-file, e.g.:
>
>> SET PERSISTENT work_mem = 256MB;
>
>> What Amit seems to be talking about is more EXPORT SETTINGS, where you
>> dump all current settings in the session to a file.  This seems likely
>> to produce accidental changes when the user writes out settings they've
>> forgotten they changed.
>
> Yeah.  It also seems to be unnecessarily different from the existing
> model of SET.  I'd go with one-setting-per-command.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Robert Haas

On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila  wrote:
> Is the above opinion about only locking or even on a way to write the changed 
> things in a file as mentioned in point-1 in mail chain upthread.
> (Point-1: > 1. While writing .auto file, it will always assume that .auto 
> file contain
>> all config parameters.
>>   Now as this .auto file is of fixed format and fixed record size, it can
>> directly write a given record to its particular position.)
> What my thinking was that if we can decide that the format and size of each 
> configuration is fixed, it can be directly written without doing anything for 
> it in memory.

Uh, no, I don't think that's a good idea.  IMHO, what we should do is:

1. Read postgresql.conf.auto and remember all the settings we saw.  If
we see something funky like an include directive, barf.
2. Forget the value we remembered for the particular setting being
changed.  Instead, remember the user-supplied new value for that
parameter.
3. Write a new postgresql.conf.auto based on the information
remembered in steps 1 and 2.

Of course, if we go with one-file-per-setting, then this becomes even
simpler: just clobber the file for the single setting being updated -
creating it if it exists - and ignore all the rest.  I don't
personally favor that approach because I think I think it's clunky to
manage, but YMMV.

With either approach, it's worth noting that a RESET variant of this
could be useful - which would either remove the chosen setting from
postgresql.conf.auto, or remove the file containing the
automatically-set value for that setting.  I think my personal
favorite syntax is:

ALTER SYSTEM .. SET wunk = 'thunk';
ALTER SYSTEM .. RESET wunk;

But I'm OK with something else if there's consensus.  I don't
particularly like SET PERSISTENT because I think this is more like
ALTER DATABASE .. SET than it is like SET LOCAL, but IJWH.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 12:38 PM, Tom Lane  wrote:
> Simon Riggs  writes:
>> The most popular relational database in the world is Microsoft Access,
>> not MySQL. Access appears desirable because it allows a single user to
>> create and use a database (which is very good). But all business
>> databases have a requirement for at least one of: high availability,
>> multi-user access or downstream processing in other parts of the
>> business.
>
> That's a mighty sweeping claim, which you haven't offered adequate
> evidence for.  The fact of the matter is that there is *lots* of demand
> for simple single-user databases, and what I'm proposing is at least a
> first step towards getting there.
>
> The main disadvantage of approaching this via the existing single-user
> mode is that you won't have any autovacuum, bgwriter, etc, support.
> But the flip side is that that lack of infrastructure is a positive
> advantage for certain admittedly narrow use-cases, such as disaster
> recovery and pg_upgrade.  So while I agree that this isn't the only
> form of single-user mode that we'd like to support, I think it is *a*
> form we'd like to support, and I don't see why you appear to be against
> having it at all.
>
> A more reasonable objection would be that we need to make sure that this
> isn't foreclosing the option of having a multi-process environment with
> a single user connection.  I don't see that it is, but it might be wise
> to sketch exactly how that case would work before accepting this.

I'm not particularly excited about providing more single-user mode
options, but I think it's worth having this particular thing because
it makes pg_upgrade more robust.  Whether we do anything else is
something we can litigate when the time comes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs

On 13 November 2012 17:38, Tom Lane  wrote:
> Simon Riggs  writes:
>> The most popular relational database in the world is Microsoft Access,
>> not MySQL. Access appears desirable because it allows a single user to
>> create and use a database (which is very good). But all business
>> databases have a requirement for at least one of: high availability,
>> multi-user access or downstream processing in other parts of the
>> business.
>
> That's a mighty sweeping claim, which you haven't offered adequate
> evidence for.  The fact of the matter is that there is *lots* of demand
> for simple single-user databases, and what I'm proposing is at least a
> first step towards getting there.

I agree there is lots of demand for simple single-user databases and I
wish that too. What I don't agree with is something that casts that
requirement in stone by architecturally/permanently disallowing
secondary connections.

Evidence for claims:
* The whole Business Intelligence industry relies on being able to
re-purpose existing data, forming integrated webs of interconnecting
databases. All of that happens after the initial developers write the
first version of the database application.
* Everybody wants a remote backup, whether its for your mobile phone
contact list or your enterprise datastore.

People are migrating away from embedded databases in droves for these
very reasons.

> The main disadvantage of approaching this via the existing single-user
> mode is that you won't have any autovacuum, bgwriter, etc, support.
> But the flip side is that that lack of infrastructure is a positive
> advantage for certain admittedly narrow use-cases, such as disaster
> recovery and pg_upgrade.  So while I agree that this isn't the only
> form of single-user mode that we'd like to support, I think it is *a*
> form we'd like to support, and I don't see why you appear to be against
> having it at all.

I have no problem with people turning things off, I reject the idea
that we should encourage people to never be able to turn them back on.

> A more reasonable objection would be that we need to make sure that this
> isn't foreclosing the option of having a multi-process environment with
> a single user connection.  I don't see that it is, but it might be wise
> to sketch exactly how that case would work before accepting this.

Whatever we provide will become the norm. I don't have a problem with
you providing BOTH the proposed single user mode AND the multi-process
single user connection mode in this release. But if you provide just
one of them and its the wrong one, we will be severely hampered in the
future.

Yes, I am very much against this project producing a new DBMS
architecture that works on top of PostgreSQL data files, yet prevents
maintenance, backup, replication and multi-user modes.

I see this decision as a critical point for this project, so please
consider this objection and where it comes from.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Variable length array element encoding…

2012-11-13 Thread Sean Chittenden

[ Not subscribed, please keep me in the CC list ]

Is there a standard idiom for encoding small variable length data in an array? 
I wrote the varint extension[1] that encodes data using a variable width 
encoding scheme[2] for signed and unsigned integers[3]. Right now the extension 
is mostly of use in skinny tables that have at least 4-5 columns, all of which 
are of INT or INT8. If you have only 5 columns of INT8, you can save ~50% of 
your table space.

But, to get larger savings, it's required to bypass the tuple overhead and 
aggregating data in to an array (i.e. aggregate all time series data for a 5min 
window of time in to a single varuint[]).

The problem with that being, each varint takes 8 bytes in an array because of 
padding and alignment. Is there a way to prevent that, or, more realistically, 
are there standard ways of encoding this data in to a BYTEA and then manually 
scanning and unpacking the data? Random access in to the array isn't a concern. 
I was thinking about adding a BYTEA to varint[] cast, but am fishing for a 
better idea.

Any hints or thoughts? Thanks in advance. -sc


[1] https://github.com/sean-/postgresql-varint

[2] SELECT varint64,  pg_column_size(varint64) FROM varint64_table ORDER BY 
varint64 ASC;
   varint64   |  pg_column_size
--+-
 -4611686018427387905 |  11
 -4611686018427387904 |  10
 -36028797018963969   |  10
 -36028797018963968   |   9
 -281474976710657 |   9
 -281474976710656 |   8
 -219902323   |   8
 -219902322   |   7
 -17179869185 |   7
 -17179869184 |   6
 -134217729   |   6
 -134217728   |   5
 -1048577 |   5
 -1048576 |   4
 -8193|   4
 -8192|   3
 -65  |   3
 -64  |   2
 -1   |   2
 0|   2
 1|   2
 63   |   2
 64   |   3
 8191 |   3
 8192 |   4
 1048575  |   4
 1048576  |   5
 134217727|   5
 134217728|   6
 17179869183  |   6
 17179869184  |   7
 219902321|   7
 219902322|   8
 281474976710655  |   8
 281474976710656  |   9
 36028797018963967|   9
 36028797018963968|  10
 4611686018427387903  |  10
 4611686018427387904  |  11
(39 rows)

SELECT varuint64,  pg_column_size(varint64) FROM varuint64_table ORDER BY 
varint64 ASC;
  varuint64  |  pg_column_size
-+-
 0   |   2
 127 |   2
 128 |   3
 16383   |   3
 16384   |   4
 2097151 |   4
 2097152 |   5
 268435455   |   5
 268435456   |   6
 34359738367 |   6
 34359738368 |   7
 4398046511103   |   7
 4398046511104   |   8
 562949953421311 |   8
 562949953421312 |   9
 72057594037927935   |   9
 72057594037927936   |  10
 9223372036854775807 |  10

[3] I know the unsigned int only goes up to 2^^63 atm, it will go to 2^^64 once 
I get around to setting up a test methodology. Using INT8 internally was too 
convenient at the time.

--
Sean Chittenden
s...@chittenden.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane  wrote:
> I wonder though if we ought to think about running output functions in
> a short-lived memory context instead of the executor's main context.
> We've considered that before, I think, and it's always been the path
> of least resistance to fix the output functions instead --- but there
> will always be another leak I'm afraid.

Such is the lot of people who code in C.  I worry that the number of
memory contexts we're kicking around already is imposing a significant
distributed overhead on the system that is hard to measure but
nevertheless real, and that this will add to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Tom Lane

Simon Riggs  writes:
> On 13 November 2012 17:38, Tom Lane  wrote:
>> ...  The fact of the matter is that there is *lots* of demand
>> for simple single-user databases, and what I'm proposing is at least a
>> first step towards getting there.

> I agree there is lots of demand for simple single-user databases and I
> wish that too. What I don't agree with is something that casts that
> requirement in stone by architecturally/permanently disallowing
> secondary connections.

If you want secondary connections, then I think you want a postmaster.
We already have umpteen ways to limit who can connect (for example,
putting the socket in a directory with limited access rights), and in
that sort of situation I don't see why you'd really want a database
that is only accessible when the "main" client is running.

The case that this patch is meant to address is one where there is only
one client application, period, and you'd rather that the database
starts and stops automatically with that application instead of needing
any management complexity.  Now we can debate whether we want only one
process or multiple processes underneath the client application, but
I think the restriction to one client connection is a key *feature*
not a bug, precisely because it removes a whole bunch of user-visible
complexity that we cannot escape otherwise.

> People are migrating away from embedded databases in droves for these
> very reasons.

[ shrug... ]  If they don't want an embedded database, they won't want
this either, but there are still plenty of people left who do want an
embedded database.  We've never had an adequate offering for those
people before.  If we ratchet up the management complexity of "single
user" mode then it still won't be an adequate offering for them.

> I see this decision as a critical point for this project, so please
> consider this objection and where it comes from.

I think this is nonsense.  It's not critical; it's a very small patch
that provides a feature of interest to a limited audience.  And I don't
believe it's foreclosing providing other operating modes later, unless
maybe people feel this is "almost good enough" and lose motivation to
work on those other operating modes.  But if that happens, then I'd say
the demand for the other modes isn't as high as you think.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane

Robert Haas  writes:
> On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane  wrote:
>> I wonder though if we ought to think about running output functions in
>> a short-lived memory context instead of the executor's main context.
>> We've considered that before, I think, and it's always been the path
>> of least resistance to fix the output functions instead --- but there
>> will always be another leak I'm afraid.

> Such is the lot of people who code in C.  I worry that the number of
> memory contexts we're kicking around already is imposing a significant
> distributed overhead on the system that is hard to measure but
> nevertheless real, and that this will add to it.

Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
than a bunch of retail pfree's, but it's hard to prove anything without
actually coding and testing it --- and on modern machines, effects like
cache locality could swamp pure instruction-count gains anyway.

Anyway, I committed the narrow fix for the moment.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Robert Haas

On Sun, Nov 11, 2012 at 5:52 PM, Jeff Davis  wrote:
> Per-database does sound easier than per-table. I'd have to think about
> how that would affect shared catalogs though.
>
> For now, I'm leaning toward an offline utility to turn checksums on or
> off, called pg_checksums. It could do so lazily (just flip a switch to
> "enabling" in pg_control), or it could do so eagerly and turn it into a
> fully-protected instance.
>
> For the first patch, it might just be an initdb-time option for
> simplicity.

It'd be pretty easy to write a pg_checksums utilitys to turn checksums
on/off on a database that is shut down, since the hard part of all of
this is to change the state while the database is running.  But I
think even that doesn't need to be part of the first patch.  A small
patch that gets committed is better than a big one that doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Christopher Browne

Preface: I think there's some great commentary here, and find myself
agreeing
pretty whole-heartedly.

On Tue, Nov 13, 2012 at 2:45 PM, Simon Riggs  wrote:

> On 13 November 2012 17:38, Tom Lane  wrote:
> > Simon Riggs  writes:
> >> The most popular relational database in the world is Microsoft Access,
> >> not MySQL. Access appears desirable because it allows a single user to
> >> create and use a database (which is very good). But all business
> >> databases have a requirement for at least one of: high availability,
> >> multi-user access or downstream processing in other parts of the
> >> business.
> >
> > That's a mighty sweeping claim, which you haven't offered adequate
> > evidence for.  The fact of the matter is that there is *lots* of demand
> > for simple single-user databases, and what I'm proposing is at least a
> > first step towards getting there.
>
> I agree there is lots of demand for simple single-user databases and I
> wish that too. What I don't agree with is something that casts that
> requirement in stone by architecturally/permanently disallowing
> secondary connections.
>
> Evidence for claims:
> * The whole Business Intelligence industry relies on being able to
> re-purpose existing data, forming integrated webs of interconnecting
> databases. All of that happens after the initial developers write the
> first version of the database application.
> * Everybody wants a remote backup, whether its for your mobile phone
> contact list or your enterprise datastore.
>
> People are migrating away from embedded databases in droves for these
> very reasons.
>

There seems to be a continuum of different sorts of scenarios of
more-to-less
concurrency that are desirable for some different reasons.  From
most-to-least,
I can see:

1 - Obviously, there's the case that Postgres is eminently good at, of
supporting
  many users concurrently using a database.  We love that, let's not break
it :-).

2 - We have found it useful to have some extra work processes that do some
  useful internal things, such as vacuuming, forcing background writes,
  collecting statistics.  And an online backup requires having a second
process.

3 - People doing embedded systems find it attractive to attach all the data
  to the singular user running the system.  Witness the *heavy* deployment
  of SQLite on Android and iOS.  People make an assumption that this is
  a single-process thing, but I am inclined to be a bit skeptical.  What
they
  *do* know is that it's convenient to not spawn extra processes and do
  IPC.  That's not quite the same thing as it being a certainty that they
  definitely want not to have more than one process.

4 - There are times when there *is* certainty about not wanting there to be
more
  than one process.  When running pg_upgrade, or, at certain times, when
  doing streaming replication node status switches, one might have that
  certainty.  Or when reindexing system tables, which needs single user
mode.

For us to conflate the 3rd and 4th items seems like a mistake to me.

> > The main disadvantage of approaching this via the existing single-user
> > mode is that you won't have any autovacuum, bgwriter, etc, support.
> > But the flip side is that that lack of infrastructure is a positive
> > advantage for certain admittedly narrow use-cases, such as disaster
> > recovery and pg_upgrade.  So while I agree that this isn't the only
> > form of single-user mode that we'd like to support, I think it is *a*
> > form we'd like to support, and I don't see why you appear to be against
> > having it at all.
>
> I have no problem with people turning things off, I reject the idea
> that we should encourage people to never be able to turn them back on.
>

Yep.  That seems like conflating #2 with #4.

It's mighty attractive to have a forcible "single process mode" to add
safety to
certain activities.

I think we need a sharper knife, though, so we don't ablate off stuff like
#2, just
because someone imagined that "Must Have Single Process!!!" was the right
doctrine.

> > A more reasonable objection would be that we need to make sure that this
> > isn't foreclosing the option of having a multi-process environment with
> > a single user connection.  I don't see that it is, but it might be wise
> > to sketch exactly how that case would work before accepting this.
>
> Whatever we provide will become the norm. I don't have a problem with
> you providing BOTH the proposed single user mode AND the multi-process
> single user connection mode in this release. But if you provide just
> one of them and its the wrong one, we will be severely hampered in the
> future.
>
> Yes, I am very much against this project producing a new DBMS
> architecture that works on top of PostgreSQL data files, yet prevents
> maintenance, backup, replication and multi-user modes.
>
> I see this decision as a critical point for this project, so please
> consider this objection and where it comes from.
>

I don't think we're necessarily *hu

Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Robert Haas

On Mon, Nov 12, 2012 at 4:44 AM, Craig Ringer  wrote:
> That'll make it hard for VACUUM, hint-bit setting, etc to
> opportunistically checksum pages whenever they're doing a page write anyway.
>
> Is it absurd to suggest using another bitmap, like the FSM or visibility
> map, to store information on page checksumming while checksumming is
> enabled but incomplete? As a much smaller file the bitmap could its self
> be very quickly generated in one pass when checksumming is enabled, with
> its starting state showing no pages having checksums.

Hmm... what if we took this a step further and actually stored the
checksums in a separate relation fork?  That would make it pretty
simple to support enabling/disabling checksums for particular
relations.  It would also allow us to have a wider checksum, like 32
or 64 bits rather than 16.  I'm not scoffing at a 16-bit checksum,
because even that's enough to catch a very high percentage of errors,
but it wouldn't be terrible to be able to support a wider one, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 3:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane  wrote:
>>> I wonder though if we ought to think about running output functions in
>>> a short-lived memory context instead of the executor's main context.
>>> We've considered that before, I think, and it's always been the path
>>> of least resistance to fix the output functions instead --- but there
>>> will always be another leak I'm afraid.
>
>> Such is the lot of people who code in C.  I worry that the number of
>> memory contexts we're kicking around already is imposing a significant
>> distributed overhead on the system that is hard to measure but
>> nevertheless real, and that this will add to it.
>
> Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
> than a bunch of retail pfree's, but it's hard to prove anything without
> actually coding and testing it --- and on modern machines, effects like
> cache locality could swamp pure instruction-count gains anyway.

Yeah.  The thing that concerns me is that I think we have a pretty
decent number of memory contexts where the expected number of
allocations is very small ... and we have the context *just in case*
we do more than that in certain instances.  I've seen profiles where
the setup/teardown costs of memory contexts are significant ... which
doesn't mean that those examples would perform better with fewer
memory contexts, but it's enough to make me pause for thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to compute Max LSN of Data Pages

2012-11-13 Thread Robert Haas

On Tue, Nov 13, 2012 at 11:46 AM, Fujii Masao  wrote:
> Without this utility, it's difficult to calculate the maximum LSN of
> data page, so
> basically we needed to take a backup when starting the standby. In the future,
> thanks to this utility, we can calculate the maximum LSN, and can skip a 
> backup
> if that LSN is less than the master (i.e., last applied LSN, IOW,
> timeline switch LSN).

Doesn't the minimum recovery point give us that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Tom Lane

Robert Haas  writes:
> Hmm... what if we took this a step further and actually stored the
> checksums in a separate relation fork?  That would make it pretty
> simple to support enabling/disabling checksums for particular
> relations.  It would also allow us to have a wider checksum, like 32
> or 64 bits rather than 16.  I'm not scoffing at a 16-bit checksum,
> because even that's enough to catch a very high percentage of errors,
> but it wouldn't be terrible to be able to support a wider one, either.

What happens when you get an I/O failure on the checksum fork?  Assuming
you're using 8K pages there, that would mean you can no longer verify
the integrity of between one and four thousand pages of data.

Not to mention the race condition problems associated with trying to be
sure the checksum updates hit the disk at the same time as the data-page
updates.

I think you really have to store the checksums *with* the data they're
supposedly protecting.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign key locks

2012-11-13 Thread Alvaro Herrera

Noah Misch wrote:
> On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:

> > Not really sure about the proposed syntax, but yes clearly we need some
> > other syntax to mean "FOR NON KEY UPDATE".  I would rather keep FOR
> > UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
> > the other (weaker) lock level welcome (but if you love FOR NON KEY
> > UPDATE, please chime in too)
> 
> Agree on having "FOR UPDATE" without any "FOR KEY UPDATE" synonym.  For the
> weaker lock, I mildly preferred the proposal of "FOR NO KEY UPDATE".  NON KEY
> captures the idea better in English, but NO is close enough and already part
> of the SQL lexicon.

This is the proposal I like best; however there is an asymmetry, because
the locking options now are

FOR KEY SHARE
FOR SHARE
FOR NO KEY UPDATE
FOR UPDATE

I used to have comments such as

/* Currently, SELECT ... FOR [KEY] UPDATE/SHARE requires UPDATE privileges */
#define ACL_SELECT_FOR_UPDATE   ACL_UPDATE

but now they are slightly incorrect because the NO is not illustrated.
I guess I could use SELECT ... FOR [NO KEY] UPDATE/SHARE but this leaves
out the "FOR KEY SHARE" case (and can be thought to introduce a FOR NO
KEY SHARE case).  And getting much more verbose than that is probably
not warranted.  In some places I would like the use a phrase like "the
locking clause", but I'm not sure that it's clear enough.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Process waiting for ExclusiveLock on INSERT

2012-11-13 Thread Jehan-Guillaume de Rorthais

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

I had a surprising question today from a customer about exclusive
locks on INSERT. His log file shows something like:


==>
  LOG:  process 1881 still waiting for ExclusiveLock on extension of
relation 168318652 of database 58429185 after 1000.065 ms
  STATEMENT:  INSERT INTO  (...cols...) VALUES (...values...)
<==


While asking on #postgresql and investigating in the code, I think I
hit a theory. When the system is I/O bound and backends are doing
shared buffer cleanups, at some point they have to extend a relation
with new pages, requiring for an exclusive lock on the relation to
forbid anyone else to extend it in the same time. At this time, if
multiple backends try to extend the relation, one win, all other wait
for the lock, leading to messages in the log file when log_lock_waits
is enabled.

This lock would comes from src/backend/access/heap/hio.c:432:
==>
  /*
   * Have to extend the relation.
   *
   * We have to use a lock to ensure no one else is extending the rel
at the
   * same time, else we will both try to initialize the same new page.  We
   * can skip locking for new or temp relations, however, since no one
else
   * could be accessing them.
   */
   needLock = !RELATION_IS_LOCAL(relation);

   if (needLock)
   LockRelationForExtension(relation, ExclusiveLock);
<==


Is this theory correct or this issue should be discussed a bit more on
this list ?

PFA a small bash script with a small PostgreSQL configuration in
comments that reproduce this behavior very often on my laptop, eg:


==>
  2012-11-13 23:15:51 CET [23137]: [1-1] user=postgres,db=test LOG:
process 23137 still waiting for ExclusiveLock on extension of relation
28118 of database 28115 after 100.086 ms
  2012-11-13 23:15:51 CET [23137]: [2-1] user=postgres,db=test
STATEMENT:  insert into test select i, md5(i::text) from
generate_series(1,1000) AS i
<==

Regards,
- -- 
Jehan-Guillaume de Rorthais
http://www.dalibo.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlCiyT4ACgkQXu9L1HbaT6KbdgCgslQiKjP5bovr/eN5gi1TJB6i
9pcAoI9BpfD/4306xSUZTPUcQTLYHJS3
=HgzB
-END PGP SIGNATURE-


test_exclusivelock_on_insert.sh
Description: application/shellscript

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Dimitri Fontaine

Tom Lane  writes:
>> I agree there is lots of demand for simple single-user databases and I
>> wish that too. What I don't agree with is something that casts that
>> requirement in stone by architecturally/permanently disallowing
>> secondary connections.
>
> If you want secondary connections, then I think you want a postmaster.

I would agree. I think you're both talking above each other, and that
what Simon is worried about (but I haven't asked him about that before
sending that email) is how to change the application setup to switch
from single user mode to multi user mode.

IIRC the way to implement single user mode in your application is quite
low-level with this patch, so switching to multi-user mode is not about
just changing the connection string, or is it?

> The case that this patch is meant to address is one where there is only
> one client application, period, and you'd rather that the database
> starts and stops automatically with that application instead of needing
> any management complexity.  Now we can debate whether we want only one
> process or multiple processes underneath the client application, but
> I think the restriction to one client connection is a key *feature*
> not a bug, precisely because it removes a whole bunch of user-visible
> complexity that we cannot escape otherwise.

Well I think your patch would be easier to accept as is if it was
documented only as a psql friendly single-user mode. I would really
welcome that.

> embedded database.  We've never had an adequate offering for those
> people before.  If we ratchet up the management complexity of "single
> user" mode then it still won't be an adequate offering for them.

Now, if we're talking about single user mode as in embedded database, I
really do think this patch should include a solution to run online
maintainance, logical and physical backups, replication, archiving and
all the production grade features you expect from PostgreSQL.

And then I understand Simon's POV about code complexity and bgworkers
for examples, which will *need* to be taken care of in that solution.

> I think this is nonsense.  It's not critical; it's a very small patch
> that provides a feature of interest to a limited audience.  And I don't

Yes, it's providing full psql capabilities where we only had that bizare
postgres --single interface. Maybe it will make initdb and debugging it
easier too.

> believe it's foreclosing providing other operating modes later, unless
> maybe people feel this is "almost good enough" and lose motivation to
> work on those other operating modes.  But if that happens, then I'd say
> the demand for the other modes isn't as high as you think.

Again, my concern on that point after reading Simon's comments is only
about the production procedure you have to follow to switch your
application from single user mode to multi user mode.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Peter Geoghegan

On 13 November 2012 16:37, Robert Haas  wrote:
> I found this an interesting read.  As one of the people who worked on
> the feature, I'm sort of curious whether people have any experience
> yet with how this actually shakes out in the field.  Are you (or is
> anyone) aware of positive/negative field experiences with this
> feature?

Unfortunately, I don't think that I have any original insight about
the problems with index-only scans in the field right now.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> Yeah.  The thing that concerns me is that I think we have a pretty
> decent number of memory contexts where the expected number of
> allocations is very small ... and we have the context *just in case*
> we do more than that in certain instances.  I've seen profiles where
> the setup/teardown costs of memory contexts are significant ... which
> doesn't mean that those examples would perform better with fewer
> memory contexts, but it's enough to make me pause for thought.

So, for my 2c, I'm on the other side of this, personally.  We have
memory contexts for more-or-less exactly this issue.  It's one of the
great things about PG- it's resiliant and very unlikely to have large or
bad memory leaks in general, much of which can, imv, be attributed to
our use of memory contexts.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Peter Geoghegan

On 13 November 2012 16:37, Robert Haas  wrote:
> I found this an interesting read.  As one of the people who worked on
> the feature, I'm sort of curious whether people have any experience
> yet with how this actually shakes out in the field.  Are you (or is
> anyone) aware of positive/negative field experiences with this
> feature?

Unfortunately, I don't think that I have any original insight about
the problems with index-only scans in the field right now.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread Craig Ringer

Hi all

Andreas Heiduk on -bugs suggested that we add the functions provided by
contrib modules to the function index in the docs, so it's easier to go
from, say,  "what the heck is idx(...)" to finding it in the intarray
contrib module.

This seems like a good idea and I'd like to pop it in the TODO until I
get time to check it out, flagged as a minor/newbie-friendly problem.
Any objections?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread David Johnston

> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
> ow...@postgresql.org] On Behalf Of Craig Ringer
> Sent: Tuesday, November 13, 2012 6:15 PM
> To: PostgreSQL Hackers
> Subject: [HACKERS] Add contrib module functions to docs' function index
> 
> Hi all
> 
> Andreas Heiduk on -bugs suggested that we add the functions provided by
> contrib modules to the function index in the docs, so it's easier to go
from,
> say,  "what the heck is idx(...)" to finding it in the intarray contrib
module.
> 
> This seems like a good idea and I'd like to pop it in the TODO until I get
time to
> check it out, flagged as a minor/newbie-friendly problem.
> Any objections?
> 

For clarity does this proposal refer to Chapter 9 of the documentation, the
"Index", or both.

If modifying Chapter 9 the function and operator tables should be extended
to include a "source" column with values of "base" or "contrib: " or something similar.

As to the desirability of such a change I concur that it would be a nice
usability enhancement to consider beyond just updating the actual "Index".

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread Craig Ringer

On 11/14/2012 07:56 AM, David Johnston wrote:
>
> For clarity does this proposal refer to Chapter 9 of the documentation, the
> "Index", or both.
>
> If modifying Chapter 9 the function and operator tables should be extended
> to include a "source" column with values of "base" or "contrib:  name>" or something similar.
>
> As to the desirability of such a change I concur that it would be a nice
> usability enhancement to consider beyond just updating the actual "Index".
Sorry I was unclear.

I'm talking about making sure that contrib module functions (and
settings) appear in the documentation index (
http://www.postgresql.org/docs/current/static/bookindex.html
) so it's easy
to find a function by name whether it's in core or contrib. This is what
I want to add to TODO.

Separately, it might also be nice to add the contrib functions to the
section 9 tables with an extra column showing their origin, but that's
less clearly a good thing. Even if there's a column saying "intarray"
for intarray functions in the array functions list, people will still
try to use them without loading the extension and get confused when
they're not found. It'll also bloat the listings of core functions.
Rather than do that, I'd probably prefer to add a note to relevant
sections. For example, in array functions I'd want to add "Additional
functions that operate only on arrays of integers are available in the
intarray extension".

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-13 Thread Bruce Momjian

On Tue, Nov 13, 2012 at 07:03:51PM -0500, Bruce Momjian wrote:
> I am attaching an updated pg_upgrade patch, which I believe is ready for
> application for 9.3.

Correction, here is the proper patch.  The previous posted version was
had pending merges from the master branch.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/file.c b/contrib/pg_upgrade/file.c
new file mode 100644
index a5d92c6..d8cd8f5
*** a/contrib/pg_upgrade/file.c
--- b/contrib/pg_upgrade/file.c
*** copy_file(const char *srcfile, const cha
*** 221,281 
  #endif
  
  
- /*
-  * load_directory()
-  *
-  * Read all the file names in the specified directory, and return them as
-  * an array of "char *" pointers.  The array address is returned in
-  * *namelist, and the function result is the count of file names.
-  *
-  * To free the result data, free each (char *) array member, then free the
-  * namelist array itself.
-  */
- int
- load_directory(const char *dirname, char ***namelist)
- {
- 	DIR		   *dirdesc;
- 	struct dirent *direntry;
- 	int			count = 0;
- 	int			allocsize = 64;		/* initial array size */
- 
- 	*namelist = (char **) pg_malloc(allocsize * sizeof(char *));
- 
- 	if ((dirdesc = opendir(dirname)) == NULL)
- 		pg_log(PG_FATAL, "could not open directory \"%s\": %s\n",
- 			   dirname, getErrorText(errno));
- 
- 	while (errno = 0, (direntry = readdir(dirdesc)) != NULL)
- 	{
- 		if (count >= allocsize)
- 		{
- 			allocsize *= 2;
- 			*namelist = (char **)
- 		pg_realloc(*namelist, allocsize * sizeof(char *));
- 		}
- 
- 		(*namelist)[count++] = pg_strdup(direntry->d_name);
- 	}
- 
- #ifdef WIN32
- 	/*
- 	 * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
- 	 * released version
- 	 */
- 	if (GetLastError() == ERROR_NO_MORE_FILES)
- 		errno = 0;
- #endif
- 
- 	if (errno)
- 		pg_log(PG_FATAL, "could not read directory \"%s\": %s\n",
- 			   dirname, getErrorText(errno));
- 
- 	closedir(dirdesc);
- 
- 	return count;
- }
- 
- 
  void
  check_hard_link(void)
  {
--- 221,226 
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
new file mode 100644
index 3058343..f35ce75
*** a/contrib/pg_upgrade/pg_upgrade.h
--- b/contrib/pg_upgrade/pg_upgrade.h
***
*** 7,13 
  
  #include 
  #include 
- #include 
  #include 
  #include 
  
--- 7,12 
*** const char *setupPageConverter(pageCnvCt
*** 366,372 
  typedef void *pageCnvCtx;
  #endif
  
- int			load_directory(const char *dirname, char ***namelist);
  const char *copyAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
    const char *dst, bool force);
  const char *linkAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
--- 365,370 
diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c
new file mode 100644
index 33a867f..d763ba7
*** a/contrib/pg_upgrade/relfilenode.c
--- b/contrib/pg_upgrade/relfilenode.c
***
*** 17,25 
  
  static void transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size);
! static void transfer_relfile(pageCnvCtx *pageConverter,
!  const char *fromfile, const char *tofile,
!  const char *nspname, const char *relname);
  
  
  /*
--- 17,24 
  
  static void transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size);
! static void transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
! 			 const char *suffix);
  
  
  /*
*** static void
*** 131,185 
  transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size)
  {
- 	char		old_dir[MAXPGPATH];
- 	char		file_pattern[MAXPGPATH];
- 	char		**namelist = NULL;
- 	int			numFiles = 0;
  	int			mapnum;
! 	int			fileno;
! 	bool		vm_crashsafe_change = false;
! 
! 	old_dir[0] = '\0';
! 
! 	/* Do not copy non-crashsafe vm files for binaries that assume crashsafety */
  	if (old_cluster.controldata.cat_ver < VISIBILITY_MAP_CRASHSAFE_CAT_VER &&
  		new_cluster.controldata.cat_ver >= VISIBILITY_MAP_CRASHSAFE_CAT_VER)
! 		vm_crashsafe_change = true;
  
  	for (mapnum = 0; mapnum < size; mapnum++)
  	{
! 		char		old_file[MAXPGPATH];
! 		char		new_file[MAXPGPATH];
! 
! 		/* Changed tablespaces?  Need a new directory scan? */
! 		if (strcmp(maps[mapnum].old_dir, old_dir) != 0)
! 		{
! 			if (numFiles > 0)
! 			{
! for (fileno = 0; fileno < numFiles; fileno++)
! 	pg_free(namelist[fileno]);
! pg_free(namelist);
! 			}
! 
! 			snprintf(old_dir, sizeof(old_dir), "%s", maps[mapnum].old_dir);
! 			numFiles = load_directory(old_dir, &namelist);
! 		}
! 
! 		/* Copying files might take some time, so give feedback. */
! 
! 		snprintf(old_file, sizeof(old_file), "%s/%u", maps[mapnum].old_dir,
!  maps[mapnum].old_relfilenode);
! 		snprintf(new_file, sizeof(new_file), "%s/%u", maps[mapnum].new_dir,

Re: [HACKERS] Doc patch, index search_path where it's used to secure functions

2012-11-13 Thread Peter Eisentraut

On Fri, 2012-09-28 at 12:17 -0500, Karl O. Pinc wrote:
> On 09/28/2012 11:28:39 AM, Karl O. Pinc wrote:
> 
> > Doc patch, index search_path where it's used to secure functions.
> > search_path-securing.patch
> 
> Second version.  Should be indexing the concept, not the run-time 
> setting.

Well, I'm not sure.  We currently have three index entries on the topic:

search path
search_path
search_path configuration parameter

I think I'd put them all under search_path.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doc patch, further describe and-mask nature of the permission system

2012-11-13 Thread Peter Eisentraut

On Sat, 2012-09-29 at 01:16 -0500, Karl O. Pinc wrote:
> This patch makes some sweeping statements.

Unfortunately, they are wrong.  What you term the additive nature is
really only a special case in the relationship between table and column
privileges.  Schema and database privileges are completely separate
things.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-13 Thread Ants Aasma

On Wed, Nov 14, 2012 at 2:03 AM, Bruce Momjian  wrote:
> At 64k I see pg_upgrade taking 12% of the duration time, if I subtract
> out the dump/restore times.

My percentage numbers only included CPU time and I used SSD storage.
For the most part there was no IO wait to speak of, but it's
completely expected that thousands of link calls are not free.

>> Postgres time itself breaks down with 10% for shutdown checkpoint and
>> 90% for regular running, consisting of 16% parsing, 13% analyze, 20%
>> plan, 30% execute, 11% commit (AtEOXact_RelationCache) and 6% network.
>
> That SVG graph was quite impressive.

I used perf and Gprof2Dot for this. I will probably do a blog post on
how to generate these graphs. It's much more useful for me than a
plain flat profile as I don't know by heart which functions are called
by which.

>> It looks to me that most benefit could be had from introducing more
>> parallelism. Are there any large roadblocks to pipelining the dump and
>> restore to have them happen in parallel?
>
> I talked to Andrew Dustan about parallelization in pg_restore.  First,
> we currently use pg_dumpall, which isn't in the custom format required
> for parallel restore, but if we changed to custom format, create table
> isn't done in parallel, only create index/check constraints, and trigger
> creation, etc.  Not sure if it worth perusing this just for pg_upgrade.

I agree that parallel restore for schemas is a hard problem. But I
didn't mean parallelism within the restore, I meant that we could
start both postmasters and pipe the output from dump directly to
restore. This way the times for dumping and restoring can overlap.

Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] recursive view syntax

2012-11-13 Thread Peter Eisentraut

I noticed we don't implement the recursive view syntax, even though it's
part of the standard SQL feature set for recursive queries.  Here is a
patch to add that.  It basically converts

CREATE RECURSIVE VIEW name (columns) AS SELECT ...;

to

CREATE VIEW name AS WITH RECURSIVE name (columns) AS (SELECT ...) SELECT
columns FROM name;

diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 838bf48..c13f3ec 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -21,7 +21,7 @@
 
  
 
-CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] VIEW name [ ( column_name [, ...] ) ]
+CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] [ RECURSIVE ] VIEW name [ ( column_name [, ...] ) ]
 [ WITH ( view_option_name [= view_option_value] [, ... ] ) ]
 AS query
 
@@ -81,6 +81,23 @@ Parameters

 

+RECURSIVE
+
+ 
+  Creates a recursive view.  The syntax
+
+CREATE RECURSIVE VIEW name (columns) AS SELECT ...;
+
+  is equivalent to
+
+CREATE VIEW name AS WITH RECURSIVE name (columns) AS (SELECT ...) SELECT columns FROM name;
+
+  A view column list must be specified for a recursive view.
+ 
+
+   
+
+   
 name
 
  
@@ -191,6 +208,16 @@ Examples
* was used to create the view, columns added later to
the table will not be part of the view.
   
+
+  
+   Create a recursive view consisting of the numbers from 1 to 100:
+
+CREATE RECURSIVE VIEW nums_1_100 (n) AS
+VALUES (1)
+UNION ALL
+SELECT n+1 FROM nums_1_100 WHERE n < 100;
+
+  
  
 
  
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e4ff76e..159096a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -164,6 +164,7 @@ static void SplitColQualList(List *qualList,
 static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
+static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %}
 
@@ -7834,6 +7835,30 @@ ViewStmt: CREATE OptTemp VIEW qualified_name opt_column_list opt_reloptions
 	n->options = $8;
 	$$ = (Node *) n;
 }
+		| CREATE OptTemp RECURSIVE VIEW qualified_name '(' columnList ')' opt_reloptions
+AS SelectStmt
+{
+	ViewStmt *n = makeNode(ViewStmt);
+	n->view = $5;
+	n->view->relpersistence = $2;
+	n->aliases = $7;
+	n->query = makeRecursiveViewSelect(n->view->relname, n->aliases, $11);
+	n->replace = false;
+	n->options = $9;
+	$$ = (Node *) n;
+}
+		| CREATE OR REPLACE OptTemp RECURSIVE VIEW qualified_name '(' columnList ')' opt_reloptions
+AS SelectStmt
+{
+	ViewStmt *n = makeNode(ViewStmt);
+	n->view = $7;
+	n->view->relpersistence = $4;
+	n->aliases = $9;
+	n->query = makeRecursiveViewSelect(n->view->relname, n->aliases, $13);
+	n->replace = true;
+	n->options = $11;
+	$$ = (Node *) n;
+}
 		;
 
 opt_check_option:
@@ -13541,6 +13566,60 @@ processCASbits(int cas_bits, int location, const char *constrType,
 	}
 }
 
+/*--
+ * Recursive view transformation
+ *
+ * Convert
+ *
+ * CREATE RECURSIVE VIEW relname (aliases) AS query
+ *
+ * to
+ *
+ * CREATE VIEW relname (aliases) AS
+ * WITH RECURSIVE relname (aliases) AS (query)
+ * SELECT aliases FROM relname
+ *
+ * Actually, just the WITH ... part, which is then inserted into the original
+ * view definition as the query.
+ * --
+ */
+static Node *
+makeRecursiveViewSelect(char *relname, List *aliases, Node *query)
+{
+	SelectStmt *s = makeNode(SelectStmt);
+	WithClause *w = makeNode(WithClause);
+	CommonTableExpr *cte = makeNode(CommonTableExpr);
+	List	   *tl = NIL;
+	ListCell   *lc;
+
+	cte->ctename = relname;
+	cte->aliascolnames = aliases;
+	cte->ctequery = query;
+	cte->location = -1;
+
+	w->recursive = true;
+	w->ctes = list_make1(cte);
+	w->location = -1;
+
+	foreach (lc, aliases)
+	{
+		ResTarget *rt = makeNode(ResTarget);
+
+		rt->name = NULL;
+		rt->indirection = NIL;
+		rt->val = makeColumnRef(strVal(lfirst(lc)), NIL, -1, 0);
+		rt->location = -1;
+
+		tl = lappend(tl, rt);
+	}
+
+	s->targetList = tl;
+	s->fromClause = list_make1(makeRangeVar(NULL, relname, -1));
+	s->withClause = w;
+
+	return (Node *) s;
+}
+
 /* parser_init()
  * Initialize to parse one query string
  */
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index b98ca63..272118f 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -49,6 +49,36 @@ SELECT * FROM t;
  5
 (5 rows)
 
+-- recursive view
+CREATE RECURSIVE VIEW nums (n) AS
+VALUES (1)
+UNION ALL
+SELECT n+1 FROM nums WHERE n < 5;
+SELECT * FROM nums;
+ n 
+---
+ 1
+ 2
+ 3
+ 4
+ 5
+(5 rows)
+
+CREATE OR REPLACE RECURSIVE VIEW nums (n) AS
+VALUES (1)
+UNION ALL
+SELECT n+1 FROM nums WHERE n < 6;
+SELECT * FROM nums;
+ n 
+---
+ 1
+ 2

Re: [HACKERS] recursive view syntax

2012-11-13 Thread Tom Lane

Peter Eisentraut  writes:
> I noticed we don't implement the recursive view syntax, even though it's
> part of the standard SQL feature set for recursive queries.  Here is a
> patch to add that.

Can't you simplify that by using "SELECT * FROM name"?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Amit Kapila

On Wednesday, November 14, 2012 12:25 AM Robert Haas wrote:
> On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila 
> wrote:
> > Is the above opinion about only locking or even on a way to write the
> changed things in a file as mentioned in point-1 in mail chain upthread.
> > (Point-1: > 1. While writing .auto file, it will always assume that
> .auto file contain
> >> all config parameters.
> >>   Now as this .auto file is of fixed format and fixed record size, it
> can
> >> directly write a given record to its particular position.)
> > What my thinking was that if we can decide that the format and size of
> each configuration is fixed, it can be directly written without doing
> anything for it in memory.
> 
> Uh, no, I don't think that's a good idea.  IMHO, what we should do is:
> 
> 1. Read postgresql.conf.auto and remember all the settings we saw.  If
> we see something funky like an include directive, barf.
> 2. Forget the value we remembered for the particular setting being
> changed.  Instead, remember the user-supplied new value for that
> parameter.
> 3. Write a new postgresql.conf.auto based on the information
> remembered in steps 1 and 2.

I am okay with implementing the above way because as per my understanding
this is almost very similar to what I have mentioned in my initial proposal
(Point-5 in Algorithm of Alter System Set ...).
http://archives.postgresql.org/pgsql-hackers/2012-10/msg01509.php

However as now Greg suggested to explore GUC concept as well, so I would
like to check and see the feasibility by that method.

The only reason I have mentioned about fixed format and fixed record size
concept is that during previous discussions for writing the file with GUC,
it came up that is it possible to write file without reading it in current
session.
(-- It seems to me that we ought to be able to rewrite a machine-generated
configuration file without loading those values into the current session.)
Now on second thought it seems to me may be you want to say by above comment
was without loading into session specific GUC.

> Of course, if we go with one-file-per-setting, then this becomes even
> simpler: just clobber the file for the single setting being updated -
> creating it if it exists - and ignore all the rest.  I don't
> personally favor that approach because I think I think it's clunky to
> manage, but YMMV.

> With either approach, it's worth noting that a RESET variant of this
> could be useful - which would either remove the chosen setting from
> postgresql.conf.auto, or remove the file containing the
> automatically-set value for that setting.  I think my personal
> favorite syntax is:
> 
> ALTER SYSTEM .. SET wunk = 'thunk';
> ALTER SYSTEM .. RESET wunk;
> 
> But I'm OK with something else if there's consensus.  I don't
> particularly like SET PERSISTENT because I think this is more like
> ALTER DATABASE .. SET than it is like SET LOCAL, but IJWH.

I think for this there are multiple ways, one is Alter System .., other is
provide this through built-in function.
For first version may be I will go with built-in function Approach, then if
there is consensus to give it through 
Alter System, we can change it.
One advantage, I am seeing in your above suggestion is that a method to
provide RESET will be better with ALTER SYSTEM rather than built-in
function. For the same to achieve through built-in, I think one way to
provide is to give a separate function.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doc patch, index search_path where it's used to secure functions

2012-11-13 Thread Karl O. Pinc

On 11/13/2012 08:46:19 PM, Peter Eisentraut wrote:
> 
> Well, I'm not sure.  We currently have three index entries on the
> topic:
> 
> search path
> search_path
> search_path configuration parameter
> 
> I think I'd put them all under search_path.

Ok.

I think you are right that they need to all be
under one index entry.

It might make sense to have a "search term"
entry that says "see search_term", but since
the two entries would be right next to each
other in the index this seems overkill.

I'm going to send this in as a single patch
that fixes all the search path related
index entries:

  search_path-index.patch

(replaces search_path-normalize.patch
  and search_path-securing_v2.patch)

This new patch also changes the
search path index in doc/src/sgml/func.sgml.
Perhaps I just don't understand the logic
in the indexing in that file but the
search path index there seems to point to the
entire file/nowhere useful.  Since there are 2 places
in the file that are concerned with
search path I've removed the "global"
index into the file and added new
index entries.  This also moved the
"schema, current" index target.  
(Since that too involves search path.)

I've also added a  entry to "schema, current"
to the top-level "search_path" index entry.

If you want this broken up into smaller
patches let me know.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 88cea3d..3bad24c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4769,7 +4769,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
  
   search_path (string)
   
-   search_path configuration parameter
+   search_path
+   configuration parameter
   
   pathfor schemas
   
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index d6e5d64..99ebd10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1698,7 +1698,8 @@ CREATE TABLE public.products ( ... );
The Schema Search Path
 

-search path
+search_path
+schema, current

 

@@ -1735,7 +1736,8 @@ CREATE TABLE public.products ( ... );

 

-search_path
+search_path
+runtime variable

 

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f8f63d8..dc6ca1e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12944,16 +12944,6 @@ postgres=# SELECT * FROM unnest2(ARRAY[[1,2],[3,4]]);

 

-schema
-current
-   
-
-   
-search path
-current
-   
-
-   
 session_user

 
@@ -12980,6 +12970,11 @@ postgres=# SELECT * FROM unnest2(ARRAY[[1,2],[3,4]]);
 the current user is the effective user.

 
+   
+schema
+current
+   
+

 current_schema returns the name of the schema that is
 first in the search path (or a null value if the search path is
@@ -13493,6 +13488,13 @@ SELECT has_function_privilege('joeuser', 'myfunc(int, text)', 'execute');
 are immediately available without doing SET ROLE.

 
+   
+search_path
+
+  object visibility given an unqualified name
+
+   
+
   
 shows functions that
determine whether a certain object is visible in the
diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml
index 4336e4b..cb5ee91 100644
--- a/doc/src/sgml/ref/create_function.sgml
+++ b/doc/src/sgml/ref/create_function.sgml
@@ -673,6 +673,11 @@ SELECT * FROM dup(42);
  
   Writing SECURITY DEFINER Functions Safely
 
+   
+  search_path
+  use in securing functions
+   
+

 Because a SECURITY DEFINER function is executed
 with the privileges of the user that created it, care is needed to


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Amit Kapila

On Tuesday, November 13, 2012 11:43 PM Josh Berkus wrote:
> On 11/12/12 7:59 PM, Amit kapila wrote:
> > On Monday, November 12, 2012 12:07 PM Greg Smith wrote:
> > On 11/9/12 11:59 PM, Amit kapila wrote:
> >
> >>> Please let me know if there are any objections or problems in above
> method of implementation,
> >>> else I can go ahead to prepare the patch for the coming CF.
> >
> >> It may be the case that the locking scheme Robert described is the
> best
> >> approach here.  It seems kind of heavy to me though.  I suspect that
> >> some more thinking about it might come up with something better.
> 
> So, here's the problem I'm seeing with having a single .auto file:  when
> we write settings to a file, are we writing a *single* setting or *all
> of a user's current settings*?

Single setting.

> I was imagining writing single, specific settings, which inevitably
> leads to one-setting-per-file, e.g.:
> 
> SET PERSISTENT work_mem = 256MB;

Yes, from beginning what I was discussing was setting of single config 
parameter as in your example.
However, it can be done with one-file for all variables as well. 
I have already mentioned 2 ways of doing it, one is fixed format and fixed size 
file, other is similar to what Robert has detailed
in his mail (http://archives.postgresql.org/pgsql-hackers/2012-11/msg00572.php).

> What Amit seems to be talking about is more EXPORT SETTINGS, where you
> dump all current settings in the session to a file.  This seems likely
> to produce accidental changes when the user writes out settings they've
> forgotten they changed.

I think may be I was not clear enough in my previous mails, but for sure 
whatever I am talking is never related to
"dump all current settings in the session to a file". 
In fact both my ideas (fixed format file, initial proposal) was not to touch or 
check the current session parameters.
There is only one Approach which is to see if from GUC, we can write the file 
that talks about writing multiple parameters.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Martijn van Oosterhout

On Tue, Nov 13, 2012 at 05:50:08PM -0500, Stephen Frost wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
> > Yeah.  The thing that concerns me is that I think we have a pretty
> > decent number of memory contexts where the expected number of
> > allocations is very small ... and we have the context *just in case*
> > we do more than that in certain instances.  I've seen profiles where
> > the setup/teardown costs of memory contexts are significant ... which
> > doesn't mean that those examples would perform better with fewer
> > memory contexts, but it's enough to make me pause for thought.
> 
> So, for my 2c, I'm on the other side of this, personally.  We have
> memory contexts for more-or-less exactly this issue.  It's one of the
> great things about PG- it's resiliant and very unlikely to have large or
> bad memory leaks in general, much of which can, imv, be attributed to
> our use of memory contexts.

If the problem is that we create memory context overhead which is not
necessary in many cases, perhaps we can reduce the overhead somehow. 
IIRC we have a seperate function for resetting a context and freeing it
entirely.  If there was a quick test we could do such that resetting a
context did nothing unless at least (say) 16k had been allocated, that
might reduce the cost for many very small allocations.

Ofcourse, unless someone comes up with a way to measure the cost this
is all handwaving, but it might a nice project for someone interested
in learning to hack postgres.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

signature.asc
Description: Digital signature

57 matches

Mail list logo