date:20110920

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Fujii Masao

On Fri, Sep 16, 2011 at 9:25 PM, Peter Eisentraut  wrote:
> On fre, 2011-09-16 at 11:54 +0900, Fujii Masao wrote:
>> #1
>> Use empty recovery.ready file to enter arhicve recovery. recovery.conf
>> is not read automatically. All recovery parameters are expected to be
>> specified in postgresql.conf. If you must specify them in recovery.conf,
>> you need to add "include 'recovery.conf'" into postgresql.conf. But note
>> that that recovery.conf will not be renamed to recovery.done at the
>> end of recovery. This is what the patch I've posted does. This is
>> simplest approach, but might confuse people who use the tools which
>> depend on recovery.conf.
>
> A small variant to this:  When you are actually doing recovery from a
> backup, having a recovery trigger and a recovery done file is obviously
> quite helpful and necessary for safety.  But when you're setting up a
> replication slave, it adds extra complexity for the user.  The
> approximately goal ought to be to be able to do
>
> pg_basebackup -h master -D there
> postgres -D there --standby-mode=on --primary-conninfo=master
>
> without the need to touch any obscure recovery trigger files.
>
> So perhaps recovery.{trigger,ready} should not be needed if, say,
> standby_mode=on.

Or what about making pg_basebackup automatically create a
recovery trigger file?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Dave Page

On Tue, Sep 20, 2011 at 12:37 AM, Tom Lane  wrote:
> As has been mentioned a couple times, we're well overdue for updates of
> the back branches.  Seems like time to get that done, so we'll be
> wrapping 8.2.x and up this Thursday for release Monday the 26th.

8.2 up, including 9.1.1? I'm not sure our QA guys will be able to cope
with verification of so many individual installers in that timeframe -
8.2 - 9.0 is hard enough to do in one go.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Simon Riggs

On Fri, Sep 16, 2011 at 2:46 PM, Euler Taveira de Oliveira
 wrote:
> On 15-09-2011 23:54, Fujii Masao wrote:
>>
>> #1
>> Use empty recovery.ready file to enter arhicve recovery. recovery.conf
>> is not read automatically. All recovery parameters are expected to be
>> specified in postgresql.conf. If you must specify them in recovery.conf,
>> you need to add "include 'recovery.conf'" into postgresql.conf. But note
>> that that recovery.conf will not be renamed to recovery.done at the
>> end of recovery. This is what the patch I've posted does. This is
>> simplest approach, but might confuse people who use the tools which
>> depend on recovery.conf.
>>
> more or less +1. We don't need two config files.; just one: postgresql.conf.
> Just turn all recovery.conf parameters to GUCs. As already said, the
> recovery.conf settings are not different from archive settings, we just need
> a way to trigger the recovery. And that trigger could be pulled by a GUC
> (standby_mode) or a file (say recovery -> recovery.done). Also,
> recovery.done could be filled with recovery information just for DBA record.
> standby_mode does not create any file, it just trigger the recovery (as it
> will be used mainly for replication purposes).

I sympathise with this view, to an extent.

If people want to put all parameters in one file, they can do so. So +1 to that.

Should they be forced to adopt that new capability by us deliberately
breaking their existing setups? No. So -1 to that.

If we do an automatic include of recovery.conf first, then follow by
reading postgresql,conf then we will preserve the old as well as
allowing the new.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Simon Riggs

On Fri, Sep 16, 2011 at 3:54 AM, Fujii Masao  wrote:
> On Thu, Sep 15, 2011 at 11:37 PM, Tom Lane  wrote:
>> This seems like it's already predetermining the outcome of the argument
>> about recovery.conf.  Mind you, I'm not unhappy with this choice, but
>> it's hardly implementing only behavior that's not being debated.
>>
>> If we're satisfied with not treating recovery-only parameters different
>> from run-of-the-mill GUCs, this is fine.
>
> Okay, we need to reach a consensus about the treatment of
> recovery.conf.
>
> We have three choices.

What we should focus on is these requirements

1. Don't break software that relies on the existing behaviour

2. Allow parameters to be reloaded at SIGHUP

3. Allow recovery parameters to be handled same way as other GUCs

4. We need to retain recovery.conf/.done style behaviour to mark end
of archive recovery

Making an automatic include makes (2) and (3) work OK, without
breaking (1). I haven't seen another solution that doesnt break (1).

Rename of .conf to .done allows us to maintain working code for
existing solutions (there are *many* out there...)

We should say that the automatic include only works during recovery,
so if someone puts recovery.conf back then we ignore it.

If we treat recovery,conf as being read *first* then any parameter
mentioned twice (i.e. mentioned again in postgresql.conf) will
override the setting in recovery.conf and we have no issues.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Simon Riggs

On Fri, Sep 16, 2011 at 1:13 PM, Peter Eisentraut  wrote:
> On fre, 2011-09-16 at 01:32 -0400, Tom Lane wrote:
>> As far as the other issues go, I think there is actually a
>> prerequisite
>> discussion to be had here, which is whether we are turning the
>> recovery
>> parameters into plain old GUCs or not.  If they are plain old GUCs,
>> then
>> they will presumably still have their values when we are *not* doing
>> recovery.  That leads to a couple of questions:
>> * will seeing these values present in pg_settings confuse anybody?
>
> How so?  We add or change the available parameters all the time.
>
>> * can the values be changed when not in recovery, if so what happens,
>>   and again will that confuse anybody?
>
> Should be similar to archive_command and archive_mode.  You can still
> see and change archive_command when archive_mode is off.


I do think special handling would be useful here, of some kind. We
could reset them as well, if we wished, but that is probably more
trouble than is worth.

Perhaps we need a new SCOPE attribute on pg_settings to show whether
the parameter applies in recovery, in normal or both.


>> * is there any security hazard from ordinary users being able to see
>>   what settings had been used?
>
> Again, not much different from the archive_* settings.  They are, after
> all, almost the same in the opposite direction.


There is a potential security hole if people hardcode passwords into
primary_conninfo. As long as we document not to do that, we're OK.


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Thom Brown

On 20 September 2011 05:20, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Sep 19, 2011 at 10:04 PM, Tom Lane  wrote:
>>> We could possibly add a HINT suggesting that the locale isn't installed,
>>> but I don't see that we could offer any useful generic advice about how
>>> to install it.  I'm also worried about how to phrase the hint to cover
>>> some other obvious possibilities, like "you fat-fingered the locale
>>> name".
>
>> Maybe something like this?
>
>> HINT: The operating system was unable to find any locale data for the
>> locale name you specified.
>
> Hmm, that's not bad.  We could probably even call it errdetail, since
> it's not so much a hint as explaining what the SUS spec states that the
> ENOENT error code means here.
>
> In the nitpick department, s/was unable to/could not/ per our usual
> message style guidelines.  Otherwise seems good.

Sounds good to me.  If this is to be the errdetail, does that mean
you'd be keeping the original message in tact?  The problem with the
actual error message is that it might cause the user to think along
the lines of "Am I supposed to put a fully qualified path in this
parameter?".

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Simon Riggs

On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao  wrote:
> On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs  wrote:
>> This patch splits bgwriter into 2 processes: checkpointer and
>> bgwriter, seeking to avoid contentious changes. Additional changes are
>> expected in this release to build upon these changes for both new
>> processes, though this patch stands on its own as both a performance
>> vehicle and in some ways a refcatoring to simplify the code.
>
> I like this idea to simplify the code. How much performance gain can we
> expect by this patch?

On heavily I/O bound systems, this is likely to make a noticeable
difference, since bgwriter reduces I/O in user processes.

The overhead of sending signals between processes is much less than I
had previously thought, so I expect no problems there, even on highly
loaded systems.


>> Current patch has a bug at shutdown I've not located yet, but seems
>> likely is a simple error. That is mainly because for personal reasons
>> I've not been able to work on the patch recently. I expect to be able
>> to fix that later in the CF.
>
> You seem to have forgotten to include checkpointor.c and .h in the patch.

I confirm this error. I'll repost full patch later in the week when I
have more time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 10:48, Simon Riggs wrote:

On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao  wrote:

On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs  wrote:

This patch splits bgwriter into 2 processes: checkpointer and
bgwriter, seeking to avoid contentious changes. Additional changes are
expected in this release to build upon these changes for both new
processes, though this patch stands on its own as both a performance
vehicle and in some ways a refcatoring to simplify the code.


I like this idea to simplify the code. How much performance gain can we
expect by this patch?


On heavily I/O bound systems, this is likely to make a noticeable
difference, since bgwriter reduces I/O in user processes.


Hmm. If the system is I/O bound, it doesn't matter which process 
performs the I/O. It's still the same amount of I/O in total, and in an 
I/O bound system, that's what determines the overall throughput.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Simon Riggs

On Tue, Sep 20, 2011 at 9:06 AM, Heikki Linnakangas
 wrote:
> On 20.09.2011 10:48, Simon Riggs wrote:
>>
>> On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao
>>  wrote:
>>>
>>> On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs
>>>  wrote:

 This patch splits bgwriter into 2 processes: checkpointer and
 bgwriter, seeking to avoid contentious changes. Additional changes are
 expected in this release to build upon these changes for both new
 processes, though this patch stands on its own as both a performance
 vehicle and in some ways a refcatoring to simplify the code.
>>>
>>> I like this idea to simplify the code. How much performance gain can we
>>> expect by this patch?
>>
>> On heavily I/O bound systems, this is likely to make a noticeable
>> difference, since bgwriter reduces I/O in user processes.
>
> Hmm. If the system is I/O bound, it doesn't matter which process performs
> the I/O. It's still the same amount of I/O in total, and in an I/O bound
> system, that's what determines the overall throughput.

That's true, but not relevant.

The bgwriter avoids I/O, if it is operating correctly. This patch
ensures it continues to operate even during heavy checkpoints. So it
helps avoid extra I/O during a period of very high I/O activity.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 11:18, Simon Riggs wrote:

The bgwriter avoids I/O, if it is operating correctly. This patch
ensures it continues to operate even during heavy checkpoints. So it
helps avoid extra I/O during a period of very high I/O activity.


I don't see what difference it makes which process does the I/O. If a 
write() by checkpointer process blocks, any write()s by the separate 
bgwriter process at that time will block too. If the I/O is not 
saturated, and the checkpoint write()s don't block, then even without 
this patch, the bgwriter process can handle its usual bgwriter duties 
during checkpoint just fine. (And if the I/O is not saturated, it's not 
an I/O bound system anyway.)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CUDA Sorting

2011-09-20 Thread Florian Pflug

On Sep19, 2011, at 19:46 , Stephen Frost wrote:
> I agree that it'd be interesting to do, but I share Lord Stark's
> feelings about the challenges and lack of potential gain- it's a very
> small set of queries that would benefit from this.  You need to be
> working with enough data to make the cost of tranferring it all over to
> the GPU worthwhile, just for starters..

I wonder if anyone has ever tried to employ a GPU for more low-level
tasks. Things like sorting or hashing are hard to move to the
GPU in postgres because, in the general case, they involve essentially
arbitrary user-defined functions. But couldn't for example the WAL CRC
computation be moved to a GPU? Or, to get really crazy, even the search
for the optimal join order (only for a large number of joins though,
i.e. where we currently switch to a genetic algorithmn)?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread Shigeru Hanada

Hi Fujita-san,

(2011/09/12 19:40), Etsuro Fujita wrote:
> Hi there,
> 
> To enable file_fdw to estimate costs of scanning a CSV file more
> accurately, I would like to propose a new FDW callback routine,
> AnalyzeForeignTable, which allows to ANALYZE command to collect
> statistics on a foreign table, and a corresponding file_fdw function,
> fileAnalyzeForeignTable. Attached is my WIP patch.

I think this is a very nice feature so that planner would be able to
create smarter plan for a query which uses foreign tables.

I took a look at the patch, and found that it couldn't be applied
cleanly against HEAD.  Please rebase your patch against current HEAD of
master branch, rather than 9.1beta1.

The wiki pages below would be helpful for you.
  http://wiki.postgresql.org/wiki/Submitting_a_Patch
  http://wiki.postgresql.org/wiki/Creating_Clean_Patches
  http://wiki.postgresql.org/wiki/Reviewing_a_Patch

And it would be easy to use git to follow changes made by other
developers in master branch.
   http://wiki.postgresql.org/wiki/Working_with_Git

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] MicrOLAP Database Designer with PostgreSQL 9.1 support is out!

2011-09-20 Thread Pavel Golub

Hello.

Database Designer for PostgreSQL is an easy CASE tool which works
natively under Windows OS family and Linux under Wine/WineHQ. 

This release introduces new functionality as well as several bug fixes.
Support for PostgreSQL 9.1 added, new Create HTML Report functionality
present, unlogged tables support added, Database Generation and
Modification dialogs improved.  

You're welcome to download the Database Designer for PostgreSQL 1.8.1 right now 
at:
http://microolap.com/products/database/postgresql-designer/download/

Full changelog:

[+] Name patterns for newly created objects supported
[+] Object selection added to Create HTML Report dialog
[+] Support for unlogged tables using the UNLOGGED option added
[*] Additional checks added for dependent composite types during model loading
[*] Improved processing for fields deletion with referenced foreign keys in 
Modify engine
[*] Syntax highlightning improved for base types
[-] "Access violation on Privilege Manager Add button click" bug fixed
[-] "Cannot cast type text[] to text error duting Reverse Engineering on 
pre-8.2 servers" bug fixed
[-] "EInvalidCast error occurs during deletion of domain in Domain & UDT 
Manager" bug fixed
[-] "Relation 'pg_users' does not exist during Reverse Engineering on pre-8.1 
servers" bug fixed
[-] "Some sequences may not be generated using Generate Database dialog" bug 
fixed
[-] "Stamp colors are not saved" bug fixed
[-] "Stored Routine Editor dissalows non-latin character in routine name" bug 
fixed

Please don't hesitate to ask any questions or report bugs with our Support 
Ticketing system available at
http://www.microolap.com/support/




-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Simon Riggs

On Tue, Sep 20, 2011 at 10:03 AM, Heikki Linnakangas
 wrote:
> On 20.09.2011 11:18, Simon Riggs wrote:
>>
>> The bgwriter avoids I/O, if it is operating correctly. This patch
>> ensures it continues to operate even during heavy checkpoints. So it
>> helps avoid extra I/O during a period of very high I/O activity.
>
> I don't see what difference it makes which process does the I/O. If a
> write() by checkpointer process blocks, any write()s by the separate
> bgwriter process at that time will block too. If the I/O is not saturated,
> and the checkpoint write()s don't block, then even without this patch, the
> bgwriter process can handle its usual bgwriter duties during checkpoint just
> fine. (And if the I/O is not saturated, it's not an I/O bound system
> anyway.)

Whatever value you assign to the bgwriter, then this patch makes sure
that happens during heavy fsyncs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Grouping Sets

2011-09-20 Thread David Rinaldi

Since it seems that you have spent some considerable time investigating and
producing a working concept, what would your best guess time estimate be,
assuming the requisite skills/talent/will in (planner/executor/etc.), to
have a solid working module put together? Are we looking at something like
40 hours or more like 5000 hours, in your estimate? 

Thanks. 

--
Regards

David 

-Original Message-
From: Pavel Stehule [mailto:pavel.steh...@gmail.com] 
Sent: Monday, September 19, 2011 10:45 PM
To: edwbro...@gmail.com
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Grouping Sets

Hello

2011/9/20 David Rinaldi :
> Paul,
>
> I was able to apply the patch to 9.0.4 and so far looks good.  My Oracle
> results match. Nice.
>
> But, when trying to calculate some percentages and control some rounding,
> the results are coming back as null for some reason.  I have tried
casting,
> to_char, etc to try to get them to show up..no love ensued. I was
wondering
> if you have any idea what could by happening. I have attached some test
> results based on the grouping sets wiki. One of the examples is just using
> group by, as a sanity check.  Any ideas or help would be much appreciated.
>

sorry, I have not any useful idea. This work was a concept and it is
probable, so there will be some corner issues :(.

This feature needs more love and some more significant changes in
planner and executor.

Regards

Pavel


>
> CREATE TABLE cars2 (name CHARACTER VARYING , place CHARACTER VARYING ,
> sales real, cost real );
>
> insert into cars2 values('skoda', 'czech rep.', 1, 8000);
> insert into cars2 values('skoda', 'germany', 5000, 6000);
> insert into cars2 values('bmw', 'czech rep.', 6000, 4000);
> insert into cars2 values('bmw', 'germany', 18000, 15000);
> insert into cars2 values('opel', 'czech rep.', 7000, 5000);
> insert into cars2 values('opel', 'germany', 7000, 5000);
>
> --grouping sets test--
>
> select name, place,
> sum(sales) as sales,
> sum(cost) as cost,
> sum(cost) / sum(sales) as cost_sales_ratio,
> (sum(cost) / sum(sales)) * 100 as cost_sales_ratio_per,
> round(((sum(cost) / sum(sales)) * 100),0) as cost_sales_ratio_per_rnd
> from cars2 group by rollup(name, place);
>
>
> name    place   sales   cost    cost_sales_ratio      
 cost_sales_ratio_per
> cost_sales_ratio_per_rnd
> bmw     czech rep.      6000    4000    0.6667  (null)  (null)
> skoda   germany 5000    6000    1.2     (null)  (null)
> opel    czech rep.      7000    5000    0.7143  (null)  (null)
> opel    germany 7000    5000    0.7143  (null)  (null)
> skoda   czech rep.      1   8000    0.8     (null)  (null)
> bmw     germany 18000   15000   0.8333  (null)  (null)
> bmw     (null)  24000   19000   0.7917  (null)  (null)
> skoda   (null)  15000   14000   0.9333  (null)  (null)
> opel    (null)  14000   1   0.7143  (null)  (null)
> (null)  (null)  53000   43000   0.8113  (null)  (null)
>
> --group by sanity test--
>
> select name, place,
> sum(sales) as sales,
> sum(cost) as cost,
> sum(cost) / sum(sales) as cost_sales_ratio,
> (sum(cost) / sum(sales)) * 100 as cost_sales_ratio_per,
> round(((sum(cost) / sum(sales)) * 100),0) as cost_sales_ratio_per_rnd
> from cars2 group by name, place;
>
> name    place   sales   cost    cost_sales_ratio      
 cost_sales_ratio_per
> cost_sales_ratio_per_rnd
> bmw     czech rep.      6000    4000    0.6667  66.6667 67
> skoda   germany 5000    6000    1.2     120     120
> opel    czech rep.      7000    5000    0.7143  71.4286 71
> opel    germany 7000    5000    0.7143  71.4286 71
> skoda   czech rep.      1   8000    0.8     80      80
> bmw     germany 18000   15000   0.8333  83. 83
>
>
>
> Thanks
>
>
>
>
> --
> Regards
>
> David
>
>
> -Original Message-
> From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
> Sent: Sunday, September 18, 2011 2:34 PM
> To: David Rinaldi
> Subject: Re: [HACKERS] Grouping Sets
>
> Hello
>
> A last patch should be applied on 8.4 or 9.0 - should to try it. I
> worked with developer version.
>
> http://archives.postgresql.org/pgsql-hackers/2010-08/msg00647.php
>
> Regards
>
> Pavel Stehule
>
> 2011/9/18 David Rinaldi :
>> Hi,
>>
>> I tried to apply the Grouping Sets Patch to 8.4, but received several
> Hunks
>> failed messages, does anyone know if the failing hunks can be applied
>> manually?  Or what version they were applied to specifically?
>>
>> --
>> Regards
>>
>> David
>>
>
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Grouping Sets

2011-09-20 Thread Pavel Stehule

Hello

2011/9/20 David Rinaldi :
> Since it seems that you have spent some considerable time investigating and
> producing a working concept, what would your best guess time estimate be,
> assuming the requisite skills/talent/will in (planner/executor/etc.), to
> have a solid working module put together? Are we looking at something like
> 40 hours or more like 5000 hours, in your estimate?
>

it depends on your knowledge of pg internals and your motivation :). I
thing so it can be less than 40 hours for elimination of these issues
and next 40 hours for some finalisation.

If I remember well, I had a prototype after one week of hacking, and I
am not a strong programmer.

Regards

Pavel Stehule

> Thanks.
>
> --
> Regards
>
> David
>
> -Original Message-
> From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
> Sent: Monday, September 19, 2011 10:45 PM
> To: edwbro...@gmail.com
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Grouping Sets
>
> Hello
>
> 2011/9/20 David Rinaldi :
>> Paul,
>>
>> I was able to apply the patch to 9.0.4 and so far looks good.  My Oracle
>> results match. Nice.
>>
>> But, when trying to calculate some percentages and control some rounding,
>> the results are coming back as null for some reason.  I have tried
> casting,
>> to_char, etc to try to get them to show up..no love ensued. I was
> wondering
>> if you have any idea what could by happening. I have attached some test
>> results based on the grouping sets wiki. One of the examples is just using
>> group by, as a sanity check.  Any ideas or help would be much appreciated.
>>
>
> sorry, I have not any useful idea. This work was a concept and it is
> probable, so there will be some corner issues :(.
>
> This feature needs more love and some more significant changes in
> planner and executor.
>
> Regards
>
> Pavel
>
>
>>
>> CREATE TABLE cars2 (name CHARACTER VARYING , place CHARACTER VARYING ,
>> sales real, cost real );
>>
>> insert into cars2 values('skoda', 'czech rep.', 1, 8000);
>> insert into cars2 values('skoda', 'germany', 5000, 6000);
>> insert into cars2 values('bmw', 'czech rep.', 6000, 4000);
>> insert into cars2 values('bmw', 'germany', 18000, 15000);
>> insert into cars2 values('opel', 'czech rep.', 7000, 5000);
>> insert into cars2 values('opel', 'germany', 7000, 5000);
>>
>> --grouping sets test--
>>
>> select name, place,
>> sum(sales) as sales,
>> sum(cost) as cost,
>> sum(cost) / sum(sales) as cost_sales_ratio,
>> (sum(cost) / sum(sales)) * 100 as cost_sales_ratio_per,
>> round(((sum(cost) / sum(sales)) * 100),0) as cost_sales_ratio_per_rnd
>> from cars2 group by rollup(name, place);
>>
>>
>> name    place   sales   cost    cost_sales_ratio
>  cost_sales_ratio_per
>> cost_sales_ratio_per_rnd
>> bmw     czech rep.      6000    4000    0.6667  (null)  (null)
>> skoda   germany 5000    6000    1.2     (null)  (null)
>> opel    czech rep.      7000    5000    0.7143  (null)  (null)
>> opel    germany 7000    5000    0.7143  (null)  (null)
>> skoda   czech rep.      1   8000    0.8     (null)  (null)
>> bmw     germany 18000   15000   0.8333  (null)  (null)
>> bmw     (null)  24000   19000   0.7917  (null)  (null)
>> skoda   (null)  15000   14000   0.9333  (null)  (null)
>> opel    (null)  14000   1   0.7143  (null)  (null)
>> (null)  (null)  53000   43000   0.8113  (null)  (null)
>>
>> --group by sanity test--
>>
>> select name, place,
>> sum(sales) as sales,
>> sum(cost) as cost,
>> sum(cost) / sum(sales) as cost_sales_ratio,
>> (sum(cost) / sum(sales)) * 100 as cost_sales_ratio_per,
>> round(((sum(cost) / sum(sales)) * 100),0) as cost_sales_ratio_per_rnd
>> from cars2 group by name, place;
>>
>> name    place   sales   cost    cost_sales_ratio
>  cost_sales_ratio_per
>> cost_sales_ratio_per_rnd
>> bmw     czech rep.      6000    4000    0.6667  66.6667 67
>> skoda   germany 5000    6000    1.2     120     120
>> opel    czech rep.      7000    5000    0.7143  71.4286 71
>> opel    germany 7000    5000    0.7143  71.4286 71
>> skoda   czech rep.      1   8000    0.8     80      80
>> bmw     germany 18000   15000   0.8333  83. 83
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>> --
>> Regards
>>
>> David
>>
>>
>> -Original Message-
>> From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
>> Sent: Sunday, September 18, 2011 2:34 PM
>> To: David Rinaldi
>> Subject: Re: [HACKERS] Grouping Sets
>>
>> Hello
>>
>> A last patch should be applied on 8.4 or 9.0 - should to try it. I
>> worked with developer version.
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-08/msg00647.php
>>
>> Regards
>>
>> Pavel Stehule
>>
>> 2011/9/18 David Rinaldi :
>>> Hi,
>>>
>>> I tried to apply the Grouping Sets Patch to 8.4, but received several
>> Hunks
>>> failed messages, does anyone know if the failing hunks can be applied
>>> manually?  Or what version they were applied to specifically?
>>>
>>> --
>>> Regards
>>>
>>> David
>>>
>>
>>
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgres

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Andrew Dunstan




On 09/20/2011 02:46 AM, Devrim GÜNDÜZ wrote:

On Tue, 2011-09-20 at 01:37 -0400, Tom Lane wrote:

As has been mentioned a couple times, we're well overdue for updates
of the back branches.  Seems like time to get that done, so we'll be
wrapping 8.2.x and up this Thursday for release Monday the 26th.

Can we also specify a final release version for 8.2? This set will be
8.2.21, and I propose to EOL 8.2 as of 8.2.22.




I don't see why we should deviate from the policy at 
. In 
all probability, that means there will be one more release for 8.2 after 
this, but I don't think we need to determine that now.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Greg Stark

On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs  wrote:
>> I don't see what difference it makes which process does the I/O. If a
>> write() by checkpointer process blocks, any write()s by the separate
>> bgwriter process at that time will block too. If the I/O is not saturated,
>> and the checkpoint write()s don't block, then even without this patch, the
>> bgwriter process can handle its usual bgwriter duties during checkpoint just
>> fine. (And if the I/O is not saturated, it's not an I/O bound system
>> anyway.)
>
> Whatever value you assign to the bgwriter, then this patch makes sure
> that happens during heavy fsyncs.

I think his point is that it doesn't because if the heavy fsyncs cause
the system to be i/o bound it then bgwriter will just block issuing
the writes instead of the fsyncs.

I'm not actually convinced. Writes will only block if the kernel
decides to block. We don't really know how the kernel makes this
decision but it's entirely possible that having pending physical i/o
issued due to an fsync doesn't influence the decision if there is
still a reasonable number of dirty pages in the buffer cache.  In a
sense, "I/O bound" means different things for write and fsync. Or to
put it another way fsync is latency sensitive but write is only
bandwidth sensitive.

All that said my question is which way is the code more legible and
easier to follow?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 16:29, Greg Stark wrote:

On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs  wrote:

I don't see what difference it makes which process does the I/O. If a
write() by checkpointer process blocks, any write()s by the separate
bgwriter process at that time will block too. If the I/O is not saturated,
and the checkpoint write()s don't block, then even without this patch, the
bgwriter process can handle its usual bgwriter duties during checkpoint just
fine. (And if the I/O is not saturated, it's not an I/O bound system
anyway.)


Whatever value you assign to the bgwriter, then this patch makes sure
that happens during heavy fsyncs.


I think his point is that it doesn't because if the heavy fsyncs cause
the system to be i/o bound it then bgwriter will just block issuing
the writes instead of the fsyncs.

I'm not actually convinced. Writes will only block if the kernel
decides to block. We don't really know how the kernel makes this
decision but it's entirely possible that having pending physical i/o
issued due to an fsync doesn't influence the decision if there is
still a reasonable number of dirty pages in the buffer cache.  In a
sense, "I/O bound" means different things for write and fsync. Or to
put it another way fsync is latency sensitive but write is only
bandwidth sensitive.


Yeah, I was thinking of write()s, not fsyncs. I agree this might have 
some effect during fsync phase.



All that said my question is which way is the code more legible and
easier to follow?


Hear hear. If we're going to give the bgwriter more responsibilities, 
this might make sense even if it has no effect on performance.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Magnus Hagander

On Tue, Sep 20, 2011 at 15:35, Heikki Linnakangas
 wrote:
> On 20.09.2011 16:29, Greg Stark wrote:
>>
>> On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs
>>  wrote:

 I don't see what difference it makes which process does the I/O. If a
 write() by checkpointer process blocks, any write()s by the separate
 bgwriter process at that time will block too. If the I/O is not
 saturated,
 and the checkpoint write()s don't block, then even without this patch,
 the
 bgwriter process can handle its usual bgwriter duties during checkpoint
 just
 fine. (And if the I/O is not saturated, it's not an I/O bound system
 anyway.)
>>>
>>> Whatever value you assign to the bgwriter, then this patch makes sure
>>> that happens during heavy fsyncs.
>>
>> I think his point is that it doesn't because if the heavy fsyncs cause
>> the system to be i/o bound it then bgwriter will just block issuing
>> the writes instead of the fsyncs.
>>
>> I'm not actually convinced. Writes will only block if the kernel
>> decides to block. We don't really know how the kernel makes this
>> decision but it's entirely possible that having pending physical i/o
>> issued due to an fsync doesn't influence the decision if there is
>> still a reasonable number of dirty pages in the buffer cache.  In a
>> sense, "I/O bound" means different things for write and fsync. Or to
>> put it another way fsync is latency sensitive but write is only
>> bandwidth sensitive.
>
> Yeah, I was thinking of write()s, not fsyncs. I agree this might have some
> effect during fsync phase.
>
>> All that said my question is which way is the code more legible and
>> easier to follow?
>
> Hear hear. If we're going to give the bgwriter more responsibilities, this
> might make sense even if it has no effect on performance.

Isn't there also the advantage of that work put in two different
processes can use two different CPU cores? Or is that likely to never
ever come in play here?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-20 Thread Peter Eisentraut

On sön, 2011-09-18 at 12:43 -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On sön, 2011-09-18 at 09:45 -0500, Dave Page wrote:
> >> That is much more reasonable, though unfortunately not what was said.
> >> Regardless, I stand by my main point that such a representative should
> >> be communicating with the project regularly. Having a rep who works
> >> outside the project is of no use at all. 
> 
> > Well, the point of this thread is, how can she communicate?
> 
> +1 for a closed mailing list.  It's a bit annoying to have to do such
> a thing, but it's not like we haven't got other closed lists for
> appropriate purposes.

Well, that much we've already decided a few years ago.  The blocking
issues are: (1) do we have enough interest, and (2) where to put it (I'm
looking at you, pgfoundry).

> I guess the real question is, exactly what will be the requirements
> for joining?

As as far as I'm concerned, anyone who is known in the community and has
a plausible interest can join.  The requirement is that we share this
material with "colleagues" for consultation, as opposed to posting it on
the public internet.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-20 Thread Alvaro Herrera


Excerpts from Peter Eisentraut's message of mar sep 20 10:51:51 -0300 2011:
> On sön, 2011-09-18 at 12:43 -0400, Tom Lane wrote:
> > Peter Eisentraut  writes:
> > > On sön, 2011-09-18 at 09:45 -0500, Dave Page wrote:
> > >> That is much more reasonable, though unfortunately not what was said.
> > >> Regardless, I stand by my main point that such a representative should
> > >> be communicating with the project regularly. Having a rep who works
> > >> outside the project is of no use at all. 
> > 
> > > Well, the point of this thread is, how can she communicate?
> > 
> > +1 for a closed mailing list.  It's a bit annoying to have to do such
> > a thing, but it's not like we haven't got other closed lists for
> > appropriate purposes.
> 
> Well, that much we've already decided a few years ago.  The blocking
> issues are: (1) do we have enough interest, and (2) where to put it (I'm
> looking at you, pgfoundry).

I don't see why we wouldn't put it in @postgresql.org.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread Marti Raudsepp

2011/9/12 Etsuro Fujita :
> This is called when ANALYZE command is executed. (ANALYZE
> command should be executed because autovacuum does not analyze foreign
> tables.)

This is a good idea.

However, if adding these statistics requires an explicit ANALYZE
command, then we should also have a command for resetting the
collected statistics -- to get it back into the un-analyzed state.

Currently it looks like the only way to reset statistics is to tamper
with catalogs directly, or recreate the foreign table.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 16:49, Magnus Hagander wrote:

Isn't there also the advantage of that work put in two different
processes can use two different CPU cores? Or is that likely to never
ever come in play here?


You would need one helluva I/O system to saturate even a single CPU, 
just by doing write+fsync.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Tom Lane

Andrew Dunstan  writes:
> On 09/20/2011 02:46 AM, Devrim GÃNDÃZ wrote:
>> Can we also specify a final release version for 8.2? This set will be
>> 8.2.21, and I propose to EOL 8.2 as of 8.2.22.

> I don't see why we should deviate from the policy at 
> . In 
> all probability, that means there will be one more release for 8.2 after 
> this, but I don't think we need to determine that now.

Per that policy, there will certainly be at least one more 8.2.x
release, namely the first one "after December 2011".  There could be
more than one, if we are pressed into making a set of releases between
now and December.

I don't think we've yet decided what the policy means if a release
happens during the stated calendar month, which seems rather likely
this time around in view of our historical record of doing updates
roughly quarterly.  Should we settle that detail now?  That is,
does "after December" really mean "in or after December", or did we
really mean "after"?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Cédric Villemain

2011/9/20 Heikki Linnakangas :
> On 20.09.2011 16:49, Magnus Hagander wrote:
>>
>> Isn't there also the advantage of that work put in two different
>> processes can use two different CPU cores? Or is that likely to never
>> ever come in play here?
>
> You would need one helluva I/O system to saturate even a single CPU, just by
> doing write+fsync.

The point of Magnus is valid. There are possible throttling done by
linux per node, per process/task.
Since ..2.6.37 (32 ?) I believe .. there are more temptation to have
have per cgroup io/sec limits, and there exists some promising work
done to have a better IO bandwith throttling per process.

IMO, splitting the type of IO workload per process allows the
administrators to have more control on the IO limits they want to have
(and it may help the kernels() to have a better strategy ?)

>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Andrew Dunstan




On 09/20/2011 10:28 AM, Tom Lane wrote:


I don't think we've yet decided what the policy means if a release
happens during the stated calendar month, which seems rather likely
this time around in view of our historical record of doing updates
roughly quarterly.  Should we settle that detail now?  That is,
does "after December" really mean "in or after December", or did we
really mean "after"?




If we really want to get that specific, let's just say that the EOL date 
is at the end of the designated month.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Dave Page

2011/9/20 Andrew Dunstan :
>
>
> On 09/20/2011 10:28 AM, Tom Lane wrote:
>>
>> I don't think we've yet decided what the policy means if a release
>> happens during the stated calendar month, which seems rather likely
>> this time around in view of our historical record of doing updates
>> roughly quarterly.  Should we settle that detail now?  That is,
>> does "after December" really mean "in or after December", or did we
>> really mean "after"?
>>
>>
>
> If we really want to get that specific, let's just say that the EOL date is
> at the end of the designated month.

+1


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Tom Lane

Dave Page  writes:
> 2011/9/20 Andrew Dunstan :
>> On 09/20/2011 10:28 AM, Tom Lane wrote:
>>> does "after December" really mean "in or after December", or did we
>>> really mean "after"?

>> If we really want to get that specific, let's just say that the EOL date is
>> at the end of the designated month.

> +1

OK, so "after" really means "after", ie, the last 8.2.x release will be
timestamped 2012-something.  Fine with me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 17:31, Cédric Villemain wrote:

2011/9/20 Heikki Linnakangas:

On 20.09.2011 16:49, Magnus Hagander wrote:


Isn't there also the advantage of that work put in two different
processes can use two different CPU cores? Or is that likely to never
ever come in play here?


You would need one helluva I/O system to saturate even a single CPU, just by
doing write+fsync.


The point of Magnus is valid. There are possible throttling done by
linux per node, per process/task.
Since ..2.6.37 (32 ?) I believe .. there are more temptation to have
have per cgroup io/sec limits, and there exists some promising work
done to have a better IO bandwith throttling per process.

IMO, splitting the type of IO workload per process allows the
administrators to have more control on the IO limits they want to have
(and it may help the kernels() to have a better strategy ?)


That is a separate issue from being able to use different CPU cores. But 
cool! I didn't know Linux can do that nowadays. That could be highly 
useful, if you can put e.g autovacuum on a different cgroup from regular 
backends.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Dave Page

2011/9/20 Tom Lane :
> Dave Page  writes:
>> 2011/9/20 Andrew Dunstan :
>>> On 09/20/2011 10:28 AM, Tom Lane wrote:
 does "after December" really mean "in or after December", or did we
 really mean "after"?
>
>>> If we really want to get that specific, let's just say that the EOL date is
>>> at the end of the designated month.
>
>> +1
>
> OK, so "after" really means "after", ie, the last 8.2.x release will be
> timestamped 2012-something.  Fine with me.

It's unfortunate, but it seems to me it's the only interpretation that
doesn't risk taking someone by surprise.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 9:35 AM, Heikki Linnakangas
 wrote:
>> All that said my question is which way is the code more legible and
>> easier to follow?
>
> Hear hear. If we're going to give the bgwriter more responsibilities, this
> might make sense even if it has no effect on performance.

I agree.  I don't think this change needs to be justified on
performance grounds; there are enough collateral benefits to make it
worthwhile.  If the checkpoint process handles all the stuff with
highly variable latency (i.e. fsyncs), then the background writer work
will happen more regularly and predictably.  The code will also be
simpler, which I think will open up opportunities for additional
optimizations such as (perhaps) making the background writer only wake
up when there are dirty buffers to write, which ties in to
longstanding concerns about power consumption.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Tom Lane

Dave Page  writes:
> On Tue, Sep 20, 2011 at 12:37 AM, Tom Lane  wrote:
>> As has been mentioned a couple times, we're well overdue for updates of
>> the back branches.  Seems like time to get that done, so we'll be
>> wrapping 8.2.x and up this Thursday for release Monday the 26th.

> 8.2 up, including 9.1.1? I'm not sure our QA guys will be able to cope
> with verification of so many individual installers in that timeframe -
> 8.2 - 9.0 is hard enough to do in one go.

Well, all the pre-9.1 branches are definitely badly in need of updates.
9.1 maybe could go without at this point, but we do have one crasher bug
and one serious memory leak fixed there, neither new in 9.1.  I'd just
as soon not establish a precedent for not releasing the same fixes at
the same time in all branches.

How about we wrap them all, but you could let your team slip the 9.1
update for a day or so if they need more time?  It's certainly less
critical than the older branches.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Marti Raudsepp

On Fri, Sep 16, 2011 at 01:53, Simon Riggs  wrote:
> This patch splits bgwriter into 2 processes: checkpointer and
> bgwriter, seeking to avoid contentious changes. Additional changes are
> expected in this release to build upon these changes for both new
> processes, though this patch stands on its own as both a performance
> vehicle and in some ways a refcatoring to simplify the code.

While you're already splitting up bgwriter, could there be any benefit
to spawning a separate bgwriter process for each tablespace?

If your database has one tablespace on a fast I/O system and another
on a slow one, the slow tablespace would also bog down background
writing for the fast tablespace. But I don't know whether that's
really a problem or not.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Dave Page

On Tue, Sep 20, 2011 at 3:57 PM, Tom Lane  wrote:
> Dave Page  writes:
>> On Tue, Sep 20, 2011 at 12:37 AM, Tom Lane  wrote:
>>> As has been mentioned a couple times, we're well overdue for updates of
>>> the back branches.  Seems like time to get that done, so we'll be
>>> wrapping 8.2.x and up this Thursday for release Monday the 26th.
>
>> 8.2 up, including 9.1.1? I'm not sure our QA guys will be able to cope
>> with verification of so many individual installers in that timeframe -
>> 8.2 - 9.0 is hard enough to do in one go.
>
> Well, all the pre-9.1 branches are definitely badly in need of updates.
> 9.1 maybe could go without at this point, but we do have one crasher bug
> and one serious memory leak fixed there, neither new in 9.1.  I'd just
> as soon not establish a precedent for not releasing the same fixes at
> the same time in all branches.
>
> How about we wrap them all, but you could let your team slip the 9.1
> update for a day or so if they need more time?  It's certainly less
> critical than the older branches.

OK, well we can push the installers much more quickly over the CDN anyway.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread Tom Lane

Marti Raudsepp  writes:
> 2011/9/12 Etsuro Fujita :
>> This is called when ANALYZE command is executed. (ANALYZE
>> command should be executed because autovacuum does not analyze foreign
>> tables.)

> This is a good idea.

> However, if adding these statistics requires an explicit ANALYZE
> command, then we should also have a command for resetting the
> collected statistics -- to get it back into the un-analyzed state.

Uh, why?  There is no UNANALYZE operation for ordinary tables, and
I've never heard anyone ask for one.

If you're desperate you could manually delete the relevant rows in
pg_statistic, a solution that would presumably work for foreign tables
too.

Probably a more interesting question is why we wouldn't change
autovacuum so that it calls this automatically for foreign tables.

(Note: I'm unconvinced that there's a use-case for this in the case of
"real" foreign tables on a remote server --- it seems likely that the
wrapper ought to ask the remote server for its current stats, instead.
But it's clearly useful for non-server-backed sources such as file_fdw.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread Euler Taveira de Oliveira


On 20-09-2011 11:12, Marti Raudsepp wrote:

2011/9/12 Etsuro Fujita:

This is called when ANALYZE command is executed. (ANALYZE
command should be executed because autovacuum does not analyze foreign
tables.)


This is a good idea.

However, if adding these statistics requires an explicit ANALYZE
command, then we should also have a command for resetting the
collected statistics -- to get it back into the un-analyzed state.

Why would you want this? If the stats aren't up to date, run ANALYZE 
periodically. Remember that it is part of the DBA maintenance tasks [1].



[1] http://www.postgresql.org/docs/current/static/maintenance.html


--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Tom Lane

Thom Brown  writes:
> On 20 September 2011 05:20, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Maybe something like this?
>>> HINT: The operating system was unable to find any locale data for the
>>> locale name you specified.

>> Hmm, that's not bad.  We could probably even call it errdetail, since
>> it's not so much a hint as explaining what the SUS spec states that the
>> ENOENT error code means here.

> Sounds good to me.  If this is to be the errdetail, does that mean
> you'd be keeping the original message in tact?  The problem with the
> actual error message is that it might cause the user to think along
> the lines of "Am I supposed to put a fully qualified path in this
> parameter?".

[ shrug... ] And who's to say that that's wrong?  We have no knowledge
of the OS's conventions for naming locales.  There's a limit to how
friendly we can make this message without turning it into something
that's actively misleading for platforms we failed to consider.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 11:01 AM, Marti Raudsepp  wrote:
> On Fri, Sep 16, 2011 at 01:53, Simon Riggs  wrote:
>> This patch splits bgwriter into 2 processes: checkpointer and
>> bgwriter, seeking to avoid contentious changes. Additional changes are
>> expected in this release to build upon these changes for both new
>> processes, though this patch stands on its own as both a performance
>> vehicle and in some ways a refcatoring to simplify the code.
>
> While you're already splitting up bgwriter, could there be any benefit
> to spawning a separate bgwriter process for each tablespace?
>
> If your database has one tablespace on a fast I/O system and another
> on a slow one, the slow tablespace would also bog down background
> writing for the fast tablespace. But I don't know whether that's
> really a problem or not.

I doubt it.  Most of the time the writes are going to be absorbed by
the OS write cache anyway.

I think there's probably more performance to be squeezed out of the
background writer, but maybe not that exact thing, and in any case it
seems like material for a separate patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Tom Lane

Simon Riggs  writes:
> I sympathise with this view, to an extent.

> If people want to put all parameters in one file, they can do so. So +1 to 
> that.

> Should they be forced to adopt that new capability by us deliberately
> breaking their existing setups? No. So -1 to that.

> If we do an automatic include of recovery.conf first, then follow by
> reading postgresql,conf then we will preserve the old as well as
> allowing the new.

I don't buy this argument at all.  I don't believe that recovery.conf is
part of anyone's automated processes at all, let alone to an extent that
they won't be able to cope with a change to rationalize the file layout.
And most especially I don't buy that someone who does want to keep using
it couldn't cope with adding an "include" to postgresql.conf manually.

If we're going to move these parameters into postgresql.conf, we should
just do that and remove all mention of recovery.conf.  Anything else
will generate much more confusion than benefit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Josh Berkus


> I don't buy this argument at all.  I don't believe that recovery.conf is
> part of anyone's automated processes at all, let alone to an extent that
> they won't be able to cope with a change to rationalize the file layout.
> And most especially I don't buy that someone who does want to keep using
> it couldn't cope with adding an "include" to postgresql.conf manually.

Speaking as someone who has a lot of client admin scripts written around
recovery.conf, we will be *thrilled* to update those scripts in order to
simplify the configuration file situation.  Having a third conf file (or
fourth, or fifth, depending on which auth features you're using) already
adds unwarranted complexity to automation scripts, and each config file
we get rid of makes those scripts easier to maintain and troubleshoot.

For that matter, I'd love to consolidate pg_hba and pg_ident into a
single file.  Any chance?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Tom Lane

Thom Brown  writes:
> [ unhelpful reporting of ENOENT from newlocale() ]

BTW, on examining the code I note that we're doing something else that
promotes the confusion of "bad locale name" with "bad file name": we're
using errcode_for_file_access() to select the SQLSTATE.  If we don't
believe that ENOENT should be taken at face value then this is pointless
(none of the other spec-defined error codes for newlocale() are
particularly sensible as file access errors).  I propose just reporting
ERRCODE_INVALID_PARAMETER_VALUE instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Thom Brown

On 20 September 2011 17:45, Tom Lane  wrote:
> Thom Brown  writes:
>> [ unhelpful reporting of ENOENT from newlocale() ]
>
> BTW, on examining the code I note that we're doing something else that
> promotes the confusion of "bad locale name" with "bad file name": we're
> using errcode_for_file_access() to select the SQLSTATE.  If we don't
> believe that ENOENT should be taken at face value then this is pointless
> (none of the other spec-defined error codes for newlocale() are
> particularly sensible as file access errors).  I propose just reporting
> ERRCODE_INVALID_PARAMETER_VALUE instead.

*nods*

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Josh Berkus

All,

First, if we're going to change behavior, I assert that we should stop
calling stuff "recovery" and either call it "replica" or "standby".  Our
use of the word "recovery" confuses users; it is historical in nature
and requires an understanding of PostgreSQL internals to know why it's
called that.  It's also inconsistent with our use of the word "standby"
everywhere else.

Second, I haven't seen a response to this:

> Do we want a trigger file to enable recovery, or one to *disable*
recovery?  Or both?

I'll go further and say that we only want one trigger file by default,
one which either enables or disables recovery.  I'll further suggest
that we:

a) have a standby.on file which puts the server in replica/recovery mode
if it's detected on startup, and
b) that we poll for the standby.on file going away as a trigger to stop
recovery and bring up the server in master mode, and
c) that pg_basebackup automatically create a standby.on file.

> Perhaps we need a new SCOPE attribute on pg_settings to show whether
> the parameter applies in recovery, in normal or both.

An overhaul of the category tree would also do that.  I've been putting
off an overhaul/cleanup of categories for pg_settings, maybe it's about
time.

> There is a potential security hole if people hardcode passwords into
> primary_conninfo. As long as we document not to do that, we're OK.

Yeah, I'd almost be inclined to actively prohibit this, but that would
draw user complaints.  We'll have to be satisfied with a doc plus a comment.

Speaking of which, .pgpass could be better documented as an option for
handling this sort of thing.  Will take a stab, eventually.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 1:01 PM, Josh Berkus  wrote:
> I'll go further and say that we only want one trigger file by default,
> one which either enables or disables recovery.  I'll further suggest
> that we:
>
> a) have a standby.on file which puts the server in replica/recovery mode
> if it's detected on startup, and
> b) that we poll for the standby.on file going away as a trigger to stop
> recovery and bring up the server in master mode, and
> c) that pg_basebackup automatically create a standby.on file.

It seems a bit confusing to me to have a file that takes effect only
at startup when created but anytime when removed.

I think one of the insufficiently-lauded 9.1 features is Fujii Masao's
"pg_ctl promote".   Now THAT is a good interface.  Unlike
trigger_file, it doesn't require any advance preparation, or monkeying
with files on disk.  You just tell it to promote, and it does.  Sweet.
 Now it turns out that it uses a file to make that happen behind the
scenes, but who cares?  From the user's perspective It Just Works.

I like the idea of some kind of sentinel file that tells the server to
start up in recovery mode.  But instead of having the user remove it
to cause a promotion, I think the server should remove it when it does
promote.  That's more like what we've done in the past, and it ties in
very nicely with what pg_ctl promote already does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Josh Berkus

On 9/20/11 10:09 AM, Robert Haas wrote:
> I like the idea of some kind of sentinel file that tells the server to
> start up in recovery mode.  But instead of having the user remove it
> to cause a promotion, I think the server should remove it when it does
> promote.  That's more like what we've done in the past, and it ties in
> very nicely with what pg_ctl promote already does.

Yes, I agree that that is a superior approach.

And then we could keep the trigger_file configuration parameter for
backwards compatibility, with intent to depreciate it after a couple
more Postgres versions.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Tom Lane

Thom Brown  writes:
> On 20 September 2011 17:45, Tom Lane  wrote:
>> BTW, on examining the code I note that we're doing something else that
>> promotes the confusion of "bad locale name" with "bad file name": we're
>> using errcode_for_file_access() to select the SQLSTATE.  If we don't
>> believe that ENOENT should be taken at face value then this is pointless
>> (none of the other spec-defined error codes for newlocale() are
>> particularly sensible as file access errors).  I propose just reporting
>> ERRCODE_INVALID_PARAMETER_VALUE instead.

> *nods*

OK, done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Tom Lane

Josh Berkus  writes:
> First, if we're going to change behavior, I assert that we should stop
> calling stuff "recovery" and either call it "replica" or "standby".  Our
> use of the word "recovery" confuses users; it is historical in nature
> and requires an understanding of PostgreSQL internals to know why it's
> called that.  It's also inconsistent with our use of the word "standby"
> everywhere else.

Are we all talking about the same thing?  In my mind recovery.conf is
for configuring a point-in-time archive recovery run.  It's got nothing
to do with either replication or standbys.  Perhaps part of our problem
here is overloading that case with standby behavior.

> Second, I haven't seen a response to this:

> Do we want a trigger file to enable recovery, or one to *disable*
> recovery?  Or both?

As far as the PITR scenario is concerned, only the former can possibly
make any sense; the latter would be downright dangerous.

>> There is a potential security hole if people hardcode passwords into
>> primary_conninfo. As long as we document not to do that, we're OK.

> Yeah, I'd almost be inclined to actively prohibit this, but that would
> draw user complaints.  We'll have to be satisfied with a doc plus a comment.

I think that marking the GUC as "only readable by superuser" is a
sufficient fix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 1:30 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> First, if we're going to change behavior, I assert that we should stop
>> calling stuff "recovery" and either call it "replica" or "standby".  Our
>> use of the word "recovery" confuses users; it is historical in nature
>> and requires an understanding of PostgreSQL internals to know why it's
>> called that.  It's also inconsistent with our use of the word "standby"
>> everywhere else.
>
> Are we all talking about the same thing?  In my mind recovery.conf is
> for configuring a point-in-time archive recovery run.  It's got nothing
> to do with either replication or standbys.

Huh?  How else can you create a standby?  I do it by creating a
recovery.conf file that says:

standby_mode=on
primary_conninfo='whatever'

I wasn't aware that there is another method.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] heap_update temporary release of buffer lock

2011-09-20 Thread Alvaro Herrera


I notice that heap_update releases the buffer lock, after checking the
HeapTupleSatifiesUpdate result, and before marking the tuple as updated,
to pin the visibility map page -- heapam.c lines 2638ff in master branch.

Is this not a bug?  I imagine that while this code releases the lock,
someone else could acquire it and grab a FOR SHARE lock on the tuple; if
the update later ends up modifying the tuple completely, this could
cause an FK to be broken, for example.

The other piece of that routine that releases the buffer lock, to toast
the tuple, is careful enough to set the Xmax to itself before releasing
the lock, which seems to me the right thing to do, because then the
prospective locker would have to wait until this transaction finishes
before being able to grab the lock.  Is this not necessary in the other
path?  If so, why?

The reason I care is because I need to do something to this code for the
FOR KEY SHARE stuff I'm working on (not yet sure what).

(I CC both Robert and Heikki because I don't remember whose work it was
on the VM stuff).

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] File not found error on creating collation

2011-09-20 Thread Thom Brown

On 20 September 2011 18:25, Tom Lane  wrote:
> Thom Brown  writes:
>> On 20 September 2011 17:45, Tom Lane  wrote:
>>> BTW, on examining the code I note that we're doing something else that
>>> promotes the confusion of "bad locale name" with "bad file name": we're
>>> using errcode_for_file_access() to select the SQLSTATE.  If we don't
>>> believe that ENOENT should be taken at face value then this is pointless
>>> (none of the other spec-defined error codes for newlocale() are
>>> particularly sensible as file access errors).  I propose just reporting
>>> ERRCODE_INVALID_PARAMETER_VALUE instead.
>
>> *nods*
>
> OK, done.

Thanks.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap_update temporary release of buffer lock

2011-09-20 Thread Heikki Linnakangas


On 20.09.2011 20:42, Alvaro Herrera wrote:

I notice that heap_update releases the buffer lock, after checking the
HeapTupleSatifiesUpdate result, and before marking the tuple as updated,
to pin the visibility map page -- heapam.c lines 2638ff in master branch.

Is this not a bug?  I imagine that while this code releases the lock,
someone else could acquire it and grab a FOR SHARE lock on the tuple; if
the update later ends up modifying the tuple completely, this could
cause an FK to be broken, for example.


Yeah, I think you're right. Not only might someone grab a FOR SHARE lock 
on the tuple, but someone might even update it under your nose.



The other piece of that routine that releases the buffer lock, to toast
the tuple, is careful enough to set the Xmax to itself before releasing
the lock, which seems to me the right thing to do, because then the
prospective locker would have to wait until this transaction finishes
before being able to grab the lock.  Is this not necessary in the other
path?  If so, why?


Yeah, we could do the same when relocking to pin the VM page. Or just 
add a "goto l2" there to start over.


BTW, I think we're playing a bit fast and loose with that 
set-xmax-before-unlocking trick. We haven't WAL-logged anything at that 
point yet, so it's possible that while we're busy toasting the tuple, 
the page is flushed from shared buffers with the xmax and the infomask 
already update. Now, the system crashes, and you get a torn page, so 
that the xmax is already updated, but the HEAP_XMAX_COMMITTED flag was 
*not* cleared, so it's still set. Oops. Highly unlikely in practice, 
because xmax and infomask are very close to each other, but it violates 
the principles of what we usually expect from the underlying system wrt. 
torn pages.



(I CC both Robert and Heikki because I don't remember whose work it was
on the VM stuff).


Fortunately, this race was in Robert's patch for 9.2, so this doesn't 
need to be back-patched.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Darren Duncan


Tom Lane wrote:

Dave Page  writes:

On Tue, Sep 20, 2011 at 12:37 AM, Tom Lane  wrote:

As has been mentioned a couple times, we're well overdue for updates of
the back branches. �Seems like time to get that done, so we'll be
wrapping 8.2.x and up this Thursday for release Monday the 26th.



8.2 up, including 9.1.1? I'm not sure our QA guys will be able to cope
with verification of so many individual installers in that timeframe -
8.2 - 9.0 is hard enough to do in one go.


Well, all the pre-9.1 branches are definitely badly in need of updates.
9.1 maybe could go without at this point, but we do have one crasher bug
and one serious memory leak fixed there, neither new in 9.1.  I'd just
as soon not establish a precedent for not releasing the same fixes at
the same time in all branches.

How about we wrap them all, but you could let your team slip the 9.1
update for a day or so if they need more time?  It's certainly less
critical than the older branches.


I would prefer that all branches have synchronized patch releases as they seem 
to have had in the past, and that the latest production is included, 9.1.1 in 
this case, even if its change set is more minor. -- Darren Duncan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap_update temporary release of buffer lock

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 2:28 PM, Heikki Linnakangas
 wrote:
> On 20.09.2011 20:42, Alvaro Herrera wrote:
>> I notice that heap_update releases the buffer lock, after checking the
>> HeapTupleSatifiesUpdate result, and before marking the tuple as updated,
>> to pin the visibility map page -- heapam.c lines 2638ff in master branch.
>>
>> Is this not a bug?  I imagine that while this code releases the lock,
>> someone else could acquire it and grab a FOR SHARE lock on the tuple; if
>> the update later ends up modifying the tuple completely, this could
>> cause an FK to be broken, for example.
>
> Yeah, I think you're right. Not only might someone grab a FOR SHARE lock on
> the tuple, but someone might even update it under your nose.

Yeah, I think he's right, too.  :-(

The easiest fix seems to be (as you suggest) to add "goto l2" after
reacquiring the lock.  Can we get away with (and is there any benefit
to) doing that only if xmax has changed?

> BTW, I think we're playing a bit fast and loose with that
> set-xmax-before-unlocking trick. We haven't WAL-logged anything at that
> point yet, so it's possible that while we're busy toasting the tuple, the
> page is flushed from shared buffers with the xmax and the infomask already
> update. Now, the system crashes, and you get a torn page, so that the xmax
> is already updated, but the HEAP_XMAX_COMMITTED flag was *not* cleared, so
> it's still set. Oops. Highly unlikely in practice, because xmax and infomask
> are very close to each other, but it violates the principles of what we
> usually expect from the underlying system wrt. torn pages.

I think our usual assumption is that the disk might write in chunks as
small as 512 bytes.  IIUC, tuples are only guaranteed to have 8-byte
alignment, and xmax is at offset 4, while the infomask is at offset
20.  So that doesn't seem safe.  If we were doing this over again we
could rearrange things to put them in the same eight-byte aligned
chunk by  (i.e. swap t_xmax and t_field3, move t_infomask before
t_ctid), but it's a bit late for that now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Tom Lane

Robert Haas  writes:
> On Tue, Sep 20, 2011 at 1:30 PM, Tom Lane  wrote:
>> Are we all talking about the same thing?  In my mind recovery.conf is
>> for configuring a point-in-time archive recovery run.  It's got nothing
>> to do with either replication or standbys.

> Huh?  How else can you create a standby?  I do it by creating a
> recovery.conf file that says:
> standby_mode=on

The point I'm trying to make is that it seems like this discussion is
getting driven entirely by the standby case, without remembering that
recovery.conf was originally designed for, and is still used in,
a significantly different use-case.  Maybe we had better take two
steps back and think about the implications for the archive-recovery
case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap_update temporary release of buffer lock

2011-09-20 Thread Alvaro Herrera


Excerpts from Robert Haas's message of mar sep 20 16:04:03 -0300 2011:
> On Tue, Sep 20, 2011 at 2:28 PM, Heikki Linnakangas
>  wrote:
> > On 20.09.2011 20:42, Alvaro Herrera wrote:
> >> I notice that heap_update releases the buffer lock, after checking the
> >> HeapTupleSatifiesUpdate result, and before marking the tuple as updated,
> >> to pin the visibility map page -- heapam.c lines 2638ff in master branch.
> >>
> >> Is this not a bug?  I imagine that while this code releases the lock,
> >> someone else could acquire it and grab a FOR SHARE lock on the tuple; if
> >> the update later ends up modifying the tuple completely, this could
> >> cause an FK to be broken, for example.
> >
> > Yeah, I think you're right. Not only might someone grab a FOR SHARE lock on
> > the tuple, but someone might even update it under your nose.
> 
> Yeah, I think he's right, too.  :-(
> 
> The easiest fix seems to be (as you suggest) to add "goto l2" after
> reacquiring the lock.  Can we get away with (and is there any benefit
> to) doing that only if xmax has changed?

Hmm ... I think that works, and it would suit my purposes too.  Note
this means you have to recheck infomask too (otherwise consider that
IS_MULTI could be set the first time, and not set the second time, and
that makes the Xmax have a different meaning.)  OTOH if you just do it
always, it is simpler.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap_update temporary release of buffer lock

2011-09-20 Thread Tom Lane

Alvaro Herrera  writes:
> Excerpts from Robert Haas's message of mar sep 20 16:04:03 -0300 2011:
>>> On 20.09.2011 20:42, Alvaro Herrera wrote:
 I notice that heap_update releases the buffer lock, after checking the
 HeapTupleSatifiesUpdate result, and before marking the tuple as updated,
 to pin the visibility map page -- heapam.c lines 2638ff in master branch.

>> The easiest fix seems to be (as you suggest) to add "goto l2" after
>> reacquiring the lock.  Can we get away with (and is there any benefit
>> to) doing that only if xmax has changed?

> Hmm ... I think that works, and it would suit my purposes too.  Note
> this means you have to recheck infomask too (otherwise consider that
> IS_MULTI could be set the first time, and not set the second time, and
> that makes the Xmax have a different meaning.)  OTOH if you just do it
> always, it is simpler.

Yeah, I think a "goto l2" is correct and sufficient.  As the comment
already notes, this need not be a high-performance path, so why spend
extra code (with extra risk of bugs)?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Josh Berkus


> The point I'm trying to make is that it seems like this discussion is
> getting driven entirely by the standby case, without remembering that
> recovery.conf was originally designed for, and is still used in,
> a significantly different use-case.  Maybe we had better take two
> steps back and think about the implications for the archive-recovery
> case.

I think we should take that into consideration, sure.  But it should not
be in the driver's seat for things like nomenclature.  Far more people
use replication than use PITR.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 3:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Sep 20, 2011 at 1:30 PM, Tom Lane  wrote:
>>> Are we all talking about the same thing?  In my mind recovery.conf is
>>> for configuring a point-in-time archive recovery run.  It's got nothing
>>> to do with either replication or standbys.
>
>> Huh?  How else can you create a standby?  I do it by creating a
>> recovery.conf file that says:
>> standby_mode=on
>
> The point I'm trying to make is that it seems like this discussion is
> getting driven entirely by the standby case, without remembering that
> recovery.conf was originally designed for, and is still used in,
> a significantly different use-case.  Maybe we had better take two
> steps back and think about the implications for the archive-recovery
> case.

I'm not sure there really are any implications that are worth getting
excited about.  The problem is really one of naming; I'm reminded of
our recent discussions of the use of the term "relation".  The problem
is that the word "recovery" encompasses a number of very different
scenarios.  First, we have crash recovery.  Second, we have archive
recovery (which, confusingly enough, no longer requires an archive).
Archive recovery can be further subdivided by where the xlog files are
coming from (archive, streaming replication, both, or neither) and how
long we plan to stay in recovery (forever, if acting as a standby;
until end of WAL, if promoting a standby or recovering a hot backup;
or until the recovery target is reached, if performing a point in time
recovery).  I would guess that most of those are not what the typical
user thinks of as "recovery", but what else are we gonna call it?
Josh is arguing that we ought to use the term "replication", but it
seems to me that's just as misleading - maybe moreso, since "recovery"
is sufficiently a term of art to make you at least think about reading
the manual, whereas you know (or think you know) what replication is.

For now, I think we're best off not changing the terminology, and
confining the remit of this patch to (a) turning all of the existing
recovery.conf parameters into GUCs and (b) replacing recovery.conf
with a sentinel file a sentinel file (name TBB) to indicate that the
server is to start in recovery mode.  The naming isn't great but the
more we change at once the less chance of reaching agreement.  It
seems like we have pretty broad agreement on the basics here, so let's
start with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CUDA Sorting

2011-09-20 Thread Nulik Nol

>
> I already did some benchmarks with GPU sorting (not in pgsql), and
> measured total sort times, copy bandwidth and energy usage, and got
> some exciting results:
Was that qsort implementation on CPU cache friendly and optimized for SSE ?
To make a fair comparison you have to take the best CPU implementation
and compare it to best GPU implementation. Because if not, you are
comparing full throttled GPU vs lazy CPU.
Check this paper on how hash join was optimized 17x when SSE
instructions were used.
www.vldb.org/pvldb/2/vldb09-257.pdf

Regards


-- 
==
The power of zero is infinite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX

2011-09-20 Thread MUHAMMAD ASIF


Hi,
I faced similar issue as discussed in 
"http://postgresql.1045698.n5.nabble.com/Fwd-DBD-Pg-on-HP-UX-11-31-64bit-td3305163.html";.
(man xopen_networking - 
http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/xopen_networking.7.html)
... There are two ways to obtain X/Open Sockets functionality:      
  * Method A is in compliance with X/Open compilation specification.      * 
Method B slightly deviates from X/Open compilation specification. However, 
Method B allows a program to include both objects compiled to X/Open Sockets 
specification and objects compiled to BSD Sockets   specification.  ...
PostgreSQL support X/Open Sockets. Apache web server (2.2.15, 
/opt/hpws22/apache) and Perl (5.8.8, /opt/perl_64) are BSD Socket applications 
that are default with the OS. I tried Method B (It provides wrapper _xpg_ 
socket functions that allows using X/Open socket objects and BSD socket objects 
in the same binary) to build PostgreSQL 9.1 code, I LD_PRELOAD the generated 
libpq binary, without any other change both perl and apache work fine with 
postgresql now,and it is easy to implement too. We just need to build the 
source code with "-D_XOPEN_SOURCE=600 -D_HPUX_ALT_XOPEN_SOCKET_API" and link 
binary with libc. PFA patch. Thanks.
Best Regards,Muhammad Asif Naeem
  

hp-ux_socket.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-20 Thread David Fetter

On Tue, Sep 20, 2011 at 04:51:51PM +0300, Peter Eisentraut wrote:
> On sön, 2011-09-18 at 12:43 -0400, Tom Lane wrote:
> > Peter Eisentraut  writes:
> > > On sön, 2011-09-18 at 09:45 -0500, Dave Page wrote:
> > >> That is much more reasonable, though unfortunately not what was said.
> > >> Regardless, I stand by my main point that such a representative should
> > >> be communicating with the project regularly. Having a rep who works
> > >> outside the project is of no use at all. 
> > 
> > > Well, the point of this thread is, how can she communicate?
> > 
> > +1 for a closed mailing list.  It's a bit annoying to have to do such
> > a thing, but it's not like we haven't got other closed lists for
> > appropriate purposes.
> 
> Well, that much we've already decided a few years ago.  The blocking
> issues are: (1) do we have enough interest, and (2) where to put it (I'm
> looking at you, pgfoundry).
> 
> > I guess the real question is, exactly what will be the requirements
> > for joining?
> 
> As as far as I'm concerned, anyone who is known in the community and has
> a plausible interest can join.  The requirement is that we share this
> material with "colleagues" for consultation, as opposed to posting it on
> the public internet.

I'd like to be on this list, and believe I qualify.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-20 Thread Tom Lane

Marko Tiikkaja  writes:
> The attached patch is the best I could come up with.  I considered 
> showing "Rows Removed by Foo: (never executed)" and omitting the line 
> altogether, but I didn't particularly like either of those options.  The 
> current patch simply displays "Rows Removed by Foo: 0".

I'm looking at this patch now.  I can't escape the conclusion that we
should put the counters into struct Instrumentation after all.  I know
I complained upthread that that would require arbitrary decisions about
how to use the counters in multi-filter nodes, but now I see there are
downsides to the way it's done here, too.  The basic problem I've got
with this is that the counters are being managed differently from any
other instrumentation counters, and that will bite us in the rear sooner
or later.  For example, if we needed a way to reset the counters for a
particular plan node, we'd be forced into some very ugly choices about
how to make that happen for counters that are in node-type-specific
structures.

So, unless there are objections, I'm going to adjust the patch to put
the counters back into struct Instrumentation.

I'm also thinking that it'd be wise to invent macros to encapsulate the
"if (node->ps.instrument) node->ps.instrument->counter += 1" sequences
that will be required.

One other point is that with or without macros, a large fraction of the
executor/nodeFoo.c files are going to need to #include instrument.h
if we put the counters into struct Instrumentation.  We could only avoid
that by putting the increments into out-of-line functions instead of
macros, which I don't care for from a performance standpoint.  So I'm
thinking we should just bite the bullet and #include instrument.h in
execnodes.h, which would also let us get rid of the "struct" hack that's
currently used to reference Instrumentation nodes there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX

2011-09-20 Thread Tom Lane

MUHAMMAD ASIF  writes:
> I faced similar issue as discussed in 
> "http://postgresql.1045698.n5.nabble.com/Fwd-DBD-Pg-on-HP-UX-11-31-64bit-td3305163.html";.
>   (man xopen_networking - 
> http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/xopen_networking.7.html)  
>   ... There are two ways to obtain X/Open Sockets functionality:      
>   * Method A is in compliance with X/Open compilation specification.      
> * Method B slightly deviates from X/Open compilation specification. However, 
> Method B allows a program to include both objects compiled to X/Open Sockets 
> specification and objects compiled to BSD Sockets   specification.  ...
> PostgreSQL support X/Open Sockets. Apache web server (2.2.15, 
> /opt/hpws22/apache) and Perl (5.8.8, /opt/perl_64) are BSD Socket 
> applications that are default with the OS. I tried Method B (It provides 
> wrapper _xpg_ socket functions that allows using X/Open socket objects and 
> BSD socket objects in the same binary) to build PostgreSQL 9.1 code, I 
> LD_PRELOAD the generated libpq binary, without any other change both perl and 
> apache work fine with postgresql now,and it is easy to implement too. We just 
> need to build the source code with "-D_XOPEN_SOURCE=600 
> -D_HPUX_ALT_XOPEN_SOCKET_API" and link binary with libc. PFA patch. Thanks.

AFAICT, the proposed patch will break things on at least some versions
of HPUX.  You can't just arbitrarily remove the reference to -lxnet,
at least not without explaining to us why the existing comment about it
is wrong.  Likewise, removing -D_XOPEN_SOURCE_EXTENDED isn't
acceptable without a whole more supporting evidence than you've
provided.  (I'm fairly certain that the latter will break the build on
my old HPUX 10.20 box, for example.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-20 Thread Peter Geoghegan

On 20 September 2011 03:51, Tom Lane  wrote:
> Considering that -O2 is our standard optimization level, that
> observation seems to translate to "this patch will be useless in
> practice".  I think you had better investigate that aspect in some
> detail before spending more effort.

I don't think that the fact that that happens is at all significant at
this early stage, and it never even occurred to me that you'd think
that it might be. I was simply disclosing a quirk of this POC patch.
The workaround is probably to use a macro instead. For the benefit of
those that didn't follow the other threads, the macro-based qsort
implementation, which I found to perform significantly better than
regular qsort(), runs like this on my laptop when I built at 02 with
GCC 4.6 just now:

C stdlib quick-sort time elapsed: 2.092451 seconds
Inline quick-sort time elapsed: 1.587651 seconds

Does *that* look attractive to you? I've attached source code of the
program that produced these figures, which has been ported to C from
C++.

When I #define LARGE_SIZE 1, here's what I see:

[peter@peter inline_compar_test]$ ./a.out
C stdlib quick-sort time elapsed: 23.659411 seconds
Inline quick-sort time elapsed: 18.470611 seconds

Here, sorting with the function pointer/stdlib version takes about
1.28 times as long. In the prior test (with the smaller LARGE_SIZE),
it took about 1.32 times as long. Fairly predictable, linear, and not
to be sniffed at.

The variance I'm seeing across runs is low - a couple of hundredths of
a second at most. This is a Fedora 15 " Intel(R) Core(TM) i5-2540M CPU
@ 2.60GHz" machine. I'm not sure right now why the inline quick-sort
is less of a win than on my old Fedora 14 desktop (where it was 3.24
Vs 2.01), but it's still a significant win. Perhaps others can build
this simple program and tell me what they come up with.

>> This performance patch differs from most in that it's difficult in
>> principle to imagine a performance regression occurring.
>
> Really?  N copies of the same code could lead to performance loss just
> due to code bloat (ie, less of a query's inner loops fitting in CPU
> cache).

I did consider that. Of course inlining has an overhead, and I'll be
testing that each instance of inlining has a net benefit. I just meant
that many other performance patches have an obvious worst case, and I
think that it is policy to focus on that case, but I can't think of
one here. Does anyone else have any ideas?

> Not to mention the clear regression in maintainability.  So
> I'm disinclined to consider this sort of change without a significantly
> bigger win than you're suggesting above

Sure, there'll be some sort of regression in maintainability - I think
that HOT had a clear regression in maintainability too. The important
questions are obviously "how big is the loss of maintainability?", and
"is it worth it?". We'll know more when this work is actually shaped
into a proper patch. Perhaps I should have waited until I had
something along those lines before making an announcement, but I
wanted community input as early as possible. I think that there's
plenty of tweaking that can be done to get additional performance
improvements - all I've done so far is demonstrate that those
improvements are real and worth thinking about, in the fastest
possible way, partly because you expressed skepticism of the benefits
of inlining comparators to Greg Stark in an earlier thread.

Performance and maintainability are often somewhat in tension, but we
cannot ignore performance. If this work can bring us an improvement in
performance approaching the isolated macro Vs qsort() function pointer
benchmark, that's a *big* win. Sorting integers and floats is very
common and important.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

inline_compar_test.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-20 Thread Tom Lane

Marko Tiikkaja  writes:
>> The attached patch is the best I could come up with.  I considered 
>> showing "Rows Removed by Foo: (never executed)" and omitting the line 
>> altogether, but I didn't particularly like either of those options.  The 
>> current patch simply displays "Rows Removed by Foo: 0".

I ran into a couple more issues with this patch.

One is the handling of bitmapqualorig filtering (and correspondingly
indexqualorig, which the patch misses entirely).  These counts are
really quite a bit different from the other filter conditions we are
dealing with, because what they represent is not anything directly
user-controllable, but how lossy the indexscan is.  That is, we get a
count for any tuple that the index told us to visit but that turned out
to not actually satisfy the indexqual.  So the count is guaranteed zero
for any non-lossy indexscan, which includes most cases.  In view of
that, I find it useless and probably confusing to put out "Rows Removed
by Recheck Cond: 0" unless we're dealing with a lossy index.

Now the difficulty is that EXPLAIN doesn't really have any way to know
if the index is lossy, especially not if every such check luckily
happened to pass.

What I'm inclined to do is suppress the "rows removed" output, at least
in textual output format, unless it's nonzero.  But that sorta begs the
question of whether we shouldn't do that for all cases, not just index
recheck conditions.

Also, upthread it was argued that we shouldn't measure the effects of
joinqual filtering.  I don't buy this for a minute, especially not in
merge/hash joins, where a row thrown away by joinqual filtering is just
as expensive as one thrown away by otherqual filtering, and where you
can *not* determine how big the raw merge/hash join result is if you're
not told how much the joinqual removed.  I see the point about it not
being clear how to explain things for SEMI/ANTI join cases, but I think
we need to figure that out, not just punt.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread David Fetter

On Tue, Sep 20, 2011 at 11:13:05AM -0400, Tom Lane wrote:
> Marti Raudsepp  writes:
> > 2011/9/12 Etsuro Fujita :
> >> This is called when ANALYZE command is executed. (ANALYZE
> >> command should be executed because autovacuum does not analyze foreign
> >> tables.)
> 
> > This is a good idea.
> 
> > However, if adding these statistics requires an explicit ANALYZE
> > command, then we should also have a command for resetting the
> > collected statistics -- to get it back into the un-analyzed state.
> 
> Uh, why?  There is no UNANALYZE operation for ordinary tables, and
> I've never heard anyone ask for one.
> 
> If you're desperate you could manually delete the relevant rows in
> pg_statistic, a solution that would presumably work for foreign tables
> too.
> 
> Probably a more interesting question is why we wouldn't change
> autovacuum so that it calls this automatically for foreign tables.

How about a per-table setting that tells autovacuum whether to do
this?  Come to think of it, all of per-FDW, per-remote and per-table
settings would be handy, so people could express things like, "all CSV
files except these three, all PostgreSQL connections on the
10.1.0.0/16 network, and these two tables in Oracle."

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Isolation tests still falling over routinely

2011-09-20 Thread Tom Lane

The buildfarm is still showing isolation test failures more days than
not, eg
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pika&dt=2011-09-17%2012%3A43%3A11
and I've personally seen such failures when testing with
CLOBBER_CACHE_ALWAYS.  Could we please fix those tests to not have such
fragile timing assumptions?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-09-20 Thread Alvaro Herrera


Excerpts from David Fetter's message of mar sep 20 21:22:32 -0300 2011:
> On Tue, Sep 20, 2011 at 11:13:05AM -0400, Tom Lane wrote:

> > Probably a more interesting question is why we wouldn't change
> > autovacuum so that it calls this automatically for foreign tables.
> 
> How about a per-table setting that tells autovacuum whether to do
> this?

Seems reasonable.  Have autovacuum assume that foreign tables are not to
be analyzed, unless some reloption is set.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-20 Thread karavelov

- Цитат от Peter Geoghegan (pe...@2ndquadrant.com), на 21.09.2011 в 02:53 
-

> On 20 September 2011 03:51, Tom Lane  wrote:
>> Considering that -O2 is our standard optimization level, that
>> observation seems to translate to "this patch will be useless in
>> practice".  I think you had better investigate that aspect in some
>> detail before spending more effort.
> 
> I don't think that the fact that that happens is at all significant at
> this early stage, and it never even occurred to me that you'd think
> that it might be. I was simply disclosing a quirk of this POC patch.
> The workaround is probably to use a macro instead. For the benefit of
> those that didn't follow the other threads, the macro-based qsort
> implementation, which I found to perform significantly better than
> regular qsort(), runs like this on my laptop when I built at 02 with
> GCC 4.6 just now:
> 
> C stdlib quick-sort time elapsed: 2.092451 seconds
> Inline quick-sort time elapsed: 1.587651 seconds
> 
> Does *that* look attractive to you? I've attached source code of the
> program that produced these figures, which has been ported to C from
> C++.
> 
> When I #define LARGE_SIZE 1, here's what I see:
> 
> [peter@peter inline_compar_test]$ ./a.out
> C stdlib quick-sort time elapsed: 23.659411 seconds
> Inline quick-sort time elapsed: 18.470611 seconds
> 
> Here, sorting with the function pointer/stdlib version takes about
> 1.28 times as long. In the prior test (with the smaller LARGE_SIZE),
> it took about 1.32 times as long. Fairly predictable, linear, and not
> to be sniffed at.
> 
> The variance I'm seeing across runs is low - a couple of hundredths of
> a second at most. This is a Fedora 15 " Intel(R) Core(TM) i5-2540M CPU
> @ 2.60GHz" machine. I'm not sure right now why the inline quick-sort
> is less of a win than on my old Fedora 14 desktop (where it was 3.24
> Vs 2.01), but it's still a significant win. Perhaps others can build
> this simple program and tell me what they come up with.
> 
Run it here.

Intel(R) Core(TM)2 Duo CPU E8200  @ 2.66GHz
gcc version 4.6.1 (Debian 4.6.1-10)

g++ -O2 qsort-inline-benchmark.c
./a.out 
C stdlib quick-sort time elapsed: 1.942686 seconds
Inline quick-sort time elapsed: 1.126508 seconds

With #define LARGE_SIZE 1

C stdlib quick-sort time elapsed: 22.158207 seconds
Inline quick-sort time elapsed: 12.861018 seconds

with g++ -O0
C stdlib quick-sort time elapsed: 2.736360 seconds
Inline quick-sort time elapsed: 2.045619 seconds

On server hardware:
Intel(R) Xeon(R) CPU   E5405  @ 2.00GHz
gcc version 4.4.5 (Debian 4.4.5-8)

/a.out 
C stdlib quick-sort time elapsed: 2.610150 seconds
Inline quick-sort time elapsed: 1.494198 seconds

All -O2 version show 42% speedup with inlined qsort. 
-O0 showed 25% speedup.

Best regards

--
Luben Karavelov

Re: [HACKERS] Isolation tests still falling over routinely

2011-09-20 Thread Kevin Grittner

Tom Lane  wrote:

> The buildfarm is still showing isolation test failures more days
> than not, eg
> 
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pika&dt=2011-09-17%2012%3A43%3A11
> and I've personally seen such failures when testing with
> CLOBBER_CACHE_ALWAYS. Could we please fix those tests to not have
> such fragile timing assumptions?

I went back over two months, and only found one failure related to an
SSI test, and that was because the machine ran out of disk space. 
There should never be any timing-related failures on the SSI tests,
as there is no blocking or deadlocking.

If you have seen any failures on isolation tests other than the fk-*
tests, I'd be very interested in details.

The rest are not related to SSI but test deadlock conditions related
to foreign keys.  I didn't have anything to do with these but to
provide alternate result files for REPEATABLE READ and SERIALIZABLE
isolation levels.  (I test the installcheck-world target and the
isolation tests in those modes frequently, and the fk-deadlock tests
were failing every time at those levels.)

If I remember right, Alvaro chose these timings to balance run time
against chance of failure.  Unless we want to remove these deadlock
handling tests or ignore failures (which both seem like bad ideas to
me), I think we need to bump the long timings by an order of
magnitude and just concede that those tests run for a while.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Isolation tests still falling over routinely

2011-09-20 Thread Alvaro Herrera


Excerpts from Kevin Grittner's message of mar sep 20 22:51:39 -0300 2011:

> If I remember right, Alvaro chose these timings to balance run time
> against chance of failure.  Unless we want to remove these deadlock
> handling tests or ignore failures (which both seem like bad ideas to
> me), I think we need to bump the long timings by an order of
> magnitude and just concede that those tests run for a while.

The main problem I have is that I haven't found a way to reproduce the
problems in my machine.  I was playing with modifying the way the error
messages are reported, but that ended up unfinished in a local branch.

I'll give it a go once more and see if I can commit so that buildfarm
tells us if it works or not.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-20 Thread Robert Haas

On Tue, Sep 20, 2011 at 8:19 PM, Tom Lane  wrote:
> Marko Tiikkaja  writes:
>>> The attached patch is the best I could come up with.  I considered
>>> showing "Rows Removed by Foo: (never executed)" and omitting the line
>>> altogether, but I didn't particularly like either of those options.  The
>>> current patch simply displays "Rows Removed by Foo: 0".
>
> I ran into a couple more issues with this patch.
>
> One is the handling of bitmapqualorig filtering (and correspondingly
> indexqualorig, which the patch misses entirely).  These counts are
> really quite a bit different from the other filter conditions we are
> dealing with, because what they represent is not anything directly
> user-controllable, but how lossy the indexscan is.  That is, we get a
> count for any tuple that the index told us to visit but that turned out
> to not actually satisfy the indexqual.  So the count is guaranteed zero
> for any non-lossy indexscan, which includes most cases.  In view of
> that, I find it useless and probably confusing to put out "Rows Removed
> by Recheck Cond: 0" unless we're dealing with a lossy index.

I don't really see the point of this.  I think printing it always is
both clear and appropriate; it would be even nicer if we also had a
line for "Rows Rechecked".

I am slightly worried that this additional information is going to
make the output too verbose.  But if that turns out to be the problem,
I think the solution is to add another option to control whether this
information is emitted - that's why we have a flexible options syntax
in the first place - and not to start guessing which information the
user will think is boring or confusing in any particular case.

I think we are getting to the point where EXPLAIN is complex enough
that it should really have its own chapter in the documentation.  The
existing treatment in the SQL reference page is a good start, but it
doesn't really do the topic justice.  And being able to document what
all of these things mean would, I think, ease the problem of trying to
make everything 100% self-documenting.

> Also, upthread it was argued that we shouldn't measure the effects of
> joinqual filtering.  I don't buy this for a minute, especially not in
> merge/hash joins, where a row thrown away by joinqual filtering is just
> as expensive as one thrown away by otherqual filtering, and where you
> can *not* determine how big the raw merge/hash join result is if you're
> not told how much the joinqual removed.  I see the point about it not
> being clear how to explain things for SEMI/ANTI join cases, but I think
> we need to figure that out, not just punt.

Yep, I agree.  We should measure everything we possibly can.  I don't
have a clear idea how this ought to be represented, but leaving it out
doesn't seem like the right answer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Online base backup from the hot-standby

2011-09-20 Thread Fujii Masao

2011/9/13 Jun Ishiduka :
>
> Update patch.
>
> Changes:
>  * set 'on' full_page_writes by user (in document)
>  * read "FROM: XX" in backup_label (in xlog.c)
>  * check status when pg_stop_backup is executed (in xlog.c)

Thanks for updating the patch.

Before reviewing the patch, to encourage people to comment and
review the patch, I explain what this patch provides:

This patch provides the capability to take a base backup during recovery,
i.e., from the standby server. This is very useful feature to offload the
expense of periodic backups from the master. That backup procedure is
similar to that during normal running, but slightly different:

1. Execute pg_start_backup on the standby. To execute a query on the
   standby, hot standby must be enabled.

2. Perform a file system backup on the standby.

3. Copy the pg_control file from the cluster directory on the standby to
the backup as follows:

cp $PGDATA/global/pg_control /mnt/server/backupdir/global

4. Execute pg_stop_backup on the standby.

The backup taken by the above procedure is available for an archive
recovery or standby server.

If the standby is promoted during a backup, pg_stop_backup() detects
the change of the server status and fails. The data backed up before the
promotion is invalid and not available for recovery.

Taking a backup from the standby by using pg_basebackup is still not
possible. But we can relax that restriction after applying this patch.

To take a base backup during recovery safely, some sort of parameters
must be set properly. Hot standby must be enabled on the standby, i.e.,
wal_level and hot_standby must be enabled on the master and the standby,
respectively. FPW (full page writes) is required for a base backup,
so full_page_writes must be enabled on the master.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-20 Thread Peter Eisentraut

On tis, 2011-09-20 at 11:12 -0300, Alvaro Herrera wrote:
> > > +1 for a closed mailing list.  It's a bit annoying to have to do
> such
> > > a thing, but it's not like we haven't got other closed lists for
> > > appropriate purposes.
> > 
> > Well, that much we've already decided a few years ago.  The blocking
> > issues are: (1) do we have enough interest, and (2) where to put it
> (I'm
> > looking at you, pgfoundry).
> 
> I don't see why we wouldn't put it in @postgresql.org.

One nice thing about pgfoundry would be the document manager.  Also, at
least at some point in the past, a pgfoundry project was easier to
manage than getting anything done about a @postgresql.org mailing list.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Isolation tests still falling over routinely

2011-09-20 Thread Tom Lane

Alvaro Herrera  writes:
> The main problem I have is that I haven't found a way to reproduce the
> problems in my machine.

Try -DCLOBBER_CACHE_ALWAYS.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Join push-down for foreign tables

2011-09-20 Thread Michael Paquier

Hi,

I am interested in the development you are doing regarding join push down
and fdw stuff for remote postgreSQL servers.
Is there a way to get the postgres fdw you are providing here for common
9.1?
I saw that the tar you are providing here is adapted only for your patch.

Regards,

Michael

2011/9/14 Shigeru Hanada 

> Hi all,
>
> I'd like to propose $SUBJECT for further foreign query optimization.
> I've not finished development, but I'd appreciate it if I got someone's
> review on my WIP code and its design.
>
> Changes I made
> ==
>
> (1) Add foreign server OID to RelOptInfo
> I think it would be nice to know whether a join comes from one foreign
> server or not without digging into child nodes during considering paths
> for a query.  So I added serverid field to RelOptInfo, which defaults to
> InvalidOid ,and is set to OID of the server if the node and all of its
> children are from same foreign server.  This also avoids looking catalog
> up for foreign table entry to determine FDW routine.
>
> (2) Add new planner node, ForeignJoinPath
> ForeignJoinPath derives JoinPath, like other Join nodes, and holds
> FdwPlan like ForeignPath node.
>
> This node is used to represent a pushed-down join between foreign tables
> and/or another foreign join in early planning phase, for all of
> combination such as table-table, table-join, join-table and join-join
> will be considered.  In addition, though they might generate same
> internal (FDW-specific) plan, reversed combination is considered;
> planner generates two ForeignJoinPath for both (A & B) and (B & A).
>
> During creation of this node, planner calls new FDW handler function
> PlanForeignJoin to get a FdwPlan which includes costs and internal plan
> of a foreign join.  If a FDW can't (or doesn't want to) handle this
> join, just return NULL is OK, and then planner gives such optimization
> up and considers other usual join methods such as nested loop and hash
> join.
>
> A subtree which has a ForeignJoin on its top is translated into a
> ForeignScan node during constructing a plan tree.  This behavior is
> different from other join path nodes such as NestPath and MergePath,
> because they have child plan nodes correspond to path nodes.
>
> (3) Add EXPALIN support for foreign join (currently just for debug)
> ForeignScan might not be a simple foreign table scan, so
> ExplainScanTarget() can't be used for it.  An idea I have is adding
> ExplainForeignScanTarget() to handle ForeignScan separately from other
> scan nodes.
>
> (4) Add new GUC parameter, enable_foreignjoin
> If this was off, planner never generates ForeignJoinPath.  In such case,
> foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin.
>
> Known issue
> ===
>
> I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so
> currently some kind of query fails.  Known failure patterns are:
>
> *) SELECT * FROM A JOIN B (...) doesn't work.  Specifying columns in
> SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
> *) ORDER BY causes error if no column is specified in SELECT clause from
> sort key's table.
>
> Probably more problems still are there...
>
> PG-wrapper as sample implementation
> ===
>
> pgsql_fdw-0.1.0.tar.gz is an WIP implementation of PG-wrapper, which can
> (hopefully) handle both simple foreign table scan and multiple foreign
> joins.  You can build it with placing in contrib/, or using pgxs.  Note
> that it has some issues such as memory leak of PGresult.  I'm planning
> to propose this wrapper as a contrib module, but it would be after
> clearing such issues.
>
> Regards,
> --
> Shigeru Hanada
>
>
-- 
Michael Paquier
http://michael.otacoo.com

Re: [HACKERS] Back-branch releases upcoming this week

2011-09-20 Thread Peter Eisentraut

On tis, 2011-09-20 at 10:28 -0400, Tom Lane wrote:
> I don't think we've yet decided what the policy means if a release
> happens during the stated calendar month, which seems rather likely
> this time around in view of our historical record of doing updates
> roughly quarterly.  Should we settle that detail now?  That is,
> does "after December" really mean "in or after December", or did we
> really mean "after"?

The policy states that we will support branches until that time, which
should reasonably include the whole time period stated.  That is, 8.2 is
supported until December 31.

But it does not say that we are obliged to make another release after
the EOL with all the patches that have accumulated between the previous
release and the EOL.  And it certainly does not say that we are obliged
to keep patching after EOL until that next release happens.  It does say
that that would "normally" happen, but it doesn't have to.  Previously,
we have argued that we should make another release because the previous
patching effort would otherwise have been wasted.  Maybe so.  But let's
keep this in perspective.  If we made another release on December 13, we
shouldn't have to keep patching after that, unless there is an
emergency.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Online base backup from the hot-standby

2011-09-20 Thread Magnus Hagander

On Wed, Sep 21, 2011 at 04:50, Fujii Masao  wrote:
> 2011/9/13 Jun Ishiduka :
>>
>> Update patch.
>>
>> Changes:
>>  * set 'on' full_page_writes by user (in document)
>>  * read "FROM: XX" in backup_label (in xlog.c)
>>  * check status when pg_stop_backup is executed (in xlog.c)
>
> Thanks for updating the patch.
>
> Before reviewing the patch, to encourage people to comment and
> review the patch, I explain what this patch provides:
>
> This patch provides the capability to take a base backup during recovery,
> i.e., from the standby server. This is very useful feature to offload the
> expense of periodic backups from the master. That backup procedure is
> similar to that during normal running, but slightly different:
>
> 1. Execute pg_start_backup on the standby. To execute a query on the
>   standby, hot standby must be enabled.
>
> 2. Perform a file system backup on the standby.
>
> 3. Copy the pg_control file from the cluster directory on the standby to
>    the backup as follows:
>
>    cp $PGDATA/global/pg_control /mnt/server/backupdir/global

But this is done as part of step 2 already. I assume what this really
means is that the pg_control file must be the last file backed up?

(Since there are certainly a lot other ways to do the backup than just
cp to a mounted directory..)


> 4. Execute pg_stop_backup on the standby.
>
> The backup taken by the above procedure is available for an archive
> recovery or standby server.
>
> If the standby is promoted during a backup, pg_stop_backup() detects
> the change of the server status and fails. The data backed up before the
> promotion is invalid and not available for recovery.
>
> Taking a backup from the standby by using pg_basebackup is still not
> possible. But we can relax that restriction after applying this patch.

I think that this is going to be very important, particularly given
the requirements on pt 3 above. (But yes, it certainly doesn't have to
be done as part of this patch, but it really should be the plan to
have this included in the same version)


> To take a base backup during recovery safely, some sort of parameters
> must be set properly. Hot standby must be enabled on the standby, i.e.,
> wal_level and hot_standby must be enabled on the master and the standby,
> respectively. FPW (full page writes) is required for a base backup,
> so full_page_writes must be enabled on the master.

Presumably pg_start_backup() will check this. And we'll somehow track
this before pg_stop_backup() as well? (for such evil things such as
the user changing FPW from on to off and then back to on again during
a backup, will will make it look correct both during start and stop,
but incorrect in the middle - pg_stop_backup needs to fail in that
case as well)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Join push-down for foreign tables

2011-09-20 Thread Shigeru Hanada

Hi Michael,

(2011/09/21 12:52), Michael Paquier wrote:
> I am interested in the development you are doing regarding join push down
> and fdw stuff for remote postgreSQL servers.
> Is there a way to get the postgres fdw you are providing here for common
> 9.1?
> I saw that the tar you are providing here is adapted only for your patch.

As you say, the pgsql_fdw I posted requires my join-push-down patch.
But, at least in current revision, using PG_VERSION_NUM would make it
compile-able for both 9.1 and 9.2.  But I'm not sure that changes
required for 9.2 development cycle are enough small for this workaround.

Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or
elsewhere, though it needs some (hopefully little) time.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Online base backup from the hot-standby

2011-09-20 Thread Fujii Masao

On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander  wrote:
> On Wed, Sep 21, 2011 at 04:50, Fujii Masao  wrote:
>> 3. Copy the pg_control file from the cluster directory on the standby to
>>    the backup as follows:
>>
>>    cp $PGDATA/global/pg_control /mnt/server/backupdir/global
>
> But this is done as part of step 2 already. I assume what this really
> means is that the pg_control file must be the last file backed up?

Yes.

When we perform an archive recovery from the backup taken during
normal processing, we gets a backup end location from the backup-end
WAL record which was written by pg_stop_backup(). But since no WAL
writing is allowed during recovery, pg_stop_backup() on the standby
cannot write a backup-end WAL record. So, in his patch, instead of
a backup-end WAL record, the startup process uses the minimum
recovery point recorded in pg_control which has been included in the
backup, as a backup end location. BTW, a backup end location is
used to check whether recovery has reached a consistency state
(i.e., end-of-backup).

To use the minimum recovery point in pg_control as a backup end
location safely, pg_control must be backed up last. Otherwise, data
page which has the newer LSN than the minimum recovery point
might be included in the backup.

> (Since there are certainly a lot other ways to do the backup than just
> cp to a mounted directory..)

Yes. The above command I described is just an example.

>> 4. Execute pg_stop_backup on the standby.
>>
>> The backup taken by the above procedure is available for an archive
>> recovery or standby server.
>>
>> If the standby is promoted during a backup, pg_stop_backup() detects
>> the change of the server status and fails. The data backed up before the
>> promotion is invalid and not available for recovery.
>>
>> Taking a backup from the standby by using pg_basebackup is still not
>> possible. But we can relax that restriction after applying this patch.
>
> I think that this is going to be very important, particularly given
> the requirements on pt 3 above. (But yes, it certainly doesn't have to
> be done as part of this patch, but it really should be the plan to
> have this included in the same version)

Agreed.

>> To take a base backup during recovery safely, some sort of parameters
>> must be set properly. Hot standby must be enabled on the standby, i.e.,
>> wal_level and hot_standby must be enabled on the master and the standby,
>> respectively. FPW (full page writes) is required for a base backup,
>> so full_page_writes must be enabled on the master.
>
> Presumably pg_start_backup() will check this. And we'll somehow track
> this before pg_stop_backup() as well? (for such evil things such as
> the user changing FPW from on to off and then back to on again during
> a backup, will will make it look correct both during start and stop,
> but incorrect in the middle - pg_stop_backup needs to fail in that
> case as well)

Right. As I suggested upthread, to address that problem, we need to log
the change of FPW on the master, and then we need to check whether
such a WAL is replayed on the standby during the backup. If it's done,
pg_stop_backup() should emit an error.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-20 Thread Heikki Linnakangas


On 21.09.2011 02:53, Peter Geoghegan wrote:

C stdlib quick-sort time elapsed: 2.092451 seconds
Inline quick-sort time elapsed: 1.587651 seconds

Does *that* look attractive to you?


Not really, to be honest. That's a 25% speedup in pure qsorting speed. 
How much of a gain in a real query do you expect to get from that, in 
the best case? There's so many other sources of overhead that I'm afraid 
this will be lost in the noise. If you find a query that spends, say, 
50% of its time in qsort(), you will only get a 12.5% speedup on that 
query. And even 50% is really pushing it - I challenge you to find a 
query that spends any significant amount of time qsorting integers.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-20 Thread Simon Riggs

On Tue, Sep 20, 2011 at 3:51 AM, Tom Lane  wrote:

>> This performance patch differs from most in that it's difficult in
>> principle to imagine a performance regression occurring.
>
> Really?  N copies of the same code could lead to performance loss just
> due to code bloat (ie, less of a query's inner loops fitting in CPU
> cache).  Not to mention the clear regression in maintainability.  So
> I'm disinclined to consider this sort of change without a significantly
> bigger win than you're suggesting above (no, I don't even consider the
> -O0 number attractive, let alone what you're finding at -O2).

More copies of the code are somewhat annoying, but its only 100 lines
of code in one module and we can easily have specific tests for each.
The extra code size is minor in comparison to the reams of code we add
elsewhere.

It's a surprisingly good win for such a common use case. Well done, Peter.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

83 matches

Mail list logo