[HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

Hello all,

what I saw on PHP unconference told me that I should ask all again.

I feel lonely. Believe me it is a bad feeling when it seems that nobody has
interests in what you are doing.

Since 4 years I am PostgreSQL representative in SQL Standard committee.

Always, when I suggested to talk about my work in the SQL committee on 
community
events, a committee rejected it. This just showed me that nobody really 
has interests

in my work.

I now learned that such a event committee not always is able to estimate 
interests correct.


The only two persons who sometimes support me are David F. and Peter.

The next ISO meeting will be soon - and of course there are lots of 
drafts that needs

decisions.

I am not allowed to share the drafts in public. Because the drafts are 
confidential.
But I am allowed to share the drafts with the group of ppl who are 
supporting me.
Of course I am allowed to discuss the drafts with my folk before I will 
give my votes and comments.


Is there really only David and Peter who have interests?

I not really want to believe it.

Isn't it possible to create a closed mailing list - a list that won't 
get published - on which

I can discuss SQL Standard stuff with the folk who wants to support me?

I don't fear to make decisions on my own - but speaking for the whole 
project without

getting feedback - is a worse feeling.

Usually, when I feel unsure how I should decide I just bother Peter - 
but I would prefer

to have some more ppl in my background.

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Brendan Jurd
On 16 September 2011 16:24, Susanne Ebrecht susa...@2ndquadrant.com wrote:
 Isn't it possible to create a closed mailing list - a list that won't get
 published - on which
 I can discuss SQL Standard stuff with the folk who wants to support me?

 I don't fear to make decisions on my own - but speaking for the whole
 project without
 getting feedback - is a worse feeling.

 Usually, when I feel unsure how I should decide I just bother Peter - but I
 would prefer
 to have some more ppl in my background.

I for one would definitely be interested in reading such a list.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 We have three choices. Which do you like the best?

I'm in favor of defining a separate, content-free trigger file to enable
archive recovery.  Not sure about the name recovery.ready, though ---
that makes it look like one of the WAL archive transfer trigger files,
which does not seem like a great analogy.  The pg_standby documentation
suggests names like foo.trigger for failover triggers, which is a bit
better analogy because something external to the database creates the
file.  What about recovery.trigger?

As far as the other issues go, I think there is actually a prerequisite
discussion to be had here, which is whether we are turning the recovery
parameters into plain old GUCs or not.  If they are plain old GUCs, then
they will presumably still have their values when we are *not* doing
recovery.  That leads to a couple of questions:
* will seeing these values present in pg_settings confuse anybody?
* can the values be changed when not in recovery, if so what happens,
  and again will that confuse anybody?
* is there any security hazard from ordinary users being able to see
  what settings had been used?

If these settings are to be plain old GUCs, then I think we should just
stick them in postgresql.conf and forget about recovery.conf (although
of course someone could include recovery.conf if he were bent on
storing them separately).  On the other hand, if we think they are *not*
plain old GUCs, maybe they shouldn't be in postgresql.conf.  But the
source file isn't the first thing to worry about in that case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Heikki Linnakangas

On 16.09.2011 09:24, Susanne Ebrecht wrote:

The next ISO meeting will be soon - and of course there are lots of
drafts that needs
decisions.

I am not allowed to share the drafts in public. Because the drafts are
confidential.


I think that's a big part of the problem. It's hard to get excited about 
something if you don't know what's happening.



But I am allowed to share the drafts with the group of ppl who are
supporting me.
Of course I am allowed to discuss the drafts with my folk before I will
give my votes and comments.


Even if you can't share drafts, would it be possible to give a summary 
of things that are being discussed in the committee? That way if there's 
people in the community with interests in particular topics, they could 
contact you and get involved.



Isn't it possible to create a closed mailing list - a list that won't
get published - on which
I can discuss SQL Standard stuff with the folk who wants to support me?


I could join such a mailing list if you create one, but it would 
probably be better if specific topics could be discussed on 
pgsql-hackers. Perhaps this is something you should bring up in the 
committee. I understand that the committee doesn't want to open its work 
to the whole world, but I also don't see why work-in-progress features 
couldn't be discussed in the public.


FWIW, I'm very glad you're on the committee! Even though I have no idea 
what's going on there, it gives me a warm feeling that there's someone 
on the committee who knows PostgreSQL. If someone proposes something 
that would hurt PostgreSQL, like syntax that we already use for 
something else, I know you're going to speak up.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Double sorting split patch

2011-09-16 Thread Heikki Linnakangas

On 15.09.2011 21:46, Alexander Korotkov wrote:

On Thu, Sep 15, 2011 at 7:27 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


I've looked at the patch, and took a brief look at the paper - but I still
don't understand the algorithm. I just can't get my head around the concepts
of split pairs and left/right groups. Can you explain that in layman's
terms? Perhaps an example would help?


In short algorithm works as following:
1) Each box can be projected to the axis as an interval. Box (x1,y1)-(x2,y2)
are projected to X axis as (x1,x2) interval and to the Y axis as (y1,y2)
interval. At the first step we search for splits of those intervals and
select the best one.
2) At the second step produced split are converting into terms of boxes
and ambiguities are solving.

Let's see a little deeper how intervals split search are performed by
considering an example. We've intervals (0,1), (1,3), (2,3), (2,4). We
assume intervals of the groups to be (0,a), (b,4). So, a can be some upper
bound of interval: {1,3,4}, and b can be some lower bound of inteval:
{0,1,2}.
We consider following splits: each a with greatest possible b
(0,1), (1,4)
(0,3), (2,4)
(0,4), (2,4)
and each b with least possible a. In this example splits will be:
(0,1), (0,4)
(0,1), (1,4)
(0,3), (2,4)
By removing the duplicates we've following splits:
(0,1), (0,4)
(0,1), (1,4)
(0,3), (2,4)
(0,4), (2,4)


Ok, thanks, I understand that now.


Proposed algorithm finds following splits by single pass on two arrays: one
sorted by lower bound of interval and another sorted by upper bound of
interval.


That looks awfully complicated. I don't understand how that works. I 
wonder if two passes would be simpler?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Peter Eisentraut
On fre, 2011-09-16 at 01:32 -0400, Tom Lane wrote:
 As far as the other issues go, I think there is actually a
 prerequisite
 discussion to be had here, which is whether we are turning the
 recovery
 parameters into plain old GUCs or not.  If they are plain old GUCs,
 then
 they will presumably still have their values when we are *not* doing
 recovery.  That leads to a couple of questions:
 * will seeing these values present in pg_settings confuse anybody?

How so?  We add or change the available parameters all the time.

 * can the values be changed when not in recovery, if so what happens,
   and again will that confuse anybody?

Should be similar to archive_command and archive_mode.  You can still
see and change archive_command when archive_mode is off.

 * is there any security hazard from ordinary users being able to see
   what settings had been used? 

Again, not much different from the archive_* settings.  They are, after
all, almost the same in the opposite direction.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Peter Eisentraut
On fre, 2011-09-16 at 11:54 +0900, Fujii Masao wrote:
 #1
 Use empty recovery.ready file to enter arhicve recovery. recovery.conf
 is not read automatically. All recovery parameters are expected to be
 specified in postgresql.conf. If you must specify them in recovery.conf,
 you need to add include 'recovery.conf' into postgresql.conf. But note
 that that recovery.conf will not be renamed to recovery.done at the
 end of recovery. This is what the patch I've posted does. This is
 simplest approach, but might confuse people who use the tools which
 depend on recovery.conf. 

A small variant to this:  When you are actually doing recovery from a
backup, having a recovery trigger and a recovery done file is obviously
quite helpful and necessary for safety.  But when you're setting up a
replication slave, it adds extra complexity for the user.  The
approximately goal ought to be to be able to do

pg_basebackup -h master -D there
postgres -D there --standby-mode=on --primary-conninfo=master

without the need to touch any obscure recovery trigger files.

So perhaps recovery.{trigger,ready} should not be needed if, say,
standby_mode=on.  But then what impact should the presence of
recovery.done have?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Dave Page
On Fri, Sep 16, 2011 at 7:24 AM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:

 Since 4 years I am PostgreSQL representative in SQL Standard committee.

With respect, I believe you are on the committee as you were an
employee of MySQL. We've had a number of discussions both online and
at one of the more recent developer meetings, and even approved
funding around having (if I remember correctly) Peter or Simon
represent us on the committee.

 Always, when I suggested to talk about my work in the SQL committee on
 community
 events, a committee rejected it. This just showed me that nobody really has
 interests
 in my work.

 I now learned that such a event committee not always is able to estimate
 interests correct.

An event committee is not approving talks because the work is
important to the community - they are approving talks that will be of
interest to the conference audience. In the case of PG Conference
Europe which I suspect you are alluding to there were a significant
number of talks submitted that would be of far more interest and
benefit to our primary audience of end users.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Merlin Moncure
On Fri, Sep 16, 2011 at 1:24 AM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:
 The next ISO meeting will be soon - and of course there are lots of drafts
 that needs
 decisions.

So, generally speaking, what kinds of things are going to be brought
up at the ISO meeting?  Is this an opportunity to get postgres special
syntax drafted into the sql standard?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

Dave,

On 16.09.2011 14:33, Dave Page wrote:

On Fri, Sep 16, 2011 at 7:24 AM, Susanne Ebrecht
susa...@2ndquadrant.com  wrote:

Since 4 years I am PostgreSQL representative in SQL Standard committee.

With respect, I believe you are on the committee as you were an
employee of MySQL.


Nope. As Sun employee - I always was responsible for taking care of 
Postgresql - taking care of MySQL others did.



An event committee is not approving talks because the work is
important to the community - they are approving talks that will be of
interest to the conference audience.


You exactly hit the point here - where I had another opinion for what a 
community event also is.

But doesn't matter.

As I said - I just learned.

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Dave Page
On Fri, Sep 16, 2011 at 1:43 PM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:

 Since 4 years I am PostgreSQL representative in SQL Standard committee.

 With respect, I believe you are on the committee as you were an
 employee of MySQL.

 Nope. As Sun employee - I always was responsible for taking care of
 Postgresql - taking care of MySQL others did.

My point remains - Sun were never in a position to say who represents
PostgreSQL.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Alvaro Herrera

Excerpts from Susanne Ebrecht's message of vie sep 16 03:24:58 -0300 2011:

 Isn't it possible to create a closed mailing list - a list that won't 
 get published - on which
 I can discuss SQL Standard stuff with the folk who wants to support me?

It's certainly possible to create a private mailing list to support this
idea.  How would the membership be approved, however, is not clear to
me.  Would we let only well-known names from other pgsql lists into it?

(I, for one, had no idea you were in the SQL committee.)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

Heikki,

On 16.09.2011 08:49, Heikki Linnakangas wrote:


Even if you can't share drafts, would it be possible to give a summary 
of things that are being discussed in the committee? That way if 
there's people in the community with interests in particular topics, 
they could contact you and get involved.


Of course it is. I just not wanted to spam hackers.

FWIW, I'm very glad you're on the committee! Even though I have no 
idea what's going on there, it gives me a warm feeling that there's 
someone on the committee who knows PostgreSQL. If someone proposes 
something that would hurt PostgreSQL, like syntax that we already use 
for something else, I know you're going to speak up.




Thanks for the bouquet. This comment let me feel better.

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fstat vs. lseek

2011-09-16 Thread Andrea Suisani

hi

On 08/08/2011 07:50 PM, Robert Haas wrote:

On Mon, Aug 8, 2011 at 1:31 PM, Andres Freundand...@anarazel.de  wrote:

If its ok I will write a mail to lkml referencing this thread and your numbers
inline (with attribution obviously).


That would be great.  Please go ahead.


I've just stumbled across this thread on lkml [1]
Improve lseek scalability v3.

and I thought to ping pgsql hackers list
just in case, more to the point they're
asking are there any real workloads which care
[Make generic lseek lockless safe]

maybe I've got it wrong but it seems somewhat
related to what has been discussed here and
also in Robert Haas's Linux and glibc Scalability
blog post [1].

[cut]

Andrea

[1] https://lkml.org/lkml/2011/9/15/399
[2] http://rhaas.blogspot.com/2011/08/linux-and-glibc-scalability.html

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fstat vs. lseek

2011-09-16 Thread Andres Freund
On Friday 16 Sep 2011 15:19:07 Andrea Suisani wrote:
 hi
 
 On 08/08/2011 07:50 PM, Robert Haas wrote:
  On Mon, Aug 8, 2011 at 1:31 PM, Andres Freundand...@anarazel.de  wrote:
  If its ok I will write a mail to lkml referencing this thread and your
  numbers inline (with attribution obviously).
  
  That would be great.  Please go ahead.
 
 I've just stumbled across this thread on lkml [1]
 Improve lseek scalability v3.
 
 and I thought to ping pgsql hackers list
 just in case, more to the point they're
 asking are there any real workloads which care
 [Make generic lseek lockless safe]
I wrote them a mail sometime ago (some weeks) regarding an earlier version of 
the patch... Can't find it right now though.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Euler Taveira de Oliveira

On 15-09-2011 23:54, Fujii Masao wrote:

#1
Use empty recovery.ready file to enter arhicve recovery. recovery.conf
is not read automatically. All recovery parameters are expected to be
specified in postgresql.conf. If you must specify them in recovery.conf,
you need to add include 'recovery.conf' into postgresql.conf. But note
that that recovery.conf will not be renamed to recovery.done at the
end of recovery. This is what the patch I've posted does. This is
simplest approach, but might confuse people who use the tools which
depend on recovery.conf.

more or less +1. We don't need two config files.; just one: postgresql.conf. 
Just turn all recovery.conf parameters to GUCs. As already said, the 
recovery.conf settings are not different from archive settings, we just need a 
way to trigger the recovery. And that trigger could be pulled by a GUC 
(standby_mode) or a file (say recovery - recovery.done). Also, recovery.done 
could be filled with recovery information just for DBA record. standby_mode 
does not create any file, it just trigger the recovery (as it will be used 
mainly for replication purposes).



--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

On 16.09.2011 14:47, Dave Page wrote:

My point remains - Sun were never in a position to say who represents
PostgreSQL.


Dave,

the procedure works different. The country representation ask for you.
Because you represent your product on one side - but you also represent 
your country.

For example ANSI offered Sun to send some experts.
If BSI wants your expertise then they would ask you or your company 
(don't know how BSI works internally).

Germany ask for my PostgreSQL expertise.

Of course Peter always was and still is in my background.

Finland just has no active group yet - afair Peter is working on that 
problem.


Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Dave Page
On Fri, Sep 16, 2011 at 8:47 AM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:
 On 16.09.2011 14:47, Dave Page wrote:

 My point remains - Sun were never in a position to say who represents
 PostgreSQL.

 Dave,

 the procedure works different. The country representation ask for you.
 Because you represent your product on one side - but you also represent your
 country.
 For example ANSI offered Sun to send some experts.
 If BSI wants your expertise then they would ask you or your company (don't
 know how BSI works internally).
 Germany ask for my PostgreSQL expertise.

 Of course Peter always was and still is in my background.

 Finland just has no active group yet - afair Peter is working on that
 problem.

You're missing my point completely. You say you represent PostgreSQL
on the SQL Committee (or German working group, but that's not the
point), yet the PostgreSQL hackers didn't know that, and were making
other plans less than 2 years ago. For me, a representative would have
been reporting back to us after each meeting, and discussing points to
raise before each meeting - not working in isolation, without the
knowledge of others.

I'd be glad to see us have representation, but I do not believe we
have had any yet, and whatever you have done so far (which may or may
not be good for us) really doesn't count because it hasn't involved
the project in any way.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Euler Taveira de Oliveira

On 16-09-2011 10:26, Susanne Ebrecht wrote:

On 16.09.2011 08:49, Heikki Linnakangas wrote:


Even if you can't share drafts, would it be possible to give a summary of
things that are being discussed in the committee? That way if there's people
in the community with interests in particular topics, they could contact you
and get involved.


Of course it is. I just not wanted to spam hackers.


But if it is community interest, of course it will bother no one here...


--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initialization of ResultTupleSlot in AppendNode

2011-09-16 Thread Amit Kapila

 That also holds the plan's output tuple descriptor.  If you tried to
 remove it, I think the ExecAssignResultTypeFromTL call would crash.
 And if you removed *that*, upper nodes would get unhappy, cf
 ExecGetResultType.
Yes, this is true. However upper nodes doesn't need in all cases, so is it
possible that ExecGetResultType should check if ResultTupleSlot is NULL,
then does functionality similar to ExecAssignResultTypeFromTL, to return
tuple descriptor.
This can save everytime allocation of ResultTupleSlot for AppendNode.

***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Friday, September 16, 2011 4:24 AM
To: Amit Kapila
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Initialization of ResultTupleSlot in AppendNode 

Amit Kapila amit.kap...@huawei.com writes:
 I observed that during initialization of planstate for Append Node, we
 allocate ResulttupleSlot, however it is used only to send NULL slot
indicate
 no more tuples. 

 Is it right or there is any other purpose of it?

That also holds the plan's output tuple descriptor.  If you tried to
remove it, I think the ExecAssignResultTypeFromTL call would crash.
And if you removed *that*, upper nodes would get unhappy, cf
ExecGetResultType.

The use as an end-of-scan signal seems a bit vestigial, since we
could just as well return NULL, but it doesn't really cost enough
to be worth changing ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initialization of ResultTupleSlot in AppendNode

2011-09-16 Thread Amit Kapila
 That also holds the plan's output tuple descriptor.  If you tried to 
 remove it, I think the ExecAssignResultTypeFromTL call would crash.
 And if you removed *that*, upper nodes would get unhappy, cf 
 ExecGetResultType.
Yes, this is true. However upper nodes doesn't need in all cases, so is it
possible that ExecGetResultType should check if ResultTupleSlot is NULL,
then does functionality similar to ExecAssignResultTypeFromTL, to return
tuple descriptor.
This can save everytime allocation of ResultTupleSlot for AppendNode.


***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!


-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Friday, September 16, 2011 4:24 AM
To: Amit Kapila
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Initialization of ResultTupleSlot in AppendNode 

Amit Kapila amit.kap...@huawei.com writes:
 I observed that during initialization of planstate for Append Node, we
 allocate ResulttupleSlot, however it is used only to send NULL slot
indicate
 no more tuples. 

 Is it right or there is any other purpose of it?

That also holds the plan's output tuple descriptor.  If you tried to
remove it, I think the ExecAssignResultTypeFromTL call would crash.
And if you removed *that*, upper nodes would get unhappy, cf
ExecGetResultType.

The use as an end-of-scan signal seems a bit vestigial, since we
could just as well return NULL, but it doesn't really cost enough
to be worth changing ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

On 16.09.2011 14:38, Merlin Moncure wrote:

So, generally speaking, what kinds of things are going to be brought
up at the ISO meeting?  Is this an opportunity to get postgres special
syntax drafted into the sql standard?


Yes and no.

You first need to convince your country and then - as country 
representative - you need to convince the other countries on ISO level.


You have country based sql standard groups in several countries. The 
well known groups are ANSI for USA,

BSI for UK, DIN for Germany, JTC for Japan and so on.

Inner the country you usually represent your own product.

Usually the country based group members are a mix of research folk (e.g. 
universities) and DB-System companies placed inner the country. Which 
experts they will let in and which not depends on the country based

group.

It is good here to be in a smaller country - because the groups are 
smaller and you can get more of your ideas up to ISO.


All the country based groups together are ISO.

In ISO every country just has a single vote.
This means - even when your country suggested what you wanted then it 
could happen that there are enough countries against it so that your 
idea won't get into the standard.


Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Susanne Ebrecht

On 16.09.2011 15:59, Dave Page wrote:

other plans less than 2 years ago. For me, a representative would have
been reporting back to us after each meeting, and discussing points to
raise before each meeting - not working in isolation, without the
knowledge of others.


Dave,

I exactly did this with Peter.
Afair, I once was told it is enough to report to Peter.
And as I said - David showed interests and we sometimes talk about it too.
I never wanted to bother hackers with all this stuff.

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Tom Lane
While testing 9.1 RPMs on Fedora 15 (2.6.40 kernel), I notice
messages like these in the kernel log:

Sep 11 13:38:56 rhl kernel: [  415.308092] postgres (18040): 
/proc/18040/oom_adj is deprecated, please use /proc/18040/oom_score_adj instead.

These don't show up on every single PG process launch, but that probably
just indicates there's a rate-limiter in the kernel reporting mechanism.

So it looks like it behooves us to cater for oom_score_adj in the
future.  The simplest, least risky change that I can think of is to
copy-and-paste the relevant #ifdef code block in fork_process.c.
If we do that, then it would be up to the packager whether to #define
LINUX_OOM_ADJ or LINUX_OOM_SCORE_ADJ or both depending on the behavior
he wants.

That would be good enough for my own purposes in building Fedora/RHEL
packages, since I can predict with confidence which kernel versions a
given build is likely to be used with.  I think probably the same
would be true for most other distro-specific builds.  Does anyone want
to argue for doing something more complicated, and if so what exactly?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-16 Thread Dave Page
On Fri, Sep 16, 2011 at 9:49 AM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:
 On 16.09.2011 15:59, Dave Page wrote:

 other plans less than 2 years ago. For me, a representative would have
 been reporting back to us after each meeting, and discussing points to
 raise before each meeting - not working in isolation, without the
 knowledge of others.

 Dave,

 I exactly did this with Peter.
 Afair, I once was told it is enough to report to Peter.
 And as I said - David showed interests and we sometimes talk about it too.
 I never wanted to bother hackers with all this stuff.

-hackers is exactly where we would discuss issues related to
development and design of PostgreSQL.



-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Greg Stark
On Fri, Sep 16, 2011 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Does anyone want
 to argue for doing something more complicated, and if so what exactly?


Well there's no harm trying to write to oom_score_adj and if that
fails with EEXISTS trying to write to oom_adj.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 On Fri, Sep 16, 2011 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Does anyone want
 to argue for doing something more complicated, and if so what exactly?

 Well there's no harm trying to write to oom_score_adj and if that
 fails with EEXISTS trying to write to oom_adj.

Well, (1) what value are you going to write (they need to be different
for the two files), and (2) the main point of the exercise, at present,
is to avoid kernel log messages.  I'm not sure that trying to create
random files under /proc isn't going to draw bleats in the kernel log.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Alvaro Herrera

Excerpts from Tom Lane's message of vie sep 16 13:37:46 -0300 2011:
 Greg Stark st...@mit.edu writes:
  On Fri, Sep 16, 2011 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Does anyone want
  to argue for doing something more complicated, and if so what exactly?
 
  Well there's no harm trying to write to oom_score_adj and if that
  fails with EEXISTS trying to write to oom_adj.
 
 Well, (1) what value are you going to write (they need to be different
 for the two files), and (2) the main point of the exercise, at present,
 is to avoid kernel log messages.  I'm not sure that trying to create
 random files under /proc isn't going to draw bleats in the kernel log.

I guess the question is what semantics the new code has.  In the old
badness() world, child processes inherited the oom_adj value of its
parent.  What the code in fork_process was used for was resetting the
value back to 0 (meaning kernel is allowed to kill this process on
OOM), so that you could set the oom_adj in the start script for
postmaster (to a value meaning never kill this one), and the backends
would see their values reset to zero.

The new oom_score_adj has a scale of -1000 to +1000, with zero being
neutral and -1000 being never kill.  So what we want to do here in
most cases, it seems, is set the value to zero whether it's oom_adj or
oom_score_adj -- assuming the new code is still causing children
processes to inherit the adj value from the parent.

Now the problem is that we have defined the LINUX_OOM_ADJ symbol as
meaning the value we're going to write.  Maybe this wasn't the best
choice.  I mean, it's very flexible, but it doesn't seem to offer any
benefit over a plain boolean choice.

Is your proposal to create a new LINUX_OOM_SCORE_ADJ cpp symbol with the
same semantics?

The most thorough documentation on this option seems to be this commit:
https://github.com/mirrors/linux-2.6/commit/a63d83f427fbce97a6cea0db2e64b0eb8435cd10#include/linux/oom.h

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Now the problem is that we have defined the LINUX_OOM_ADJ symbol as
 meaning the value we're going to write.  Maybe this wasn't the best
 choice.  I mean, it's very flexible, but it doesn't seem to offer any
 benefit over a plain boolean choice.

 Is your proposal to create a new LINUX_OOM_SCORE_ADJ cpp symbol with the
 same semantics?

Yes, that's what I was thinking.  We could avoid that if we were going
to hard-wire a decision that zero is the thing to write, but I see no
reason to place such a restriction on users.  Who's to say they might
not want backends to adopt some other value?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Andres Freund
Hi,
On Friday 16 Sep 2011 17:36:20 Matthew Wilcox wrote:
 On Fri, Sep 16, 2011 at 04:16:49PM +0200, Andres Freund wrote:
  I sent an email containing benchmarks from Robert Haas regarding the
  Subject. Looking at lkml.org I can't see it right now, Will recheck when
  I am at home.
  
  He replaced lseek(SEEK_END) with fstat() and got speedups up to 8.7 times
  the lseek performance.
  The workload was 64 clients hammering postgres with a simple readonly
  workload (pgbench -S).
 Yay!  Data!

  For reference see the thread in the postgres archives which also links to
  performance data: http://archives.postgresql.org/message-
  id/CA+TgmoawRfpan35wzvgHkSJ0+i-W=vkjpknrxk2ktdr+hsa...@mail.gmail.com
 So both fstat and lseek do more work than postgres wants.  lseek modifies
 the file pointer while fstat copies all kinds of unnecessary information
 into userspace.  I imagine this is the source of the slowdown seen in
 the 1-client case.
Yes, that was my theory as well.

 I'd like to dig into the requirement for knowing the file size a little
 better.  According to the blog entry it's used for the query planner.
Its used for multiple things - one of which is the query planner.
The query planner needs to know how many tuples a table has to produce a 
sensible plan. For that is has stats which tell 1. how big is the table 2. how 
many tuples does the table have. Those statistics are only updated every now 
and then though.
So it uses those old stats to check how many tuples are normally stored on a 
page and then uses that to extrapolate the number of tuples from the current 
nr of pages (which is computed by lseek(SEEK_END) over the 1GB segements of a 
table).

I am not sure how interested you are on the relevant postgres internals?

 Does the query planner need to know the exact number of bytes in the file,
 or is it after an order-of-magnitude?  Or to-the-nearest-gigabyte?
It depends on where the information is used. For some of the uses it needs to 
be exact (the assumed size is rechecked after acquiring a lock preventing 
extension) at other places I guess it would be ok if the accuracy got lower 
with bigger files (those files won't ever get bigger than 1GB).
But I have a hard time seeing an implementation where the approximate size 
would be faster to get than just the filesize? 

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Alvaro Herrera

Excerpts from Andres Freund's message of vie sep 16 14:27:33 -0300 2011:
 Hi,
 On Friday 16 Sep 2011 17:36:20 Matthew Wilcox wrote:

  Does the query planner need to know the exact number of bytes in the file,
  or is it after an order-of-magnitude?  Or to-the-nearest-gigabyte?
 It depends on where the information is used. For some of the uses it needs to 
 be exact (the assumed size is rechecked after acquiring a lock preventing 
 extension) at other places I guess it would be ok if the accuracy got lower 
 with bigger files (those files won't ever get bigger than 1GB).

One other thing we're interested in is portability.  I mean, even if
Linux were to introduce a new hypothetical syscall that was able to
return the file size at a ridiculously low cost, we probably wouldn't
use it because it'd be Linux-specific.  So an improvement of lseek()
seems to be the best option.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2011-09-16 Thread Alex Hunsaker
On Fri, Sep 16, 2011 at 10:37, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@mit.edu writes:
 On Fri, Sep 16, 2011 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Does anyone want
 to argue for doing something more complicated, and if so what exactly?

 Well there's no harm trying to write to oom_score_adj and if that
 fails with EEXISTS trying to write to oom_adj.

Yeah, I don't really like the idea of a compile time option that is
kernel version dependent... But i don't feel too strongly about it
either (all my kernels are new enough that they support
oom_score_adj).

I'll also note that on my system we are in the good company of ssd and chromium:
sshd (978): /proc/978/oom_adj is deprecated, please use
/proc/978/oom_score_adj instead.
chromium-sandbo (1377): /proc/1375/oom_adj is deprecated, please use
/proc/1375/oom_score_adj instead.

[ It quite annoying that soon after we decided to stick
-DLINUX_OOM_ADJ in they changed it.  Whatever happened to a stable
userspace API :-( ]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI heap_insert and page-level predicate locks

2011-09-16 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Wed, 2011-06-08 at 17:29 -0500, Kevin Grittner wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 AFAICS, the check for page lock is actually unnecessary.

 Absolutely correct.  Patch attached.

 I like the change, but the comment is slightly confusing.

I've committed this patch with comment rewording along the lines
suggested by Jeff.  I also moved the CheckForSerializableConflictIn call
to just before, instead of just after, the RelationGetBufferForTuple
call.  We no longer have to do it after, since we don't need to know
which buffer to pass, and it should buy some more low-level parallelism
to run the SSI checks while not holding exclusive lock on the eventual
target buffer.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Joshua Berkus

 I'm in favor of defining a separate, content-free trigger file to
 enable
 archive recovery.  Not sure about the name recovery.ready, though
 ---
 that makes it look like one of the WAL archive transfer trigger
 files,
 which does not seem like a great analogy.  The pg_standby
 documentation
 suggests names like foo.trigger for failover triggers, which is a
 bit
 better analogy because something external to the database creates the
 file.  What about recovery.trigger?

Do we want a trigger file to enable recovery, or one to *disable* recovery?  Or 
both?

Also, I might point out that we're really confusing our users by talking about 
recovery all the time, if they're just using streaming replication.  Just 
sayin'

 * will seeing these values present in pg_settings confuse anybody?

No.  pg_settings already has a couple dozen developer parameters which nobody 
not on this mailing list understands.  Adding the recovery parameters to it 
wouldn't confuse anyone further, and would have the advantage of making the 
recovery parameters available by monitoring query on a hot standby.

For that matter, I'd suggest that we add a read-only setting called in_recovery.

 * can the values be changed when not in recovery, if so what happens,
   and again will that confuse anybody?

Yes, and no.

 * is there any security hazard from ordinary users being able to see
   what settings had been used?

primary_conninfo could be a problem, since it's possible to set a password 
there.

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] force_not_null option support for file_fdw

2011-09-16 Thread Tom Lane
Shigeru Hanada shigeru.han...@gmail.com writes:
 (2011/09/09 0:47), Kohei Kaigai wrote:
 makeString() does not copy the supplied string itself, so it is not 
 preferable to reference
 NameStr(attr-attname) across ReleaseSysCache().

 Oops, fixed.
 [ I should check some of my projects for this issue... ]

I've committed this with some mostly-cosmetic revisions, notably

* use defGetBoolean, since this ought to be a plain boolean option
rather than having its own private idea of which spellings are accepted.

* get rid of the ORDER BY altogether in the regression test case ---
it seems a lot safer to assume that COPY will read the data in the
presented order than that text will be sorted in any particular way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Andres Freund
On Friday, September 16, 2011 10:08:17 PM Benjamin LaHaise wrote:
 On Fri, Sep 16, 2011 at 07:27:33PM +0200, Andres Freund wrote:
  many tuples does the table have. Those statistics are only updated every
  now and then though.
  So it uses those old stats to check how many tuples are normally stored
  on a page and then uses that to extrapolate the number of tuples from
  the current nr of pages (which is computed by lseek(SEEK_END) over the
  1GB segements of a table).
  
  I am not sure how interested you are on the relevant postgres internals?
 
 For such tables, can't Postgres track the size of the file internally?  I'm
 assuming it's keeping file descriptors open on the tables it manages, in
 which case when it writes to a file to extend it, the internally stored
 size could be updated.  Not making a syscall at all would scale far better
 than even a modified lseek() will perform.
Yes, it tracks the fds internally. The problem is that postgres is process 
based so those tables are not reachable by all processes. It could start 
tracking those in shared memory but the synchronization overhead for that 
would likely be more expensive than the syscall overhead (Given that the 
fdsets are possibly (and realistically) disjunct between the individual 
backends you would have to reserve enough shared memory for a fully seperate 
fds between each process... Which would complicate efficient lookup).

Also with fstat() instead of lseek() there was no bottleneck anymore, so I 
don't think the benefits would warrant that.

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Andres Freund
On Friday, September 16, 2011 11:02:38 PM Andres Freund wrote:
 Also with fstat() instead of lseek() there was no bottleneck anymore, so I 
 don't think the benefits would warrant that.
At least thats what I observed on a 4 x 6 machine without the patch applied 
(can't reboot it). That shouldn't be concurrency relevant so...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Greg Stark
On Fri, Sep 16, 2011 at 9:08 PM, Benjamin LaHaise b...@kvack.org wrote:
 For such tables, can't Postgres track the size of the file internally?  I'm
 assuming it's keeping file descriptors open on the tables it manages, in
 which case when it writes to a file to extend it, the internally stored size
 could be updated.  Not making a syscall at all would scale far better than
 even a modified lseek() will perform.

There's no hardwired limit on how many tables you can have in a
database, it's not limited by the number of file descriptors. Postgres
would have to keep some kind of LRU for recently opened files and
their sizes or something like that. There would probably still be a
lot of lseeks/fstats going on.

Generally keeping a Postgres cached value for the size would then have
a reliability issue. It's much safer to have a single authoritative
value -- the actual length of the file -- than have the same value
stored in two locations and then need to worry about them getting out
of sync. If a write fails when extending the file due to a filesystem
running out of space then Postgres might not know how to update its
internal cached state accurately for example.

There's no question it could be done but it's not clear it would
necessarily be much faster than a lock-free lseek/fstat.

On Fri, Sep 16, 2011 at 6:27 PM, Andres Freund and...@anarazel.de wrote:
 It depends on where the information is used. For some of the uses it needs to
 be exact (the assumed size is rechecked after acquiring a lock preventing
 extension)

Fwiw this might give the wrong impression. I don't believe scans
acquire a lock preventing extension -- that is another process can be
concurrently extending the file at the same time as the scan is
proceeding. The scan only locks out truncation (vacuum). Any blocks
added by another process are ignored by the scan because they can only
contain records invisible to that transaction. This does depend on the
lseek/fstat being done after the transaction snapshot is taken which
is possibly rechecking the value taken by the query planner but
they're really two independent things.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated

2011-09-16 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 this patch significantly reduce a ccache searching.

I looked at this patch a little bit.  It's got a very serious problem:
it supposes that the parent of an ARRAYELEM datum must be a VAR datum,
which is not so.  As an example, it gets an Assert failure on this:


create table rtype (id int, ar text[]);

create or replace function foo() returns text[] language plpgsql as $$
declare
  r record;
begin
  r := row(12, '{foo,bar,baz}')::rtype;
  r.ar[2] := 'replace';
  return r.ar;
end$$;

select foo();


There is not any good place to keep the array element lookup data for
the non-VAR cases that is comparable to what you did for VAR.  I wasn't
exactly thrilled about adding another field to PLpgSQL_var anyway,
because it would go unused in the large majority of cases.

A possible solution is to use the ARRAYELEM datum itself to hold the
cached lookup data.  I'm not sure if it's worth having a level of
indirection as you do here; you might as well just drop the fields right
into PLpgSQL_arrayelem, because they'd be used in the vast majority of
cases.

Also, in order to deal with subscripting record fields, you'd better be
prepared for the possibility that the target array type changes from
time to time.  I'd envision this working similarly to what various
array-manipulating functions do: you remember the last input OID you
looked up, and whenever that changes, repeat the lookup steps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Andi Kleen
 One other thing we're interested in is portability.  I mean, even if
 Linux were to introduce a new hypothetical syscall that was able to
 return the file size at a ridiculously low cost, we probably wouldn't
 use it because it'd be Linux-specific.  So an improvement of lseek()
 seems to be the best option.

Fully agreed. It doesn't make any sense at all to implement special
syscalls just to workaround a basic system call not scaling.

-Andi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-16 Thread Matthew Wilcox
On Fri, Sep 16, 2011 at 04:16:49PM +0200, Andres Freund wrote:
 I sent an email containing benchmarks from Robert Haas regarding the Subject. 
 Looking at lkml.org I can't see it right now, Will recheck when I am at home.
 
 He replaced lseek(SEEK_END) with fstat() and got speedups up to 8.7 times the 
 lseek performance.
 The workload was 64 clients hammering postgres with a simple readonly 
 workload 
 (pgbench -S).

Yay!  Data!

 For reference see the thread in the postgres archives which also links to 
 performance data: http://archives.postgresql.org/message-
 id/CA+TgmoawRfpan35wzvgHkSJ0+i-W=vkjpknrxk2ktdr+hsa...@mail.gmail.com

So both fstat and lseek do more work than postgres wants.  lseek modifies
the file pointer while fstat copies all kinds of unnecessary information
into userspace.  I imagine this is the source of the slowdown seen in
the 1-client case.

There have been various proposals to make the amount of information returned
by fstat limited to the 'cheap' (for various definitions of 'cheap') fields.

I'd like to dig into the requirement for knowing the file size a little
better.  According to the blog entry it's used for the query planner.
Does the query planner need to know the exact number of bytes in the file,
or is it after an order-of-magnitude?  Or to-the-nearest-gigabyte?

-- 
Matthew Wilcox  Intel Open Source Technology Centre
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers