[HACKERS] only linestyle is NULL as default

2013-09-07 Thread Pavel Stehule
Hello

when I checked psql and pset without any arguments patch, I found so only
popt-topt.line_style is initialized to NULL as default. All other popt
variables are not null.

Can we fixed?

Regards

Pavel


[HACKERS] review: psql and pset without any arguments

2013-09-07 Thread Pavel Stehule
Hello

* patch is cleanly patchable and compilation is without warnings
* all regression tests passed
* no impact on dump, performance or any current features
* no comments to programming style
* we would this feature - it is consistent with \set and it gives a fast
resume about psql printer settings

Issues:
1) it doesn't show linestyle when is default

I looked there and it is probably a small different issue - this value is
initialized as NULL as default. But I dislike a empty output in this case:

else if (strcmp(param, linestyle) == 0)
{
if (!popt-topt.line_style)
;
else
printf(_(Line style is %s.\n),
   get_line_style(popt-topt)-name);
}

I propose a verbose result instead nothing:

else if (strcmp(param, linestyle) == 0)
{
if (!popt-topt.line_style)
   printf(_(Line style is unset.\n)) ;
else
printf(_(Line style is %s.\n),
   get_line_style(popt-topt)-name);
}


2) there is only one open question
http://www.postgresql.org/message-id/b6f6fd62f2624c4c9916ac0175d56d880ce00...@jenmbs01.ad.intershop.net-
there is no clean relation between output and some pset option.

I don't think so Marc' proposal is ideal (these values are not a variables)
- but maybe some enhanced output (only for this resume) can be better:

postgres=# \pset
Output format (format) is aligned.
Border style (border) is 1.
Expanded display (expanded) is off.
Null display (null) is .
Field separator (fieldsep) is |.
Tuples only (tuples_only) is off.
Title (title) is unset.
Table attributes (tableattr) unset.
Pager (pager) is used for long output.
Record separator (recordsep) is newline.

This expanded output should be used only for this resume (not when a option
was changed or individual ask on option value)

Regards

Pavel Stehule


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-07 Thread MauMau

Thank you for your opinions and ideas.

From: Tom Lane t...@sss.pgh.pa.us

Greg Stark st...@mit.edu writes:

What would be nicer would be to display the C define, EINVAL, EPERM, etc.
Afaik there's no portable way to do that though. I suppose we could just
have a small array or hash table of all the errors we know about and look
it up.


Yeah, I was just thinking the same thing.  We could do

switch (errno)
{
case EINVAL: str = EINVAL; break;
case ENOENT: str = ENOENT; break;
...
#ifdef EFOOBAR
case EFOOBAR: str = EFOOBAR; break;
#endif
...

for all the common or even less-common names, and only fall back on
printing a numeric value if it's something really unusual.

But I still maintain that we should only do this if we can't get a useful
string out of strerror().


OK, I'll take this approach.  That is:

str = strerror(errnum);
if (str == NULL || *str == '\0' || *str == '?')
{
switch (errnum)
{
case EINVAL: str = errno=EINVAL; break;
case ENOENT: str = errno=ENOENT; break;
...
#ifdef EFOOBAR
case EFOOBAR: str = EFOOBAR; break;
#endif
default:
 snprintf(errorstr_buf, sizeof(errorstr_buf),
_(operating system error %d), errnum);
 str = errorstr_buf;
}
}

The number of questionmarks probably depends on the original message, so I 
won't strcmp() against ???.



From: Tom Lane t...@sss.pgh.pa.us

There is certainly no way we'd risk back-patching something with as
many potential side-effects as fooling with libc's textdomain.


Agreed.  It should be better to avoid making use of undocumented behavior 
(i.e. strerror() uses libc.mo), if we can take another approach.




BTW: personally, I would say that what you're looking at is a glibc bug.
I always thought the contract of gettext was to return the ASCII version
if it fails to produce a translated version.  That might not be what the
end user really wants to see, but surely returning something like ???
is completely useless to anybody.


I think so, too.  Under the same condition, PostgreSQL built with Oracle 
Studio on Solaris outputs correct Japanese for strerror(), and English is 
output on Windows.  I'll contact glibc team to ask for improvement.




From: Tom Lane t...@sss.pgh.pa.us

I dislike that on grounds of readability and translatability; and
I'm also of the opinion that errno codes aren't really consistent
enough across platforms to be all that trustworthy for remote diagnostic
purposes.  I'm fine with printing the code if strerror fails to
produce anything useful --- but not if it succeeds.


I don't think this is a concern, because we should ask trouble reporters 
about the operating system where they are running the database server.



From: Tom Lane t...@sss.pgh.pa.us

There isn't any way to cram this information
into the current usage of %m without doing damage to the readability and
translatability of the string.  Our style  translatability guidelines
specifically recommend against assembling messages out of fragments,
and also against sticking in parenthetical additions.


From: Andres Freund and...@2ndquadrant.com

If we'd add the errno inside %m processing, I don't see how it's
a problem for translation?


I'm for Andres.  I don't see any problem if we don't translate errno=%d.


I'll submit a revised patch again next week.  However, I believe my original 
approach is better, because it outputs user-friendly Japanese message 
instead of errno=ENOENT.  Plus, outputing both errno value and its 
descriptive text is more useful, because the former is convenient for 
OS/library experts and the latter is convenient for PostgreSQL users.  Any 
better idea would be much appreciated.



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME

2013-09-07 Thread Michael Paquier
On Thu, Sep 5, 2013 at 12:27 PM,  wangs...@highgo.com.cn wrote:
 I had committed the patch to the Server Features
 (https://commitfest.postgresql.org/action/commitfest_view/open).
 Is this right ? If not, please give me more advice,thanks !
Yes this category is fine don't worry.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-07 Thread Kohei KaiGai
2013/9/7 Kohei KaiGai kai...@kaigai.gr.jp:
 2013/9/7 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 I find this a somewhat depressing response.  Didn't we discuss this
 exact design at the developer meeting in Ottawa?  I thought it sounded
 reasonable to you then, or at least I don't remember you panning it.

 What I recall saying is that I didn't see how the planner side of it would
 work ... and I still don't see that.  I'd be okay with committing
 executor-side fixes only if we had a vision of where we'd go on the
 planner side; but this patch doesn't offer any path forward there.

 The reason why this patch stick on executor-side is we concluded
 not to patch the planner code from the beginning in Ottawa because
 of its complexity.
 I'd also like to agree that planner support for custom plan is helpful
 to construct better execution plan, however, it also make sense even
 if this feature begins a functionality that offers a way to arrange a plan
 tree being already constructed.

 Anyway, let me investigate what's kind of APIs to be added for planner
 stage also.

It is a brief idea to add planner support on custom node, if we need it
from the beginning. Of course, it is not still detailed considered and
needs much brushing up, however, it may be a draft to implement this
feature.

We may be able to categorize plan node types into three; scan, join
and others.

Even though planner tries to test various combinations of join and scan
to minimize its estimated cost, we have less options on other types
like T_Agg and so on. It seems to me the other types are almost put
according to the query's form, so it does not make a big problem even
if all we can do is manipulation of plan-tree at planner_hook.
That is similar to what proposed patch is doing.

So, let's focus on join and scan. It needs to give extensions a chance
to override built-in path if they can offer more cheap path.
It leads an API that allows to add alternative paths when built-in feature
is constructing candidate paths. Once path was added, we can compare
them according to the estimated cost.
For example, let's assume a query tries to join foreign table A and B
managed by same postgres_fdw server, remote join likely has cheaper
cost than local join. If extension has a capability to handle the case
correctly, it may be able to add an alternative custom-join path with
cheaper-cost.
Then, this path shall be transformed to CustomJoin node that launches
a query to get a result of A join B being remotely joined.
In this case, here is no matter even if CustomJoin has underlying
ForeignScan nodes on the left-/right-tree, because extension can handle
the things to do with its arbitrary.

So, the following APIs may be needed for planner support, at least.

* API to add an alternative join path, in addition to built-in join logic.
* API to add an alternative scan path, in addition to built-in scan logic.
* API to construct CustomJoin according to the related path.
* API to construct CustomScan according to the related path.

Any comment please.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql and pset without any arguments

2013-09-07 Thread Marc Mamin
 2) there is only one open question 
 http://www.postgresql.org/message-id/b6f6fd62f2624c4c9916ac0175d56d880ce00...@jenmbs01.ad.intershop.net
  -
 there is no clean relation between output and some pset option.

 I don't think so Marc' proposal is ideal (these values are not a variables) - 
 but maybe some enhanced output (only for this resume) can be better:

 postgres=# \pset
 Output format (format) is aligned.
 Border style (border) is 1.
...

+1

Many thanks for taking care of my suggestion.

best regards,

Marc Mamin



Re: [HACKERS] Custom Plan node

2013-09-07 Thread David Fetter
On Sat, Sep 07, 2013 at 02:49:54PM +0200, Kohei KaiGai wrote:
 2013/9/7 Kohei KaiGai kai...@kaigai.gr.jp:
  2013/9/7 Tom Lane t...@sss.pgh.pa.us:
  Robert Haas robertmh...@gmail.com writes:
  I find this a somewhat depressing response.  Didn't we discuss this
  exact design at the developer meeting in Ottawa?  I thought it sounded
  reasonable to you then, or at least I don't remember you panning it.
 
  What I recall saying is that I didn't see how the planner side of it would
  work ... and I still don't see that.  I'd be okay with committing
  executor-side fixes only if we had a vision of where we'd go on the
  planner side; but this patch doesn't offer any path forward there.
 
  The reason why this patch stick on executor-side is we concluded
  not to patch the planner code from the beginning in Ottawa because
  of its complexity.
  I'd also like to agree that planner support for custom plan is helpful
  to construct better execution plan, however, it also make sense even
  if this feature begins a functionality that offers a way to arrange a plan
  tree being already constructed.
 
  Anyway, let me investigate what's kind of APIs to be added for planner
  stage also.
 
 It is a brief idea to add planner support on custom node, if we need it
 from the beginning. Of course, it is not still detailed considered and
 needs much brushing up, however, it may be a draft to implement this
 feature.
 
 We may be able to categorize plan node types into three; scan, join
 and others.
 
 Even though planner tries to test various combinations of join and scan
 to minimize its estimated cost, we have less options on other types
 like T_Agg and so on. It seems to me the other types are almost put
 according to the query's form, so it does not make a big problem even
 if all we can do is manipulation of plan-tree at planner_hook.
 That is similar to what proposed patch is doing.
 
 So, let's focus on join and scan. It needs to give extensions a chance
 to override built-in path if they can offer more cheap path.
 It leads an API that allows to add alternative paths when built-in feature
 is constructing candidate paths. Once path was added, we can compare
 them according to the estimated cost.
 For example, let's assume a query tries to join foreign table A and B
 managed by same postgres_fdw server, remote join likely has cheaper
 cost than local join. If extension has a capability to handle the case
 correctly, it may be able to add an alternative custom-join path with
 cheaper-cost.
 Then, this path shall be transformed to CustomJoin node that launches
 a query to get a result of A join B being remotely joined.
 In this case, here is no matter even if CustomJoin has underlying
 ForeignScan nodes on the left-/right-tree, because extension can handle
 the things to do with its arbitrary.
 
 So, the following APIs may be needed for planner support, at least.
 
 * API to add an alternative join path, in addition to built-in join logic.
 * API to add an alternative scan path, in addition to built-in scan logic.
 * API to construct CustomJoin according to the related path.
 * API to construct CustomScan according to the related path.
 
 Any comment please.

The broad outlines look great.

Do we have any way, at least conceptually, to consider the graph of
the cluster with edges weighted by network bandwidth and latency?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-07 Thread Bruce Momjian
On Sat, Sep  7, 2013 at 07:42:55AM +0200, Andres Freund wrote:
 On 2013-09-06 23:07:04 -0400, Bruce Momjian wrote:
  On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
Another possible fix would be to avoid the IS NULL value optimizer
expansion if a ROW construct is inside a ROW().  I have attached a 
patch
that does this for review.
   
Having received no replies, do people perfer this version of the patch
that just punts nested ROW IS NULL testing to execQual.c?
   
   For some reason I read your previous message as saying you were willing to
   wait for considered reviews this time.  If not, I'll just write a blanket
   -1 for any version of this patch.
  
  Are you saying people will comment later?  I wasn't clear that was the
  plan.  I can certainly wait.
 
 You do realize mere mortals in the project frequently have to wait
 *months* to get comments on their patches? Not getting any for less than
 48h doesn't seem to be saying much.

My original problem report was November, 2012:

http://www.postgresql.org/message-id/50b3d11f.20...@2ndquadrant.com

and my patch to fix this was July 4.  Tom gave me a code snipped to test
PL/pgSQL's handling of record types.  I tested that and it looked ok,
but there are other place to test that I don't know about.

 Why don't you add the proposal to the commitfest?

This issue is so much larger than the patch's validity that I don't see
how that would work.

I obviously can't complete this, so I am adding a TODO item:

IS NULL testing of nested ROW() values is inconsistent

All my patches are in that thread in case someone who can complete this
item wants to take it over.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-07 Thread Kohei KaiGai
2013/9/7 David Fetter da...@fetter.org:
 On Sat, Sep 07, 2013 at 02:49:54PM +0200, Kohei KaiGai wrote:
 2013/9/7 Kohei KaiGai kai...@kaigai.gr.jp:
  2013/9/7 Tom Lane t...@sss.pgh.pa.us:
  Robert Haas robertmh...@gmail.com writes:
  I find this a somewhat depressing response.  Didn't we discuss this
  exact design at the developer meeting in Ottawa?  I thought it sounded
  reasonable to you then, or at least I don't remember you panning it.
 
  What I recall saying is that I didn't see how the planner side of it would
  work ... and I still don't see that.  I'd be okay with committing
  executor-side fixes only if we had a vision of where we'd go on the
  planner side; but this patch doesn't offer any path forward there.
 
  The reason why this patch stick on executor-side is we concluded
  not to patch the planner code from the beginning in Ottawa because
  of its complexity.
  I'd also like to agree that planner support for custom plan is helpful
  to construct better execution plan, however, it also make sense even
  if this feature begins a functionality that offers a way to arrange a plan
  tree being already constructed.
 
  Anyway, let me investigate what's kind of APIs to be added for planner
  stage also.
 
 It is a brief idea to add planner support on custom node, if we need it
 from the beginning. Of course, it is not still detailed considered and
 needs much brushing up, however, it may be a draft to implement this
 feature.

 We may be able to categorize plan node types into three; scan, join
 and others.

 Even though planner tries to test various combinations of join and scan
 to minimize its estimated cost, we have less options on other types
 like T_Agg and so on. It seems to me the other types are almost put
 according to the query's form, so it does not make a big problem even
 if all we can do is manipulation of plan-tree at planner_hook.
 That is similar to what proposed patch is doing.

 So, let's focus on join and scan. It needs to give extensions a chance
 to override built-in path if they can offer more cheap path.
 It leads an API that allows to add alternative paths when built-in feature
 is constructing candidate paths. Once path was added, we can compare
 them according to the estimated cost.
 For example, let's assume a query tries to join foreign table A and B
 managed by same postgres_fdw server, remote join likely has cheaper
 cost than local join. If extension has a capability to handle the case
 correctly, it may be able to add an alternative custom-join path with
 cheaper-cost.
 Then, this path shall be transformed to CustomJoin node that launches
 a query to get a result of A join B being remotely joined.
 In this case, here is no matter even if CustomJoin has underlying
 ForeignScan nodes on the left-/right-tree, because extension can handle
 the things to do with its arbitrary.

 So, the following APIs may be needed for planner support, at least.

 * API to add an alternative join path, in addition to built-in join logic.
 * API to add an alternative scan path, in addition to built-in scan logic.
 * API to construct CustomJoin according to the related path.
 * API to construct CustomScan according to the related path.

 Any comment please.

 The broad outlines look great.

 Do we have any way, at least conceptually, to consider the graph of
 the cluster with edges weighted by network bandwidth and latency?

As postgres_fdw is now doing?
Its configuration allows to add cost to connect remote server as startup
cost, and also add cost to transfer data on network being multiplexed
with estimated number of rows, according to per-server configuration.
I think it is responsibility of the custom plan provider, and fully depends
on the nature of what does it want to provide.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql and pset without any arguments

2013-09-07 Thread Gilles Darold
Le 07/09/2013 10:02, Pavel Stehule a écrit :
 Hello

 * patch is cleanly patchable and compilation is without warnings
 * all regression tests passed
 * no impact on dump, performance or any current features
 * no comments to programming style
 * we would this feature - it is consistent with \set and it gives a
 fast resume about psql printer settings

 Issues:
 1) it doesn't show linestyle when is default

 I looked there and it is probably a small different issue - this value
 is initialized as NULL as default. But I dislike a empty output in
 this case:

 else if (strcmp(param, linestyle) == 0)
 {
 if (!popt-topt.line_style)
 ;
 else
 printf(_(Line style is %s.\n),
get_line_style(popt-topt)-name);
 }

 I propose a verbose result instead nothing:

 else if (strcmp(param, linestyle) == 0)
 {
 if (!popt-topt.line_style)
printf(_(Line style is unset.\n)) ;
 else
 printf(_(Line style is %s.\n),
get_line_style(popt-topt)-name);
 }

+1 to show the information even if linestyle is not set but by default
get_line_style() return ascii if linestyle is not set. So maybe it is
better to  rewrite it as follow:

else if (strcmp(param, linestyle) == 0)
{
printf(_(Line style is %s.\n),
   get_line_style(popt-topt)-name);
}

This will output:

Line style is ascii.

when linestyle is not set or of course it is set to ascii.

 2) there is only one open question
 http://www.postgresql.org/message-id/b6f6fd62f2624c4c9916ac0175d56d880ce00...@jenmbs01.ad.intershop.net
 - there is no clean relation between output and some pset option.
   
 I don't think so Marc' proposal is ideal (these values are not a
 variables) - but maybe some enhanced output (only for this resume) can
 be better:

 postgres=# \pset
 Output format (format) is aligned.
 Border style (border) is 1.
 Expanded display (expanded) is off.
 Null display (null) is .
 Field separator (fieldsep) is |.
 Tuples only (tuples_only) is off.
 Title (title) is unset.
 Table attributes (tableattr) unset.
 Pager (pager) is used for long output.
 Record separator (recordsep) is newline.

 This expanded output should be used only for this resume (not when a
 option was changed or individual ask on option value)

Yes this could be a good accommodation but I really prefer to not
duplicate code and translation between this resume and the output when
these options are set. If we can print the same output messages using:

postgres=# \pset fieldsep '|'
Field separator (fieldsep) is |.

it could be a good compromise.

Regards,

-- 
Gilles Darold
Administrateur de bases de données
http://dalibo.com - http://dalibo.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-07 Thread Bruce Momjian
On Sat, Sep  7, 2013 at 10:59:08AM -0400, Bruce Momjian wrote:
 My original problem report was November, 2012:
 
   http://www.postgresql.org/message-id/50b3d11f.20...@2ndquadrant.com
 
 and my patch to fix this was July 4.  Tom gave me a code snipped to test
 PL/pgSQL's handling of record types.  I tested that and it looked ok,
 but there are other place to test that I don't know about.

Correction --- my updates to Tom's suggestion was only posted September
4. Anyway, it is on the TODO list now.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-07 Thread Bruce Momjian
On Sat, Sep  7, 2013 at 07:34:49AM +0200, Andres Freund wrote:
  The idea of using RecentGlobalXmin to see how much _work_ has happened
  since the last vacuum is interesting, but it doesn't handle read-only
  transactions;  I am not sure how they can be tracked.  You make a good
  point that 5 minutes passing is meaningless --- you really want to know
  how many transactions have completed.
 
 So, what I was pondering went slightly into a different direction:
 
 (lets ignore anti wraparound vacuum for now)
 
 Currently we trigger autovacuums by the assumed number of dead
 tuples. In the course of it's action it usually will find that it cannot
 remove all dead rows and that it cannot mark everything as all
 visible. That's because the xmin horizon hasn't advanced far enough. We
 won't trigger another vacuum after that unless there are further dead
 tuples in the relation...
 One trick if we want to overcome that problem and that we do not handle
 setting all visible nicely for INSERT only workloads would be to trigger
 vacuum by the amount of pages that are not marked all visible in the vm.
 
 The problem there is that repeatedly scanning a relation that's only 50%
 visible where the rest cannot be marked all visible because of a
 longrunning pg_dump obivously isn't a good idea. So we need something to
 notify us when there's work to be done. Using elapsed time seems like a
 bad idea because it doesn't adapt to changing workloads very well and
 doesn't work nicely for different relations.
 
 What I was thinking of was to keep track of the oldest xids on pages
 that cannot be marked all visible. I haven't thought about the
 statistics part much, but what if we binned the space between
 [RecentGlobalXmin, -nextXid) into 10 bins and counted the number of
 pages falling into each bin. Then after the vacuum finished we could
 compute how far RecentGlobalXmin would have to progress to make another
 vacuum worthwile by counting the number of pages from the lowest bin
 upwards and use the bin's upper limit as the triggering xid.
 
 Now, we'd definitely need to amend that scheme by something that handles
 pages that are newly written to, but it seems something like that
 wouldn't be too hard to implement and would make autovacuum more useful.

That seems very complicated.  I think it would be enough to record the
current xid at the time of the vacuum, and when testing for later
vacuums, if that saved xid is earlier than the RecentGlobalXmin, and
there have been no inserts/updates/deletes, we know that all of
the pages can now be marked as allvisible.

What this doesn't handle is the insert case.  What we could do there is
to record the total free space map space, and if the FSM has not changed
between the last vacuum, we can even vacuum if inserts happened in that
period because we assume the inserts are on new pages.  One problem
there is that the FSM is only updated if an insert will not fit on the
page.  We could record the table size and make sure the table size has
increased before we allow inserts to trigger a vm-set vacuum.

None of this is perfect, but it is better than what we have, and it
would eventually get the VM bits set.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Privileges for INFORMATION_SCHEMA.SCHEMATA (was Re: [DOCS] Small clarification in 34.41. schemata)

2013-09-07 Thread Bruce Momjian
On Thu, Jan 31, 2013 at 03:49:36PM -0500, Peter Eisentraut wrote:
 On 1/9/13 8:56 PM, Tom Lane wrote:
  However, it seems to me that this behavior is actually wrong for our
  purposes, as it represents a too-literal reading of the spec.  The SQL
  standard has no concept of privileges on schemas, only ownership.
  We do have privileges on schemas, so it seems to me that the consistent
  thing would be for this view to show any schema that you either own or
  have some privilege on.  That is the test should be more like 
  
  pg_has_role(n.nspowner, 'USAGE')
  OR has_schema_privilege(n.oid, 'CREATE, USAGE')
  
  As things stand, a non-superuser won't see public, pg_catalog,
  nor even information_schema itself in this view, which seems a
  tad silly.
 
 I agree it would make sense to change this.

Is this the patch you want applied?  The docs are fine?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
new file mode 100644
index 95f267f..605bcbd
*** a/src/backend/catalog/information_schema.sql
--- b/src/backend/catalog/information_schema.sql
*** CREATE VIEW schemata AS
*** 1502,1508 
 CAST(null AS sql_identifier) AS default_character_set_name,
 CAST(null AS character_data) AS sql_path
  FROM pg_namespace n, pg_authid u
! WHERE n.nspowner = u.oid AND pg_has_role(n.nspowner, 'USAGE');
  
  GRANT SELECT ON schemata TO PUBLIC;
  
--- 1502,1509 
 CAST(null AS sql_identifier) AS default_character_set_name,
 CAST(null AS character_data) AS sql_path
  FROM pg_namespace n, pg_authid u
! WHERE n.nspowner = u.oid AND (pg_has_role(n.nspowner, 'USAGE') OR 
!   has_schema_privilege(n.oid, 'CREATE, USAGE'));
  
  GRANT SELECT ON schemata TO PUBLIC;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2013-09-07 Thread Bruce Momjian
On Mon, Jul 15, 2013 at 06:19:27PM -0400, Alvaro Herrera wrote:
 Robert Haas escribió:
  On Mon, Jul 15, 2013 at 5:59 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   Xi Wang escribió:
   Intel's icc and PathScale's pathcc compilers optimize away several
   overflow checks, since they consider signed integer overflow as
   undefined behavior.  This leads to a vulnerable binary.
  
   This thread died without reaching a conclusion.  Noah Misch, Robert Haas
   and Greg Stark each gave a +1 to the patches, but Tom Lane gave them a
   -inf; so they weren't applied.  However, I think everyone walked away
   with the feeling that Tom is wrong on this.
  
   Meanwhile Xi Wang and team published a paper:
   http://pdos.csail.mit.edu/~xi/papers/stack-sosp13.pdf
  
   Postgres is mentioned a number of times in this paper -- mainly to talk
   about the bugs we leave unfixed.
  
   It might prove useful to have usable these guys' STACK checker output
   available continuously, so that if we happen to introduce more bugs in
   the future, it alerts us about that.
  
  Yeah, I think we ought to apply those patches.  I don't suppose you
  have links handy?
 
 Sure, see this thread in the archives: first one is at
 http://www.postgresql.org/message-id/510100aa.4050...@gmail.com and you
 can see the others in the dropdown (though since the subjects are not
 shown, only date and author, it's a bit hard to follow.  The flat URL
 is useful.)

Should these patches be applied?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-07 Thread David Fetter
On Sat, Sep 07, 2013 at 05:21:31PM +0200, Kohei KaiGai wrote:
 2013/9/7 David Fetter da...@fetter.org:
  The broad outlines look great.
 
  Do we have any way, at least conceptually, to consider the graph
  of the cluster with edges weighted by network bandwidth and
  latency?
 
 As postgres_fdw is now doing?  Its configuration allows to add cost
 to connect remote server as startup cost, and also add cost to
 transfer data on network being multiplexed with estimated number of
 rows, according to per-server configuration.  I think it is
 responsibility of the custom plan provider, and fully depends on the
 nature of what does it want to provide.

Interesting :)

Sorry I was unclear.  I meant something that could take into account
the bandwidths and latencies throughout the graph, not just the
connections directly from the originating node.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql and pset without any arguments

2013-09-07 Thread Pavel Stehule
2013/9/7 Gilles Darold gilles.dar...@dalibo.com

 Le 07/09/2013 10:02, Pavel Stehule a écrit :
  Hello
 
  * patch is cleanly patchable and compilation is without warnings
  * all regression tests passed
  * no impact on dump, performance or any current features
  * no comments to programming style
  * we would this feature - it is consistent with \set and it gives a
  fast resume about psql printer settings
 
  Issues:
  1) it doesn't show linestyle when is default
 
  I looked there and it is probably a small different issue - this value
  is initialized as NULL as default. But I dislike a empty output in
  this case:
 
  else if (strcmp(param, linestyle) == 0)
  {
  if (!popt-topt.line_style)
  ;
  else
  printf(_(Line style is %s.\n),
 get_line_style(popt-topt)-name);
  }
 
  I propose a verbose result instead nothing:
 
  else if (strcmp(param, linestyle) == 0)
  {
  if (!popt-topt.line_style)
 printf(_(Line style is unset.\n)) ;
  else
  printf(_(Line style is %s.\n),
 get_line_style(popt-topt)-name);
  }

 +1 to show the information even if linestyle is not set but by default
 get_line_style() return ascii if linestyle is not set. So maybe it is
 better to  rewrite it as follow:

 else if (strcmp(param, linestyle) == 0)
 {
 printf(_(Line style is %s.\n),
get_line_style(popt-topt)-name);
 }

 This will output:

 Line style is ascii.

 when linestyle is not set or of course it is set to ascii.

  2) there is only one open question
 
 http://www.postgresql.org/message-id/b6f6fd62f2624c4c9916ac0175d56d880ce00...@jenmbs01.ad.intershop.net
  - there is no clean relation between output and some pset option.
 
  I don't think so Marc' proposal is ideal (these values are not a
  variables) - but maybe some enhanced output (only for this resume) can
  be better:
 
  postgres=# \pset
  Output format (format) is aligned.
  Border style (border) is 1.
  Expanded display (expanded) is off.
  Null display (null) is .
  Field separator (fieldsep) is |.
  Tuples only (tuples_only) is off.
  Title (title) is unset.
  Table attributes (tableattr) unset.
  Pager (pager) is used for long output.
  Record separator (recordsep) is newline.
 
  This expanded output should be used only for this resume (not when a
  option was changed or individual ask on option value)

 Yes this could be a good accommodation but I really prefer to not
 duplicate code and translation between this resume and the output when
 these options are set. If we can print the same output messages using:

 postgres=# \pset fieldsep '|'
 Field separator (fieldsep) is |.

 it could be a good compromise.


ok

Pavel



 Regards,

 --
 Gilles Darold
 Administrateur de bases de données
 http://dalibo.com - http://dalibo.org




Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-07 Thread Andres Freund
Hi,

On 2013-09-07 12:50:59 -0400, Bruce Momjian wrote:
 That seems very complicated.  I think it would be enough to record the
 current xid at the time of the vacuum, and when testing for later
 vacuums, if that saved xid is earlier than the RecentGlobalXmin, and
 there have been no inserts/updates/deletes, we know that all of
 the pages can now be marked as allvisible.

But that would constantly trigger vacuums, or am I missing something? Or
what are you suggesting this xid to be used for?

What I was talking about was how to evaluate the benefit of triggering
an VACUUM even if there's not a significant amount of new dead rows. If
we know that for a certain xmin horizon there's N pages that potentially
can be cleaned and marked all visible we have a change of making
sensible decisions.
We could just use one bin (i.e. use one cutoff xid as you propose) and
count the number of pages that would be affected. But that would mean
we'd only trigger vacuums very irregularly if you have a workload with
several longrunning transactions. When the oldest of a set of
longrunning transactions finishes you possibly can already clean up a
good bit reducing the chance of further bloat. Otherwise you have to
wait for all of them to finish.

 What this doesn't handle is the insert case.  What we could do there is
 to record the total free space map space, and if the FSM has not changed
 between the last vacuum, we can even vacuum if inserts happened in that
 period because we assume the inserts are on new pages.  One problem
 there is that the FSM is only updated if an insert will not fit on the
 page.  We could record the table size and make sure the table size has
 increased before we allow inserts to trigger a vm-set vacuum.

Not sure why that's better than just counting the number of pages that
have unset vm bits?
Note that you cannot rely on the FSM data to be correct all the time, we
can only use such tricks to trigger vacuums not for the actual operation
in the vacuum.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2013-09-07 Thread Greg Stark
 Should these patches be applied?

I have a copy of the program and was going to take care of this.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [rfc] overhauling pgstat.stat

2013-09-07 Thread Tomas Vondra
On 5.9.2013 09:36, Atri Sharma wrote:
 On Thu, Sep 5, 2013 at 10:59 AM, Alvaro Herrera 
 alvhe...@2ndquadrant.com wrote:
 Satoshi Nagayasu wrote:
 
 But, for now, I think we should have a real index for the 
 statistics data because we already have several index storages, 
 and it will allow us to minimize read/write operations.
 
 BTW, what kind of index would be preferred for this purpose? 
 btree or hash?
 
 I find it hard to get excited about using the AM interface for
 this purpose.  To me it makes a lot more sense to have separate,
 much simpler code.  We don't need any transactionality, user
 defined types, user defined operators, or anything like that.
 
 +1.
 
 But, would not rewriting a lot of existing functionalities
 potentially lead to points of contention and/or much more effort?

Don't forget the stats are written only by the postmaster, all the
regular backends only read it (and eventually send updates back). But
yes, contention might be a problem, because there will have to be some
kind of locking that is not needed now when the postmaster is writing
fresh copy into a new file.

But I think we need to implement something and then measure this.
Because it might even with the contention the overall performance might
actually improve.

I'd vote to try a simple approach first - adding some simple array
'index' allowing fast access to particular records etc. At least that
was my plan. But feel free to implement something more advanced (e.g. a
BTree storage) and we can compare the results.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [rfc] overhauling pgstat.stat

2013-09-07 Thread Tomas Vondra
On 5.9.2013 07:29, Alvaro Herrera wrote:
 Satoshi Nagayasu wrote:
 
 But, for now, I think we should have a real index for the 
 statistics data because we already have several index storages, and
 it will allow us to minimize read/write operations.
 
 BTW, what kind of index would be preferred for this purpose? btree
 or hash?
 
 I find it hard to get excited about using the AM interface for this 
 purpose.  To me it makes a lot more sense to have separate, much 
 simpler code.  We don't need any transactionality, user defined
 types, user defined operators, or anything like that.

+1 to these concerns

And I think using regular tables might actually cause more harm than
benefits. For example let's say we have a large database with many
objects (which is the aim of this thread) with high activity - sessions
accessing objects, i.e. updating many rows in the stats tables.

Now, the stats table is likely to bloat thanks of the MVCC
copy-on-update. Not a good think, and it might easily happen the price
for maintenance of the table will be much higher than what we saved.

There are probably other similar scenarios.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [rfc] overhauling pgstat.stat

2013-09-07 Thread Atri Sharma


Sent from my iPad

On 08-Sep-2013, at 4:27, Tomas Vondra t...@fuzzy.cz wrote:

 On 5.9.2013 09:36, Atri Sharma wrote:
 On Thu, Sep 5, 2013 at 10:59 AM, Alvaro Herrera 
 alvhe...@2ndquadrant.com wrote:
 Satoshi Nagayasu wrote:
 
 But, for now, I think we should have a real index for the 
 statistics data because we already have several index storages, 
 and it will allow us to minimize read/write operations.
 
 BTW, what kind of index would be preferred for this purpose? 
 btree or hash?
 
 I find it hard to get excited about using the AM interface for
 this purpose.  To me it makes a lot more sense to have separate,
 much simpler code.  We don't need any transactionality, user
 defined types, user defined operators, or anything like that.
 
 +1.
 
 But, would not rewriting a lot of existing functionalities
 potentially lead to points of contention and/or much more effort?
 
 Don't forget the stats are written only by the postmaster, all the
 regular backends only read it (and eventually send updates back). But
 yes, contention might be a problem, because there will have to be some
 kind of locking that is not needed now when the postmaster is writing
 fresh copy into a new file.
 
 But I think we need to implement something and then measure this.
 Because it might even with the contention the overall performance might
 actually improve.
 
 I'd vote to try a simple approach first - adding some simple array
 'index' allowing fast access to particular records etc. At least that
 was my plan. But feel free to implement something more advanced (e.g. a
 BTree storage) and we can compare the results.
 
 
+1 on the simple implementation and measure approach.

My focus here is to identify what kind of queries we expect to be serving from 
the stats.I think someone mentioned range queries upthread. I feel we should be 
looking at range trees as secondary index, if not the primary storage for 
pgstat.

Regards,

Atri

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [rfc] overhauling pgstat.stat

2013-09-07 Thread Atri Sharma


Sent from my iPad

On 08-Sep-2013, at 4:27, Tomas Vondra t...@fuzzy.cz wrote:

 On 5.9.2013 09:36, Atri Sharma wrote:
 On Thu, Sep 5, 2013 at 10:59 AM, Alvaro Herrera 
 alvhe...@2ndquadrant.com wrote:
 Satoshi Nagayasu wrote:
 
 But, for now, I think we should have a real index for the 
 statistics data because we already have several index storages, 
 and it will allow us to minimize read/write operations.
 
 BTW, what kind of index would be preferred for this purpose? 
 btree or hash?
 
 I find it hard to get excited about using the AM interface for
 this purpose.  To me it makes a lot more sense to have separate,
 much simpler code.  We don't need any transactionality, user
 defined types, user defined operators, or anything like that.
 
 +1.
 
 But, would not rewriting a lot of existing functionalities
 potentially lead to points of contention and/or much more effort?
 
 Don't forget the stats are written only by the postmaster, all the
 regular backends only read it (and eventually send updates back). But
 yes, contention might be a problem, because there will have to be some
 kind of locking that is not needed now when the postmaster is writing
 fresh copy into a new file.
 
 But I think we need to implement something and then measure this.
 Because it might even with the contention the overall performance might
 actually improve.
 
 I'd vote to try a simple approach first - adding some simple array
 'index' allowing fast access to particular records etc. At least that
 was my plan. But feel free to implement something more advanced (e.g. a
 BTree storage) and we can compare 

Another thing I would want to explore is if we could somehow prioritise the 
more frequently accessed records to reduce their access times even more.I am 
thinking on the lines of self organising lists.I am not sure if and how it 
would be possible to implement this idea of self organising in BTree or any 
other tree though.

Regards,

Atri

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers