Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread wangshuo

于 2013-09-06 01:41, Jeff Janes 回复:

On Wed, Sep 4, 2013 at 10:06 PM,  wangs...@highgo.com.cn wrote:

于 2013-09-04 23:41, Jeff Janes 回复:


On Tue, Sep 3, 2013 at 9:08 PM,  wangs...@highgo.com.cn wrote:


Hi, Hackers!

I find that it takes a long time when I increase the scale of a 
numeric

datatype.
By checking the code, I found that's because it needs to rewrite 
that

table's file.
After checking that table's data file, I found only parameter 
n_header

changed.
And, I found the data in that numeric field never changed.
So I thank It's not necessary to rewrite the table's file in this 
case.


Anyone has more idea about this, please come to talk about this!





Jeff Janes jeff.ja...@gmail.com wrote:


This was fixed in version 9.2.  You must be using an older version.

Cheers,

Jeff



Thanks for your reply.

To declare a column of type numeric use the syntax:
NUMERIC(precision, scale).
What I said is this scale,not yours.




Jeff Janes jeff.ja...@gmail.com wrote:
You're right, I had tested a change in precision, not in scale.  
Sorry.


In order to avoid the rewrite, the code would have to be changed to
look up the column definition and if it specifies the scale, then
ignore the per-row n_header, and look at the n_header only if the
column is NUMERIC with no precision or scale.  That should
conceptually be possible, but I don't know how hard it would be to
implement--it sounds pretty invasive to me.  Then if the column was
altered from NUMERIC with scale to be a plain NUMERIC, it would have
to rewrite the table to enforce the row-wise scale to match the old
column-wise scale.  Where as now that alter doesn't need a re-write.
I don't know if this would be an overall gain or not.

Cheers,

Jeff


I modified the code for this situation.I consider it very simple.
It will does not modify the table file, when the scale has been
increased exclusively.

I modified the code , as follow:

static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber 
varattno, int32 oldtypemod, int32 newtypemod);



in function ATExecAlterColumnGenericOptions:

if (ATColumnChangeRequiresRewrite(transform, attnum, 
attTup-atttypmod, targettypmod))

tab-rewrite = true;

in the function ATColumnChangeRequiresRewrite:

else if (IsA(expr, FuncExpr))
{
int32   between = 0;

/*
 * Check whether funcresulttype == 1700 and 
funcid == 1703 when user modify datatype.
 * If true, then we know user modify the 
datatype numeric;

 * Then we go to get value 'between'.
 */
			if(((FuncExpr *) expr)-funcresulttype == 1700  ((FuncExpr *) 
expr)-funcid == 1703)

between = newtypemod - oldtypemod;

/*
 * If 'between' satisfy the following 
condition,
 * Then we know the scale of the numeric was 
increased.

 */
if(between  0  between  1001)
return false;
else
return true;
}

I packed a patch about this modification.

 Wang Shuo
 HighGo Software Co.,Ltd.
 September 6, 2013diff -uNr b/src/backend/commands/tablecmds.c a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c	2013-08-31 17:11:00.529744869 +0800
+++ a/src/backend/commands/tablecmds.c	2013-09-04 11:20:28.797652760 +0800
@@ -367,7 +367,7 @@
 	  AlteredTableInfo *tab, Relation rel,
 	  bool recurse, bool recursing,
 	  AlterTableCmd *cmd, LOCKMODE lockmode);
-static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno);
+static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno, int32 oldtypemod, int32 newtypemod);
 static void ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
 	  AlterTableCmd *cmd, LOCKMODE lockmode);
 static void ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
@@ -7480,7 +7480,7 @@
 		newval-expr = (Expr *) transform;
 
 		tab-newvals = lappend(tab-newvals, newval);
-		if (ATColumnChangeRequiresRewrite(transform, attnum))
+		if (ATColumnChangeRequiresRewrite(transform, attnum, attTup-atttypmod, targettypmod))
 			tab-rewrite = true;
 	}
 	else if (transform)
@@ -7530,7 +7530,7 @@
  * try to do that.
  */
 static bool
-ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno)
+ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno, int32 oldtypemod, int32 newtypemod)
 {
 	Assert(expr != NULL);
 
@@ -7549,6 +7549,18 @@
 return true;
 			expr = (Node *) d-arg;
 		}
+		else if (IsA(expr, FuncExpr))
+		{
+			int32	between = 0;
+
+			if(((FuncExpr *) expr)-funcresulttype == 1700  ((FuncExpr *) expr)-funcid == 1703)
+between = newtypemod - oldtypemod;
+
+			if(between  0  between  1001)
+	

Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Craig Ringer
On 09/05/2013 08:47 PM, Peter Eisentraut wrote:
 Other ideas?  Are there legitimate uses for SQL_ASCII?

IMO people who want SQL_ASCII should actually be storing everything in
`bytea`; that's a truer reflection of what they're actually storing,
retrieving, and working with and how they're doing it.

Unfortunately there'll be enough users of it around that I don't think
we can drop it.

What we SHOULD be doing is making it an explicit decision to use
SQL_ASCII, and NEVER creating a cluster or database with that encoding
by default. Ever. If we can't decide what the correct default encoding
is (say, if locale is C) we should error out unless a specific flag is
set.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I'm not sure if we need to expose all these new maintenance actions as
 SQL commands.

I strongly think we should, if only for diagnostic purposes. Also to
adapt to some well defined workloads that the automatic system is not
designed to handle.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] regression tests

2013-09-06 Thread Jeremy Harris

Hi,

   I don't see the regression tests running any index-hash operations.
What am I missing?

--
Cheers,
Jeremy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-06 Thread Cédric Villemain
Le jeudi 5 septembre 2013 17:14:37 Bruce Momjian a écrit :
 On Thu, Sep  5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
   I have developed the attached patch which implements an auto-tuned
   effective_cache_size which is 4x the size of shared buffers.  I had to
   set effective_cache_size to its old 128MB default so the EXPLAIN
   regression tests would pass unchanged.
  
  That's not really autotuning though. ISTM that making the *default* 4
  x shared_buffers might make perfect sense, but do we really need to
  hijack the value of -1 for that? That might be useful for some time
  when we have actual autotuning, that somehow inspects the system and
  tunes it from there.
  
  I also don't think it should be called autotuning, when it's just a
  smarter default value.
  
  I like the feature, though, just not the packaging.
 
 That auto-tuning text came from the wal_buffer documentation, which
 does exactly this based on shared_buffers:
 
 The contents of the WAL buffers are written out to disk at every
 transaction commit, so extremely large values are unlikely to
 provide a significant benefit.  However, setting this value to at
 least a few megabytes can improve write performance on a busy
 -- server where many clients are committing at once.  The auto-tuning
---
 selected by the default setting of -1 should give reasonable
 results in most cases.
 
 I am fine with rewording and not using -1, but we should change the
 wal_buffer default and documentation too then.  I am not sure what other
 value than -1 to use?  0?  I figure if we ever get better auto-tuning,
 we would just remove this functionality and make it better.

I'm fine with a -1 for auto-tune or inteligent default: it means (for me) that 
you don't need to care about this parameter in most case.

A negative impact of the simpler multiplier might be that if suddendly someone 
reduce the shared_buffers size to fix some strange behavior, then he at the 
same 
needs to increase manualy the effective_cache_size (which remain the sum of the 
caches on the system, at least on a dedicated to PostgreSQL one).

IMHO it is easy to know exactly how much of the memory is (or can be) used 
for/by PostgreSQL, we can compute that and update effective_cache_size at 
regular point int time. (just an idea, I know there are arguments against that 
too)

Maybe the value for a 4x multiplier instead of 3x, is that the 
effective_cache_size usage can be larger than required. It's not a big trouble.
With all things around NUMA we maybe just need to revisit that area (memory 
access cost non linear, double-triple caching, ...) .
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Florian Weimer

On 09/06/2013 09:14 AM, Craig Ringer wrote:

On 09/05/2013 08:47 PM, Peter Eisentraut wrote:

Other ideas?  Are there legitimate uses for SQL_ASCII?


IMO people who want SQL_ASCII should actually be storing everything in
`bytea`; that's a truer reflection of what they're actually storing,
retrieving, and working with and how they're doing it.


Practically speaking, the escaping gets in the way, and there isn't full 
feature parity with TEXT.  Regular expression matching seems to be 
missing, for instance.


But apart from that, yes, BYTEA would be the more appropriate choice.

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Analysis on backend-private memory usage (and a patch)

2013-09-06 Thread Greg Stark
On 4 Sep 2013 20:46, Heikki Linnakangas hlinnakan...@vmware.com wrote:


 One fairly simple thing we could do is to teach catcache.c to resize the
caches. Then we could make the initial size of all the syscaches much
smaller. At the moment, we use fairly caches for catalogs like pg_enum (256
entries) and pg_usermapping (128), even though most databases don't use
those features at all. If they could be resized on demand, we could easily
allocate them initially with just, say, 4 entries.

If most databases don't use the feature at all, tsparser, enums, etc, why
not start out with *no* cache and only build one when it's first needed?
This would also mean there's less overhead for implementing new features
that aren't universally used.


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Hannu Krosing
On 09/06/2013 09:23 AM, Dimitri Fontaine wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I'm not sure if we need to expose all these new maintenance actions as
 SQL commands.
 I strongly think we should, if only for diagnostic purposes. 
It would be much easier and more flexible to expose them
as pg_*() function calls, not proper commands.
 Also to
 adapt to some well defined workloads that the automatic system is not
 designed to handle.
+1

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS]

2013-09-06 Thread Agustin Larreinegabe
Hi everyone:

I want to know if exists a postgres function or some easy way to know if a
PK in a table is already referenced in another table/tables.

e.g.
I want to delete a row but first I've got to change or delete in the
table/tables where is referenced, and I have many table where could be
referenced.


Re: [HACKERS]

2013-09-06 Thread Serge Fonville
Hi Agustin,


 I want to know if exists a postgres function or some easy way to know if
a PK in a table is already referenced  in another table/tables.
Unfortunately, this is not the correct mailing list.

Instead you should ask these types of questions on either NOVICE or GENERAL.
http://www.postgresql.org/community/lists/

HTH

Kind regards/met vriendelijke groet,

Serge Fonville

http://www.sergefonville.nl

Convince Microsoft!
Please don't get rid of the MCM and MCA
programshttps://connect.microsoft.com/SQLServer/feedback/details/417926/truncate-partition-of-partitioned-table


2013/9/6 Agustin Larreinegabe alarre...@gmail.com

 Hi everyone:

 I want to know if exists a postgres function or some easy way to know if a
 PK in a table is already referenced in another table/tables.

 e.g.
 I want to delete a row but first I've got to change or delete in the
 table/tables where is referenced, and I have many table where could be
 referenced.




Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 01:22:36 -0400, Alvaro Herrera wrote:
 I think it's shortsighted to keep thinking of autovacuum as just a way
 to run VACUUM and ANALYZE.  We have already discussed work items that
 need to be done separately, such as truncating the last few empty pages
 on a relation that was vacuumed recently.  We also need to process a GIN
 index' pending insertion list; and with minmax indexes I will want to
 run summarization of heap page ranges.

Agreed.

 So maybe instead of trying to think of VM bit setting as part of vacuum,
 we could just keep stats about how many pages we might need to scan
 because of possibly needing to set the bit, and then doing that in
 autovacuum, independently from actually vacuuming the relation.

I am not sure I understand this though. What would be the point to go
and set all visible and not do the rest of the vacuuming work?

I think triggering vacuuming by scanning the visibility map for the
number of unset bits and use that as another trigger is a good idea. The
vm should ensure we're not doing superflous work.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 13:38:56 +0200, Hannu Krosing wrote:
 On 09/06/2013 09:23 AM, Dimitri Fontaine wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  I'm not sure if we need to expose all these new maintenance actions as
  SQL commands.
  I strongly think we should, if only for diagnostic purposes. 
 It would be much easier and more flexible to expose them
 as pg_*() function calls, not proper commands.

I don't think that's as easy as you might imagine. For much of what's
done in that context you cannot be in a transaction, you even need to be
in a toplevel statement (since we internally
CommitTransactionCommand/StartTransactionCommand).

So those pg_* commands couldn't be called (except possibly via the
fastpath function call API ...) which might restrict their usefulnes a
teensy bit ;)

So, I think extending the options passed to VACUUM - since it can take
pretty generic options these days - is a more realistic path.

  Also to
  adapt to some well defined workloads that the automatic system is not
  designed to handle.
 +1

What would you like to expose individually?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Andres Freund
On 2013-02-25 21:13:25 -0500, Tom Lane wrote:
  b) allow variables to start with a digit from the second level onwards.
 
 That seems like a seriously bad idea.  I note that SET does *not* allow
 this;

Hm. One thing about this is that we currently allow something silly as:
SET 1.1barblub = 3;

So I'd like to either restrict SET here or allow the same for guc-file.l
parsed GUCs. Any opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread MauMau

Hello,

I've been suffering from PostgreSQL's problems related to character encoding 
for some time.  I really wish to solve those problems, because they make 
troubleshooting difficult.  I'm going to propose fixes for them, and I would 
appreciate if you could help release the official patches as soon as 
possible.


The first issue is that the messages from strerror() become ??? in a 
typical locale/encoding combination.  I found this was reported in 2010, but 
it was not solved.


problem with glibc strerror messages translation (was: Could not open file 
pg_xlog/00010)

http://www.postgresql.org/message-id/87pqvezp3w@home.progtech.ru

The steps to reproduce the problem are:

$ export LANG=ja_JP.UTF-8
$ initdb -E UTF8 --no-locale --lc-messages=ja_JP
$ pg_ctl start
$ psql -d postgres -c CREATE TABLE a (col int)
$ psql -d postgres -c SELECT pg_relation_filepath('a')
... This outputs something like base/xxx/yyy
$ mv $PGDATA/base/xxx/yyy a
$ psql -d postgres -c SELECT * FROM a
... This outputs, in Japanese, a message meaning could not open file 
base/xxx/yyy: ???.


The problem is that strerror() returns ???, which hides the cause of the 
trouble.


The cause is that gettext() called by strerror() tries to convert UTF-8 
messages obtained from libc.mo to ASCII.  This is because postgres calls 
setlocale(LC_CTYPE, C) when it connects to the database.


Thus, I attached a patch (strerror_codeset.patch).  This simple patch just 
sets the codeset for libc catalog the same as postgres catalog.  As noted in 
the comment, I understand this is a kludge based on an undocumented fact 
(the catalog for strerror() is libc.mo), and may not work on all 
environments.  However, this will help many people who work in non-English 
regions.  Please just don't reject this because of implementation cleanness. 
If there is a better idea which can be implemented easily, I'd be happy to 
hear that.



I'm also attaching another patch, errno_str.patch, which adds the numeric 
value of errno to %m in ereport() like:


could not open file base/xxx/yyy: errno=2: No such file or directory

When talking with operating system experts, numeric errno values are 
sometimes more useful and easy to communicate than their corresponding 
strings.  This is a closely related but a separate proposal.


I want the first patch to be backported at least to 9.2.

Regards
MauMau


strerror_codeset.patch
Description: Binary data


errno_str.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 What we SHOULD be doing is making it an explicit decision to use
 SQL_ASCII, and NEVER creating a cluster or database with that encoding
 by default. Ever. If we can't decide what the correct default encoding
 is (say, if locale is C) we should error out unless a specific flag is
 set.

There's a large undercurrent of I say it's bad for you in this thread,
with frankly nothing to back it up.  If we try to be as nanny-ish as
you're suggesting here, we'll just annoy users.

And just to push back on the specific point: SQL_ASCII *is* the correct
default encoding for C locale.  Both are agnostic about the meaning of
anything outside the 7-bit ASCII set, while not rejecting such data.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Andres Freund
On 2013-09-06 10:13:23 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  That seems like a seriously bad idea.  I note that SET does *not* allow
  this;

  Hm. One thing about this is that we currently allow something silly as:
  SET 1.1barblub = 3;

  So I'd like to either restrict SET here or allow the same for guc-file.l
  parsed GUCs. Any opinions?

 Well, if you feel an absolute compulsion to make them consistent, I'd
 go with making SET disallow creation of variables with names the file
 parser wouldn't recognize.  But why is it such a bad thing if SET can
 do that?  The whole reason we allow SET to create new variables at all
 is that the universe of things you can have as session-local values is
 larger than the set of parameters that are allowed in postgresql.conf.
 So I'm missing why we need such a restriction.

Well, it's confusing for users, i.e. me. I've several times now
prototyped stuff that was supposed to be configurable in postgresql.conf
by either passing the options to postgres -c or by doing user level
SETs. Only to then later discover that what I've prototyped doesn't work
because the restrictions in postgresql.conf are way stricter.

Also, ALTER SYSTEM SET is going to need a similar restriction as well,
otherwise the server won't restart although the GUCs pass validation...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Tom Lane
MauMau maumau...@gmail.com writes:
 I've been suffering from PostgreSQL's problems related to character encoding 
 for some time.  I really wish to solve those problems, because they make 
 troubleshooting difficult.  I'm going to propose fixes for them, and I would 
 appreciate if you could help release the official patches as soon as 
 possible.

I don't find either of these patches to be a particularly good idea.
There is certainly no way we'd risk back-patching something with as
many potential side-effects as fooling with libc's textdomain.

I wonder though if we could attack the specific behavior you're
complaining of by testing to see if strerror() returned ???, and
substituting the numeric value for that, ie

 * Some strerror()s return an empty string for out-of-range errno. This 
is
 * ANSI C spec compliant, but not exactly useful.
 */
-   if (str == NULL || *str == '\0')
+   if (str == NULL || *str == '\0' || strcmp(str, ???) == 0)
{
snprintf(errorstr_buf, sizeof(errorstr_buf),
/*--

This would only work if glibc always returns that exact string for a
codeset translation failure, but a look into the glibc sources should
quickly confirm that.

BTW: personally, I would say that what you're looking at is a glibc bug.
I always thought the contract of gettext was to return the ASCII version
if it fails to produce a translated version.  That might not be what the
end user really wants to see, but surely returning something like ???
is completely useless to anybody.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-02-25 21:13:25 -0500, Tom Lane wrote:
 b) allow variables to start with a digit from the second level onwards.

 That seems like a seriously bad idea.  I note that SET does *not* allow
 this;

 Hm. One thing about this is that we currently allow something silly as:
 SET 1.1barblub = 3;

 So I'd like to either restrict SET here or allow the same for guc-file.l
 parsed GUCs. Any opinions?

Well, if you feel an absolute compulsion to make them consistent, I'd
go with making SET disallow creation of variables with names the file
parser wouldn't recognize.  But why is it such a bad thing if SET can
do that?  The whole reason we allow SET to create new variables at all
is that the universe of things you can have as session-local values is
larger than the set of parameters that are allowed in postgresql.conf.
So I'm missing why we need such a restriction.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Andres Freund
On 2013-09-06 10:37:16 -0400, Tom Lane wrote:
 MauMau maumau...@gmail.com writes:
  I've been suffering from PostgreSQL's problems related to character 
  encoding 
  for some time.  I really wish to solve those problems, because they make 
  troubleshooting difficult.  I'm going to propose fixes for them, and I 
  would 
  appreciate if you could help release the official patches as soon as 
  possible.
 
 I don't find either of these patches to be a particularly good idea.
 There is certainly no way we'd risk back-patching something with as
 many potential side-effects as fooling with libc's textdomain.

I have no clue about the gettext stuff but I am in favor of including
the raw errno in strerror() messages (no backpatching tho). When doing
support it's a PITA to get translated strings for those. I can lookup
postgres' own translated messages in the source easy enough, but that
doesn't work all that well for OS supplied messages.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression tests

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 3:34 AM, Jeremy Harris j...@wizmail.org wrote:
I don't see the regression tests running any index-hash operations.
 What am I missing?

What's an index-hash operation?

...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 10:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 There's a large undercurrent of I say it's bad for you in this thread,
 with frankly nothing to back it up.  If we try to be as nanny-ish as
 you're suggesting here, we'll just annoy users.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Hannu Krosing
On 09/06/2013 03:12 PM, Andres Freund wrote:
 On 2013-09-06 13:38:56 +0200, Hannu Krosing wrote:
 On 09/06/2013 09:23 AM, Dimitri Fontaine wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I'm not sure if we need to expose all these new maintenance actions as
 SQL commands.
 I strongly think we should, if only for diagnostic purposes. 
 It would be much easier and more flexible to expose them
 as pg_*() function calls, not proper commands.
 I don't think that's as easy as you might imagine. For much of what's
 done in that context you cannot be in a transaction, you even need to be
 in a toplevel statement (since we internally
 CommitTransactionCommand/StartTransactionCommand).

 So those pg_* commands couldn't be called (except possibly via the
 fastpath function call API ...) which might restrict their usefulnes a
 teensy bit ;)

 So, I think extending the options passed to VACUUM - since it can take
 pretty generic options these days - is a more realistic path.
Might be something convoluted like 

VACUUM indexname WITH (function = pg_cleanup_gin($1));

:)

 Also to
 adapt to some well defined workloads that the automatic system is not
 designed to handle.
 +1
 What would you like to expose individually?

 Greetings,

 Andres Freund




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I have no clue about the gettext stuff but I am in favor of including
 the raw errno in strerror() messages (no backpatching tho).

I dislike that on grounds of readability and translatability; and
I'm also of the opinion that errno codes aren't really consistent
enough across platforms to be all that trustworthy for remote diagnostic
purposes.  I'm fine with printing the code if strerror fails to
produce anything useful --- but not if it succeeds.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Andres Freund
On 2013-09-06 10:52:03 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I have no clue about the gettext stuff but I am in favor of including
  the raw errno in strerror() messages (no backpatching tho).
 
 I dislike that on grounds of readability and translatability; and
 I'm also of the opinion that errno codes aren't really consistent
 enough across platforms to be all that trustworthy for remote diagnostic
 purposes.

Well, it's easier to get access to mappings between errno and meaning of
foreign systems than to get access to their translations in my
experience.

If we'd add the errno inside %m processing, I don't see how it's
a problem for translation?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Kevin Grittner
wangs...@highgo.com.cn wangs...@highgo.com.cn wrote:

 I modified the code for this situation.I consider it very simple.

 It will does not modify the table file, when the scale has been
 increased exclusively.

This patch would allow data in a column which was not consistent
with the column definition:

test=# create table n (val numeric(5,2));
CREATE TABLE
test=# insert into n values ('123.45');
INSERT 0 1
test=# select * from n;
  val  

 123.45
(1 row)

test=# alter table n alter column val type numeric(5,4);
ALTER TABLE
test=# select * from n;
  val  

 123.45
(1 row)

Without your patch the ALTER TABLE command gets this error (as it
should):

test=# alter table n alter column val type numeric(5,4);
ERROR:  numeric field overflow
DETAIL:  A field with precision 5, scale 4 must round to an absolute value less 
than 10^1.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 There's a large undercurrent of I say it's bad for you in
 this thread, with frankly nothing to back it up.  If we try to
 be as nanny-ish as you're suggesting here, we'll just annoy
 users.

 +1.

+1

I can definitely see a place for an ASCII7 encoding which would
reject anything with the high bit set; but there is a clear place
for the current SQL_ASCII, too.  Eliminating it would be much pain
for no discernible gain.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Greg Stark
On Fri, Sep 6, 2013 at 3:57 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-09-06 10:52:03 -0400, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   I have no clue about the gettext stuff but I am in favor of including
   the raw errno in strerror() messages (no backpatching tho).
 
  I dislike that on grounds of readability and translatability; and
  I'm also of the opinion that errno codes aren't really consistent
  enough across platforms to be all that trustworthy for remote diagnostic
  purposes.


Historically they weren't even the same on Linux acros architectures. This
was to support running native binaries from the incumbent platform (SunOS,
OSF, BSD) under emulation on each architecture. I don't see any evidence of
that any more but I'm not sure I'm looking in the right place.


 Well, it's easier to get access to mappings between errno and meaning of
 foreign systems than to get access to their translations in my
 experience.


That's definitely true. There are only a few possible platforms and it's
not hard to convert an errno to an error string on a given platform.
Converting a translated string in some language you can't read to an
untranslated string is another matter.

What would be nicer would be to display the C define, EINVAL, EPERM, etc.
Afaik there's no portable way to do that though. I suppose we could just
have a small array or hash table of all the errors we know about and look
it up.

-- 
greg


[HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Greg Stark
On Thu, Sep 5, 2013 at 8:53 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Greg Stark escribió:

  The main difficulty is that Postgres is very extensible. So to implement
  this you need to think bigger than NUMERIC. It should also be possible to
  alter a column from varchar(5) to varchar(10) for example (but not the
  other way around).

 We already allow that.  See commits
 8f9fe6edce358f7904e0db119416b4d1080a83aa and
 3cc0800829a6dda5347497337b0cf43848da4acf


Ah, nice. i missed that.

So the issue here is that NUMERIC has an additional concept of scale that
is buried in the values and that this scale is set based on the typmod that
was in effect when the value was stored. If you change the typmod on the
column it currently rescales all the values in the table? There's even a
comment to that effect on the commit you pointed at.

But I wonder if we could just declare that that's not what the scale typmod
does. That it's just a maximum scale but it's perfectly valid for NUMERIC
data with lower scales to be stored in a column than the typmod says. In a
way the current behaviour is like bpchar but it would be nice if it was
more like varchar


-- 
greg


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 12:30:56 -0400, Bruce Momjian wrote:
  I am not sure I understand this though. What would be the point to go
  and set all visible and not do the rest of the vacuuming work?
 
  I think triggering vacuuming by scanning the visibility map for the
  number of unset bits and use that as another trigger is a good idea. The
  vm should ensure we're not doing superflous work.

 Yes, I think it might be hard to justify a separate VM-set-only scan of
 the table.  If you are already reading the table, and already checking
 to see if you can set the VM bit, I am not sure why you would not also
 remove old rows, especially since removing those rows might be necessary
 to allow setting VM bits.

Yep. Although adding the table back into the fsm will lead to it being
used for new writes again...

 Another problem I thought of is that while automatic vacuuming only
 happens with high update/delete load, index-only scans are best on
 mostly non-write tables, so we have bad behavior where the ideal case
 (static data) doesn't get vm-bits set, while update/delete has the
 vm-bits set, but then cleared as more update/deletes occur.

Well, older tables will get vacuumed due to vacuum_freeze_table_age. So
at some point they will get vacuumed and the vm bits will get set.

 The more I look at this the worse it appears.   How has this gone
 unaddressed for over a year?

It's been discussed several times including during the introduction of
the feature. I am a bit surprised about the panickey tone in this
thread.
Yes, we need to overhaul the way vacuum works (to reduce the frequency
of rewriting stuff repeatedly) and the way it's triggered (priorization,
more trigger conditions) but all these are known things and just need
somebody with time.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression tests

2013-09-06 Thread Jeremy Harris

On 06/09/13 15:44, Robert Haas wrote:

On Fri, Sep 6, 2013 at 3:34 AM, Jeremy Harris j...@wizmail.org wrote:

I don't see the regression tests running any index-hash operations.
What am I missing?


What's an index-hash operation?



Ones that hit tuplesort_begin_index_hash()
--
Cheers,
   Jeremy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Bruce Momjian
On Fri, Sep  6, 2013 at 03:08:54PM +0200, Andres Freund wrote:
 On 2013-09-06 01:22:36 -0400, Alvaro Herrera wrote:
  I think it's shortsighted to keep thinking of autovacuum as just a way
  to run VACUUM and ANALYZE.  We have already discussed work items that
  need to be done separately, such as truncating the last few empty pages
  on a relation that was vacuumed recently.  We also need to process a GIN
  index' pending insertion list; and with minmax indexes I will want to
  run summarization of heap page ranges.
 
 Agreed.
 
  So maybe instead of trying to think of VM bit setting as part of vacuum,
  we could just keep stats about how many pages we might need to scan
  because of possibly needing to set the bit, and then doing that in
  autovacuum, independently from actually vacuuming the relation.
 
 I am not sure I understand this though. What would be the point to go
 and set all visible and not do the rest of the vacuuming work?
 
 I think triggering vacuuming by scanning the visibility map for the
 number of unset bits and use that as another trigger is a good idea. The
 vm should ensure we're not doing superflous work.

Yes, I think it might be hard to justify a separate VM-set-only scan of
the table.  If you are already reading the table, and already checking
to see if you can set the VM bit, I am not sure why you would not also
remove old rows, especially since removing those rows might be necessary
to allow setting VM bits.

Another problem I thought of is that while automatic vacuuming only
happens with high update/delete load, index-only scans are best on
mostly non-write tables, so we have bad behavior where the ideal case
(static data) doesn't get vm-bits set, while update/delete has the
vm-bits set, but then cleared as more update/deletes occur.

The more I look at this the worse it appears.   How has this gone
unaddressed for over a year?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Hannu Krosing
On 09/06/2013 07:57 PM, Robert Haas wrote:
 On Fri, Sep 6, 2013 at 12:34 PM, Greg Stark st...@mit.edu wrote:
 But I wonder if we could just declare that that's not what the scale typmod
 does. That it's just a maximum scale but it's perfectly valid for NUMERIC
 data with lower scales to be stored in a column than the typmod says. In a
 way the current behaviour is like bpchar but it would be nice if it was more
 like varchar
 Sure, but the point is that 5. is not the same as 5.000 today.  If
 you start whacking this around you'll be changing that behavior, I
 think.

So we already get it wrong by rewriting ?



-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 12:34 PM, Greg Stark st...@mit.edu wrote:
 But I wonder if we could just declare that that's not what the scale typmod
 does. That it's just a maximum scale but it's perfectly valid for NUMERIC
 data with lower scales to be stored in a column than the typmod says. In a
 way the current behaviour is like bpchar but it would be nice if it was more
 like varchar

Sure, but the point is that 5. is not the same as 5.000 today.  If
you start whacking this around you'll be changing that behavior, I
think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic shared memory

2013-09-06 Thread Jim Nasby

On 9/5/13 11:37 AM, Robert Haas wrote:

ISTM that at some point we'll want to look at putting top-level shared
memory into this system (ie: allowing dynamic resizing of GUCs that affect
shared memory size).

A lot of people want that, but being able to resize the shared memory
chunk itself is only the beginning of the problem.  So I wouldn't hold
my breath.


starts breathing again


Wouldn't it protect against a crash while writing the file? I realize the
odds of that are pretty remote, but AFAIK it wouldn't cost that much to
write a new file and do an atomic mv...

If there's an OS-level crash, we don't need the state file; the shared
memory will be gone anyway.  And if it's a PostgreSQL-level failure,
this game neither helps nor hurts.


Sure.  A messed-up backend can clobber the control segment just as it
can clobber anything else in shared memory.  There's really no way
around that problem.  If the control segment has been overwritten by a
memory stomp, we can't use it to clean up.  There's no way around that
problem except to not the control segment, which wouldn't be better.


Are we trying to protect against memory stomps when we restart after a
backend dies? I thought we were just trying to ensure that all shared data
structures were correct and consistent. If that's the case, then I was
thinking that by using a pointer that can be updated in a CPU-atomic fashion
we know we'd never end up with a corrupted entry that was in use; the
partial write would be to a slot with nothing pointing at it so it could be
safely reused.

When we restart after a backend dies, shared memory contents are
completely reset, from scratch.  This is true of both the fixed size
shared memory segment and of the dynamic shared memory control
segment.  The only difference is that, with the dynamic shared memory
control segment, we need to use the segment for cleanup before
throwing it out and starting over.  Extra caution is required because
we're examining memory that could hypothetically have been stomped on;
we must not let the postmaster do anything suicidal.


Not doing something suicidal is what I'm worried about (that and not cleaning 
up as well as possible).

The specific scenario I'm worried about is something like a PANIC in the middle 
of the snprintf call in dsm_write_state_file(). That would leave that file in a 
completely unknown state so who knows what would then happen on restart. ISTM 
that writing a temp file and then doing a filesystem mv would eliminate that 
issue.

Or is it safe to assume that the snprintf call will be atomic since we're just 
spitting out a long?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Sure, but the point is that 5. is not the same as 5.000 today.  If
 you start whacking this around you'll be changing that behavior, I
 think.

Yeah.  And please note that no matter what the OP may think, a lot of
people *do* consider that there's a useful distinction between 5.000
and 5. --- it might indicate the number of significant digits in
a measurement, for example.  I do not see us abandoning that just to make
certain cases of ALTER TABLE faster.

There was some upthread discussion about somehow storing the scale info
at the column level rather than the individual-datum level.  If we could
do that, then it'd be possible to make this type of ALTER TABLE fast.
However, the work involved to do that seems enormously out of proportion
to the benefit, mainly because there just isn't any convenient way to
trace a Datum to its source column, even assuming it's got one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Jim Nasby

On 9/6/13 2:13 PM, Bruce Momjian wrote:

On Fri, Sep  6, 2013 at 01:01:59PM -0400, Bruce Momjian wrote:

This December 2012 thread by Andrew Dunstan shows he wasn't aware that a
manual VACUUM was required for index-only scans.  That thread ended with
us realizing that pg_upgrade's ANALYZE runs will populate
pg_class.relallvisible.

What I didn't see in that thread is an analysis of what cases are going
to require manual vacuum, and I have seen no work in 9.3 to improve
that.  I don't even see it on the TODO list.


OK, let's start the discussion then.  I have added a TODO list:

Improve setting of visibility map bits for read-only and insert-only 
workloads

So, what should trigger an auto-vacuum vacuum for these workloads?
Rather than activity, which is what normally drives autovacuum, it is
lack of activity that should drive it, combined with a high VM cleared
bit percentage.

It seems we can use these statistics values:

 n_tup_ins   | bigint
 n_tup_upd   | bigint
 n_tup_del   | bigint
 n_tup_hot_upd   | bigint
 n_live_tup  | bigint
 n_dead_tup  | bigint
 n_mod_since_analyze | bigint
 last_vacuum | timestamp with time zone
 last_autovacuum | timestamp with time zone

Particilarly last_vacuum and last_autovacuum can tell us the last time
of vacuum.  If the n_tup_upd/n_tup_del counts are low, and the  VM set
bit count is low, it might need vacuuming, though inserts into existing
pages would complicate that.


Something else that might be useful to look at is if there are any FSM entries 
or not. True insert only shouldn't have any FSM.

That said, there's definitely another case to think about... tables that see 
update activity on newly inserted rows but not on older rows. A work queue that 
is not pruned would be an example of that:

INSERT new work item
UPDATE work item SET status = 'In process';
UPDATE work item SET completion = '50%';
UPDATE work item SET sattus = 'Complete, completion = '100%';

In this case I would expect most of the pages in the table (except the very 
end) to be all visible.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix picksplit with nan values

2013-09-06 Thread Alexander Korotkov
Hackers,

PostGIS spotted that picksplit algorithm freezes in infinite loop when
dealing with nan values. I discovered same bug is present in core
opclasses. Attached patch fixes this issue interpreting nan as value
greater than infinity like btree comparison function does.
This patch contain copy of float8_cmp_internal rather than exposing it
from float.c, because it let compiler inline this function.

--
With best regards,
Alexander Korotkov.


picksplit-nan-fix-1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 But I wonder if we could just declare that that's not what the scale typmod
 does. That it's just a maximum scale but it's perfectly valid for NUMERIC
 data with lower scales to be stored in a column than the typmod says. In a
 way the current behaviour is like bpchar but it would be nice if it was
 more like varchar

BTW, note that if you want varying scale in a column, you can declare it
as unconstrained numeric.  So that case corresponds to text, whereas
as you rightly say, numeric(m,n) is more like bpchar(n).  It's true there
is nothing corresponding to varchar(n), but how much do you really need
that case?  The SQL standard didn't see fit to invent a variant of numeric
that worked that way, so they at least aren't buying it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Jeff Janes
On Fri, Sep 6, 2013 at 9:34 AM, Greg Stark st...@mit.edu wrote:


 But I wonder if we could just declare that that's not what the scale typmod
 does. That it's just a maximum scale but it's perfectly valid for NUMERIC
 data with lower scales to be stored in a column than the typmod says. In a
 way the current behaviour is like bpchar but it would be nice if it was more
 like varchar

I agree that this makes more sense than what is currently done.  But
are we going to break backwards compatibility to achieve it?  Do the
standards specify a behavior here?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Jim Nasby

On 9/5/13 8:29 PM, Gavin Flower wrote:

How about a 'VACUUM AFTER' command (part of the 'BEGIN' transaction syntax?) 
that would:

 1. only be valid in a transaction
 2. initiate a vacuum after the current transaction completed
 3. defer any vacuum triggered due to other criteria

If the transaction was rolled back: then if there was a pending vacuum, due to 
other reasons, it would then be actioned.

On normal transaction completion, then if there was a pending vacuum it would 
be combined with the one in the transaction.

Still would need some method of ensuring any pending vacuum was done if the 
transaction hung, or took too long.


I *really* like the idea of BEGIN VACUUM AFTER, but I suspect it would be of 
very limited usefulness if it didn't account for currently running transactions.

I'm thinking we add a vacuum_after_xid field somewhere (pg_class), and instead 
of attempting to vacuum inside the backend at commit time the transaction would 
set that field to it's XID unless the field already had a newer XID in it.

autovac would then add all tables where vacuum_after_xid  the oldest running 
transaction to it's priority list.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Alvaro Herrera
Jim Nasby escribió:

 I also have a vague memory of some features that would benefit from
 being able to have typemod info available at a tuple level in a table,
 not just for the entire table. Unfortunately I can't remember why we
 wanted that... (Alvaro, do you recall? I'm pretty sure it's something
 we'd discussed at some point.)

I don't remember anything concrete either.  Maybe it was the E.164 type
stuff, where we wanted auxiliary info to dictate what kind of pattern to
{match on input, apply on output} ?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Hannu Krosing
On 09/06/2013 08:48 PM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2013-09-06 10:13:23 -0400, Tom Lane wrote:
 Well, if you feel an absolute compulsion to make them consistent, I'd
 go with making SET disallow creation of variables with names the file
 parser wouldn't recognize.  But why is it such a bad thing if SET can
 do that?
 Also, ALTER SYSTEM SET is going to need a similar restriction as well,
 otherwise the server won't restart although the GUCs pass validation...
 Well, sure, but I would think that ALTER SYSTEM SET should be constrained
 to only set known GUCs, not invent new ones on the fly.
What's the reasoning behind this ?

I was assuming that ALTER SYSTEM SET would allow all GUCs which
do not require restart which includes all newly invented ones.

Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hstore: Query speedups with Gin index

2013-09-06 Thread Oleg Bartunov
Blake,

I think it's better to implement this patch as a separate opclass, so users
will have option to choose indexing.

Oleg


On Tue, Sep 3, 2013 at 6:24 PM, Blake Smith blakesmi...@gmail.com wrote:

 Thanks for the feedback everyone. I've attached the patch that we are now
 running in production to service our hstore include queries. We rebuilt the
 index to account for the on-disk incompatibility. I've submitted the patch
 to commitfest here:
 https://commitfest.postgresql.org/action/patch_view?id=1203

 Michael: I don't have a formal benchmark, but several of our worst queries
 went from 10-20 seconds per query down to 50-400 ms. These are numbers
 we've seen when testing real production queries against our production
 dataset with real world access patterns.
 Oleg: Thanks for your thoughts on this change. As for the spgist / gin
 work you're doing, is there anything you need help with or are you still in
 the research phase? I'd love to help get something more robust merged into
 mainline if you think there's collaborative work to be done (even if it's
 only user testing).

 Thanks,

 Blake




 On Wed, Aug 28, 2013 at 12:40 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-08-28 13:31:22 -0400, Bruce Momjian wrote:
  On Sun, Aug 25, 2013 at 10:11:50PM -0400, Tom Lane wrote:
   Michael Paquier michael.paqu...@gmail.com writes:
On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith 
 blakesmi...@gmail.com wrote:
The combined entry is used to support contains (@) queries, and
 the key
only item is used to support key contains (?) queries. This
 change seems
to help especially with hstore keys that have high cardinalities.
 Downsides
of this change is that it requires an index rebuild, and the index
 will be
larger in size.
  
Index rebuild would be a problem only for minor releases,
  
   That's completely false; people have expected major releases to be
   on-disk-compatible for several years now.  While there probably will
 be
   future releases in which we are willing to break storage
 compatibility,
   a contrib module doesn't get to dictate that.
  
   What might be a practical solution, especially if this isn't always a
   win (which seems likely given the index-bloat risk), is to make hstore
   offer two different GIN index opclasses, one that works the
 traditional
   way and one that works this way.
  
   Another thing that needs to be taken into account here is Oleg and
   Teodor's in-progress work on extending hstore:
   https://www.pgcon.org/2013/schedule/events/518.en.html
   I'm not sure if this patch would conflict with that at all, but it
   needs to be considered.
 
  We can disallow in-place upgrades for clusters that use certain contrib
  modules --- we have done that in the past.

 But that really cannot be acceptable for hstore. The probably most
 widely used extension there is.

 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services




 --
 Blake Smith
 http://blakesmith.me
 @blakesmith


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] regression tests

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 1:34 PM, Jeremy Harris j...@wizmail.org wrote:
 What's an index-hash operation?

 Ones that hit tuplesort_begin_index_hash()

Oh.  Well, it looks to me like that function can only get called when
building a hash index.  Specifically, according to the comment in
hashbuild(), a hash index projected to be larger than shared_buffers.
The regression tests are generally designed to work on small amounts
of data since they need to run quickly, so this isn't too surprising.
Hash indexes are a somewhat underwhelming feature anyway, since btrees
typically perform as well or better, and since hash indexes are not
WAL-logged and therefore can be corrupted on a crash.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Tom Lane
Hannu Krosing ha...@2ndquadrant.com writes:
 On 09/06/2013 08:48 PM, Tom Lane wrote:
 Well, sure, but I would think that ALTER SYSTEM SET should be constrained
 to only set known GUCs, not invent new ones on the fly.

 What's the reasoning behind this ?

If you don't know what a GUC is, you don't know what are valid values for
it, and thus you might write an illegal value into auto.conf (or whatever
we're calling it this week).  That could have consequences as bad as
failure to restart, should the DBA decide to preload the module defining
that GUC, which would then complain about the bad value during postmaster
start.

 I was assuming that ALTER SYSTEM SET would allow all GUCs which
 do not require restart which includes all newly invented ones.

I do not believe that the former need imply the latter, nor do I see a
strong use-case for allowing ALTER SYSTEM SET on session-local GUCs,
which is what any truly invented-on-the-fly GUCs would be.  The whole
business with session-local GUCs is pretty much a kluge anyway, which
we might want to retire or redefine someday; so I'd much prefer that
ALTER SYSTEM SET stayed out of it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix picksplit with nan values

2013-09-06 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes:
 PostGIS spotted that picksplit algorithm freezes in infinite loop when
 dealing with nan values. I discovered same bug is present in core
 opclasses. Attached patch fixes this issue interpreting nan as value
 greater than infinity like btree comparison function does.

Hm.  Good point, but it seems like some of these hunks are only taking
care of a subset of the possible combinations of input NaNs.  If you're
certain the other combinations are impossible, there should be code
comments explaining why.

BTW, as a stylistic matter, I think it sucks to write
  !float8_cmp_internal(x,y)
when what you mean is
  float8_cmp_internal(x,y) == 0
The ! syntax should pretty much only be used for boolean tests IMO.

I do recognize that there's a tradition of writing !ptr rather than
ptr == NULL, which I think is all right in most contexts, mainly
because returning a null pointer has an element of boolean yes-or-no-ness
to it.  When you're doing arithmetic comparisons, though, it's just
confusing.

I wrote another rant about this years ago in the context of complaining
about !strcmp tests; there was probably more detail in that, if you
care to look in the archives.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 15:13:30 -0400, Bruce Momjian wrote:
 On Fri, Sep  6, 2013 at 01:01:59PM -0400, Bruce Momjian wrote:
  This December 2012 thread by Andrew Dunstan shows he wasn't aware that a
  manual VACUUM was required for index-only scans.  That thread ended with
  us realizing that pg_upgrade's ANALYZE runs will populate
  pg_class.relallvisible.
  
  What I didn't see in that thread is an analysis of what cases are going
  to require manual vacuum, and I have seen no work in 9.3 to improve
  that.  I don't even see it on the TODO list.
 
 OK, let's start the discussion then.  I have added a TODO list:
 
   Improve setting of visibility map bits for read-only and insert-only 
 workloads
 
 So, what should trigger an auto-vacuum vacuum for these workloads? 
 Rather than activity, which is what normally drives autovacuum, it is
 lack of activity that should drive it, combined with a high VM cleared
 bit percentage.
 
 It seems we can use these statistics values:
 
n_tup_ins   | bigint   
n_tup_upd   | bigint   
n_tup_del   | bigint   
n_tup_hot_upd   | bigint   
n_live_tup  | bigint   
n_dead_tup  | bigint   
n_mod_since_analyze | bigint   
last_vacuum | timestamp with time zone 
last_autovacuum | timestamp with time zone 
 
 Particilarly last_vacuum and last_autovacuum can tell us the last time
 of vacuum.  If the n_tup_upd/n_tup_del counts are low, and the  VM set
 bit count is low, it might need vacuuming, though inserts into existing
 pages would complicate that.

I wonder if we shouldn't trigger most vacuums (not analyze!) via unset
fsm bits. Perhaps combined with keeping track of RecentGlobalXmin to
make sure we're not repeatedly checking for work that cannot yet be
done.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Andres Freund
On 2013-09-06 14:48:33 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-09-06 10:13:23 -0400, Tom Lane wrote:
  Well, if you feel an absolute compulsion to make them consistent, I'd
  go with making SET disallow creation of variables with names the file
  parser wouldn't recognize.  But why is it such a bad thing if SET can
  do that?
 
  Also, ALTER SYSTEM SET is going to need a similar restriction as well,
  otherwise the server won't restart although the GUCs pass validation...
 
 Well, sure, but I would think that ALTER SYSTEM SET should be constrained
 to only set known GUCs, not invent new ones on the fly.

Hm. That sounds inconvenient to me. Consider something like configuring
the system to use auto_explain henceforth.
ALTER SYSTEM SET shared_preload_libraries = 'auto_explain';
ALTER SYSTEM SET auto_explain.log_min_duration = 100;

It seems weird to forbid doing that and requiring a manual LOAD when we
don't do so for normal SETs. I can live with the restriction if we
decide it's a good idea, I just wouldn't appreciate it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 6:31 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-06 14:48:33 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-09-06 10:13:23 -0400, Tom Lane wrote:
  Well, if you feel an absolute compulsion to make them consistent, I'd
  go with making SET disallow creation of variables with names the file
  parser wouldn't recognize.  But why is it such a bad thing if SET can
  do that?

  Also, ALTER SYSTEM SET is going to need a similar restriction as well,
  otherwise the server won't restart although the GUCs pass validation...

 Well, sure, but I would think that ALTER SYSTEM SET should be constrained
 to only set known GUCs, not invent new ones on the fly.

 Hm. That sounds inconvenient to me. Consider something like configuring
 the system to use auto_explain henceforth.
 ALTER SYSTEM SET shared_preload_libraries = 'auto_explain';
 ALTER SYSTEM SET auto_explain.log_min_duration = 100;

 It seems weird to forbid doing that and requiring a manual LOAD when we
 don't do so for normal SETs. I can live with the restriction if we
 decide it's a good idea, I just wouldn't appreciate it.

I'm with Tom on this one: I think this will save more pain than it causes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 4:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 The attached patch adds a new plan node type; CustomPlan that enables
 extensions to get control during query execution, via registered callbacks.

 TBH, I think this is really an exercise in building useless mechanism.
 I don't believe that any actually *interesting* new types of plan node can
 be inserted into a query plan without invasive changes to the planner, and
 so it's a bit pointless to set up hooks whereby you can avoid touching any
 source code in the executor.

I find this a somewhat depressing response.  Didn't we discuss this
exact design at the developer meeting in Ottawa?  I thought it sounded
reasonable to you then, or at least I don't remember you panning it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Valgrind Memcheck support

2013-09-06 Thread Greg Stark
On Sun, Jun 9, 2013 at 10:25 PM, Noah Misch n...@leadboat.com wrote:

 - Test recovery, such as by running a streaming replica under Memcheck
 while
   the primary runs make installcheck-world.


In general we need a lot more testing on the recovery code.


  - Memcheck has support for detecting leaks.  I have not explored that
 side at
   all, always passing --leak-check=no.  We could add support for freeing
   everything at process exit, thereby making the leak detection
 meaningful.


I think this is missing the type of leaks we actually care about. The way
palloc works we can be virtually certain that if we did that we wouldn't
have any leaks. All it would detect are the random one-off mallocs we know
very well are there.

The problems we've had with leaks in the past are invariably things
allocated at the wrong memory context. Things that can grow for every row
processed but are stored per-query or for every query processed but stored
per-sesson. To detect that will requires more of a heuristic where when a
child memory context is reset any parent context growth is logged.

-- 
greg


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 2:34 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 On 09/06/2013 07:57 PM, Robert Haas wrote:
 On Fri, Sep 6, 2013 at 12:34 PM, Greg Stark st...@mit.edu wrote:
 But I wonder if we could just declare that that's not what the scale typmod
 does. That it's just a maximum scale but it's perfectly valid for NUMERIC
 data with lower scales to be stored in a column than the typmod says. In a
 way the current behaviour is like bpchar but it would be nice if it was more
 like varchar
 Sure, but the point is that 5. is not the same as 5.000 today.  If
 you start whacking this around you'll be changing that behavior, I
 think.

 So we already get it wrong by rewriting ?

Ah, no, I don't think so.  If you have 5.0 and lower
the scale, it'll truncate off some of those zeroes to make it fit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I find this a somewhat depressing response.  Didn't we discuss this
 exact design at the developer meeting in Ottawa?  I thought it sounded
 reasonable to you then, or at least I don't remember you panning it.

What I recall saying is that I didn't see how the planner side of it would
work ... and I still don't see that.  I'd be okay with committing
executor-side fixes only if we had a vision of where we'd go on the
planner side; but this patch doesn't offer any path forward there.

This is not unlike the FDW stuff, where getting a reasonable set of
planner APIs in place was by far the hardest part (and isn't really done
even yet, since you still can't do remote joins or remote aggregation in
any reasonable fashion).  But you can do simple stuff reasonably simply,
without reimplementing all of the planner along the way --- and I think
we should look for some equivalent level of usefulness from this before
we commit it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Valgrind Memcheck support

2013-09-06 Thread Noah Misch
On Wed, Aug 28, 2013 at 10:30:34PM -0400, Noah Misch wrote:
 On Wed, Aug 28, 2013 at 03:16:14PM +0200, Andres Freund wrote:
  I vote for just removing that VALGRIND_PRINTF - it doesn't give you
  anything you cannot easily achieve otherwise...

Done.

 I'd like to see a buildfarm member running make installcheck under Valgrind,
 so I'd like the code to fit the needs thereof without patching beyond
 pg_config_manual.h.  Perhaps having the buildfarm member do valgrind postgres
 --log-statement=all 2combined-logfile is good enough.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 13:01:59 -0400, Bruce Momjian wrote:
 On Fri, Sep  6, 2013 at 06:36:47PM +0200, Andres Freund wrote:
  On 2013-09-06 12:30:56 -0400, Bruce Momjian wrote:
I am not sure I understand this though. What would be the point to go
and set all visible and not do the rest of the vacuuming work?
   
I think triggering vacuuming by scanning the visibility map for the
number of unset bits and use that as another trigger is a good idea. The
vm should ensure we're not doing superflous work.
  
   Yes, I think it might be hard to justify a separate VM-set-only scan of
   the table.  If you are already reading the table, and already checking
   to see if you can set the VM bit, I am not sure why you would not also
   remove old rows, especially since removing those rows might be necessary
   to allow setting VM bits.
  
  Yep. Although adding the table back into the fsm will lead to it being
  used for new writes again...
 
 You mean adding _pages_ back into the table's FSM?  Yes, that is going
 to cause those pages to get dirty, but it is better than expanding the
 table size.  I don't see why you would not update the FSM.

You're right, we should add them, I wasn't really questioning that. I
was, quietly so you couldn't hear it, wondering whether we should
priorize the target buffer selection differently.

   Another problem I thought of is that while automatic vacuuming only
   happens with high update/delete load, index-only scans are best on
   mostly non-write tables, so we have bad behavior where the ideal case
   (static data) doesn't get vm-bits set, while update/delete has the
   vm-bits set, but then cleared as more update/deletes occur.
  
  Well, older tables will get vacuumed due to vacuum_freeze_table_age. So
  at some point they will get vacuumed and the vm bits will get set.
 
 Hmm, good point.  That would help with an insert-only workload, as long
 as you can chew through 200M transactions.   That doesn't help with a
 read-only workload as we don't consume transction IDs for SELECT.

It's even 150mio. For the other workloads, its pretty common wisdom to
VACUUM after bulk data loading. I think we even document that.

   The more I look at this the worse it appears.   How has this gone
   unaddressed for over a year?
  
  It's been discussed several times including during the introduction of
  the feature. I am a bit surprised about the panickey tone in this
  thread.
 
 This December 2012 thread by Andrew Dunstan shows he wasn't aware that a
 manual VACUUM was required for index-only scans.  That thread ended with
 us realizing that pg_upgrade's ANALYZE runs will populate
 pg_class.relallvisible.

 What I didn't see in that thread is an analysis of what cases are going
 to require manual vacuum, and I have seen no work in 9.3 to improve
 that.  I don't even see it on the TODO list.

Yes, TODO maybe missing.

 It bothers me that we spent time developing index-only scans, but have
 significant workloads where it doesn't work, no efforts on improving it,
 and no documentation on manual workarounds.  I have not even seen
 discussion on how we are going to improve this.  I would like to have
 that discussion now.

It's not like the feature is useless in this case. You just need to
perform an extra operation to activate it. I am not saying we shouldn't
document it better, but it seriously worries me that a useful feature is
depicted as useless because it requires a manual VACUUM in some
circumstances.

  Yes, we need to overhaul the way vacuum works (to reduce the frequency
  of rewriting stuff repeatedly) and the way it's triggered (priorization,
  more trigger conditions) but all these are known things and just need
  somebody with time.

 Based on the work needed to improve this, I would have thought someone
 would have taken this on during 9.3 development.

There has been some discussion about it indirectly via the freezing
stuff. That also would require more advanced scheduling.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-06 Thread Jim Nasby

On 9/5/13 10:47 PM, Noah Misch wrote:

On Thu, Sep 05, 2013 at 10:41:25AM -0700, Jeff Janes wrote:

In order to avoid the rewrite, the code would have to be changed to
look up the column definition and if it specifies the scale, then
ignore the per-row n_header, and look at the n_header only if the
column is NUMERIC with no precision or scale.  That should
conceptually be possible, but I don't know how hard it would be to
implement--it sounds pretty invasive to me.  Then if the column was
altered from NUMERIC with scale to be a plain NUMERIC, it would have
to rewrite the table to enforce the row-wise scale to match the old
column-wise scale.  Where as now that alter doesn't need a re-write.
I don't know if this would be an overall gain or not.


Invasive indeed.  The type-supplementary data would need to reach essentially
everywhere we now convey a type OID.  Compare the invasiveness of adding
collation support.  However, this is not the first time it would have been
useful.  We currently store a type OID in every array and composite datum.
That's wasteful and would be unnecessary if we reliably marshalled similar
information to all the code needing it.  Given a few more use cases, the
effort would perhaps start to look credible relative to the benefits.


Aren't there cases where PL/pgsql gets hosed by this? Or even functions in 
general?

I also have a vague memory of some features that would benefit from being able 
to have typemod info available at a tuple level in a table, not just for the 
entire table. Unfortunately I can't remember why we wanted that... (Alvaro, do 
you recall? I'm pretty sure it's something we'd discussed at some point.)
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Bruce Momjian
On Fri, Sep  6, 2013 at 01:01:59PM -0400, Bruce Momjian wrote:
 This December 2012 thread by Andrew Dunstan shows he wasn't aware that a
 manual VACUUM was required for index-only scans.  That thread ended with
 us realizing that pg_upgrade's ANALYZE runs will populate
 pg_class.relallvisible.
 
 What I didn't see in that thread is an analysis of what cases are going
 to require manual vacuum, and I have seen no work in 9.3 to improve
 that.  I don't even see it on the TODO list.

OK, let's start the discussion then.  I have added a TODO list:

Improve setting of visibility map bits for read-only and insert-only 
workloads

So, what should trigger an auto-vacuum vacuum for these workloads? 
Rather than activity, which is what normally drives autovacuum, it is
lack of activity that should drive it, combined with a high VM cleared
bit percentage.

It seems we can use these statistics values:

 n_tup_ins   | bigint   
 n_tup_upd   | bigint   
 n_tup_del   | bigint   
 n_tup_hot_upd   | bigint   
 n_live_tup  | bigint   
 n_dead_tup  | bigint   
 n_mod_since_analyze | bigint   
 last_vacuum | timestamp with time zone 
 last_autovacuum | timestamp with time zone 

Particilarly last_vacuum and last_autovacuum can tell us the last time
of vacuum.  If the n_tup_upd/n_tup_del counts are low, and the  VM set
bit count is low, it might need vacuuming, though inserts into existing
pages would complicate that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hstore: Query speedups with Gin index

2013-09-06 Thread Peter Eisentraut
On 9/5/13 2:42 PM, Blake Smith wrote:
 Thanks for checking the tests. I wasn't able to duplicate your test
 results. Did you run the hstore regression tests with the revised patch
 I attached in the thread?  Attached is the output I got with the latest
 patch applied.

See
http://pgci.eisentraut.org/jenkins/job/postgresql_commitfest_world/46/consoleFull

Perhaps you didn't build with --enable-cassert?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Bruce Momjian
On Sat, Sep  7, 2013 at 12:26:23AM +0200, Andres Freund wrote:
  So, what should trigger an auto-vacuum vacuum for these workloads? 
  Rather than activity, which is what normally drives autovacuum, it is
  lack of activity that should drive it, combined with a high VM cleared
  bit percentage.
  
  It seems we can use these statistics values:
  
   n_tup_ins   | bigint   
   n_tup_upd   | bigint   
   n_tup_del   | bigint   
   n_tup_hot_upd   | bigint   
   n_live_tup  | bigint   
   n_dead_tup  | bigint   
   n_mod_since_analyze | bigint   
   last_vacuum | timestamp with time zone 
   last_autovacuum | timestamp with time zone 
  
  Particilarly last_vacuum and last_autovacuum can tell us the last time
  of vacuum.  If the n_tup_upd/n_tup_del counts are low, and the  VM set
  bit count is low, it might need vacuuming, though inserts into existing
  pages would complicate that.
 
 I wonder if we shouldn't trigger most vacuums (not analyze!) via unset
 fsm bits. Perhaps combined with keeping track of RecentGlobalXmin to

Fsm bits?  FSM tracks the free space on each page.  How does that help?

 make sure we're not repeatedly checking for work that cannot yet be
 done.

The idea of using RecentGlobalXmin to see how much _work_ has happened
since the last vacuum is interesting, but it doesn't handle read-only
transactions;  I am not sure how they can be tracked.  You make a good
point that 5 minutes passing is meaningless --- you really want to know
how many transactions have completed.  Unfortunately, our virtual
transactions make that hard to compute.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-06 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 The attached patch adds a new plan node type; CustomPlan that enables
 extensions to get control during query execution, via registered callbacks.

TBH, I think this is really an exercise in building useless mechanism.
I don't believe that any actually *interesting* new types of plan node can
be inserted into a query plan without invasive changes to the planner, and
so it's a bit pointless to set up hooks whereby you can avoid touching any
source code in the executor.

 ... Extension will put its local code on the planner_hook
 to reference and manipulate PlannedStmt object.

That is hardly a credible design for doing anything interesting with
custom plans.  It's got exactly the same problem you are complaining about
for the executor, ie you have to replace the whole of the planner if you
try to do things that way.

One other point here is: if you need more than one kind of custom plan
node, how will you tell what's what?  I doubt you can completely eliminate
the need for IsA-style tests, especially in the planner area.  The sample
contrib module here already exposes the failure mode I'm worried about:
it falls down as soon as it sees a plan node type it doesn't know.  If you
could show me how this would work together with some other extension
that's also adding custom plan nodes of its own, then I might think you
had something.

In the same vein, the patch fails to provide credible behavior for
ExecSupportsMarkRestore, ExecMaterializesOutput, ExplainPreScanNode,
search_plan_tree, and probably some other places that need to know
about all possible plan node types.

Even if you'd covered every one of those bases, you've still only got
support for generic plan nodes having no particularly unique properties.
As an example of what I'm thinking about here, NestLoop, which might be
considered the most vanilla of all join plan nodes, actually has a lot of
specialized infrastructure in both the planner and the executor to support
its ability to pass outer-relation values into the inner-relation scan.
I think that as soon as you try to do anything of real interest with
custom plan nodes, you'll be finding you need special-purpose additions
that no set of generic hooks could reasonably cover.

In short, I don't understand or agree with this idea that major changes
should be implementable without touching any of the core code in any way.
This is open source --- if you need a modified version then modify it.
I used to build systems that needed hook-style extensibility because the
core code was burned into ROM; but that's not what we're dealing with
today, and I don't really see the argument for sacrificing readability
and performance by putting hooks everywhere, especially in places with
vague, ever-changing API contracts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Bruce Momjian
On Fri, Sep  6, 2013 at 06:36:47PM +0200, Andres Freund wrote:
 On 2013-09-06 12:30:56 -0400, Bruce Momjian wrote:
   I am not sure I understand this though. What would be the point to go
   and set all visible and not do the rest of the vacuuming work?
  
   I think triggering vacuuming by scanning the visibility map for the
   number of unset bits and use that as another trigger is a good idea. The
   vm should ensure we're not doing superflous work.
 
  Yes, I think it might be hard to justify a separate VM-set-only scan of
  the table.  If you are already reading the table, and already checking
  to see if you can set the VM bit, I am not sure why you would not also
  remove old rows, especially since removing those rows might be necessary
  to allow setting VM bits.
 
 Yep. Although adding the table back into the fsm will lead to it being
 used for new writes again...

You mean adding _pages_ back into the table's FSM?  Yes, that is going
to cause those pages to get dirty, but it is better than expanding the
table size.  I don't see why you would not update the FSM.

  Another problem I thought of is that while automatic vacuuming only
  happens with high update/delete load, index-only scans are best on
  mostly non-write tables, so we have bad behavior where the ideal case
  (static data) doesn't get vm-bits set, while update/delete has the
  vm-bits set, but then cleared as more update/deletes occur.
 
 Well, older tables will get vacuumed due to vacuum_freeze_table_age. So
 at some point they will get vacuumed and the vm bits will get set.

Hmm, good point.  That would help with an insert-only workload, as long
as you can chew through 200M transactions.   That doesn't help with a
read-only workload as we don't consume transction IDs for SELECT.

  The more I look at this the worse it appears.   How has this gone
  unaddressed for over a year?
 
 It's been discussed several times including during the introduction of
 the feature. I am a bit surprised about the panickey tone in this
 thread.

This December 2012 thread by Andrew Dunstan shows he wasn't aware that a
manual VACUUM was required for index-only scans.  That thread ended with
us realizing that pg_upgrade's ANALYZE runs will populate
pg_class.relallvisible.

What I didn't see in that thread is an analysis of what cases are going
to require manual vacuum, and I have seen no work in 9.3 to improve
that.  I don't even see it on the TODO list.

It bothers me that we spent time developing index-only scans, but have
significant workloads where it doesn't work, no efforts on improving it,
and no documentation on manual workarounds.  I have not even seen
discussion on how we are going to improve this.  I would like to have
that discussion now.

 Yes, we need to overhaul the way vacuum works (to reduce the frequency
 of rewriting stuff repeatedly) and the way it's triggered (priorization,
 more trigger conditions) but all these are known things and just need
 somebody with time.

Based on the work needed to improve this, I would have thought someone
would have taken this on during 9.3 development.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.4] row level security

2013-09-06 Thread Peter Eisentraut
On Wed, 2013-08-28 at 13:56 +0200, Kohei KaiGai wrote:
 The attached patch fixed the portion I was pointed out, so its overall
 design has not been changed so much.

The documentation doesn't build:

openjade:catalogs.sgml:237:28:X: reference to non-existent ID 
CATALOG-PG-ROWLEVELSEC




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2013-09-06 Thread Peter Eisentraut
On Wed, 2013-09-04 at 10:06 +0200, Boszormenyi Zoltan wrote:
 2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta:
 
  I have put the broken up patchset into a GIT tree of mine at GitHub:
  https://github.com/zboszor/ecpg-readahead/
  but the huge compressed patch is also attached for reference.
 
 I merged current PG GIT HEAD in the above tree and fixed a merge conflict
 caused by commit 673b527534893a4a8adb3cdef52fc645c13598ce
 
 The huge patch is attached for reference.

The documentation doesn't build:

openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was 
specified
openjade:ecpg.sgml:477:40: start tag was here
openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was 
specified
openjade:ecpg.sgml:477:20: start tag was here
openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was 
specified
openjade:ecpg.sgml:473:81: start tag was here
openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was 
specified
openjade:ecpg.sgml:473:56: start tag was here




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-09-06 10:13:23 -0400, Tom Lane wrote:
 Well, if you feel an absolute compulsion to make them consistent, I'd
 go with making SET disallow creation of variables with names the file
 parser wouldn't recognize.  But why is it such a bad thing if SET can
 do that?

 Also, ALTER SYSTEM SET is going to need a similar restriction as well,
 otherwise the server won't restart although the GUCs pass validation...

Well, sure, but I would think that ALTER SYSTEM SET should be constrained
to only set known GUCs, not invent new ones on the fly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

2013-09-06 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 What would be nicer would be to display the C define, EINVAL, EPERM, etc.
 Afaik there's no portable way to do that though. I suppose we could just
 have a small array or hash table of all the errors we know about and look
 it up.

Yeah, I was just thinking the same thing.  We could do

switch (errno)
{
case EINVAL: str = EINVAL; break;
case ENOENT: str = ENOENT; break;
...
#ifdef EFOOBAR
case EFOOBAR: str = EFOOBAR; break;
#endif
...

for all the common or even less-common names, and only fall back on
printing a numeric value if it's something really unusual.

But I still maintain that we should only do this if we can't get a useful
string out of strerror().  There isn't any way to cram this information
into the current usage of %m without doing damage to the readability and
translatability of the string.  Our style  translatability guidelines
specifically recommend against assembling messages out of fragments,
and also against sticking in parenthetical additions.

I suppose we could think about inventing another error field rather
than damaging the readability of the primary message string, ie teach
elog that if %m is used it should emit an additional line along the lines
of
ERRNO:  EINVAL
However the cost of adding a new column to CSV log format might exceed its
value.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-06 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
 Another possible fix would be to avoid the IS NULL value optimizer
 expansion if a ROW construct is inside a ROW().  I have attached a patch
 that does this for review.

Having received no replies, do people perfer this version of the patch
that just punts nested ROW IS NULL testing to execQual.c?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-06 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
 Another possible fix would be to avoid the IS NULL value optimizer
 expansion if a ROW construct is inside a ROW().  I have attached a patch
 that does this for review.

 Having received no replies, do people perfer this version of the patch
 that just punts nested ROW IS NULL testing to execQual.c?

For some reason I read your previous message as saying you were willing to
wait for considered reviews this time.  If not, I'll just write a blanket
-1 for any version of this patch.

I don't think you've shown that this is more spec-compliant than what
we had before, nor that you've made all the relevant code (execQual,
eval_const_expressions, column NOT NULL constraints, plpgsql variable
NOT NULL constraints, maybe other places) mutually consistent.

I'm not a fan of incremental improvements in application-visible
semantics: if we change this repeatedly over several releases, that's an
application author's worst nightmare, because he'll have to try to work
with multiple different behaviors.  We need to change this *once* and get
it right.  You haven't proven that this is now right where it wasn't
before, and the patch is certainly not so obviously right that it should
go in without considered review.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-06 Thread Bruce Momjian
On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
  Another possible fix would be to avoid the IS NULL value optimizer
  expansion if a ROW construct is inside a ROW().  I have attached a patch
  that does this for review.
 
  Having received no replies, do people perfer this version of the patch
  that just punts nested ROW IS NULL testing to execQual.c?
 
 For some reason I read your previous message as saying you were willing to
 wait for considered reviews this time.  If not, I'll just write a blanket
 -1 for any version of this patch.

Are you saying people will comment later?  I wasn't clear that was the
plan.  I can certainly wait.

 I don't think you've shown that this is more spec-compliant than what
 we had before, nor that you've made all the relevant code (execQual,
 eval_const_expressions, column NOT NULL constraints, plpgsql variable
 NOT NULL constraints, maybe other places) mutually consistent.

I believe all the other places (execQual, plpgsql variables) all treat
embedded row in rows as non-null, but I don't even know how to test all
the place.  Can someone do that?

 I'm not a fan of incremental improvements in application-visible
 semantics: if we change this repeatedly over several releases, that's an
 application author's worst nightmare, because he'll have to try to work
 with multiple different behaviors.  We need to change this *once* and get
 it right.  You haven't proven that this is now right where it wasn't
 before, and the patch is certainly not so obviously right that it should
 go in without considered review.

Yes, we have to be sure to get this right.  However, I am not able to
test all the places you have mentioned, so unless someone else finds
this important enough to work on, I will just document it as a TODO and
close it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Extend namespace of valid guc names

2013-09-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Sep 6, 2013 at 6:31 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-06 14:48:33 -0400, Tom Lane wrote:
 Well, sure, but I would think that ALTER SYSTEM SET should be constrained
 to only set known GUCs, not invent new ones on the fly.

 Hm. That sounds inconvenient to me. Consider something like configuring
 the system to use auto_explain henceforth.
 ALTER SYSTEM SET shared_preload_libraries = 'auto_explain';
 ALTER SYSTEM SET auto_explain.log_min_duration = 100;

 I'm with Tom on this one: I think this will save more pain than it causes.

So far as that example goes, I'm not suggesting that ALTER SYSTEM SET
auto_explain.log_min_duration should be forbidden altogether.  I *am*
saying that it should only be allowed when auto_explain is loaded in the
current session, so that we can find out whether the proposed value is
allowed by the module that defines the GUC.

Of course, this is not completely bulletproof, since it will fail if the
defining module changes its mind from time to time about what are valid
values of the GUC :-(.  But promising to restart in the face of that kind
of inconsistency is hopeless.  On the other hand, not checking at all is
just asking for failures.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Plan node

2013-09-06 Thread Kohei KaiGai
2013/9/7 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 I find this a somewhat depressing response.  Didn't we discuss this
 exact design at the developer meeting in Ottawa?  I thought it sounded
 reasonable to you then, or at least I don't remember you panning it.

 What I recall saying is that I didn't see how the planner side of it would
 work ... and I still don't see that.  I'd be okay with committing
 executor-side fixes only if we had a vision of where we'd go on the
 planner side; but this patch doesn't offer any path forward there.

The reason why this patch stick on executor-side is we concluded
not to patch the planner code from the beginning in Ottawa because
of its complexity.
I'd also like to agree that planner support for custom plan is helpful
to construct better execution plan, however, it also make sense even
if this feature begins a functionality that offers a way to arrange a plan
tree being already constructed.

Anyway, let me investigate what's kind of APIs to be added for planner
stage also.

 This is not unlike the FDW stuff, where getting a reasonable set of
 planner APIs in place was by far the hardest part (and isn't really done
 even yet, since you still can't do remote joins or remote aggregation in
 any reasonable fashion).  But you can do simple stuff reasonably simply,
 without reimplementing all of the planner along the way --- and I think
 we should look for some equivalent level of usefulness from this before
 we commit it.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2013-09-06 Thread Satoshi Nagayasu

Hi,

The revised patch for wal buffer statistics is attached.
A test script is also attached. Please take a look.

Regards,

(2013/07/19 7:49), Satoshi Nagayasu wrote:

Will revise and re-resubmit for the next CF.

Regards,

2013/07/19 1:06, Alvaro Herrera wrote:


What happened to this patch?  We were waiting on an updated version from
you.


Satoshi Nagayasu wrote:

(2012/12/10 3:06), Tomas Vondra wrote:

On 29.10.2012 04:58, Satoshi Nagayasu wrote:

2012/10/24 1:12, Alvaro Herrera wrote:

Satoshi Nagayasu escribi�:


With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in
globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this?  Should we split this value?
And should we have new system view for walwriter?


I think the answer to the two last questions is yes.  It doesn't
seem to
make sense, to me, to have a single reset timings for what are
effectively two separate things.

Please submit an updated patch to next CF.  I'm marking this one
returned with feedback.  Thanks.



I attached the latest one, which splits the reset_time
for bgwriter and walwriter, and provides new system view,
called pg_stat_walwriter, to show the dirty write counter
and the reset time.


I've done a quick review of the v4 patch:


Thanks for the review, and sorry for my delayed response.


1) applies fine on HEAD, compiles fine

2) make installcheck fails because of a difference in the 'rules'
test suite (there's a new view pg_stat_walwriter - see the
attached patch for a fixed version or expected/rules.out)


Ah, I forgot about the regression test. I will fix it. Thanks.


3) I do agree with Alvaro that using the same struct for two separate
components (bgwriter and walwriter) seems a bit awkward. For
example
you need to have two separate stat_reset fields, the reset code
becomes much more verbose (because you need to list individual
fields) etc.

So I'd vote to either split this into two structures or keeping it
as a single structure (although with two views on top of it).


Ok, I will split it into two structs, PgStat_BgWriterGlobalStats and
PgStat_WalWriterGlobalStats, and will modify PgStat_GlobalStats to
hold those two structs in the stat collector.


4) Are there any other fields that might be interesting? Right now
there's just dirty_writes but I guess there are other values.
E.g.
how much data was actually written etc.?


AFAIK, I think those numbers can be obtained by calling
pg_current_xlog_insert_location() or pg_current_xlog_location(),
but if we need it, I will add it.

Regards,








--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 23ebc11..cdced7f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1878,6 +1878,13 @@ include 'filename'
 results in most cases.
/para
 
+   para
+When you see pg_stat_walwriter.dirty_write, which means number
+of buffer flushing at buffer full, is continuously increasing
+in your running server, you may need to enlarge this buffer
+size.
+   /para
+
   /listitem
  /varlistentry
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4ec6981..15d9202 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -278,6 +278,14 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
  /row
 
  row
+  
entrystructnamepg_stat_walwriter/indextermprimarypg_stat_walwriter/primary/indexterm/entry
+  entryOne row only, showing statistics about the wal writer
+   process's activity. See xref linkend=pg-stat-walwriter-view
+   for details.
+ /entry
+ /row
+
+ row
   
entrystructnamepg_stat_database/indextermprimarypg_stat_database/primary/indexterm/entry
   entryOne row per database, showing database-wide statistics. See
xref linkend=pg-stat-database-view for details.
@@ -735,6 +743,39 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
single row, containing global data for the cluster.
   /para
 
+  table id=pg-stat-walwriter-view xreflabel=pg_stat_walwriter
+   titlestructnamepg_stat_walwriter/structname View/title
+
+   tgroup cols=3
+thead
+row
+  entryColumn/entry
+  entryType/entry
+  entryDescription/entry
+ /row
+/thead
+
+tbody

Re: [HACKERS] [PERFORM] encouraging index-only scans

2013-09-06 Thread Andres Freund
On 2013-09-06 20:29:08 -0400, Bruce Momjian wrote:
 On Sat, Sep  7, 2013 at 12:26:23AM +0200, Andres Freund wrote:
  I wonder if we shouldn't trigger most vacuums (not analyze!) via unset
  fsm bits. Perhaps combined with keeping track of RecentGlobalXmin to
 
 Fsm bits?  FSM tracks the free space on each page.  How does that
 help?

Err. I was way too tired when I wrote that. vm bits.

  make sure we're not repeatedly checking for work that cannot yet be
  done.

 The idea of using RecentGlobalXmin to see how much _work_ has happened
 since the last vacuum is interesting, but it doesn't handle read-only
 transactions;  I am not sure how they can be tracked.  You make a good
 point that 5 minutes passing is meaningless --- you really want to know
 how many transactions have completed.

So, what I was pondering went slightly into a different direction:

(lets ignore anti wraparound vacuum for now)

Currently we trigger autovacuums by the assumed number of dead
tuples. In the course of it's action it usually will find that it cannot
remove all dead rows and that it cannot mark everything as all
visible. That's because the xmin horizon hasn't advanced far enough. We
won't trigger another vacuum after that unless there are further dead
tuples in the relation...
One trick if we want to overcome that problem and that we do not handle
setting all visible nicely for INSERT only workloads would be to trigger
vacuum by the amount of pages that are not marked all visible in the vm.

The problem there is that repeatedly scanning a relation that's only 50%
visible where the rest cannot be marked all visible because of a
longrunning pg_dump obivously isn't a good idea. So we need something to
notify us when there's work to be done. Using elapsed time seems like a
bad idea because it doesn't adapt to changing workloads very well and
doesn't work nicely for different relations.

What I was thinking of was to keep track of the oldest xids on pages
that cannot be marked all visible. I haven't thought about the
statistics part much, but what if we binned the space between
[RecentGlobalXmin, -nextXid) into 10 bins and counted the number of
pages falling into each bin. Then after the vacuum finished we could
compute how far RecentGlobalXmin would have to progress to make another
vacuum worthwile by counting the number of pages from the lowest bin
upwards and use the bin's upper limit as the triggering xid.

Now, we'd definitely need to amend that scheme by something that handles
pages that are newly written to, but it seems something like that
wouldn't be too hard to implement and would make autovacuum more useful.

 Unfortunately, our virtual transactions make that hard to compute.

I don't think they pose too much of a complexity. We basically only have
to care about PGXACT-xmin here and virtual transactions don't change
the handling of that ...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange IS NULL behaviour

2013-09-06 Thread Andres Freund
On 2013-09-06 23:07:04 -0400, Bruce Momjian wrote:
 On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
   Another possible fix would be to avoid the IS NULL value optimizer
   expansion if a ROW construct is inside a ROW().  I have attached a patch
   that does this for review.
  
   Having received no replies, do people perfer this version of the patch
   that just punts nested ROW IS NULL testing to execQual.c?
  
  For some reason I read your previous message as saying you were willing to
  wait for considered reviews this time.  If not, I'll just write a blanket
  -1 for any version of this patch.
 
 Are you saying people will comment later?  I wasn't clear that was the
 plan.  I can certainly wait.

You do realize mere mortals in the project frequently have to wait
*months* to get comments on their patches? Not getting any for less than
48h doesn't seem to be saying much.

Why don't you add the proposal to the commitfest?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers