[HACKERS] proposal: enhanced stack trace for PL - print param args

2017-01-14 Thread Pavel Stehule
Hi

In last discussion related to PLpgSQL was mentioned weak of stack trace
PLpgSQL. The function parameters are not printed.

CREATE OR REPLACE FUNCTION public.foo(a double precision, b double
precision)
 RETURNS double precision
 LANGUAGE plpgsql
AS $function$
begin
  return 100/a;
end;
$function$

Current:

postgres=# select foo(0, 100);
ERROR:  division by zero
CONTEXT:  PL/pgSQL function foo(double precision) line 3 at RETURN

Proposed result:
postgres=# select foo(0, 100);
ERROR:  division by zero
CONTEXT:  PL/pgSQL function foo(double precision) line 3 at RETURN
ARGUMENTS: a=0, b=100

* only function parameters are printed - no local parameters
* the line of arguments will be limited - X chars ?
* the content of variable should be limited - X chars ? - maybe 40 chars

This function can has impact on performance - so it should be explicitly
enabled with some GUC - like extra_back_trace or some similar. Probably
before any call the function parameters and related out functions should be
copied to safe memory context. More it can increase press on Error Memory
Context and possibly on log size.

Is a interest about this feature? Comments, notes?

Regards

Pavel


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-14 Thread Michael Paquier
On Sat, Jan 14, 2017 at 10:02 PM, Gilles Darold
 wrote:
> Le 13/01/2017 à 14:09, Michael Paquier a écrit :
>> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold  
>> wrote:
>>> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
 Surely the temporary file of current_logfiles should not be included
 in base backups (look at basebackup.c). Documentation needs to reflect
 that as well. Also, should we have current_logfiles in a base backup?
 I would think no.
>>> Done but can't find any documentation about the file exclusion, do you
>>> have a pointer?
>> protocol.sgml, in the protocol-replication part. You could search for
>> the paragraph that contains postmaster.opts. There is no real harm in
>> including current_logfiles in base backups, but that's really in the
>> same bag as postmaster.opts or postmaster.pid, particularly if the log
>> file name has a timestamp.
>
> Thanks for the pointer. After reading this part I think it might only
> concern files that are critical at restore time. I still think that the
> file might not be removed automatically from the backup. If it is
> restored it has strictly no impact at all, it will be removed or
> overwritten at startup. We can let the user choose to remove it from the
> backup or not, the file can be an help to find the latest log file
> related to a backup.

OK, no problem for me. I can see that your patch does the right thing
to ignore the current_logfiles.tmp. Could you please update
t/010_pg_basebackup.pl and add this file to the list of files filled
with DONOTCOPY?

 pg_current_logfile() can be run by *any* user. We had better revoke
 its access in system_views.sql!
>>> Why? There is no special information returned by this function unless
>>> the path but it can be retrieve using show log_directory.
>> log_directory could be an absolute path, and we surely don't want to
>> let normal users have a look at it. For example, "show log_directory"
>> can only be seen by superusers. As Stephen Frost is for a couple of
>> months (years?) on a holly war path against super-user checks in
>> system functions, I think that revoking the access to this function is
>> the best thing to do. It is as well easier to restrict first and
>> relax, the reverse is harder to justify from a compatibility point of
>> view.
>
> Right, I've append a REVOKE to the function in file
> backend/catalog/system_views.sql, patch v21 reflect this change.

Thanks. That looks good.

+   {
+   /* Logging collector is not enabled. We don't know where messages are
+* logged.  Remove outdated file holding the current log filenames.
+*/
+   unlink(LOG_METAINFO_DATAFILE);
return 0
This comment format is not postgres-like.

I think that's all I have for this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

2017-01-14 Thread Michael Paquier
On Sun, Jan 15, 2017 at 1:31 AM, Magnus Hagander  wrote:
>
>
> On Tue, Jan 10, 2017 at 3:01 AM, Michael Paquier 
> wrote:
>> Based on some analysis, it is enough to look at the last 4 bytes of
>> the compressed file to get the size output data with a single call to
>> lseek() and then read(). So as there is a simple way to do things and
>> that's far cheaper than decompressing perhaps hundred of segments I'd
>> rather do it this way. Attached is the implementation. This code is
>> using 2 booleans for 4 states of the file names: full non-compressed,
>> partial non-compressed, full compressed and partial compressed. This
>> keeps the last check of FindStreamingStart() more simple, but that's
>> quite fancy lately to have an enum for such things :D
>
>
> I think you need to document that analysis at least in a code comment. I
> assume this is in reference to the ISIZE member of the gzip format?

Good question. I am not sure on this one.

> I was going to say what happens if the file is corrupt and we get random
> junk data there, but as we only compare it to XlogSegSize that should be
> safe. But we might want to put a note in about that too?

Perhaps. I have made a better try in FindStreamingStart.

> Finally, I think we should make the error message clearly say "compressed
> segment file" - just to make things extra clear.

Sure.

>> > I found another problem with it -- it is completely broken in sync mode.
>> > You
>> > need to either forbid sync mode together with compression, or teach
>> > dir_sync() about it. The later would sound better, but I wonder how much
>> > that's going to kill compression due to the small blocks? Is it a
>> > reasonable
>> > use-case?
>>
>> Hm. Looking at the docs I think that --compress defined with
>> --synchronous maps to the use of Z_SYNC_FLUSH with gzflush(). FWIW I
>> don't have a direct use case for it, but it is not a major effort to
>> add it, so done. There is no actual reason to forbid this combinations
>> of options either.
>>
> It is enough to just gzflush(), don't we also need to still fsync()? I can't
> see any documentation that gzflush() does that, and a quick look at the code
> of zlib indicates it doesn't (but I didn't check in detail, so if you did
> please point me to it).

Hm. It looks that you are right. zlib goes down to _tr_flush_bits() to
flush some output, but this finishes only with put_byte(). As the fd
is opaque in gzFile, it would be just better to open() the file first,
and then use gzdopen to get the gzFile. Let's use as well the existing
fd field to save it. gzclose() closes as well the parent fd per the
documentation of zlib.
-- 
Michael


receivexlog-gzip-v5.patch
Description: application/stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-14 Thread Amit Kapila
On Fri, Jan 13, 2017 at 7:58 PM, Anastasia Lubennikova
 wrote:
> 27.12.2016 17:33, Amit Kapila:
>
>
> The similar problem has occurred while testing "parallel index only
> scan" patch and Rafia has included the fix in her patch [1] which
> ideally should be included in this patch, so I have copied the fix
> from her patch.  Apart from that, I observed that similar problem can
> happen for backward scans, so fixed the same as well.
>
>
> I confirm that this problem is solved.
>
> But I'm trying to find the worst cases for this feature. And I suppose we
> should test parallel index scans with
> concurrent insertions. More parallel readers we have, higher the
> concurrency.
> I doubt that it can significantly decrease performance, because number of
> parallel readers is not that big,
>
> I am not sure if such a test is meaningful for this patch because
> parallelism is generally used for large data reads and in such cases
> there are generally not many concurrent writes.
>
>
> I didn't find any case of noticeable performance degradation,
> so set status to "Ready for committer".
>

Thank you for the review!


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-14 Thread Tom Lane
Fabien COELHO  writes:
>> It ends up being about 30 fewer lines of code overall, despite there 
>> being four places that have to make ad-hoc RawStmt nodes.  On the whole 
>> I feel like this is cleaner,

> I agree: Better typing, more homogeneous code (PlannedStmt for all), 
> less ad-hoc checks to work around utility statements...

OK, pushed like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-14 Thread Peter Geoghegan
On Fri, Jan 13, 2017 at 8:56 PM, Serge Rielau  wrote:
>> That's total nonsense.
>>
>> MERGE isn't UPSERT….
>
> Peter,
> you are misreading what I wrote. I did not allege that PostgreSQL did the 
> wrong thing. And you are essentially confirming that there was debate and 
> MERGE deemed to be not what was wanted. So PG, with reason, went with 
> something not in the standard.
>
> That is precisely my point!

I'm sorry for being so blunt. That was unnecessary. I thought that you
were citing that as a negative counterexample, rather than a neutral
or positive one.

Still, it's true that MERGE has very little overlap with UPSERT, both
as specified by the standard, and as implemented in practice by both
SQL Server and Oracle. The Oracle docs introduce MERGE with a
statement that is something along the lines of "MERGE is a way to
combine INSERT, UPDATE, and DELETE into one convenient DML statement".
MERGE is most compelling when performing bulk loading. That being the
case, in my mind MERGE remains something that we really haven't turned
our back on at all.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-01-14 Thread Mithun Cy
On Fri, Jan 13, 2017 at 12:49 AM, Jesper Pedersen
 wrote:
> Rebased, and removed the compile warn in hashfuncs.c

I did some testing and review for the patch. I did not see any major
issue, but there are few minor cases for which I have few suggestions.

1. Source File :  /doc/src/sgml/pageinspect.sgml
example given.
SELECT * FROM hash_page_type(get_raw_page('con_hash_index', 0));
can be changed to
SELECT hash_page_type(get_raw_page('con_hash_index', 0));

2. @verify_hash_page
I can easily produce this error right after the split, so there will
be pages which are not inited before it is used. So an error saying it
is unexpected might be slightly misleading. I think error message
should be improved.
postgres=# SELECT hash_page_type(get_raw_page('i1', 12));
ERROR:  index table contains unexpected zero page

3. @verify_hash_page

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("page is not a HASH page"),
+ errdetail("Expected %08x, got %08x.",

In error message "HASH" can downcase to "hash". That makes error
messages consistent with other error messages like "page is not a hash
page of expected type"


4. The condition is raw_page_size != BLCKSZ; But error msg "input page
too small"; I think error message should be changed to "invalid page
size".

if (raw_page_size != BLCKSZ)
+ ereport(ERROR,
 i+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page too small"),
+ errdetail("Expected size %d, got %d",
+   BLCKSZ, raw_page_size)));


5. @hash_bitmap_info
Metabuf can be released once after bitmapblkno is set it is off no use.
_hash_relbuf(indexRel, metabuf);

is Write lock required here for bitmap page?
mapbuf = _hash_getbuf(indexRel, bitmapblkno, HASH_WRITE, LH_BITMAP_PAGE);


6. You have renamed convert_ovflblkno_to_bitno to
_hash_ovflblkno_to_bitno. But unfortunately, you also moved the
function down. The diff appears as if you have completely replaced it.
I think we should put it back to at old place.

7. I think it is not your bug, but probably a bug in Hash index
itself; page flag is set to 66 (for below test); So the page is both
LH_BUCKET_PAGE and LH_BITMAP_PAGE. Is not this a bug in hash index?

I have inserted 3000 records. Hash index is on integer column.
select hasho_flag FROM hash_page_stats(get_raw_page('i1', 1));
 hasho_flag

 66

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

2017-01-14 Thread Magnus Hagander
On Tue, Jan 10, 2017 at 3:01 AM, Michael Paquier 
wrote:

> On Sat, Jan 7, 2017 at 8:19 PM, Magnus Hagander 
> wrote:
> > On Sat, Jan 7, 2017 at 12:31 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >> There is something I forgot. With this patch,
> >> FindStreamingStart()@pg_receivexlog.c is actually broken. In short it
> >> forgets to consider files that have been compressed at the last run of
> >> pg_receivexlog and will try to stream changes from the beginning. I
> >> can see that gzip -l provides this information... But I have yet to
> >> find something in zlib that allows a cheap lookup as startup of
> >> streaming should be fast. Looking at how gzip -l does it may be faster
> >> than looking at the docs.
> >
> > Do we really care though? As in, does startup of streaming have to be
> *that*
> > fast? Even gunziping 16Mb (worst case) doesn't exactly take a long time.
> If
> > your pg_receivexlog is restarting so often that it becomes a problem, I
> > think you already have another and much bigger problem on your hands.
>
> Based on some analysis, it is enough to look at the last 4 bytes of
> the compressed file to get the size output data with a single call to
> lseek() and then read(). So as there is a simple way to do things and
> that's far cheaper than decompressing perhaps hundred of segments I'd
> rather do it this way. Attached is the implementation. This code is
> using 2 booleans for 4 states of the file names: full non-compressed,
> partial non-compressed, full compressed and partial compressed. This
> keeps the last check of FindStreamingStart() more simple, but that's
> quite fancy lately to have an enum for such things :D
>

I think you need to document that analysis at least in a code comment. I
assume this is in reference to the ISIZE member of the gzip format?

I was going to say what happens if the file is corrupt and we get random
junk data there, but as we only compare it to XlogSegSize that should be
safe. But we might want to put a note in about that too?

Finally, I think we should make the error message clearly say "compressed
segment file" - just to make things extra clear.


> > I found another problem with it -- it is completely broken in sync mode.
> You
> > need to either forbid sync mode together with compression, or teach
> > dir_sync() about it. The later would sound better, but I wonder how much
> > that's going to kill compression due to the small blocks? Is it a
> reasonable
> > use-case?
>
> Hm. Looking at the docs I think that --compress defined with
> --synchronous maps to the use of Z_SYNC_FLUSH with gzflush(). FWIW I
> don't have a direct use case for it, but it is not a major effort to
> add it, so done. There is no actual reason to forbid this combinations
> of options either.
>
> It is enough to just gzflush(), don't we also need to still fsync()? I
can't see any documentation that gzflush() does that, and a quick look at
the code of zlib indicates it doesn't (but I didn't check in detail, so if
you did please point me to it).


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Replication/backup defaults

2017-01-14 Thread Magnus Hagander
On Wed, Jan 11, 2017 at 2:09 AM, Michael Paquier 
wrote:

> On Wed, Jan 11, 2017 at 10:06 AM, David Steele 
> wrote:
> > On 1/10/17 3:06 PM, Stephen Frost wrote:
> >> * Magnus Hagander (mag...@hagander.net) wrote:
> >>> On Tue, Jan 10, 2017 at 8:03 PM, Robert Haas 
> wrote:
> >
>  I may be outvoted, but I'm still not in favor of changing the default
>  wal_level.  That caters only to people who lack sufficient foresight
>  to know that they need a replica before the system becomes so critical
>  that they can't bounce it to update the configuration.
> >>>
> >>> True. But the current level only caters to those people who run large
> ETL
> >>> jobs without doing any tuning on their system (at least none that would
> >>> require a restart), or another one of the fairly specific workloads.
> >>>
> >>> And as I keep re-iterating, it's not just about replicas, it's also
> about
> >>> the ability to make proper backups. Which is a pretty fundamental
> feature.
> >>>
> >>> I do think you are outvoted, yes :) At least that's the result of my
> >>> tallying up the people who have spoken out on the thread.
> >>
> >> I tend to agree with Magnus on this, being able to perform an online
> >> backup is pretty darn important.
> >
> > Agreed and +1.
>
> +1'ing.
>

I've pushed this.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-14 Thread Peter Eisentraut
On 1/11/17 11:20 AM, Ryan Murphy wrote:
> Thanks for the review Beena, I'm glad the patch is ready to go!

committed, thanks

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-14 Thread Gilles Darold
Le 13/01/2017 à 14:09, Michael Paquier a écrit :
> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold  
> wrote:
>> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>>> Surely the temporary file of current_logfiles should not be included
>>> in base backups (look at basebackup.c). Documentation needs to reflect
>>> that as well. Also, should we have current_logfiles in a base backup?
>>> I would think no.
>> Done but can't find any documentation about the file exclusion, do you
>> have a pointer?
> protocol.sgml, in the protocol-replication part. You could search for
> the paragraph that contains postmaster.opts. There is no real harm in
> including current_logfiles in base backups, but that's really in the
> same bag as postmaster.opts or postmaster.pid, particularly if the log
> file name has a timestamp.
Thanks for the pointer. After reading this part I think it might only
concern files that are critical at restore time. I still think that the
file might not be removed automatically from the backup. If it is
restored it has strictly no impact at all, it will be removed or
overwritten at startup. We can let the user choose to remove it from the
backup or not, the file can be an help to find the latest log file
related to a backup.

>
>>> pg_current_logfile() can be run by *any* user. We had better revoke
>>> its access in system_views.sql!
>> Why? There is no special information returned by this function unless
>> the path but it can be retrieve using show log_directory.
> log_directory could be an absolute path, and we surely don't want to
> let normal users have a look at it. For example, "show log_directory"
> can only be seen by superusers. As Stephen Frost is for a couple of
> months (years?) on a holly war path against super-user checks in
> system functions, I think that revoking the access to this function is
> the best thing to do. It is as well easier to restrict first and
> relax, the reverse is harder to justify from a compatibility point of
> view.

Right, I've append a REVOKE to the function in file
backend/catalog/system_views.sql, patch v21 reflect this change.

Thanks.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 30dd54c..393195f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4211,6 +4211,17 @@ SELECT * FROM parent WHERE key = 2400;
  server log
 
 
+
+ When logs are written to the file system, the  file includes the type,
+ the location and the name of the logs currently in use. This provides a
+ convenient way to find the logs currently in used by the instance.
+
+$ cat $PGDATA/current_logfiles
+stderr pg_log/postgresql-2017-01-13_085812.log
+
+
+
 
  Where To Log
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 10e3186..693669b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15441,6 +15441,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile(text)
+   text
+   Primary log file name, or log in the requested format,
+   currently in use by the logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15658,6 +15665,39 @@ SET search_path TO schema , schema, ..
 the time when the postmaster process re-read the configuration files.)

 
+
+pg_current_logile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of either the csv or stderr log file currently in use by the
+logging collector.  This is a path including the
+ directory and the log file name.
+Log collection must be active or the return value
+is NULL.  When multiple log files exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered
+list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply,
+as text, either csvlog
+or stderr as the value of the optional parameter. The
+return value is NULL when the log format requested
+is not a configured .
+
+pg_current_logfiles reflects the content of the
+ file.  All caveats
+regards current_logfiles content are applicable
+to pg_current_logfiles' return value.
+   
+

 pg_my_temp_schema

diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5c52824..3ce2a7e 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -60,6 +60,38 @@ Item
  Subdirectory containing per-database subdirectories
 
 
+
+ 
+  
+   current_logfiles
+  
+  
+   Logging
+   current_logfiles file
+  
+  current_logfiles
+ 
+ 
+  
+  A file recording the log file(s) currently written to by the syslogger
+  and t

Re: [HACKERS] parallelize queries containing subplans

2017-01-14 Thread Amit Kapila
On Fri, Jan 13, 2017 at 7:13 PM, Tom Lane  wrote:
> Dilip Kumar  writes:
>> ERROR:  did not find '}' at end of input node at character 762
>

I could reproduce this error with simple query like:
SELECT * FROM information_schema.role_usage_grants WHERE object_type
LIKE 'FOREIGN%' AND object_name IN ('s6', 'foo') ORDER BY 1, 2, 3, 4,
5;

The reason is that the stored rules have the old structure of Param.  See below:
{RELABELTYPE :arg {PARAM :paramkind 2 :paramid 1 :paramtype 11975
:paramtypmod -1 :paramcollid 100 :location -1}

The new variable parallel_safe is not present in above node.  If you
recreate the fresh database you won't see the above problem.  I have
observed a problem in equal function which I have fixed in the
attached patch.  I have also added a test, so that we can catch any
problem similar to what you have reported.

> I've not looked at the patches, but just seeing this error message,
> this looks like somebody's fat-fingered the correspondence between
> outfuncs.c and readfuncs.c processing.
>

I think what we need is catversion bump as Param is part of stored rules.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pq_pushdown_correl_subplan_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers