Re: [HACKERS] contrib modules and relkind check

2017-03-06 Thread Amit Langote
Sorry about the absence on this thread.

On 2017/02/14 15:30, Michael Paquier wrote:
> On Tue, Feb 14, 2017 at 3:18 PM, Amit Langote wrote:
>>
>> Added more tests in pgstattuple and the new ones for pg_visibility,
>> although I may have overdone the latter.
> 
> A bonus idea is also to add tests for relkinds that work, with for
> example the creation of a table, inserting some data in it, vacuum it,
> and look at "SELECT count(*) > 0 FROM pg_visibility('foo'::regclass)".

I assume you meant only for pg_visibility.  Done in the attached (a pretty
basic test though).

>> In certain contexts where a subset of relkinds are allowed and others are
>> not or vice versa, partitioned tables are still referred to as "tables".
>> That's because we still use CREATE/DROP TABLE to create/drop them and
>> perhaps more to the point, their being partitioned is irrelevant.
>>
>> Examples of where partitioned tables are referred to as tables:
>>
>> [...]
>>
>> In other contexts, where a table's being partitioned is relevant, the
>> message is shown as "relation is/is not partitioned table".  Examples:
>>
>> [...]
> 
> Hm... It may be a good idea to be consistent on the whole system and
> refer to "partitioned table" as a table without storage and used as an
> entry point for partitions. The docs use this term in CREATE TABLE,
> and we would finish with messages like "not a table or a partitioned
> table". Extra thoughts are welcome here, the current inconsistencies
> would be confusing for users.

If we decide to go with some different approach, we'd not be doing it
here.  Maybe in the "partitioned tables and relfilenode" thread or a new one.

Thanks,
Amit
>From c7762d5bfd9363150cdb94030a47cef252697672 Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 24 Jan 2017 13:22:17 +0900
Subject: [PATCH] Add relkind checks to certain contrib modules

Contrib functions such a pg_visibility, pg_relpages are only applicable
to certain relkinds; error out if otherwise.

Also, RELKIND_PARTITIONED_TABLE was not added to the pageinspect's and
pgstattuple's list of unsupported relkinds.

Add tests for the newly added checks.
---
 contrib/pageinspect/expected/page.out|  5 ++
 contrib/pageinspect/rawpage.c|  5 ++
 contrib/pageinspect/sql/page.sql |  5 ++
 contrib/pg_visibility/.gitignore |  4 ++
 contrib/pg_visibility/Makefile   |  2 +
 contrib/pg_visibility/expected/pg_visibility.out | 85 
 contrib/pg_visibility/pg_visibility.c| 44 
 contrib/pg_visibility/sql/pg_visibility.sql  | 57 
 contrib/pgstattuple/expected/pgstattuple.out | 29 
 contrib/pgstattuple/pgstatindex.c| 30 +
 contrib/pgstattuple/pgstattuple.c|  3 +
 contrib/pgstattuple/sql/pgstattuple.sql  | 24 +++
 12 files changed, 279 insertions(+), 14 deletions(-)
 create mode 100644 contrib/pg_visibility/.gitignore
 create mode 100644 contrib/pg_visibility/expected/pg_visibility.out
 create mode 100644 contrib/pg_visibility/sql/pg_visibility.sql

diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 13964cd878..85b183db4d 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -71,3 +71,8 @@ SELECT * FROM fsm_page_contents(get_raw_page('test1', 'fsm', 0));
 (1 row)
 
 DROP TABLE test1;
+-- check that using any of these functions with a partitioned table would fail
+create table test_partitioned (a int) partition by range (a);
+select get_raw_page('test_partitioned', 0);
+ERROR:  cannot get raw page from partitioned table "test_partitioned"
+drop table test_partitioned;
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 102f360c6f..1ccc3ff320 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -123,6 +123,11 @@ get_raw_page_internal(text *relname, ForkNumber forknum, BlockNumber blkno)
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("cannot get raw page from foreign table \"%s\"",
 		RelationGetRelationName(rel;
+	if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		ereport(ERROR,
+(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot get raw page from partitioned table \"%s\"",
+		RelationGetRelationName(rel;
 
 	/*
 	 * Reject attempts to read non-local temporary relations; we would be
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 97eef9829a..fa3533f2af 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -28,3 +28,8 @@ SELECT tuple_data_split('test1'::regclass, t_data, t_infomask, t_infomask2, t_bi
 SELECT * FROM fsm_page_contents(get_raw_page('test1', 'fsm', 0));
 
 DROP TABLE test1;
+
+-- check that using any of these functions with a partitioned table would fail
+create table test_partitioned (a int) 

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-03-06 Thread Masahiko Sawada
On Sat, Mar 4, 2017 at 4:37 AM, Peter Geoghegan  wrote:
> On Fri, Mar 3, 2017 at 8:13 AM, Masahiko Sawada  wrote:
>> Thank you for clarification. Let me check my understanding. IIUC,
>> skipping second index vacuum path (lazy_cleanup_index) can not be
>> cause of leaving page as half-dead state but could leave recyclable
>> pages that are not marked as a recyclable. But second one, it can be
>> reclaimed by next index vacuum because btvacuumpage calls
>> RecordFreeIndexPage for recyclable page. Am I missing something?
>
> I think that this is a good summary. You have not missed anything.
>
>> My first motivation of this patch is to skip the second index vacuum
>> patch when vacuum skipped whole table by visibility map.
>
> Makes sense.
>
>> But as Robert
>> suggested on another thread, I changed it to have a threshold. If my
>> understanding is correct, we can have a threshold that specifies the
>> fraction of the scanned pages by vacuum. If it's set 0.1,
>> lazy_scan_heap can do the second vacuum index only when 10% of table
>> is scanned. IOW, if 90% of table pages is skipped, which means almost
>> of table has not changed since previous vacuum, we can skip the second
>> index vacuum.
>
> It sounds like a setting of 0.1 would leave us in a position where
> it's very unlikely that a VACUUM of indexes could fail to occur when
> autovacuum has been triggered in the common way, by the "vacuum
> threshold" having been exceeded. Does this feeling that I have about
> it seem accurate to you? Is that actually your intent? It's hard to
> really be sure, because there are so many confounding factors (e.g.
> HOT), but if that was 100% true, then I suppose there would
> theoretically be zero new risk (except, perhaps, of the "other type of
> bloat" that I described, which I am not very worried about).
>
> Please verify my understanding of your thought process: We don't have
> to freeze indexes at all, ever, so if we see index bloat as a separate
> problem, we also see that there is no need to *link* index needs to
> the need for freezing. XID burn rate is a very bad proxy for how
> bloated an index may be. Besides, we already have a separate trigger
> for the thing that *actually* matters to indexes (the vacuum threshold
> stuff).
>
> Maybe 0.0 is too low as a default for
> vacuum_cleanup_index_scale_factor, even though initially it seemed
> attractive to me for theoretical reasons. Something like 0.01 is still
> "practically zero", but removes the extreme sensitivity that would
> have with 0.0. So, 0.01 might make sense as a default for roughly the
> same reason that autovacuum_vacuum_threshold exists. (Maybe it should
> be more like autovacuum_vacuum_threshold, in that it's an absolute
> minimum number of heap blocks to trigger index clean-up.)

Agreed. Attached current design patch. Also I've changed default value to 0.01.

Remaining issues are two issues; first is what you mentioned about
that pg_class.relfrozenxid/pg_database.datfrozenxid could be advanced
past opaque->btpo.xact. The second is to find out the impact to type
of indexes other than btree. I'm investigating the second one now.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


skip_cleanup_indexdes_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread Haribabu Kommi
On Tue, Mar 7, 2017 at 5:01 PM, Michael Paquier 
wrote:

>
> @@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
> VacuumParams *params,
> numrows = (*acquirefunc) (onerel, elevel,
>   rows, targrows,
>   , );
> -
> /*
> Useless diff.
>
> + 
> +   ANALYZE is currently collecting the sample rows.
> +   The sample it reads is taken randomly.Its size depends on
> +   the default_statistics_target parameter value.
> + 
> This should use a  markup for default_statistics_target.
>
> @@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
> if (onerel->rd_rel->relkind == RELKIND_RELATION ||
> onerel->rd_rel->relkind == RELKIND_MATVIEW)
> {
> +   pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
> +   RelationGetRelid(onerel));
> It seems to me that the report should begin in do_analyze_rel().


some more comments,

+ /* Report compute heap stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS);

The above analyze phase is updated inside a for loop, instead just set it
above once.

+ /* Report compute index stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS);

Irrespective of whether the index exists on the table or not, the above
analyze phase
is set. It is better to set the above phase and index cleanup phase only
when there
are indexes on the table.

+ /* Report total number of heap blocks and collectinf sample row phase*/

Typo? collecting?


+ /* Report total number of heap blocks and collectinf sample row phase*/
+ initprog_val[0] = PROGRESS_ANALYZE_PHASE_COLLECT_HEAP_SAMPLE_ROWS;
+ initprog_val[1] = totalblocks;
+ pgstat_progress_update_multi_param(2, initprog_index, initprog_val);

acquire_sample_rows function is called from acquire_inherited_sample_rows
function, so adding the phase in that function will provide wrong info.


+#define PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS 2

why there is no code added for the phase, any specific reason?


+/* Phases of analyze */

Can be written as following for better understanding, and also
similar like vacuum.

/* Phases of analyze (as advertised via PROGRESS_ANALYZE_PHASE) */


Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Amit Langote
On 2017/03/07 14:04, Tom Lane wrote:
> Amit Langote  writes:
>> Also, I found out that alter_table.sql mistakenly forgot to drop
>> partitioned table "p1".  Patch 0002 takes care of that.
> 
> While that might or might not have been intentional, I think it's an
> astoundingly bad idea to not leave any partitioned tables behind in
> the final state of the regression database.  Doing so would likely
> have meant that this particular bug evaded detection for much longer
> than it did.  Moreover, it would mean that the pg_upgrade test would
> have exactly no coverage of partitioned cases.

That's true.  Should have been apparent to me.

> Therefore, there should definitely be a partitioned table, hopefully with
> a less generic name than "p1", in the final regression DB state.  Whether
> this particular one from alter_table.sql is a good candidate, I dunno.
> But let's not drop it without adding a better-thought-out replacement.

OK, let's drop p1 in alter_table.sql.  I think a partitioned table created
in insert.sql is a good candidate to keep around after having it renamed,
which patch 0003 does.

Thanks,
Amit
>From 0fe3bcea9424133b8711bdcc23b1432cde4fc394 Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 7 Mar 2017 10:26:16 +0900
Subject: [PATCH 1/3] Fix relcache ref leak in acquire_inherited_sample_rows

Add tests for vacuum, analyze, truncate on partitioned table, as
3c3bb9933 should have.
---
 src/backend/commands/analyze.c |  7 +--
 src/test/regress/expected/truncate.out |  6 ++
 src/test/regress/expected/vacuum.out   |  9 +
 src/test/regress/sql/truncate.sql  |  7 +++
 src/test/regress/sql/vacuum.sql| 10 ++
 5 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a70c760341..b91df986c5 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1360,11 +1360,14 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
 		else
 		{
 			/*
-			 * ignore, but release the lock on it.  could be a partitioned
-			 * table.
+			 * ignore, but release the lock on it.  don't try to unlock the
+			 * passed-in relation
 			 */
+			Assert(childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 			if (childrel != onerel)
 heap_close(childrel, AccessShareLock);
+			else
+heap_close(childrel, NoLock);
 			continue;
 		}
 
diff --git a/src/test/regress/expected/truncate.out b/src/test/regress/expected/truncate.out
index 5c5277e0f1..81612d8c88 100644
--- a/src/test/regress/expected/truncate.out
+++ b/src/test/regress/expected/truncate.out
@@ -420,3 +420,9 @@ SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
 ERROR:  relation "truncate_a_id1" does not exist
 LINE 1: SELECT nextval('truncate_a_id1');
^
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE truncparted1 PARTITION OF truncparted FOR VALUES IN (1);
+INSERT INTO truncparted VALUES (1, 'a');
+TRUNCATE truncparted;
+DROP TABLE truncparted;
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9b604be4b6..6f68663087 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -82,3 +82,12 @@ VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
+-- partitioned table
+CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO vacparted VALUES (1, 'a');
+UPDATE vacparted SET b = 'b';
+VACUUM (ANALYZE) vacparted;
+VACUUM (FULL) vacparted;
+VACUUM (FREEZE) vacparted;
+DROP TABLE vacparted;
diff --git a/src/test/regress/sql/truncate.sql b/src/test/regress/sql/truncate.sql
index a3d6f5368f..d61eea1a42 100644
--- a/src/test/regress/sql/truncate.sql
+++ b/src/test/regress/sql/truncate.sql
@@ -215,3 +215,10 @@ SELECT * FROM truncate_a;
 DROP TABLE truncate_a;
 
 SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
+
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE truncparted1 PARTITION OF truncparted FOR VALUES IN (1);
+INSERT INTO truncparted VALUES (1, 'a');
+TRUNCATE truncparted;
+DROP TABLE truncparted;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 7b819f654d..7c5fb04917 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -64,3 +64,13 @@ VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
 DROP TABLE vaccluster;
 DROP TABLE vactst;
+
+-- partitioned table
+CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO vacparted VALUES (1, 'a');
+UPDATE vacparted SET b = 'b';
+VACUUM (ANALYZE) vacparted;
+VACUUM (FULL) vacparted;
+VACUUM 

Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Ashutosh Sharma
On Tue, Mar 7, 2017 at 11:18 AM, Amit Kapila  wrote:
> On Tue, Mar 7, 2017 at 10:22 AM, Ashutosh Sharma  
> wrote:
>>> I also think that commit didn't intend to change the behavior,
>>> however, the point is how sensible is it to keep such behavior after
>>> Parallel Append.  I am not sure if this is the right time to consider
>>> it or shall we wait till Parallel Append is committed.
>>>
 I think the problem here is that compute_parallel_worker() thinks it
 can use 0 as a sentinel value that means "ignore this", but it can't
 really, because a heap can actually contain 0 pages.  Here's a patch
 which does the following:

>>>
>>> - if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
>>> - index_pages < (BlockNumber) min_parallel_index_scan_size &&
>>> - rel->reloptkind == RELOPT_BASEREL)
>>> + if (rel->reloptkind == RELOPT_BASEREL &&
>>> + ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
>>> + (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
>>>   return 0;
>>>
>>> The purpose of considering both heap and index pages () for
>>> min_parallel_* is that even if one of them is bigger than threshold
>>> then we should try parallelism.
>>
>> Yes, But, this is only true for normal tables not for partitioned or
>> inheritance tables. I think for partition table, even if one heap page
>> exist, we go for parallelism.
>>
>>  This could be helpful for parallel
>>> index scans where we consider parallel workers based on both heap and
>>> index pages.  Is there a reason for you to change that condition as
>>> that is not even related to the problem being discussed?
>>>
>>
>> I think he has changed to allow parallelism for inheritance or partition
>> tables. For normal tables, it won't be touched until the below if-condition
>> is satisfied.
>>
>> if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
>> index_pages < (BlockNumber) min_parallel_index_scan_size &&
>> rel->reloptkind == RELOPT_BASEREL)
>> return 0;
>>
>
> AFAICS, the patch has changed the if-condition you are quoting.
>

Yes, It has slightly changed the if condition or I would say it has
corrected the if condition. The reason being, with new if condition in
patch, we first check if reloptkind is of BASEREL type or not before
checking if there are enough heap pages or index pages that is good to
go for parallelism. In case if it is partition table 'rel->reloptkind
== RELOPT_BASEREL' will fail hence, further conditions won't be
checked.

I think the most important thing that the patch fixes is passing
index_pages as '-1' to compute_parallel_worker() as
'min_parallel_index_scan_size' can be set to zero by the user.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-06 Thread Michael Paquier
On Mon, Mar 6, 2017 at 3:58 PM, Andres Freund  wrote:
> On 2017-03-03 15:33:15 -0500, David Steele wrote:
>> On 3/1/17 1:25 PM, Andres Freund wrote:
>> > On 2017-03-01 10:20:41 -0800, David Fetter wrote:
>> >> On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:
>> >>> On 2/28/17 04:24, vinayak wrote:
>>  The view provides the information of analyze command progress details as
>>  follows
>>  postgres=# \d pg_stat_progress_analyze
>> View "pg_catalog.pg_stat_progress_analyze"
>> >>>
>> >>> Hmm, do we want a separate "progress" system view for every kind of
>> >>> command?  What if someone comes up with a progress checker for CREATE
>> >>> INDEX, REINDEX, CLUSTER, etc.?
>> >
>> > I don't think that'd be that bad, otherwise the naming of the fields is
>> > complicated.  I guess the alternative (or do both?) would be to to have
>> > a pivoted table, but that'd painful to query.  Do you have a better idea?
>> >
>> >
>> >> Some kind of design for progress seems like a good plan.  Some ideas:
>> >>
>> >> - System view(s)
>> >>
>> >> This has the advantage of being shown to work at least to a PoC by
>> >> this patch, and is similar to extant system views like
>> >> pg_stat_activity in the sense of capturing a moment in time.
>> >>
>> >> - NOTIFY
>> >>
>> >> Event-driven model as opposed to a polling one.  This is
>> >> attractive on efficiency grounds, less so on reliability ones.
>> >>
>> >> - Something added to the wire protocol
>> >>
>> >> More specialized, limits the information to the session where the
>> >> command was issued
>> >>
>> >> - Other things not named here
>> >
>> > We now have a framework for this [1] (currently used by vacuum, but
>> > extensible). The question is about presentation.  I'm fairly sure that
>> > we shouldn't just add yet another framework, and I doubt that that's
>> > what's proposed by Peter.
>>
>> I think the idea of a general progress view is very valuable and there
>> are a ton of operations it could be used for:  full table scans, index
>> rebuilds, vacuum, copy, etc.
>> However, I feel that this proposal is not flexible enough and comes too
>> late in the release cycle to allow development into something that could
>> be committed.
>
> I'm not following. As I pointed out, we already have this framework?
> This patch is just a short one using that framework?
>
>
>> I propose we move this to the 2017-07 CF so the idea can be more fully
>> developed.
>
> I don't see that being warranted in this case, we're really not talking
> about something complicated:
>  doc/src/sgml/monitoring.sgml |  135 
> +++
>  src/backend/catalog/system_views.sql |   16 
>  src/backend/commands/analyze.c   |   34 
>  src/backend/utils/adt/pgstatfuncs.c  |2
>  src/include/commands/progress.h  |   13 +++
>  src/include/pgstat.h |3
>  src/test/regress/expected/rules.out  |   18 
>  7 files changed, 219 insertions(+), 2 deletions(-)
> excepting tests and docs, this is very little actual code.

Or 35 lines just for the backend portion, it is hard to something smaller.

@@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
VacuumParams *params,
numrows = (*acquirefunc) (onerel, elevel,
  rows, targrows,
  , );
-
/*
Useless diff.

+ 
+   ANALYZE is currently collecting the sample rows.
+   The sample it reads is taken randomly.Its size depends on
+   the default_statistics_target parameter value.
+ 
This should use a  markup for default_statistics_target.

@@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
if (onerel->rd_rel->relkind == RELKIND_RELATION ||
onerel->rd_rel->relkind == RELKIND_MATVIEW)
{
+   pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
+   RelationGetRelid(onerel));
It seems to me that the report should begin in do_analyze_rel().
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Amit Kapila
On Tue, Mar 7, 2017 at 10:22 AM, Ashutosh Sharma  wrote:
>> I also think that commit didn't intend to change the behavior,
>> however, the point is how sensible is it to keep such behavior after
>> Parallel Append.  I am not sure if this is the right time to consider
>> it or shall we wait till Parallel Append is committed.
>>
>>> I think the problem here is that compute_parallel_worker() thinks it
>>> can use 0 as a sentinel value that means "ignore this", but it can't
>>> really, because a heap can actually contain 0 pages.  Here's a patch
>>> which does the following:
>>>
>>
>> - if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
>> - index_pages < (BlockNumber) min_parallel_index_scan_size &&
>> - rel->reloptkind == RELOPT_BASEREL)
>> + if (rel->reloptkind == RELOPT_BASEREL &&
>> + ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
>> + (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
>>   return 0;
>>
>> The purpose of considering both heap and index pages () for
>> min_parallel_* is that even if one of them is bigger than threshold
>> then we should try parallelism.
>
> Yes, But, this is only true for normal tables not for partitioned or
> inheritance tables. I think for partition table, even if one heap page
> exist, we go for parallelism.
>
>  This could be helpful for parallel
>> index scans where we consider parallel workers based on both heap and
>> index pages.  Is there a reason for you to change that condition as
>> that is not even related to the problem being discussed?
>>
>
> I think he has changed to allow parallelism for inheritance or partition
> tables. For normal tables, it won't be touched until the below if-condition
> is satisfied.
>
> if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
> index_pages < (BlockNumber) min_parallel_index_scan_size &&
> rel->reloptkind == RELOPT_BASEREL)
> return 0;
>

AFAICS, the patch has changed the if-condition you are quoting.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allow referring to functions without arguments when unique

2017-03-06 Thread Michael Paquier
On Fri, Mar 3, 2017 at 4:12 PM, Michael Paquier
 wrote:
> On Wed, Mar 1, 2017 at 11:50 AM, Peter Eisentraut
>  wrote:
>> This is the "grand finale" that goes on top of the "DROP FUNCTION of
>> multiple functions" patch set.  The purpose is to allow referring to
>> functions without having to spell out the argument list, when the
>> function name is unique.  This is especially useful when having to
>> operate on "business logic" functions with many many arguments.  It's an
>> SQL standard feature, and it applies everywhere a function is referred
>> to in the grammar.  We already have the lookup logic for the regproc
>> type, and thanks to the grand refactoring of the parser representation
>> of functions, this is quite a small patch.  There is a bit of
>> reduce/reduce parser mystery, to keep the reviewer entertained.
>
> ... Which would be nice.
>
>> (The
>> equivalent could be implemented for aggregates and operators, but I
>> haven't done that here yet.)
>
> OK. After a lookup, I am just seeing opfamily, opclass missing, so
> this patch is doing it as you describe.
>
> I have read through the code once, still I am waiting for the DROP
> FUNCTION patches to be committed before doing a real hands-on.
>
> @@ -7198,6 +7198,33 @@ function_with_argtypes:
> n->objargs = extractArgTypes($2);
> $$ = n;
> }
> This may not have arguments listed, so is function_with_argtypes really 
> adapted?
>
> +   /*
> +* If no arguments were specified, the name must yield a unique candidate.
> +*/
> +   if (nargs == -1 && clist)
> +   {
> +   if (clist->next)
> +   ereport(ERROR,
> I would have used list_length here for clarity.
>
> --- a/src/backend/parser/parse_func.c
> +++ b/src/backend/parser/parse_func.c
> @@ -1914,6 +1914,21 @@ LookupFuncName(List *funcname, int nargs, const
> Oid *argtypes, bool noError)
> The comment at the top of LookupFuncName() needs a refresh. The caller
> can as well just use a function name without arguments.


+   /*
+* Because of reduce/reduce conflicts, we can't use func_name
+* below, but we can write it out the long way, which actually
+* allows more cases.
+*/
+   | type_func_name_keyword
+   {
+   ObjectWithArgs *n = makeNode(ObjectWithArgs);
+   n->objname = list_make1(makeString(pstrdup($1)));
+   n->args_unspecified = true;
+   $$ = n;
+   }
+   | ColId
+   {
+   ObjectWithArgs *n = makeNode(ObjectWithArgs);
+   n->objname = list_make1(makeString($1));
+   n->args_unspecified = true;
+   $$ = n;
+   }
+   | ColId indirection
+   {
+   ObjectWithArgs *n = makeNode(ObjectWithArgs);
+   n->objname = check_func_name(lcons(makeString($1), $2),
+ yyscanner);
+   n->args_unspecified = true;
+   $$ = n;
+   }
I have spent some time looking at this one. Another solution would be
to extend func_args to make the difference between an empty list and a
list with no arguments. This would need an intermediate structure for
parsing, and it does not seem worth the cost so I am fine with this
solution.

=# create schema popo;
CREATE SCHEMA
=# CREATE FUNCTION popo.dup2(int,int) RETURNS TABLE(f1 int, f2 text)

  AS $$ SELECT
$1, CAST($1 AS text) || ' is text' $$

 LANGUAGE SQL;
CREATE FUNCTION
=# CREATE FUNCTION dup2(int,int) RETURNS TABLE(f1 int, f2 text)

  AS $$ SELECT
$1, CAST($1 AS text) || ' is text' $$

 LANGUAGE SQL;
CREATE FUNCTION
=# set search_path to 'public,popo';
SET
Time: 0.463 ms
=# drop function dup2;
ERROR:  42883: function dup2() does not exist
LOCATION:  LookupFuncName, parse_func.c:1944
In this case I would have expected an error telling that the name is
ambiguous. FuncnameGetCandidates() returns an empty list.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [NOVICE] opr_charset rule in gram.y

2017-03-06 Thread Tom Lane
Neha Khatri  writes:
> I was going through the grammer rule for Character types in gram.y and
> found an optional sub rule in is "opt_charset"

This question seems quite off-topic for pgsql-novice, so I've redirected
it to pgsql-hackers.

> CharacterWithLength:  character '(' Iconst ')' opt_charset
> {
> if (($5 != NULL) && (strcmp($5, "sql_text") != 0))
> $1 = psprintf("%s_%s", $1, $5);
>
> $$ = SystemTypeName($1);

> I would like to understand how opt_charset would get used in real
> applications. I tried to make use of it but results in failure:

> postgres=# create table testchar (c1 char(5) character set utf8);
> ERROR:  type "pg_catalog.bpchar_utf8" does not exist
> LINE 1: create table testchar (c1 char(5) character set utf8);

> There does not seem to be any documentation available about this optional
> parameter in the documents for Character data type( at least I could not
> find any).

Some digging in the git history shows that opt_charset was introduced in
commit f10b63923 (in 1997!), and the current interpretation as
"typename_charsetname" appeared in commit 3e1beda2c.  Neither commit
adds any user documentation or even mentions fooling with character type
declarations in its commit message.  I think that this must have been
work-in-progress that Tom Lockhart left behind when he dropped out of
the project circa 2002.

Since there are no datatypes matching the "typename_charsetname" naming
pattern, and today we'd be unlikely to accept an implementation based on
that approach even if someone wrote one, my inclination is just to rip out
this code.  Maybe we could leave behind the no-op case that allows
"CHARACTER SET SQL_TEXT", but I don't really see the point, since we've
never documented supporting any such syntax.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] logical replication apply to run with sync commit off by default

2017-03-06 Thread Petr Jelinek
Hi,

there has been discussion at the logical replication initial copy thread
[1] about making apply work with sync commit off by default for
performance reasons and adding option to change that per subscription.

Here I propose patch to implement this - it adds boolean column
subssynccommit to pg_subscription catalog which decides if
synchronous_commit should be off or local for apply. And it adds
SYNCHRONOUS_COMMIT = boolean to the list of WITH options for CREATE and
ALTER SUBSCRIPTION. When nothing is specified it will set it to false.

The patch is built on top of copy patch currently as there are conflicts
between the two and this helps a bit with testing of copy patch.

[1]
https://www.postgresql.org/message-id/CA+TgmoY7Lk2YKArcp4O=Qu=xoor8j71mad1oteojawmuje3...@mail.gmail.com

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Add-option-to-modify-sync-commit-per-subscription.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-06 Thread Ashutosh Sharma
Hi,

I took a look at this patch. Overall, the patch looks good to me.
However, there are some review comments that I would like to share,

1. I think the macro 'PATH_MAX' used in pg_waldump.c file is specific
to Linux. It needs to be changed to some constant value or may be
MAXPGPATH inorder to make it platform independent.

2. As already mentioned by Jim and Kuntal upthread, you are trying to
detect the configured WAL segment size in pg_waldump.c and
pg_standby.c files based on the size of the random WAL file which
doesn't look like a good idea. But, then I think the only option we
have is to pass the location of pg_control file to pg_waldump module
along with the start and end wal segments.

3. When trying to compile '02-initdb-walsegsize-v2.patch' on Windows,
I got this warning message,

Warning1warning C4005: 'DEFAULT_XLOG_SEG_SIZE' : macro
redefinition
c:\users\ashu\postgresql\src\include\pg_config_manual.h20

Apart from these, I am not having any comments as of now. I am still
validating the patch on Windows. If I find any issues i will update
it.

--
With Regards,
Ashutosh Sharma.
EnterpriseDB: http://www.enterprisedb.com

On Tue, Feb 28, 2017 at 10:36 AM, Beena Emerson  wrote:
>
>
> On Tue, Feb 28, 2017 at 9:45 AM, Jim Nasby  wrote:
>>
>> On 2/24/17 6:30 AM, Kuntal Ghosh wrote:
>>>
>>> * You're considering any WAL file with a power of 2 as valid. Suppose,
>>> the correct WAL seg size is 64mb. For some reason, the server
>>> generated a 16mb invalid WAL file(maybe it crashed while creating the
>>> WAL file). Your code seems to treat this as a valid file which I think
>>> is incorrect. Do you agree with that?
>>
>>
>> Detecting correct WAL size based on the size of a random WAL file seems
>> like a really bad idea to me.
>>
>>
>> I also don't see the reason for #2... or is that how initdb writes out the
>> correct control file?
>
>
> The initdb module reads the size from the option provided and sets the
> environment variable. This variable is read in
> src/backend/access/transam/xlog.c and the ControlFile written.
> Unlike pg_resetwal and pg_rewind, pg_basebackup cannot access the Control
> file. It only accesses the wal log folder.  So we get the XLogSegSize from
> the SHOW command using  replication connection.
> As Kuntal pointed out, I might need to set it from  pg_receivewal.c as well.
>
> Thank you,
>
> Beena Emerson
>
> EnterpriseDB: https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Ashutosh Bapat
I see that all the changes by Amit and myself to what was earlier 0003
patch are now part of 0002 patch. The patch looks ready for committer.

Some comments about 0003 patch.

CREATE TABLE syntax seems to allow specifying reloptions for a
partitioned table. But extractRelOptions() and heap_reloptions() seem
to be ignoring those. Is that intentional? There are two callers of
extractRelOptions(), extract_autovac_opts() and
RelationParseRelOptions(). There's an assert in extract_autovac_opts()
preventing that function from getting called for partitioned table.
RelationParseRelOptions() seems to filter the relations before calling
extractRelOptions(). I am wondering whether it's better to leave
extractRelOptions() and filter partitioned relations in
RelationParseRelOptions().

Do we need similar condition modification in the other caller of i.e.
heap_create_init_fork(), ExecuteTruncate()?

On Mon, Mar 6, 2017 at 12:48 PM, Amit Langote
 wrote:
> Thanks for the review.
>
> On 2017/03/06 15:41, Michael Paquier wrote:
>> On Fri, Mar 3, 2017 at 10:02 AM, Amit Langote
>>  wrote:
>>> Thanks.  I noticed that 'and' is duplicated in a line added by the commit
>>> to analyze.sgml.  Attached 0001 fixes that.  0002 and 0003 same as the
>>> last version.
>>
>> /*
>> -* If all the children were temp tables, pretend it's a non-inheritance
>> -* situation.  The duplicate RTE we added for the parent table is
>> -* harmless, so we don't bother to get rid of it; ditto for the useless
>> -* PlanRowMark node.
>> +* If all the children were temp tables or if the parent is a partitioned
>> +* table without any leaf partitions, pretend it's a non-inheritance
>> +* situation.  The duplicate RTE for the parent table we added in the
>> +* non-partitioned table case is harmless, so we don't bother to get rid
>> +* of it; ditto for the useless PlanRowMark node.
>>  */
>> -   if (list_length(appinfos) < 2)
>> +   if (!has_child)
>> This comment is not completely correct. Children can be temp tables,
>> they just cannot be temp tables of other backends. It seems to me that
>> you could still keep this code simple and remove has_child..
>
> I updated the comment.  I recall having posted a patch for that once, but
> perhaps went unnoticed.
>
> About has_child, the other option is to make the minimum length of
> appinfos list relkind-based, but  the condition quickly becomes ugly. Do
> you have a suggestion?
>
>> @@ -932,7 +932,6 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
>> case RELKIND_RELATION:
>> case RELKIND_TOASTVALUE:
>> case RELKIND_MATVIEW:
>> -   case RELKIND_PARTITIONED_TABLE:
>> options = heap_reloptions(classForm->relkind, datum, false);
>> break;
>> Partitioned tables cannot have reloptions? What about all the
>> autovacuum_* parameters then? Those are mainly not storage-related.
>
> AFAIK, none of the heap reloptions will be applicable to partitioned table
> relations once we eliminate storage.
>
> About autovacuum_* parameters - we currently don't handle partitioned
> tables in autovacuum.c, because no statistics are reported for them. That
> is, relation_needs_vacanalyze() will never return true for dovacuum,
> doanalyze and wraparound if it is passed a RELKIND_PARTITIONED_TABLE
> relation.  That's something to be fixed separately though.  When we add
> autovacuum support for partitioned tables, we may want to add a new set of
> reloptions (new because partitioned tables still won't support all options
> returned by heap_reloptions()).  Am I missing something?
>
> Thanks,
> Amit



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Tom Lane
Amit Langote  writes:
> Also, I found out that alter_table.sql mistakenly forgot to drop
> partitioned table "p1".  Patch 0002 takes care of that.

While that might or might not have been intentional, I think it's an
astoundingly bad idea to not leave any partitioned tables behind in
the final state of the regression database.  Doing so would likely
have meant that this particular bug evaded detection for much longer
than it did.  Moreover, it would mean that the pg_upgrade test would
have exactly no coverage of partitioned cases.

Therefore, there should definitely be a partitioned table, hopefully with
a less generic name than "p1", in the final regression DB state.  Whether
this particular one from alter_table.sql is a good candidate, I dunno.
But let's not drop it without adding a better-thought-out replacement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Amit Langote
On 2017/03/07 10:49, Michael Paquier wrote:
> On Tue, Mar 7, 2017 at 10:37 AM, Amit Langote
>  wrote:
>> On 2017/03/07 7:28, Tom Lane wrote:
>>> Kevin Grittner  writes:
 With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
 `make install-world`, a fresh default initdb, a start with default
 config, `make installcheck`, connected to the regression database
 with psql as the initial superuser, and ran:
>>>
 regression=# vacuum freeze analyze;
 WARNING:  relcache reference leak: relation "p1" not closed
 VACUUM
>>>
>>> p1 is a partitioned table.  (BTW, could I lobby for people not to use such
>>> generic, collision-prone names for tables that will be left behind after
>>> the regression tests?)  Also, I find that "vacuum analyze" is sufficient,
>>> or even just "analyze", or "analyze p1".  I think it's highly likely this
>>> was introduced by 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3.  Certainly
>>> that failed to add appropriate regression test cases, or we would have
>>> noticed this already.
>>
>> That's right, sorry about that.  Attached patch fixes the relcache leak
>> and adds tests in vacuum.sql and truncate.sql.
> 
> (I was just poking at that)
>  if (childrel != onerel)
>  heap_close(childrel, AccessShareLock);
> +else
> +heap_close(childrel, NoLock);
>  continue;
> Shouldn't that be conditional on the relkind of childrel?

I think we could simply Assert that childrel is partitioned table in this
whole block.  A child table could be a regular table, a materialized view
(?), a foreign table and a partitioned table, the first three of which are
handled by the first two blocks.

Updated patch attached.

Also, I found out that alter_table.sql mistakenly forgot to drop
partitioned table "p1".  Patch 0002 takes care of that.

Thanks,
Amit
>From 0fe3bcea9424133b8711bdcc23b1432cde4fc394 Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 7 Mar 2017 10:26:16 +0900
Subject: [PATCH 1/2] Fix relcache ref leak in acquire_inherited_sample_rows

Add tests for vacuum, analyze, truncate on partitioned table, as
3c3bb9933 should have.
---
 src/backend/commands/analyze.c |  7 +--
 src/test/regress/expected/truncate.out |  6 ++
 src/test/regress/expected/vacuum.out   |  9 +
 src/test/regress/sql/truncate.sql  |  7 +++
 src/test/regress/sql/vacuum.sql| 10 ++
 5 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a70c760341..b91df986c5 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1360,11 +1360,14 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
 		else
 		{
 			/*
-			 * ignore, but release the lock on it.  could be a partitioned
-			 * table.
+			 * ignore, but release the lock on it.  don't try to unlock the
+			 * passed-in relation
 			 */
+			Assert(childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 			if (childrel != onerel)
 heap_close(childrel, AccessShareLock);
+			else
+heap_close(childrel, NoLock);
 			continue;
 		}
 
diff --git a/src/test/regress/expected/truncate.out b/src/test/regress/expected/truncate.out
index 5c5277e0f1..81612d8c88 100644
--- a/src/test/regress/expected/truncate.out
+++ b/src/test/regress/expected/truncate.out
@@ -420,3 +420,9 @@ SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
 ERROR:  relation "truncate_a_id1" does not exist
 LINE 1: SELECT nextval('truncate_a_id1');
^
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE truncparted1 PARTITION OF truncparted FOR VALUES IN (1);
+INSERT INTO truncparted VALUES (1, 'a');
+TRUNCATE truncparted;
+DROP TABLE truncparted;
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9b604be4b6..6f68663087 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -82,3 +82,12 @@ VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
+-- partitioned table
+CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO vacparted VALUES (1, 'a');
+UPDATE vacparted SET b = 'b';
+VACUUM (ANALYZE) vacparted;
+VACUUM (FULL) vacparted;
+VACUUM (FREEZE) vacparted;
+DROP TABLE vacparted;
diff --git a/src/test/regress/sql/truncate.sql b/src/test/regress/sql/truncate.sql
index a3d6f5368f..d61eea1a42 100644
--- a/src/test/regress/sql/truncate.sql
+++ b/src/test/regress/sql/truncate.sql
@@ -215,3 +215,10 @@ SELECT * FROM truncate_a;
 DROP TABLE truncate_a;
 
 SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
+
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) 

Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Ashutosh Sharma
> I also think that commit didn't intend to change the behavior,
> however, the point is how sensible is it to keep such behavior after
> Parallel Append.  I am not sure if this is the right time to consider
> it or shall we wait till Parallel Append is committed.
>
>> I think the problem here is that compute_parallel_worker() thinks it
>> can use 0 as a sentinel value that means "ignore this", but it can't
>> really, because a heap can actually contain 0 pages.  Here's a patch
>> which does the following:
>>
>
> - if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
> - index_pages < (BlockNumber) min_parallel_index_scan_size &&
> - rel->reloptkind == RELOPT_BASEREL)
> + if (rel->reloptkind == RELOPT_BASEREL &&
> + ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
> + (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
>   return 0;
>
> The purpose of considering both heap and index pages () for
> min_parallel_* is that even if one of them is bigger than threshold
> then we should try parallelism.

Yes, But, this is only true for normal tables not for partitioned or
inheritance tables. I think for partition table, even if one heap page
exist, we go for parallelism.

 This could be helpful for parallel
> index scans where we consider parallel workers based on both heap and
> index pages.  Is there a reason for you to change that condition as
> that is not even related to the problem being discussed?
>

I think he has changed to allow parallelism for inheritance or partition
tables. For normal tables, it won't be touched until the below if-condition
is satisfied.




*if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
index_pages < (BlockNumber) min_parallel_index_scan_size &&
rel->reloptkind == RELOPT_BASEREL)return 0;*

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-06 Thread Peter Geoghegan
On Mon, Mar 6, 2017 at 3:57 PM, Andres Freund  wrote:
> I'm ok with not immediately doing so, but I think Peter's design isn't
> in line with achieving something like this.

I would be okay with doing this if we had a grab-bag of expensive
checks, that all pretty much require some combination of
ExclusiveLocks and/or ShareLocks anyway, and are amenable to being
combined. I could see that happening, but we're a long way off from
that. When it does happen, we might have other things that are a bit
like bt_index_parent_check(), in terms of locking requirements and
degree of thoroughness, that might themselves require other parameters
specific to the verification that they perform that we cannot
anticipate. For example, maybe bt_index_parent_check() could have a
parameter that is a probability that any given IndexTuple will be
verified against its heap tuple. Then you could have something like a
PRNG seed argument, so you can check different tuples every day. There
are many possibilities, and I hope that eventually amcheck has this
kind of very rich functionality.

When you use the interface you describe to "stack" several checks at
once, less important parameters, like the ones I suggest are going to
be an awkward fit, and so will probably be excluded. I'm not opposed
to having what amounts to a second interface to what bt_index_check()
does, to make this kind of thing work where it makes sense. Really,
bt_index_parent_check() is almost an alternative interface to the same
functionality today. There can be another one in the future, if it
serves a purpose, and the locking requirements are roughly the same
for all the checks. I'd be fine with that. Let's just get the basic
feature in for now, though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-03-06 Thread Ashutosh Bapat
On Tue, Mar 7, 2017 at 9:33 AM, Etsuro Fujita
 wrote:
> On 2017/03/06 21:22, Ashutosh Bapat wrote:
>>
>> On Mon, Mar 6, 2017 at 1:29 PM, David Rowley
>>  wrote:
>>>
>>> On 6 March 2017 at 18:51, Etsuro Fujita 
>>> wrote:

 On 2017/03/06 11:05, David Rowley wrote:
>>>
>>> I looked over yours and was surprised to see:
>>>
>>> + /* is_foreign_expr might need server and shippable-extensions info. */
>>> + fpinfo->server = fpinfo_o->server;
>>> + fpinfo->shippable_extensions = fpinfo_o->shippable_extensions;
>>>
>>> That appears to do nothing for the other server options. For example
>>> use_remote_estimate, which is used in estimate_path_cost_size(). As of
>>> now, that's also broken. My patch fixes that problem too, yours
>>> appears not to.
>
>
> Thanks for the comments!  Actually, those options including
> use_remote_estimate are set later in foreign_join_ok, so
> estimate_path_cost_size would work well, for example.
>
>>> Even if you expanded the above code to process all server options, if
>>> someone comes along later and adds a new option, my code will pick it
>>> up, yours could very easily be forgotten about and be the cause of yet
>>> more bugs.
>
>
> I agree with you on that point.
>
>>> It seems like a much better idea to keep the server option processing
>>> in one location, which is what I did.
>
>
> Seems like a better idea.
>
>> I agree with this. However
>> 1. apply_server_options() is parsing the options strings again and
>> again, which seems wastage of CPU cycles. It should probably pick up
>> the options from one of the joining relations. Also, the patch calls
>> GetForeignServer(), which is not needed; in foreign_join_ok(), it will
>> copy it from the outer relation anyway.
>> 2. Some server options like use_remote_estimate and fetch_size are
>> overridden by corresponding table level options. For a join relation
>> the values of these options are derived by some combination of
>> table-level options.
>>
>> I think we should write a separate function
>> apply_fdw_options_for_joinrel() and pass the fpinfos of joinrel, outer
>> and inner relation. The function will copy the values of server level
>> options and derive values for table level options. We would add a note
>> there to keep this function in sync with apply_*_options(). I don't
>> think there's any better way to keep the options in sync for base
>> relations and join relations.
>>
>> Here's the patch attached.
>
>
> I looked over the patch quickly.  I think it's a better idea that to gather
> various option processing in foreign_join_ok (or foreign_grouping_ok) in one
> place.  However, I'm wondering we really need to introduce
> apply_table_options and apply_server_options.  ISTM that those option
> processing in postgresGetForeignRelSize is gathered in one place well
> already.

I kept that as is since I liked the way it's modularized now.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

2017-03-06 Thread David Rowley
On 2 March 2017 at 16:06, Amit Kapila  wrote:
> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
>  wrote:
>> Hackers,
>>
>> I've attached a small patch which aims to improve the performance of
>> AccessExclusiveLock when there are many AccessExclusiveLock stored.
>>
>
> I could see that your idea is quite straightforward to improve
> performance (basically, convert recovery lock list to recovery lock
> hash table based on xid), however, it is not clear under what kind of
> workload this will be helpful?  Do you expect that many concurrent
> sessions are trying to acquire different AEL locks?

Many thanks for looking over this.

The problem case is when many AccessExclusiveLocks are in the list,
each transaction which commits must look through the entire list for
locks which belong to it. This can be a problem when a small number of
transactions create large amount of temp tables, perhaps some batch
processing job. Other transactions, even ones which don't create any
temp tables must look through the list to release any locks belonging
to them. All we need to do to recreate this is have a bunch of
transactions creating temp tables, then try and execute some very
small and fast transactions at the same time.

It's possible to reproduce the problem by running the attached
temptable_bench.sql with:

pgbench -f temptable_bench.sql -j 10 -c 10 -T 60 -n postgres

while concurrently running

pgbench -f updatecounter.sql -T 60 -n postgres

after having done:

CREATE TABLE t1 (value int not null);
insert into t1 values(1);

The hot physical standby lag grows significantly during the run. I saw
up to 500MB on a short 1 minute run and perf record shows that
StandbyReleaseLocks() is consuming 90% of CPU time.

With the patch StandbyReleaseLocks() is down at about 2% CPU.

> Few comments on the patch:
> 1.
> +/*
> + * Number of buckets to split RecoveryLockTable into.
> + * This must be a power of two.
> + */
>
> +#define RECOVERYLOCKTABLE_SIZE 1024
>
> On what basis did you decide to keep the lock table size as 1024, is
> it just a guess, if so, then also adding a comment to indicate that
> you think this is sufficient for current use cases seems better.
> However, if it is based on some data, then it would be more
> appropriate.

That may need tweaking. Likely it could be smaller if we had some sort
of bloom filter to mark if the transaction had obtained any AEL locks,
that way it could skip. Initially I really didn't want to make the
patch too complex. I had thought that a fairly large hash table would
fix the problem well enough, as quite possibly most buckets would be
empty and non empty buckets have short lists.

> 2.
> @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId
> xid, Oid dbOid, Oid relOid)
>  {
>   xl_standby_lock *newlock;
>   LOCKTAG locktag;
> + size_t pidx;
> +
>
> Comment on top of this function needs some change, it still seems to
> refer to RelationLockList.

Will fix that.

>
> * We keep a single dynamically expandible list of locks in local memory,
>  * RelationLockList, so we can keep track of the various entries made by
>  * the Startup process's virtual xid in the shared lock table.
>
> 3.
> + size_t pidx;
> +
> + if (!TransactionIdIsValid(xid))
> + {
> + StandbyReleaseAllLocks();
> + return;
> + }
>
> What is the need of this new code?

I was not much of a fan of the old way that the function coped with an
invalid xid. The test was being performed inside of the for loop. My
new for loop was not really compatible with that test, so I moved
handling of invalid xids to outside the loop. I think doing that with
today's code would be an improvement as it would mean less work inside
the slow loop.

> 4.
> In some cases, it might slow down the performance where you have to
> traverse the complete hash table like StandbyReleaseOldLocks, however,
> I think those cases will be less relative to other lock release calls
> which are called at every commit, so probably this is okay.

Yes, that may be true. Perhaps making the table smaller would help
bring that back to what it was, or perhaps the patch needs rethought
to use bloom filters to help identify if we need to release any locks.

Opinions on that would be welcome. Meanwhile I'll go off and play with
bloom filters to see if I can knock some percentage points of the perf
report for StandbyReleaseLocks().

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


temptable_bench.sql.bz2
Description: BZip2 compressed data


updatecounter.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-03-06 Thread Etsuro Fujita

On 2017/03/06 21:22, Ashutosh Bapat wrote:

On Mon, Mar 6, 2017 at 1:29 PM, David Rowley
 wrote:

On 6 March 2017 at 18:51, Etsuro Fujita  wrote:

On 2017/03/06 11:05, David Rowley wrote:

I looked over yours and was surprised to see:

+ /* is_foreign_expr might need server and shippable-extensions info. */
+ fpinfo->server = fpinfo_o->server;
+ fpinfo->shippable_extensions = fpinfo_o->shippable_extensions;

That appears to do nothing for the other server options. For example
use_remote_estimate, which is used in estimate_path_cost_size(). As of
now, that's also broken. My patch fixes that problem too, yours
appears not to.


Thanks for the comments!  Actually, those options including 
use_remote_estimate are set later in foreign_join_ok, so 
estimate_path_cost_size would work well, for example.



Even if you expanded the above code to process all server options, if
someone comes along later and adds a new option, my code will pick it
up, yours could very easily be forgotten about and be the cause of yet
more bugs.


I agree with you on that point.


It seems like a much better idea to keep the server option processing
in one location, which is what I did.


Seems like a better idea.


I agree with this. However
1. apply_server_options() is parsing the options strings again and
again, which seems wastage of CPU cycles. It should probably pick up
the options from one of the joining relations. Also, the patch calls
GetForeignServer(), which is not needed; in foreign_join_ok(), it will
copy it from the outer relation anyway.
2. Some server options like use_remote_estimate and fetch_size are
overridden by corresponding table level options. For a join relation
the values of these options are derived by some combination of
table-level options.

I think we should write a separate function
apply_fdw_options_for_joinrel() and pass the fpinfos of joinrel, outer
and inner relation. The function will copy the values of server level
options and derive values for table level options. We would add a note
there to keep this function in sync with apply_*_options(). I don't
think there's any better way to keep the options in sync for base
relations and join relations.

Here's the patch attached.


I looked over the patch quickly.  I think it's a better idea that to 
gather various option processing in foreign_join_ok (or 
foreign_grouping_ok) in one place.  However, I'm wondering we really 
need to introduce apply_table_options and apply_server_options.  ISTM 
that those option processing in postgresGetForeignRelSize is gathered in 
one place well already.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Amit Kapila
On Tue, Mar 7, 2017 at 3:24 AM, Robert Haas  wrote:
> On Sun, Mar 5, 2017 at 9:41 PM, Amit Kapila  wrote:
>>> RCA:
>>> 
>>> From "Replace min_parallel_relation_size with two new GUCs" commit
>>> onwards, we are not assigning parallel workers for the child rel with
>>> zero heap pages. This means we won't be able to create a partial
>>> append path as this requires all the child rels within an Append Node
>>> to have a partial path.
>>
>> Right, but OTOH, if we assign parallel workers by default, then it is
>> quite possible that it would result in much worse plans.  Consider a
>> case where partition hierarchy has 1000 partitions and only one of
>> them is big enough to allow parallel workers.  Now in this case, with
>> your proposed fix it will try to scan all the partitions in parallel
>> workers which I think can easily result in bad performance.  I think
>> the right way to make such plans parallel is by using Parallel Append
>> node (https://commitfest.postgresql.org/13/987/).  Alternatively, if
>> you want to force parallelism in cases like the one you have shown in
>> example, you can use Alter Table .. Set (parallel_workers = 1).
>
> Well, I think this is a bug in
> 51ee6f3160d2e1515ed6197594bda67eb99dc2cc.  The commit message doesn't
> say anything about changing any behavior, just about renaming GUCs,
> and I don't remember any discussion about changing the behavior
> either, and the comment still implies that we have the old behavior
> when we really don't, and parallel append isn't committed yet.
>

I also think that commit didn't intend to change the behavior,
however, the point is how sensible is it to keep such behavior after
Parallel Append.  I am not sure if this is the right time to consider
it or shall we wait till Parallel Append is committed.

> I think the problem here is that compute_parallel_worker() thinks it
> can use 0 as a sentinel value that means "ignore this", but it can't
> really, because a heap can actually contain 0 pages.  Here's a patch
> which does the following:
>

- if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
- index_pages < (BlockNumber) min_parallel_index_scan_size &&
- rel->reloptkind == RELOPT_BASEREL)
+ if (rel->reloptkind == RELOPT_BASEREL &&
+ ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
+ (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
  return 0;

The purpose of considering both heap and index pages () for
min_parallel_* is that even if one of them is bigger than threshold
then we should try parallelism.  This could be helpful for parallel
index scans where we consider parallel workers based on both heap and
index pages.  Is there a reason for you to change that condition as
that is not even related to the problem being discussed?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-06 Thread Petr Jelinek
On 06/03/17 23:40, Erik Rijkers wrote:
> On 2017-03-06 16:10, Erik Rijkers wrote:
>> On 2017-03-06 11:27, Petr Jelinek wrote:
>>> Hi,
>>>
>>> updated and rebased version of the patch attached.
>>>
>>
>> I compiled with /only/ this one latest patch:
>>0001-Logical-replication-support-for-initial-data-copy-v6.patch
>>
>> Is that correct, or are other patches still needed on top, or underneath?
>>
> 
> TWIMC, I'll answer my own question: the correct patchset seems to be
> these six:
> 
> 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch
> 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch
> 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch
> 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch
> 0005-Skip-unnecessary-snapshot-builds.patch
> 0001-Logical-replication-support-for-initial-data-copy-v6.patch
> 

Yes the 0004 and 0005 improve performance of snapshot creation
considerably. They're also the reason we uncovered all the snapbuilder
bugs now. Before the performance optimization it was very unlikely for
some of the race conditions to happen as the snapshot would just not be
built until there was less concurrent traffic on the server.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

2017-03-06 Thread Masahiko Sawada
On Mon, Mar 6, 2017 at 12:12 PM, Yugo Nagata  wrote:
> Hi,
>
> I think this is good since the information is useful and it is
> a little change.

Thank you for reviewing this patch!

>
> One thing I'm bothered is that even when the number of frozen page is
> one, it will say "1 frozen pages". In other messages, when the
> number of page is one, the word "page" rather than "pages" is used
> by using ngettext().

I don't think it can say "1 frozen pages" because the number of
skipped pages according to visibility map is always more than 32
(SKIP_PAGES_THRESHOLD).

>
> In addition, the document (doc/src/sgml/ref/vacuum.sgml) need a modification
> to use the new messages in its example.

Fixed.

>
> BTW, this patch can't be applied after the following commit.
>
> commit 9eb344faf54a849898d9be012ddfa8204cfeb57c
> Author: Simon Riggs 
> Date:   Fri Mar 3 19:18:25 2017 +0530
>
> Allow vacuums to report oldestxmin

Attached updated v2 patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


report_frozen_skipped_pages_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: check failure with -DRELCACHE_FORCE_RELEASE -DCLOBBER_FREED_MEMORY

2017-03-06 Thread Andrew Dunstan


On 03/06/2017 05:14 PM, Tom Lane wrote:
> I wrote:
>> I fixed that, and the basic regression tests no longer crash outright with
>> these settings, but I do see half a dozen errors that all seem to be in
>> RLS-related tests.
> Those turned out to all be the same bug in DoCopy.  "make check-world"
> passes for me now with -DRELCACHE_FORCE_RELEASE, but I've only tried
> HEAD not the back branches.
>
>   



I have tied the back branches. They are good.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small fix to postgresql.conf.sample's comment on max_parallel_workers

2017-03-06 Thread David Rowley
On 7 March 2017 at 15:21, Amit Kapila  wrote:
> +1.  How about changing the description of
> max_parallel_workers_per_gather to "taken from max_worker_processes,
> limited by max_parallel_workers"?

Thanks for looking.

Seems more accurate to say that it's "taken from
max_parallel_workers", maybe. You can't "take" more than what's there,
so perhaps the extra text is not required.

Patch attached.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


postgresql.conf.sample_fix_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small fix to postgresql.conf.sample's comment on max_parallel_workers

2017-03-06 Thread Amit Kapila
On Tue, Mar 7, 2017 at 6:39 AM, David Rowley
 wrote:
> While scanning over postgresql.conf I happened to notice something
> that didn't ring quite true about max_parallel_workers. The comment
> confuses worker_processes with parallel workers.
>

+1.  How about changing the description of
max_parallel_workers_per_gather to "taken from max_worker_processes,
limited by max_parallel_workers"?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-06 Thread Andreas Karlsson

On 03/03/2017 10:41 PM, Sven R. Kunze wrote:

What do you think?


I have some thoughts:

1) I do not think we currently allow setting the locale like this 
anywhere, so this will introduce a new concept to PostgreSQL. And you 
will probably need to add support for caching per locale.


2) As far as I can tell from reading the code to_date currently ignores 
the M suffix which indicates that you want localized month/day names, so 
i guess that to_date is currently immutable. Maybe it is stable due to 
the idea that we may want to support the M suffix in the future.


3) I do not like the to_date function. It is much too forgiving with 
invalid input. For example 2017-02-30 because 2017-03-02. Also just 
ignoring the M suffix in the format string seems pretty bad


Personally I would rather see a new date parsing function which is 
easier to work with or somehow fix to_date without pissing too many 
users off, but I have no idea if this is a view shared with the rest of 
the community.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 5:03 AM, Peter Eisentraut
 wrote:
> On 3/3/17 20:30, Michael Paquier wrote:
>> Yeah, it looks sensible to me to keep "replication" for physical
>> replication, and switch logical replication checks to match a database
>> name in hba comparisons.
>
> I think we are OK to move ahead with this.
>
> Another question would be why only enable connections for
> @default_username@ by default, instead of all.

That would make sense as well.

> Also, with this change, some test code that sets up pg_hba.conf for
> replication can be removed.  See attached patch.

Indeed.

I think that the documentation of initdb should mention that
pg_hba.conf entries are configured for replication connections as
well, something like a sentence in the Description paragraph:
initdb sets pg_hba.conf entries using the specified authentication
method (trust by default) for non-replication as well as replication
connections.

What do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-06 Thread Amit Kapila
On Tue, Mar 7, 2017 at 5:27 AM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia  
> wrote:
>> Yes, I thought of adding wait event only for the sync but then recording the
>> wait event for both write and sync. I understand that OS level writes are
>> cheap but it still have some cost attached to that. Also I thought for the
>> monitoring tool being develop using this wait events, will have more useful
>> capture data if we try to collect as much info as we can. Or may be not.
>>
>> I am open for other opinion/suggestions.
>
> Writes are NOT always fast.  I've seen cases of write() blocking for
> LONG periods of time on systems that are in the process of failing, or
> just busy.  So I think we certainly want to advertise a wait event for
> those.
>

Sure, if you think both Writes and Reads at OS level can have some
chance of blocking in obscure cases, then we should add a wait event
for them.

> Likewise, I agree that the prefetch call probably SHOULDN'T block, but
> just because it shouldn't doesn't mean it never will.
>
> I think somebody should try a pgbench run with this patch applied,
> using a scale factor greater than shared_buffers, and generate a wait
> event profile, just to see if these are showing up and how often.
>

Yeah, that makes sense to me and we should try for both read-write and
read-only tests.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Restricting maximum keep segments by repslots

2017-03-06 Thread Craig Ringer
On 28 February 2017 at 12:27, Petr Jelinek  wrote:

>> This patch adds a GUC to put a limit to the number of segments
>> that replication slots can keep. Hitting the limit during
>> checkpoint shows a warining and the segments older than the limit
>> are removed.
>>
>>> WARNING:  restart LSN of replication slots is ignored by checkpoint
>>> DETAIL:  Some replication slots lose required WAL segnents to continue.
>>
>
> However this is dangerous as logical replication slot does not consider
> it error when too old LSN is requested so we'd continue replication,
> hiding data loss.

That skipping only happens if you request a startpoint older than
confirmed_flush_lsn . It doesn't apply to this situation.

The client cannot control where we start decoding, it's always
restart_lsn, and if we can't find a needed WAL segment we'll ERROR. So
this is safe, though the error will be something about being unable to
find a wal segment that users might not directly associate with having
set this option. It won't say "slot disabled because needed WAL has
been discarded due to [setting]" or anything.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioned tables and relfilenode

2017-03-06 Thread Amit Langote
On 2017/03/06 17:22, Michael Paquier wrote:
> On Mon, Mar 6, 2017 at 4:18 PM, Amit Langote
>  wrote:
>> About autovacuum_* parameters - we currently don't handle partitioned
>> tables in autovacuum.c, because no statistics are reported for them. That
>> is, relation_needs_vacanalyze() will never return true for dovacuum,
>> doanalyze and wraparound if it is passed a RELKIND_PARTITIONED_TABLE
>> relation.  That's something to be fixed separately though.  When we add
>> autovacuum support for partitioned tables, we may want to add a new set of
>> reloptions (new because partitioned tables still won't support all options
>> returned by heap_reloptions()).  Am I missing something?
> 
> OK. I got confused by the fact that settings on parents should
> super-seed the settings of the children. Or if you want if a value is
> set on the parent by default it would apply to the child if it has no
> value set, which is where autovacuum_enabled makes sense even for
> partitioned tables.

Hmm, setting autovacuum_enabled on partitioned parent should be made to
work after we have fixed autovacuum support for partitioned tables.  Using
the parent's value as a default for partitions may not be what we'd want
eventually.

> Leading to the point that parents could have
> reloptions, with a new category of the type RELOPT_KIND_PARTITION.
> Still, it is sensible as well to bypass the parents in autovacuum as
> well, now that I read it. And the handling is more simple.

We will need it though, because lack of automatically updated
"inheritance" (or whole tree) statistics on partitioned tables is a problem.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-06 Thread Amit Langote
Hi Aleksander,

On 2017/03/07 0:22, Aleksander Alekseev wrote:
> Hello.
> 
> OK, here is a patch.
> 
> Benchmark, before:
> 
> ```
> number of transactions actually processed: 1823
> latency average = 1153.495 ms
> latency stddev = 154.366 ms
> tps = 6.061104 (including connections establishing)
> tps = 6.061211 (excluding connections establishing)
> ```
> 
> Benchmark, after:
> 
> ```
> number of transactions actually processed: 2462
> latency average = 853.862 ms
> latency stddev = 112.270 ms
> tps = 8.191861 (including connections establishing)
> tps = 8.192028 (excluding connections establishing)
> ```
> 
> +35% TPS, just as expected. Feel free to run your own benchmarks on
> different datasets and workloads. `perf top` shows that first bottleneck
> is completely eliminated.

That seems like a good gain.

> I did nothing about the second bottleneck
> since as Amit mentioned partition-pruning should solve this anyway and
> apparently any micro-optimizations don't worth an effort.

Sorry, I didn't mean to dissuade you from trying those
micro-optimizations.  If general inheritance cases can benefit from it
(which, until we have a different method, will be used by partitioned
tables as well), I think we should try it.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 10:37 AM, Amit Langote
 wrote:
> On 2017/03/07 7:28, Tom Lane wrote:
>> Kevin Grittner  writes:
>>> With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
>>> `make install-world`, a fresh default initdb, a start with default
>>> config, `make installcheck`, connected to the regression database
>>> with psql as the initial superuser, and ran:
>>
>>> regression=# vacuum freeze analyze;
>>> WARNING:  relcache reference leak: relation "p1" not closed
>>> VACUUM
>>
>> p1 is a partitioned table.  (BTW, could I lobby for people not to use such
>> generic, collision-prone names for tables that will be left behind after
>> the regression tests?)  Also, I find that "vacuum analyze" is sufficient,
>> or even just "analyze", or "analyze p1".  I think it's highly likely this
>> was introduced by 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3.  Certainly
>> that failed to add appropriate regression test cases, or we would have
>> noticed this already.
>
> That's right, sorry about that.  Attached patch fixes the relcache leak
> and adds tests in vacuum.sql and truncate.sql.

(I was just poking at that)
 if (childrel != onerel)
 heap_close(childrel, AccessShareLock);
+else
+heap_close(childrel, NoLock);
 continue;
Shouldn't that be conditional on the relkind of childrel?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Amit Langote
On 2017/03/07 7:28, Tom Lane wrote:
> Kevin Grittner  writes:
>> With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
>> `make install-world`, a fresh default initdb, a start with default
>> config, `make installcheck`, connected to the regression database
>> with psql as the initial superuser, and ran:
> 
>> regression=# vacuum freeze analyze;
>> WARNING:  relcache reference leak: relation "p1" not closed
>> VACUUM
> 
> p1 is a partitioned table.  (BTW, could I lobby for people not to use such
> generic, collision-prone names for tables that will be left behind after
> the regression tests?)  Also, I find that "vacuum analyze" is sufficient,
> or even just "analyze", or "analyze p1".  I think it's highly likely this
> was introduced by 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3.  Certainly
> that failed to add appropriate regression test cases, or we would have
> noticed this already.

That's right, sorry about that.  Attached patch fixes the relcache leak
and adds tests in vacuum.sql and truncate.sql.

Thanks,
Amit
>From b964b4dca34c637ec0acf64146f34caffcbb7aab Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 7 Mar 2017 10:26:16 +0900
Subject: [PATCH] Fix relcache ref leak in acquire_inherited_sample_rows

Add tests for vacuum, analyze, truncate on partitioned table, as
3c3bb9933 should have.
---
 src/backend/commands/analyze.c |  4 +++-
 src/test/regress/expected/truncate.out |  6 ++
 src/test/regress/expected/vacuum.out   |  9 +
 src/test/regress/sql/truncate.sql  |  7 +++
 src/test/regress/sql/vacuum.sql| 10 ++
 5 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a70c760341..354412b886 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1361,10 +1361,12 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
 		{
 			/*
 			 * ignore, but release the lock on it.  could be a partitioned
-			 * table.
+			 * table.  don't try to unlock the passed-in relation.
 			 */
 			if (childrel != onerel)
 heap_close(childrel, AccessShareLock);
+			else
+heap_close(childrel, NoLock);
 			continue;
 		}
 
diff --git a/src/test/regress/expected/truncate.out b/src/test/regress/expected/truncate.out
index 5c5277e0f1..81612d8c88 100644
--- a/src/test/regress/expected/truncate.out
+++ b/src/test/regress/expected/truncate.out
@@ -420,3 +420,9 @@ SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
 ERROR:  relation "truncate_a_id1" does not exist
 LINE 1: SELECT nextval('truncate_a_id1');
^
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE truncparted1 PARTITION OF truncparted FOR VALUES IN (1);
+INSERT INTO truncparted VALUES (1, 'a');
+TRUNCATE truncparted;
+DROP TABLE truncparted;
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 9b604be4b6..6f68663087 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -82,3 +82,12 @@ VACUUM FULL vactst;
 VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
+-- partitioned table
+CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO vacparted VALUES (1, 'a');
+UPDATE vacparted SET b = 'b';
+VACUUM (ANALYZE) vacparted;
+VACUUM (FULL) vacparted;
+VACUUM (FREEZE) vacparted;
+DROP TABLE vacparted;
diff --git a/src/test/regress/sql/truncate.sql b/src/test/regress/sql/truncate.sql
index a3d6f5368f..d61eea1a42 100644
--- a/src/test/regress/sql/truncate.sql
+++ b/src/test/regress/sql/truncate.sql
@@ -215,3 +215,10 @@ SELECT * FROM truncate_a;
 DROP TABLE truncate_a;
 
 SELECT nextval('truncate_a_id1'); -- fail, seq should have been dropped
+
+-- partitioned table
+CREATE TABLE truncparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE truncparted1 PARTITION OF truncparted FOR VALUES IN (1);
+INSERT INTO truncparted VALUES (1, 'a');
+TRUNCATE truncparted;
+DROP TABLE truncparted;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 7b819f654d..7c5fb04917 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -64,3 +64,13 @@ VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 
 DROP TABLE vaccluster;
 DROP TABLE vactst;
+
+-- partitioned table
+CREATE TABLE vacparted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE vacparted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO vacparted VALUES (1, 'a');
+UPDATE vacparted SET b = 'b';
+VACUUM (ANALYZE) vacparted;
+VACUUM (FULL) vacparted;
+VACUUM (FREEZE) vacparted;
+DROP TABLE vacparted;
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

[HACKERS] Small fix to postgresql.conf.sample's comment on max_parallel_workers

2017-03-06 Thread David Rowley
While scanning over postgresql.conf I happened to notice something
that didn't ring quite true about max_parallel_workers. The comment
confuses worker_processes with parallel workers.

The attached aims to put this right.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


postgresql.conf.sample_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-06 Thread Andres Freund
On 2017-03-06 19:45:27 -0500, Tom Lane wrote:
> Stephen Frost  writes:
> > * Andres Freund (and...@anarazel.de) wrote:
> >> I'm not quite sure what the best way to attack this is, but I think we
> >> need to do something.
>
> > I tend to agree with this, though I haven't got any great answers,
> > unfortunately.
>
> I don't want to reduce test coverage either.  I think the most painless
> way to improve matters would just be to work harder on running tests in
> parallel.  I think most devs these days do most of their work on 4- or
> 8-core machines, yet almost everything except the core regression tests
> is depressingly serial.  I think we could likely get a 2x or better
> reduction in total runtime without too much work just by attacking that.

A lot more probably, based on my preliminary tests / my local test
script.

I'm just not quite sure what the best way is to make it easier to run
tests in parallel within the tree.

The best I can come up so far is a toplevel target that creates the temp
install, starts a cluster and then runs the 'installcheck-or-check'
target on all the subdirectories via recursion. Individual makefiles can
either use the pre-existing cluster (most of of contrib for example), or
use the temporary install and run their pre-existing check target using
that (the tap tests, test_decoding, ...).

Requires editing a bunch of Makefiles to take advantage.  But I don't
really see anything that doesn't require that.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 3:21 AM, Robert Haas  wrote:
> On Sat, Mar 4, 2017 at 10:32 AM, Tom Lane  wrote:
>> Without having actually looked at this patch, I would say that if it added
>> a direct call of fopen() to backend-side code, that was already the wrong
>> thing.  Almost always, AllocateFile() would be a better choice, not only
>> because it's tied into transaction abort, but also because it knows how to
>> release virtual FDs in event of ENFILE/EMFILE errors.  If there is some
>> convincing reason why you shouldn't use AllocateFile(), then a safe
>> cleanup pattern would be to have the fclose() in a PG_CATCH stanza.
>
> I think that my previous remarks on this issue were simply muddled
> thinking.  The SQL-callable function pg_current_logfile() does use
> AllocateFile(), so the ERROR which may occur afterward if the file is
> corrupted is no problem.  The syslogger, on the other hand, uses
> logfile_open() to open the file, but it's careful not to throw an
> ERROR while the file is open, just like other code which runs in the
> syslogger.  So now I think there's no bug here.

-   /*
-* No space found, file content is corrupted.  Return NULL to the
-* caller and inform him on the situation.
-*/
+   /* Uh oh.  No newline found, so file content is corrupted. */
This one just made me smile.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-06 Thread Tom Lane
Stephen Frost  writes:
> * Andres Freund (and...@anarazel.de) wrote:
>> I'm not quite sure what the best way to attack this is, but I think we
>> need to do something.

> I tend to agree with this, though I haven't got any great answers,
> unfortunately.

I don't want to reduce test coverage either.  I think the most painless
way to improve matters would just be to work harder on running tests in
parallel.  I think most devs these days do most of their work on 4- or
8-core machines, yet almost everything except the core regression tests
is depressingly serial.  I think we could likely get a 2x or better
reduction in total runtime without too much work just by attacking that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Tom Lane
Giuseppe Broccolo  writes:
> I've seen that pg_dump execute the dump of an eventual comment of a
> TSDictionary without specifying the namespace where it is defined:
> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dump.c#L13542

Yup, this is clearly an error --- thanks for spotting it!  I've pushed
a fix for this and related mistakes.

> This is actually a problem if a new TSDictionary is created, in a different
> schema specified by the dumped search_path setting.

Just out of curiosity, do you have a concrete test case where it failed
that way?  AFAICS the emitted SQL would be like

SET search_path = schema1, pg_catalog;

CREATE TEXT SEARCH DICTIONARY somedict (...);

COMMENT ON TEXT SEARCH DICTIONARY somedict IS '...';

SET search_path = schema2, pg_catalog;

CREATE TEXT SEARCH DICTIONARY somedict (...);

COMMENT ON TEXT SEARCH DICTIONARY somedict IS '...';

so it should accidentally work anyway.  It's possible that a parallel
restore would get it wrong, or that a schema-selective restore would
omit comments it should include, but I couldn't reproduce a failure
in simple cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management

2017-03-06 Thread Peter Geoghegan
Recap
=

A design goal of parallel tuplesort is that the parallel case be as
close to possible as the serial case is already. Very little new code
is needed in tuplesort.c and logtape.c, most of which is about using a
new lower-level facility which is very well encapsulated. And, even
buffile.c "unification", the guts of everything, doesn't have many new
special cases.

So, workers start with "unifiable" BufFiles, which have very little
difference with conventional temp BufFiles -- they just create
filenames in a deterministic fashion, making them discoverable to the
leader process, through a process of unification (plus you have the
refcount state in shared memory, though very little). In principle,
workers could decide to not go through with unification long after the
fact, and close their unifiable temp files without doing anything for
the leader, and that would be just fine. By the same token, the
unified BufFile owned by the leader can be extended if needs be, in a
way that is virtually indistinguishable from just extending the end of
any other BufFile. logtape.c recycling for future randomAccess cases
works just the same as before.

I think that this is a nice state of affairs, since having no special
cases will tend to make the code robust. Applying leader-wise offsets
within logtape.c is done at one precise choke-point (one function),
and so it seems very likely that logtape.c routines that CREATE INDEX
happens to not use will "just work" when parallel tuplesort has more
clients at some point in the future. A logical tapeset doesn't even
remember specifically that it is using unification (unless you count
that it has a non-zero offset to apply).

V8 of the parallel tuplesort patch added a refcount mechanism that is
associated with a "unified" BufFile, consisting of $nworker unifiable
BufFiles, with metadata in shared memory per worker/unifiable BufFile.
We have a refcount per unifiable BufFile, not per fd.c segment. The
refcount thing lets worker processes go away as soon as the leader
begins its final on-the-fly merge -- the leader assumes ownership of
the BufFiles/segments initially owned only by workers. There is a very
brief period of co-ownership, which occurs during unification in the
leader.

Concern
===

My pending V9 of the patch has to solve a problem that V8 had, namely
that it's not very nice that an error in the leader (e.g. palloc()
failure) during this brief period of co-ownership will make the
resource manager attempt a double unlink() (one from the worker,
another from the leader). There is no danger of data loss, but it's
ugly. The general consensus is that the way to handle it is with a
on_dsm_detach() callback with the top-level parallel context shared
memory segment. I have already written some rough code that does all
this, and demonstrably removes the double unlink() hazard, but I have
some outstanding concerns, that I'd like to clear up.

We have at least 3 different "resource contexts" in play in this rough code:

1. The DSM segment.

2. fd.c segment clean-up (i.e., cleanup that falls on resourceOwner
stored within buffile.c for BufFile segments).

3. The memory context in which BufFile stuff is allocated.

I am bridging the gap between local memory and shared memory here, in
essence. The local state like virtual FDs is what we need to mark as
non-temp, so the second unlink() is never attempted on fd.c resource
cleanup, which comes last of all (per resource manager) on the one
hand. We do need to touch refcounts in shared memory too, but shared
memory only has only some relevant state, which seems like something
to avoid changing, because we want to keep all of those nice
advantages I went into above. On the other hand, it seems wrong to
jump from shared memory to shared local memory in my on_dsm_detach()
callback, though that's what works for me right now. The callback is
only registered in the leader process, and only really needs to be
around for as long as there is a brief period of co-ownership of
BufFiles.

Say the caller never gets around to a BufFileClose() call. That's a
bug anyway, and will produce a temp file reference leak warning today.
But this change turns that bug into a hard crash, which is a more
serious bug. This happens because the on_dsm_detach() callback ends up
calling BufFileClose() against a pointer to garbage (freed local
memory). Had BufFileClose() been called before resource cleanup called
the callback, which should happen anyway, then it would have marked
the pointer to local memory (stored in *shared* memory) NULL. No hard
crash would happen from the callback, which returns early, never
spuriously calling BufFileClose() a second time within leader.

ISTM that even today, buffile.c caller's memory context tacitly shares
the lifetime of caller's resourceOwner. Otherwise, everything would
break. It doesn't seem that much of a stretch to assume that the DSM
seg callback implicitly has the opportunity to "go first", since ISTM
that things 

Re: [HACKERS] Statement-level rollback

2017-03-06 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> >> Can you provide some references on how other systems provide this feature?
> >
> > Oracle doesn't.
> 
> Really?

Sorry, my sentence was misleading.
I meant by "Oracle/MySQL doesn't" that they do not provide a configuration 
parameter or START TRANSACTION mode to choose between statement rollback and 
transaction rollback.  They just rolls back the failed statement.  I wish 
Postgres could behave the same way.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Andres Freund
On 2017-03-06 18:59:02 -0500, Robert Haas wrote:
> On Mon, Mar 6, 2017 at 5:19 PM, Andres Freund  wrote:
> > The whole performance issue trigger this thread only exists when the
> > hashtable sizes are mis-estimated, right?  Turns out that after applying
> > the just-committed changes, that "fixing" the bad-mixing and/or doing
> > iteration that's not entirely in hash-order, slighty degrades
> > performance.  The difference is that without either of those additional
> > changes, we resize to the "right" size very early, when the hashtable is
> > barely filled (i.e. only few entries need to be moved), because the
> > imbalance is observed at tsart.  With the changes however the resizing
> > happens when the table is pretty full (i.e. a lot of entries need to be
> > moved).  So the early imbalance ends up actually not hurting
> > performance...
> 
> Hmm.  I don't know what to do about that.

Oh, I don't think we need to do much about it either way - I think it's
more an interesting curiosity than anything.  Seems to reduce the
urgency for better iteration a bit, that's it.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 8:57 AM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia  
> wrote:
>> Yes, I thought of adding wait event only for the sync but then recording the
>> wait event for both write and sync. I understand that OS level writes are
>> cheap but it still have some cost attached to that. Also I thought for the
>> monitoring tool being develop using this wait events, will have more useful
>> capture data if we try to collect as much info as we can. Or may be not.
>>
>> I am open for other opinion/suggestions.
>
> Writes are NOT always fast.  I've seen cases of write() blocking for
> LONG periods of time on systems that are in the process of failing, or
> just busy.  So I think we certainly want to advertise a wait event for
> those.

+1. I see that quite often really happens, and a lot particularly on
overloaded VMs. Having to debug such systems until you notice that
what is slow is just the underlying system is no fun. So being able to
show up an event state in memory where we can say that things are
stuck on I/O just with PG interface can really help in diagnostics
without having to look at storage-level logs for a first impression.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 5:19 PM, Andres Freund  wrote:
> The whole performance issue trigger this thread only exists when the
> hashtable sizes are mis-estimated, right?  Turns out that after applying
> the just-committed changes, that "fixing" the bad-mixing and/or doing
> iteration that's not entirely in hash-order, slighty degrades
> performance.  The difference is that without either of those additional
> changes, we resize to the "right" size very early, when the hashtable is
> barely filled (i.e. only few entries need to be moved), because the
> imbalance is observed at tsart.  With the changes however the resizing
> happens when the table is pretty full (i.e. a lot of entries need to be
> moved).  So the early imbalance ends up actually not hurting
> performance...

Hmm.  I don't know what to do about that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wait events for disk I/O

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 3:27 AM, Rushabh Lathia  wrote:
> Yes, I thought of adding wait event only for the sync but then recording the
> wait event for both write and sync. I understand that OS level writes are
> cheap but it still have some cost attached to that. Also I thought for the
> monitoring tool being develop using this wait events, will have more useful
> capture data if we try to collect as much info as we can. Or may be not.
>
> I am open for other opinion/suggestions.

Writes are NOT always fast.  I've seen cases of write() blocking for
LONG periods of time on systems that are in the process of failing, or
just busy.  So I think we certainly want to advertise a wait event for
those.

Likewise, I agree that the prefetch call probably SHOULDN'T block, but
just because it shouldn't doesn't mean it never will.

I think somebody should try a pgbench run with this patch applied,
using a scale factor greater than shared_buffers, and generate a wait
event profile, just to see if these are showing up and how often.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-06 Thread Andres Freund
On 2017-02-13 12:05:21 -0500, Robert Haas wrote:
> On Fri, Feb 10, 2017 at 8:39 PM, Peter Geoghegan  wrote:
> > On Thu, Feb 9, 2017 at 2:32 PM, Andres Freund  wrote:
> >> Except that the proposed names aren't remotely like that... ;).
> >
> > Revision attached -- V5. We now REVOKE ALL on both functions, as
> > Robert suggested, instead of the previous approach of having a
> > hard-coded superuser check with enforcement.
> >
> >> And I proposed documenting named parameters, and
> >> check_btree(performing_check_requiring_exclusive_locks => true) is just
> >> about as expressive.
> >
> > I have not done this, nor have I renamed the functions. I still think
> > that this is something that we can fix by adding a boolean argument to
> > each function in the future, or something along those lines. I
> > *really* hate the idea of having one function with non-obvious,
> > variable requirements on locking, with locking implications that are
> > not knowable when we PREPARE an SQL statement calling the function. It
> > also removes a useful way of have superusers discriminate against the
> > stronger locking variant bt_index_parent_check() by not granting
> > execute on it (as an anti-footgun measure).
>
> I think Andres is more or less correct that
> "performing_check_requiring_exclusive_locks => true" is just about as
> expressive as calling a different function, but I think that your
> point that the superuser might want to grant access to one function
> but not the other is a good one.  On the other hand, I think Andres
> has a concern that we might have more modes in the future and we don't
> want to end up with 2^n entrypoints.  That also seems valid.  Hmm.

That's part of my concern.  The second part is that I really want to be
able to have a check_relation() (and check_database())function that I
can pass a bunch of arguments determining how expensive checks are going
to be performed.

E.g. I'd like to be able to do something like

SELECT *
FROM check_relation(
   'my_table'::regclass,
   test_btree => 'true',
   test_btree_heap_interlock => 'true',
   test_gin => 'true');

SELECT *
FROM check_current_database(
   test_heap_update_chains => 'true',
   test_heap_clog_interlock => 'true',
   test_btree => 'true',
   test_gin => 'false');

etc.

You can't really trivially replace these with a larger query and/or
function, because of the locking considerations (consider what happens
if somebody concurrently drops a table/index - your whole query errors
out, wasting hours of work).

I'm ok with not immediately doing so, but I think Peter's design isn't
in line with achieving something like this.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib modules and relkind check

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 3:10 AM, Corey Huinker  wrote:
> Michael, do you want to add yourself as a reviewer?

Yes, thanks for the reminder.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 8:15 AM, Dilip Kumar  wrote:
> On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas  wrote:
>> I'm not happy with the way this patch can just happen to latch on to a
>> path that's not parallel-safe rather than one that is and then just
>> give up on a merge join in that case.  I already made this argument in
>> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com
>> and my opinion hasn't changed.  I've subsequently come to realize that
>> this problem is more widespread that I initially realized: not only do
>> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
>> without searching for the cheapest parallel-safe path, but the latter
>> later calls get_cheapest_path_for_pathkeys() and then just pretends it
>> didn't find anything unless the result is parallel-safe.  This makes
>> no sense to me: the cheapest parallel-safe path could be 1% more
>> expensive than the cheapest path of any kind, but because it's not the
>> cheapest we arbitrarily skip it, even though parallelizing more of the
>> tree might leave us way ahead overall.
>
> for sort_inner_and_outer I have followed the same logic what
> hash_inner_and_outer is doing.  Also moved the logic of selecting the
> cheapest_safe_inner out of the pathkey loop.

+/* Generate partial path if inner is parallel safe. */
+if (inner_cheapest_total->parallel_safe)
+try_partial_mergejoin_path(root,
+   joinrel,
+   outerpath,
+   inner_cheapest_total,
+   merge_pathkeys,
+   mergeclauses,
+   NIL,
+   innersortkeys,
+   jointype,
+   extra);
+
+/* Can't do anything else if inner path needs to be unique'd */
+if (save_jointype == JOIN_UNIQUE_INNER)
+return;

Right after this, you should try_partial_mergejoin_path() with the
result of get_cheapest_parallel_safe_total_inner(), just like
sort_inner_and_outer() already does.

I think with some work you could merge generate_mergejoin_paths with
generate_partial_mergejoin_paths instead of duplicating all the code.
They're not really that different, and you could reduce the
differences further if you made try_mergejoin_path() take an
additional is_partial argument and pass the call to
try_partial_mergejoin_path() if it's true.  The various bits of
special handling related to join types that aren't supported in
parallel queries aren't going to help you in parallel mode, but they
won't hurt either.  This bit:

+/* Generate partial path if inner is parallel safe. */
+if (inner_cheapest_total->parallel_safe)
+try_partial_mergejoin_path(root,

...is really not right.  That value is being passed down from
consider_parallel_mergejoin(), which is getting it from
match_unsorted_outer(), which really shouldn't be calling this code in
the first place with a path that's not parallel-safe.  What it ought
to do is (a) pass inner_cheapest_total if that's non-NULL and
parallel-safe, or else (b) call
get_cheapest_parallel_safe_total_inner() and pass that value, (c)
unless that's NULL also in which case it should skip the call
altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2017-03-06 Thread Andres Freund
Hi,

On 2017-02-13 11:52:53 -0500, Robert Haas wrote:
> On Thu, Feb 9, 2017 at 5:32 PM, Andres Freund  wrote:
> >> > Again, some parts of the code doing something bad isn't a good argument
> >> > for doing again. Releasing locks early is a bad pattern, because backend
> >> > code isn't generally safe against it, we have special code-paths for
> >> > catalog tables to support it for those.
> >>
> >> I don't agree with that.  The reason we keep relation locks until the
> >> end of the transaction is to avoid surprising application code, not
> >> because the system can't tolerate it.  But here it's more likely that
> >> retaining the lock will surprise the user.
> >
> > It's both.  A bunch of code paths rely on early release ony happening on
> > catalogs. E.g., and that's just the first thing that comes to mind,
> > cache invalidations which really only are guaranteed to be delivered and
> > visibile for other cache accesses, if the lock is only released after
> > the end of the transaction.  Object accesses like relation_open() accept
> > invalidations *after* acquiring the lock. Combined with lock releases
> > only happening *after* transaction commits that guarantees an up2date
> > view of the cache; but if others release the lock earlier and have cache
> > invalidations, that doesn't work anymore.  Now you can argue that it's
> > possibly harmless here because there's presumably no way this can ever
> > trigger cache invals - and you're probably right - but this isn't the
> > only place with such assumption and you'd definitely have to argue *why*
> > it's right.
> 
> I agree that code which issues cache invalidations needs to hold a
> blocking lock until commit IF it's critical for other backends to see
> the update.

Right. And I think that needs to be the default assumption, because
otherwise you need to be careful about all the relevant code paths.  I'm
fine with adding something along the lines of
/*
 * Release lock on relation early, this function doesn't generate any
 * cache invalidations, and it's convenient for callers to be able to
 * check multiple functions in one go.
 */
(after checking that's true of course) and then skipping the lock
release.  What I'm absolutely not ok with is doing so without
consideration and documentation.


> I don't really understand your demand for "an argument why it's
> right".  My argument is simple: I can't think of a single thing that
> it would break, and I don't believe there is any such thing.

That's pretty much an argument.  I just want documentation that calls
attention that we're doing something that requires a modicum of care.  I
fail to see why that's unreasonable.  I don't think it's that unlikely
will come around adding something like "hey we can update the free page
statistic here, because we just read the whole damn thing", suddenly
invalidating the logic.  Or we add a heap check that then might trigger
pruning, which quite conceivably could trigger an invalidation. Or ...


> I think you're trying to invent a coding rule that doesn't exist, but
> I can't prove a negative.

We had bugs around this in the past, so ...


> Your argument so far is that "yeah, well, maybe inval doesn't emit
> sinval traffic (duh?) but it might do something else that breaks,
> prove to me that there is no such thing".

I have no idea what you're trying to say with this.  The rest of that
paragraph just seems like bunch of odd hyperbole ("impossible to exceed
the speed of light...").


> From my point of view, it's likely to be common to want to run amcheck
> in series on a bunch of indexes, like by writing a SELECT query
> against pg_class or pg_index and passing the OIDs to amcheck.  With
> the retain-locks design, that query accumulates locks on a large
> number of objects, possibly blowing out the lock table or blocking
> DDL.

On the other hand, retaining the locks actually increases the chance to
run a check to completion.  If you e.g. write a query checking the
indexes for a table, you suddenly can't be sure that subsequent indexes
for the same table can be checked, because the relation lock has been
released inbetween.  Which then'd lead to erroring out after doing part
of the expensive work.

I don't think that refutes your argument, but I also don't think it's
as clear cut as you make it out to be.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog

2017-03-06 Thread Michael Paquier
On Tue, Mar 7, 2017 at 7:16 AM, Robert Haas  wrote:
> On Mon, Mar 6, 2017 at 3:26 PM, Peter Eisentraut
>  wrote:
>> On 3/4/17 02:09, Michael Paquier wrote:
>>> Well, that's one reason why I was thinking that having an independent
>>> in-core option to clean up the tail of the oldest segments is
>>> interesting: users don't need to maintain their own infra logic to do
>>> anything. Now this end-segment command can as well be used with a
>>> small binary doing this cleanup, but the monitoring of the thing gets
>>> harder as multiple processes get spawned.
>>
>> I think the initial idea of having an option that does something
>> specific is better than an invitation to run a general shell command.  I
>> have some doubts that the proposal to clean up old segments based on
>> file system space is workable.  For example, that assumes that you are
>> the only one operating on the file system.  If something else fills up
>> the file system, this system could then be induced to clean up
>> everything immediately, without any reference to what you still need.
>> Also, the various man pages about statvfs() that I found are pretty
>> pessimistic about how portable it is.

You can count Windows in that.

> What if we told pg_receivewal (or pg_receivexlog, whatever that is) a
> maximum number of segments to retain before removing old ones?  Like
> pg_receivewal --limit-retained-segments=50GB, or something like that.

That's of course doable as well by counting the entries available. Now
one reason why I did not do that is because in my case the archiver is
started as a service using a fixed script, and the size to retain can
be flexible as users can decide the VM size at deployment :)
Having this option would be better than nothing, just that it is not
that flexible if you think about it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
On 2017-03-02 22:51:09 +0100, Tomas Vondra wrote:
> Attaches is the last part of the patch series, rebased to current master and
> adopting the new chunk header approach.

Something seems to have gone awry while sending that - the attachement
is a whopping 0 bytes...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Andreas Karlsson

On 02/27/2017 03:05 PM, Peter Eisentraut wrote:

How about changing the default for log_directory from 'pg_log' to, say,
'log'?


I have attached a patch which does this. I do not care much about which 
name is picked as long as we get rid off the "pg_" prefix.


Btw, is there a reason for why global and base do not have the "pg_" prefix?

Andreas
commit 0b71fcdb328f05349775675e0491ba1b82127d4e
Author: Andreas Karlsson 
Date:   Mon Mar 6 23:52:49 2017 +0100

Rename default log directory from pg_log to log

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd82c04b05..4ee1f605bc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4324,8 +4324,8 @@ SELECT * FROM parent WHERE key = 2400;
 find the logs currently in use by the instance. Here is an example of
 this file's content:
 
-stderr pg_log/postgresql.log
-csvlog pg_log/postgresql.csv
+stderr log/postgresql.log
+csvlog log/postgresql.csv
 
 
 current_logfiles is recreated when a new log file
@@ -4427,7 +4427,7 @@ local0.*/var/log/postgresql
 cluster data directory.
 This parameter can only be set in the postgresql.conf
 file or on the server command line.
-The default is pg_log.
+The default is log.

   
  
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index 309a303e03..359f222352 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -262,7 +262,7 @@ CREATE FOREIGN TABLE pglog (
   location text,
   application_name text
 ) SERVER pglog
-OPTIONS ( filename '/home/josh/9.1/data/pg_log/pglog.csv', format 'csv' );
+OPTIONS ( filename '/home/josh/9.1/data/log/pglog.csv', format 'csv' );
 
   
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0707f66631..69b9cdacff 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3298,7 +3298,7 @@ static struct config_string ConfigureNamesString[] =
 			GUC_SUPERUSER_ONLY
 		},
 		_directory,
-		"pg_log",
+		"log",
 		check_canonical_path, NULL, NULL
 	},
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 157d775853..e6dbc31591 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -343,7 +343,7 @@
 	# (change requires restart)
 
 # These are only used if logging_collector is on:
-#log_directory = 'pg_log'		# directory where log files are written,
+#log_directory = 'log'			# directory where log files are written,
 	# can be absolute or relative to PGDATA
 #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'	# log file name pattern,
 	# can include strftime() escapes
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e5cb348f4c..1c87e39e0d 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -560,7 +560,7 @@ sub _backup_fs
 		$backup_path,
 		filterfn => sub {
 			my $src = shift;
-			return ($src ne 'pg_log' and $src ne 'postmaster.pid');
+			return ($src ne 'log' and $src ne 'postmaster.pid');
 		});
 
 	if ($hot)
diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm
index 3e98813286..457488ba5b 100644
--- a/src/test/perl/RecursiveCopy.pm
+++ b/src/test/perl/RecursiveCopy.pm
@@ -48,9 +48,9 @@ attempted.
 
  RecursiveCopy::copypath('/some/path', '/empty/dir',
 filterfn => sub {
-		# omit pg_log and contents
+		# omit log and contents
 		my $src = shift;
-		return $src ne 'pg_log';
+		return $src ne 'log';
 	}
  );
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
Hi,

On 2017-03-06 23:32:30 +0100, Tomas Vondra wrote:
> > > The question however is whether this won't make the optimization 
> > > pointless.
> > > I also, wonder how much we save by this optimization and how widely it's
> > > used? Can someone point me to some numbers?
> > 
> > I don't recall any recent numbers.  I'm more than a bit doubful that it
> > really matters - it's only used for the results of aggregate/window
> > functions, and surely they've a good chunk of their own overhead...
> > 
> 
> And if the benefit is negligible, trying to keep the optimization might
> easily result in slowdown (compared to non-optimized code).

I doubt the branch is noticeable here, given that we're doing a memory
allocation otherwise.  Should also be decently predictable.

> But I'm puzzled why we haven't seen any reports of failures? I mean, doing
> sum(int4) is not particularly extravagant thing, if there really is an
> issue, shouldn't we see a lot of reports? What are we missing?

Reports about what? False positives causing crashes / wrong results?  I
think it's quite unlikely to actually trigger this in practice, because
you need a properly aligned pointer, and then the preceding header has
to to point to a bit pattern that's equal to the context - that's
presumably quite unlikely in practice.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-06 Thread Erik Rijkers

On 2017-03-06 16:10, Erik Rijkers wrote:

On 2017-03-06 11:27, Petr Jelinek wrote:

Hi,

updated and rebased version of the patch attached.



I compiled with /only/ this one latest patch:
   0001-Logical-replication-support-for-initial-data-copy-v6.patch

Is that correct, or are other patches still needed on top, or 
underneath?




TWIMC, I'll answer my own question: the correct patchset seems to be 
these six:


0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch
0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch
0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch
0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch
0005-Skip-unnecessary-snapshot-builds.patch
0001-Logical-replication-support-for-initial-data-copy-v6.patch

These compile, make check, and install fine.  make check-world is also 
without errors.


Logical replication tests are now running again (no errors yet); they'll 
have to run for a few hours with varying parameters to gain some 
confidence but it's looking good for the moment.



Erik Rijkers







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Tomas Vondra

On 03/06/2017 08:08 PM, Andres Freund wrote:

Hi,

On 2017-03-06 19:49:56 +0100, Tomas Vondra wrote:

On 03/06/2017 07:05 PM, Robert Haas wrote:

On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:

On 2017-03-06 12:40:18 -0500, Robert Haas wrote:

On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:

The issue was that on 32bit platforms the Datum returned by some
functions (int2int4_sum in this case) isn't actually a separately
allocated Datum, but rather just something embedded in a larger
struct.  That, combined with the following code:
if (!peraggstate->resulttypeByVal && !*isnull &&
!MemoryContextContains(CurrentMemoryContext,
   
DatumGetPointer(*result)))
seems somewhat problematic to me.  MemoryContextContains() can give
false positives when used on memory that's not a distinctly allocated
chunk, and if so, we violate memory lifetime rules.  It's quite
unlikely, given the required bit patterns, but nonetheless it's making
me somewhat uncomfortable.

Do others think this isn't an issue and we can just live with it?


I think it's 100% broken to call MemoryContextContains() on something
that's not guaranteed to be a palloc'd chunk.


I agree, but to me it seems the only fix would be to just yank out the
whole optimization?


Dunno, haven't looked into it.



I think it might be fixable by adding a flag into the chunk, with 'true' for
regular allocations, and 'false' for the optimized ones. And then only use
MemoryContextContains() for 'flag=true' chunks.


I'm not quite following here.  We only get a Datum and the knowledge
that it's a pass-by-ref argument, so we really don't know that much.  We
could create an "EmbeddedDatum" type that has a preceding chunk header
(appropriately for the version), that just gets zeroed out at start.  Is
that what you mean?



Yes, that's roughly what I meant.




The question however is whether this won't make the optimization pointless.
I also, wonder how much we save by this optimization and how widely it's
used? Can someone point me to some numbers?


I don't recall any recent numbers.  I'm more than a bit doubful that it
really matters - it's only used for the results of aggregate/window
functions, and surely they've a good chunk of their own overhead...



And if the benefit is negligible, trying to keep the optimization might 
easily result in slowdown (compared to non-optimized code).


But I'm puzzled why we haven't seen any reports of failures? I mean, 
doing sum(int4) is not particularly extravagant thing, if there really 
is an issue, shouldn't we see a lot of reports? What are we missing?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-06 Thread Stephen Frost
Andres,

* Andres Freund (and...@anarazel.de) wrote:
> There's also some tests simply taking way too long, e.g. the pg_dump
> tests just do largely redundant tests for 30s.

About half of the time in the pg_dump case, at least, appears to be in
010_dump_connstr.pl.  I've not looked into it, but based on the test
names it does look like some of those tests might be redundant to what
is already being covered in 002_pg_dump.pl.  Of course, the way the TAP
tests run, they require an initdb and startup of the postmaster, and
that's a somewhat fixed amount of time that any TAP test is going to
take.

I'm all for improving things certainly, though, well, while the pg_dump
tests do a lot, they also try to cover a good bit of the code in
pg_dump.c which is quite large.  I wouldn't want to reduce our code
coverage just to make the regression tests go faster.  The changes to
the pg_dump tests that I'm hoping to push soon will at least reduce the
output some, if not the length of time taken, while providing more code
coverage.

All that said, 30s out of 20m (which seems to more-or-less match what I
get locally too) makes me really wonder if that's the place that we need
to focus.  Is it really the longest running test we have and it's just
that we have tons of them that are doing initdb over and over again?

> I'm not quite sure what the best way to attack this is, but I think we
> need to do something.

I tend to agree with this, though I haven't got any great answers,
unfortunately.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Tom Lane
Kevin Grittner  writes:
> With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
> `make install-world`, a fresh default initdb, a start with default
> config, `make installcheck`, connected to the regression database
> with psql as the initial superuser, and ran:

> regression=# vacuum freeze analyze;
> WARNING:  relcache reference leak: relation "p1" not closed
> VACUUM

p1 is a partitioned table.  (BTW, could I lobby for people not to use such
generic, collision-prone names for tables that will be left behind after
the regression tests?)  Also, I find that "vacuum analyze" is sufficient,
or even just "analyze", or "analyze p1".  I think it's highly likely this
was introduced by 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3.  Certainly
that failed to add appropriate regression test cases, or we would have
noticed this already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Andres Freund
On 2017-03-06 16:58:23 -0500, Robert Haas wrote:
> On Mon, Mar 6, 2017 at 3:32 PM, Andres Freund  wrote:
> >> I think DEBUG1 is far too high for something that could occur with
> >> some frequency on a busy system; I'm fairly strongly of the opinion
> >> that you ought to downgrade that by a couple of levels, say to DEBUG3
> >> or so.
> >
> > I actually planned to remove it entirely, before committing. It was more
> > left in for testers to be able to see when the code triggers.
> 
> Oh, OK.  That'd be fine too.

And pushed that way.


> > FWIW, I played with some better mixing, and it helps a bit with
> > accurately sized hashtables and multiple columns.
> >
> > What's however more interesting is that a better mixed IV and/or better
> > iteration now *slightly* *hurts* performance with grossly misestimated
> > sizes, because resizing has to copy more rows... Not what I predicted.
> 
> I don't quite follow this.

The whole performance issue trigger this thread only exists when the
hashtable sizes are mis-estimated, right?  Turns out that after applying
the just-committed changes, that "fixing" the bad-mixing and/or doing
iteration that's not entirely in hash-order, slighty degrades
performance.  The difference is that without either of those additional
changes, we resize to the "right" size very early, when the hashtable is
barely filled (i.e. only few entries need to be moved), because the
imbalance is observed at tsart.  With the changes however the resizing
happens when the table is pretty full (i.e. a lot of entries need to be
moved).  So the early imbalance ends up actually not hurting
performance...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 3:26 PM, Peter Eisentraut
 wrote:
> On 3/4/17 02:09, Michael Paquier wrote:
>> Well, that's one reason why I was thinking that having an independent
>> in-core option to clean up the tail of the oldest segments is
>> interesting: users don't need to maintain their own infra logic to do
>> anything. Now this end-segment command can as well be used with a
>> small binary doing this cleanup, but the monitoring of the thing gets
>> harder as multiple processes get spawned.
>
> I think the initial idea of having an option that does something
> specific is better than an invitation to run a general shell command.  I
> have some doubts that the proposal to clean up old segments based on
> file system space is workable.  For example, that assumes that you are
> the only one operating on the file system.  If something else fills up
> the file system, this system could then be induced to clean up
> everything immediately, without any reference to what you still need.
> Also, the various man pages about statvfs() that I found are pretty
> pessimistic about how portable it is.

What if we told pg_receivewal (or pg_receivexlog, whatever that is) a
maximum number of segments to retain before removing old ones?  Like
pg_receivewal --limit-retained-segments=50GB, or something like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Need a builtin way to run all tests faster manner

2017-03-06 Thread Andres Freund
Hi,

Right now check-world, a sensible thing to run before commits that
aren't very narrow, takes a long time.  To the point it seems to impact
development velocity enough that it's more sensible to just skip it, and
rely on the buildfarm.

That's not a great solution.

A large amount of the time is actually spent doing completely redundant
initdb, cluster start/stop work, and running tests serially that could
be run in parallel.

A single check-world on my machine takes over 20min.  That's just not
realistic to run without hurting development pace.

We can avoid a lot of redundant work (skip redundant initdb & cluster
start/stop), and we can quite easily parallelize others.

The problem is that doing so isn't something entirely trivially
scriptable, e.g. make installcheck-world doesn't run all tests, and it
doesn't do so in parallel.  Scripting it locally also has the issue that
it's very easy to not notice new stuff being added by others.

As an example of the speedups, here's the comparison for contrib:
make -C contrib:2m21.056s
make -C contrib installcheck0m30.672s
make -C contrib -j16 -s -Otarget installcheck USE_MODULE_DB=1   0m10.418s

that's not an entirely fair comparison however, because test_decoding
doesn't to installcheck, but the general principle holds.

This is obviously a large difference.

A lot of the slow tap tests could be run serially, recoverying a lot
more time.

There's also some tests simply taking way too long, e.g. the pg_dump
tests just do largely redundant tests for 30s.

I'm not quite sure what the best way to attack this is, but I think we
need to do something.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: check failure with -DRELCACHE_FORCE_RELEASE -DCLOBBER_FREED_MEMORY

2017-03-06 Thread Tom Lane
I wrote:
> I fixed that, and the basic regression tests no longer crash outright with
> these settings, but I do see half a dozen errors that all seem to be in
> RLS-related tests.

Those turned out to all be the same bug in DoCopy.  "make check-world"
passes for me now with -DRELCACHE_FORCE_RELEASE, but I've only tried
HEAD not the back branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> Do you want to deal with this whole mess, or shall I have a go at it?
> 
> > I'm just about to push the pg_upgrade fixes for large object comments
> > and security labels, followed not too far behind by the fix for the
> > public ACL in pg_dump --clean mode.  I have no objection to you handling
> > these and I can go look at some of the other items on my plate.
> 
> OK, I'll wait till you've pushed those, just to avoid any risk of merge
> problems.

Alright, I've pushed the large object fix for pg_upgrade, hopefully the
buildfarm will stay clean.  I'm going to do a bit more testing on the
public ACL/pg_dump --clean patch, but it's also quite a localized patch,
so I think you're good to go now.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 3:32 PM, Andres Freund  wrote:
>> I think DEBUG1 is far too high for something that could occur with
>> some frequency on a busy system; I'm fairly strongly of the opinion
>> that you ought to downgrade that by a couple of levels, say to DEBUG3
>> or so.
>
> I actually planned to remove it entirely, before committing. It was more
> left in for testers to be able to see when the code triggers.

Oh, OK.  That'd be fine too.

> FWIW, I played with some better mixing, and it helps a bit with
> accurately sized hashtables and multiple columns.
>
> What's however more interesting is that a better mixed IV and/or better
> iteration now *slightly* *hurts* performance with grossly misestimated
> sizes, because resizing has to copy more rows... Not what I predicted.

I don't quite follow this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Do you want to deal with this whole mess, or shall I have a go at it?

> I'm just about to push the pg_upgrade fixes for large object comments
> and security labels, followed not too far behind by the fix for the
> public ACL in pg_dump --clean mode.  I have no objection to you handling
> these and I can go look at some of the other items on my plate.

OK, I'll wait till you've pushed those, just to avoid any risk of merge
problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-06 Thread Robert Haas
On Sun, Mar 5, 2017 at 9:41 PM, Amit Kapila  wrote:
>> RCA:
>> 
>> From "Replace min_parallel_relation_size with two new GUCs" commit
>> onwards, we are not assigning parallel workers for the child rel with
>> zero heap pages. This means we won't be able to create a partial
>> append path as this requires all the child rels within an Append Node
>> to have a partial path.
>
> Right, but OTOH, if we assign parallel workers by default, then it is
> quite possible that it would result in much worse plans.  Consider a
> case where partition hierarchy has 1000 partitions and only one of
> them is big enough to allow parallel workers.  Now in this case, with
> your proposed fix it will try to scan all the partitions in parallel
> workers which I think can easily result in bad performance.  I think
> the right way to make such plans parallel is by using Parallel Append
> node (https://commitfest.postgresql.org/13/987/).  Alternatively, if
> you want to force parallelism in cases like the one you have shown in
> example, you can use Alter Table .. Set (parallel_workers = 1).

Well, I think this is a bug in
51ee6f3160d2e1515ed6197594bda67eb99dc2cc.  The commit message doesn't
say anything about changing any behavior, just about renaming GUCs,
and I don't remember any discussion about changing the behavior
either, and the comment still implies that we have the old behavior
when we really don't, and parallel append isn't committed yet.

I think the problem here is that compute_parallel_worker() thinks it
can use 0 as a sentinel value that means "ignore this", but it can't
really, because a heap can actually contain 0 pages.  Here's a patch
which does the following:

- changes the sentinel value to be -1
- adjusts the test so that a value of -1 is ignored when determining
whether the relation is too small for parallelism
- updates a comment that is out-of-date now that this is no longer
part of the seqscan logic specifically
- moves some variables to narrower scopes
- changes the function parameter types to be doubles, because
otherwise the call in cost_index() has an overflow hazard

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


compute-parallel-worker-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-06 Thread Tomas Vondra

On 03/06/2017 10:13 PM, Peter Eisentraut wrote:

On 3/3/17 09:03, Tomas Vondra wrote:

Attached is v2, fixing both issues.


I wonder if

+   bytea  *raw_page = PG_GETARG_BYTEA_P(0);

+   uargs->page = VARDATA(raw_page);

is expected to work reliably, without copying the argument to a
different memory context.

I think it would be better not to maintain so much duplicate code
between bt_page_items(text, int) and bt_page_items(bytea).  How about
just redefining bt_page_items(text, int) as an SQL-language function
calling bt_page_items(get_raw_page($1, $2))?



Maybe. We can also probably share the code at the C level, so I'll look 
into that.


>

For page_checksum(), the checksums are internally unsigned, so maybe
return type int on the SQL level, so that there is no confusion about
the sign?



That ship already sailed, I'm afraid. We already have checksum in the 
page_header() output, and it's defined as smallint there. So using int 
here would be inconsistent.


A similar issue is that get_raw_page() uses int for block number, while 
internally it's uint32. That often requires a fair amount of gymnastics 
on large tables.


I'm not against changing either of those bits - using int for checksums 
and bigint for block numbers, but I think it should be a separate patch.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Kevin Grittner
With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
`make install-world`, a fresh default initdb, a start with default
config, `make installcheck`, connected to the regression database
with psql as the initial superuser, and ran:

regression=# vacuum freeze analyze;
WARNING:  relcache reference leak: relation "p1" not closed
VACUUM

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WARNING: relcache reference leak: relation "p1" not closed

2017-03-06 Thread Kevin Grittner
[original message held up for review -- should be along eventualy...]

On Mon, Mar 6, 2017 at 3:11 PM, Kevin Grittner  wrote:
> With e434ad39ae7316bcf35fd578dd34ad7e1ff3c25f I did a `make world`,
> `make install-world`, a fresh default initdb, a start with default
> config, `make installcheck`, connected to the regression database
> with psql as the initial superuser, and ran:
>
> regression=# vacuum freeze analyze;
> WARNING:  relcache reference leak: relation "p1" not closed
> VACUUM

Still present in e6477a8134ace06ef3a45a7ce15813cd398e72d8

-- 
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
> >> I've seen that pg_dump execute the dump of an eventual comment of a
> >> TSDictionary without specifying the namespace where it is defined:
> 
> > Great catch!
> 
> One of my smarter CS professors taught me that whenever you find a bug,
> you should look around to see where else you made the same mistake.

Indeed, was planning to do so.

> Checking for other errors of the same ilk is depressingly fruitful :-(

Ah, ouch.

> It looks to me like operator classes, operator families, and all four
> types of text-search objects have this same error, and have for a long
> time.  I'm also wondering if it wouldn't be wise for the dumpComment()
> call for procedural languages to use "lanschema", so that the comment
> TocEntry matches the parent object in both cases for PL objects.

That does sound like a good idea..

> I'm also kind of wondering why the ArchiveEntry() calls for casts and
> transforms specify "pg_catalog" as the schema but we don't do that in
> their dumpComment() calls.  Maybe we don't need that hack and should
> specify NULL instead.

Hmmm, probably, though I've not looked specifically.

> > Right.  I've got a few other patches queued up for pg_dump, so I'll
> > take care of this also.
> 
> Do you want to deal with this whole mess, or shall I have a go at it?

I'm just about to push the pg_upgrade fixes for large object comments
and security labels, followed not too far behind by the fix for the
public ACL in pg_dump --clean mode.  I have no objection to you handling
these and I can go look at some of the other items on my plate.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-06 Thread Peter Eisentraut
On 3/3/17 09:03, Tomas Vondra wrote:
> Attached is v2, fixing both issues.

I wonder if

+   bytea  *raw_page = PG_GETARG_BYTEA_P(0);

+   uargs->page = VARDATA(raw_page);

is expected to work reliably, without copying the argument to a
different memory context.

I think it would be better not to maintain so much duplicate code
between bt_page_items(text, int) and bt_page_items(bytea).  How about
just redefining bt_page_items(text, int) as an SQL-language function
calling bt_page_items(get_raw_page($1, $2))?

For page_checksum(), the checksums are internally unsigned, so maybe
return type int on the SQL level, so that there is no confusion about
the sign?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dump a comment of a TSDictionary

2017-03-06 Thread Tom Lane
Stephen Frost  writes:
> * Giuseppe Broccolo (giuseppe.brocc...@2ndquadrant.it) wrote:
>> I've seen that pg_dump execute the dump of an eventual comment of a
>> TSDictionary without specifying the namespace where it is defined:

> Great catch!

One of my smarter CS professors taught me that whenever you find a bug,
you should look around to see where else you made the same mistake.
Checking for other errors of the same ilk is depressingly fruitful :-(
It looks to me like operator classes, operator families, and all four
types of text-search objects have this same error, and have for a long
time.  I'm also wondering if it wouldn't be wise for the dumpComment()
call for procedural languages to use "lanschema", so that the comment
TocEntry matches the parent object in both cases for PL objects.

I'm also kind of wondering why the ArchiveEntry() calls for casts and
transforms specify "pg_catalog" as the schema but we don't do that in
their dumpComment() calls.  Maybe we don't need that hack and should
specify NULL instead.

> Right.  I've got a few other patches queued up for pg_dump, so I'll
> take care of this also.

Do you want to deal with this whole mess, or shall I have a go at it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in TPC-H Q18

2017-03-06 Thread Andres Freund
On 2017-03-04 11:09:40 +0530, Robert Haas wrote:
> On Sat, Mar 4, 2017 at 5:56 AM, Andres Freund  wrote:
> > attached is a patch to address this problem, and the one reported by
> > Dilip.  I ran a lot of TPC-H and other benchmarks, and so far this
> > addresses all the performance issues, often being noticeably faster than
> > with the dynahash code.
> >
> > Comments?
> 
> I'm still not convinced that raising the fillfactor like this is going
> to hold up in testing, but I don't mind you committing it and we'll
> see what happens.

I didn't see anything in testing, but I agree that it's debatable.  But
I'd rather commit it now, when we all know it's new code.  Raising it in
a new release will be a lot harder.


> I think DEBUG1 is far too high for something that could occur with
> some frequency on a busy system; I'm fairly strongly of the opinion
> that you ought to downgrade that by a couple of levels, say to DEBUG3
> or so.

I actually planned to remove it entirely, before committing. It was more
left in for testers to be able to see when the code triggers.


> > On 2017-03-03 11:23:00 +0530, Kuntal Ghosh wrote:
> >> On Fri, Mar 3, 2017 at 8:41 AM, Robert Haas  wrote:
> >> > On Fri, Mar 3, 2017 at 1:22 AM, Andres Freund  wrote:
> >> >> the resulting hash-values aren't actually meaningfully influenced by the
> >> >> IV. Because we just xor with the IV, most hash-value that without the IV
> >> >> would have fallen into a single hash-bucket, fall into a single
> >> >> hash-bucket afterwards as well; just somewhere else in the hash-range.
> >> >
> >> > Wow, OK.  I had kind of assumed (without looking) that setting the
> >> > hash IV did something a little more useful than that.  Maybe we should
> >> > do something like struct blah { int iv; int hv; }; newhv =
> >> > hash_any(, sizeof(blah)).
> >
> > The hash invocations are already noticeable performancewise, so I'm a
> > bit hesitant to go there.  I'd rather introduce a decent 'hash_combine'
> > function or such.
> 
> Yes, that might be better.  I wasn't too sure the approach I proposed
> would actually do a sufficiently-good job mixing it the bits from the
> IV anyway.  It's important to keep in mind that the values we're using
> as IVs aren't necessarily going to be uniformly distributed in any
> meaningful way.  They're just PIDs, so you might only have 1-3 bits of
> difference between one value and another within the same parallel
> query.  If you don't do something fairly aggressive to make that
> change perturb the final hash value, it probably won't.

FWIW, I played with some better mixing, and it helps a bit with
accurately sized hashtables and multiple columns.

What's however more interesting is that a better mixed IV and/or better
iteration now *slightly* *hurts* performance with grossly misestimated
sizes, because resizing has to copy more rows... Not what I predicted.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog

2017-03-06 Thread Peter Eisentraut
On 3/4/17 02:09, Michael Paquier wrote:
> Well, that's one reason why I was thinking that having an independent
> in-core option to clean up the tail of the oldest segments is
> interesting: users don't need to maintain their own infra logic to do
> anything. Now this end-segment command can as well be used with a
> small binary doing this cleanup, but the monitoring of the thing gets
> harder as multiple processes get spawned.

I think the initial idea of having an option that does something
specific is better than an invitation to run a general shell command.  I
have some doubts that the proposal to clean up old segments based on
file system space is workable.  For example, that assumes that you are
the only one operating on the file system.  If something else fills up
the file system, this system could then be induced to clean up
everything immediately, without any reference to what you still need.
Also, the various man pages about statvfs() that I found are pretty
pessimistic about how portable it is.

I think something that works similar to pg_archivecleanup that knows
what the last base backup is could work.  In fact, could
pg_archivecleanup not be made to work here?  It's an archive, and it
needs cleaning.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-06 19:45 GMT+01:00 Peter Eisentraut :

> On 3/5/17 05:40, Jan Michálek wrote:
> > jelen=# \pset linestyle rst
> > Line style is rst.
> > jelen=# \pset format wrapped
> > Output format is wrapped.
> > jelen=# SELECT repeat('Goodnight Irene ', 30);
> > +---
> --+
> > |
> > repeat|
> > +===
> ==+
> > | Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> > Goodnight I.|
> > |.rene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> > Goodni.|
> > |.ght Irene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight
> > Irene G.|
>
> I doubt that this kind of line breaking style is actually proper rst.
>

If you mean wrapped lines, it is wrapped by email client (in sent messages
it looks as a table).
But really, this rst (original version from sent messages) doesn`t work (I
actually test it) and don`t know why, but problem is probably related with
linebreaks.
In last wersion of diff is pset changed from linestyle to format and using
"wrapped" format is not possible.

Regards

Jan


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Jelen
Starší čeledín datovýho chlíva


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Tom Lane
Andreas Karlsson  writes:
> On 03/01/2017 05:49 AM, Peter Eisentraut wrote:
>> On 2/27/17 09:51, Tom Lane wrote:
>>> No objection to the basic point, but "log" seems perhaps a little too
>>> generic to me.  Would something like "server_log" be better?

>> Well, "log" is pretty well established.  There is /var/log, and if you
>> unpack a, say, Kafka or Cassandra distribution, they also come with a
>> log or logs directory.

> +1, though I am also fine with server_log.

FWIW, I'm not strongly advocating for "server_log", I just thought it
was worth having some discussion around the name.  Peter's point above
is pretty good though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster methods for getting SPI results

2017-03-06 Thread Peter Eisentraut
On 3/5/17 16:07, Jim Nasby wrote:
>> There is nothing that requires us to materialize the results into an
>> actual list of actual rows.  We could wrap the SPI_tuptable into a
>> Python object and implement __getitem__ or __iter__ to emulate sequence
>> or mapping access.
> Would it be possible to have that just pull tuples directly from the 
> executor? The overhead of populating the tuplestore just to drain it 
> again can become quite significant, and AFAICT it's completely unnecessary.

I think there are many options, depending on what you want.  If you want
to materialize the result, then you have to materialize it somewhere,
and then make a Python object around that.  Or you could make an
iterator interface that just reads a tuple at a time.  Or maybe there
are other options.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Andreas Karlsson

On 03/01/2017 05:49 AM, Peter Eisentraut wrote:

On 2/27/17 09:51, Tom Lane wrote:

No objection to the basic point, but "log" seems perhaps a little too
generic to me.  Would something like "server_log" be better?


Well, "log" is pretty well established.  There is /var/log, and if you
unpack a, say, Kafka or Cassandra distribution, they also come with a
log or logs directory.


+1, though I am also fine with server_log.

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-03-06 Thread Peter Eisentraut
On 3/3/17 20:30, Michael Paquier wrote:
> Yeah, it looks sensible to me to keep "replication" for physical
> replication, and switch logical replication checks to match a database
> name in hba comparisons.

I think we are OK to move ahead with this.

Another question would be why only enable connections for
@default_username@ by default, instead of all.

Also, with this change, some test code that sets up pg_hba.conf for
replication can be removed.  See attached patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6f1c79dd34d67bf36a317d454eb6e6349598a97d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 6 Mar 2017 14:53:27 -0500
Subject: [PATCH] Enable replication connections by default in pg_hba.conf

---
 src/backend/libpq/pg_hba.conf.sample |  6 +++---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  7 ++-
 src/test/perl/PostgresNode.pm| 19 ++-
 3 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/src/backend/libpq/pg_hba.conf.sample b/src/backend/libpq/pg_hba.conf.sample
index e0fbfcb026..b0852c82c0 100644
--- a/src/backend/libpq/pg_hba.conf.sample
+++ b/src/backend/libpq/pg_hba.conf.sample
@@ -84,6 +84,6 @@ hostall all 127.0.0.1/32@authmethodhost@
 hostall all ::1/128 @authmethodhost@
 # Allow replication connections from localhost, by a user with the
 # replication privilege.
-@remove-line-for-nolocal@#local   replication @default_username@@authmethodlocal@
-#hostreplication @default_username@127.0.0.1/32@authmethodhost@
-#hostreplication @default_username@::1/128 @authmethodhost@
+@remove-line-for-nolocal@local   replication @default_username@@authmethodlocal@
+hostreplication @default_username@127.0.0.1/32@authmethodhost@
+hostreplication @default_username@::1/128 @authmethodhost@
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index aafb138fd5..14bd813896 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@
 use Config;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 73;
+use Test::More tests => 72;
 
 program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
@@ -15,15 +15,12 @@
 my $node = get_new_node('main');
 
 # Initialize node without replication settings
-$node->init(hba_permit_replication => 0);
+$node->init;
 $node->start;
 my $pgdata = $node->data_dir;
 
 $node->command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
-$node->command_fails(
-	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
-	'pg_basebackup fails because of hba');
 
 # Some Windows ANSI code pages may reject this filename, in which case we
 # quietly proceed without this bit of test coverage.
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e5cb348f4c..7e530676b2 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -349,11 +349,7 @@ sub set_replication_conf
 
 	open my $hba, ">>$pgdata/pg_hba.conf";
 	print $hba "\n# Allow replication (set up by PostgresNode.pm)\n";
-	if (!$TestLib::windows_os)
-	{
-		print $hba "local replication all trust\n";
-	}
-	else
+	if ($TestLib::windows_os)
 	{
 		print $hba
 "host replication all $test_localhost/32 sspi include_realm=1 map=regress\n";
@@ -373,9 +369,6 @@ a directory that's only accessible to the current user to ensure that.
 On Windows, we use SSPI authentication to ensure the same (by pg_regress
 --config-auth).
 
-pg_hba.conf is configured to allow replication connections. Pass the keyword
-parameter hba_permit_replication => 0 to disable this.
-
 WAL archiving can be enabled on this node by passing the keyword parameter
 has_archiving => 1. This is disabled by default.
 
@@ -396,8 +389,6 @@ sub init
 	my $pgdata = $self->data_dir;
 	my $host   = $self->host;
 
-	$params{hba_permit_replication} = 1
-	  unless defined $params{hba_permit_replication};
 	$params{allows_streaming} = 0 unless defined $params{allows_streaming};
 	$params{has_archiving}= 0 unless defined $params{has_archiving};
 
@@ -451,7 +442,7 @@ sub init
 	}
 	close $conf;
 
-	$self->set_replication_conf if $params{hba_permit_replication};
+	$self->set_replication_conf if $params{allows_streaming};
 	$self->enable_archiving if $params{has_archiving};
 }
 
@@ -591,9 +582,6 @@ Does not start the node after initializing it.
 
 A recovery.conf is not created.
 
-pg_hba.conf is configured to allow replication connections. Pass the keyword
-parameter hba_permit_replication => 0 to disable this.
-
 Streaming replication can be 

Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-06 Thread Lukas Fittl
On Mon, Mar 6, 2017 at 9:36 AM, Robert Haas  wrote:

> On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan  wrote:
> > In my opinion, we expose query id (and dbid, and userid) as the
> > canonical identifier for each pg_stat_statements entry, and have done
> > so for some time. That's the stable API -- not query text. I'm aware
> > of cases where query text was used as an identifier, but that ended up
> > being hashed anyway.
> >
> > Query text is just for human consumption.
>
> Lukas evidently thinks otherwise, based on the original post.
>

I actually agree with Peter that the queryid+userid+dbid is the canonical
identifier,
not the query text.

There is however value in parsing the query, e.g. to find out which
statement
type something is, or to determine which table names a query references
(assuming one knows the search_path) programatically.

It is for that latter reason I'm interested in parsing the query, and
avoiding the
ambiguity that ? carries, since its also an operator.

Based on some hackery, I've previously built a little example script that
filters pg_stat_statements output: https://github.com/lfittl/pg_qtop#usage

This script currently breaks in complex cases of ? operators, since the
pg_stat_statements query text is ambiguous.


> > I'd be in favor of a change
> > that makes it easier to copy and paste a query, to run EXPLAIN and so
> > on. Lukas probably realizes that there are no guarantees that the
> > query text that appears in pg_stat_statements will even appear as
> > normalized in all cases. The "sticky entry" stuff is intended to
> > maximize the chances of that happening, but it's still generally quite
> > possible (e.g. pg_stat_statements never swaps constants in a query
> > like "SELECT 5, pg_stat_statements_reset()"). This means that we
> > cannot really say that this buys us a machine-readable query text
> > format, at least not without adding some fairly messy caveats.
>
> Well, Lukas's original suggestion of using $n for a placeholder would
> do that, unless there's already a $n with the same numerical value,
> but Andres's proposal to use $-n or $:n would not.
>

Yes, and I do think that making it easier to run EXPLAIN would be the
primary user-visible benefit in core.

I'd be happy to add a docs section showing how to use this, if there is
some consensus that its worth pursuing this direction.

Thanks for all the comments, appreciate the discussion.

Best,
Lukas

-- 
Lukas Fittl


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-06 Thread Andres Freund
Hi,

This issue has bothered me in non-partitioned use-cases recently, so
thanks for taking it up.


On 2017-03-06 18:22:17 +0300, Aleksander Alekseev wrote:
> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index 2fb9a8bf58..fa906e7930 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -160,6 +160,16 @@ typedef struct TabStatusArray
>  
>  static TabStatusArray *pgStatTabList = NULL;

Why are we keeping that list / the "batch" allocation pattern? It
doesn't actually seem useful to me after these changes.  Given that we
don't shrink hash-tables, we could just use the hash-table memory for
this, no?

I think a separate list that only contains things that are "active" in
the current transaction makes sense, because the current logic requires
iterating over a potentially very large array at transaction commit.


> +/* pgStatTabHash entry */
> +typedef struct TabStatHashEntry
> +{
> + Oid t_id;
> + PgStat_TableStatus* tsa_entry;
> +} TabStatHashEntry;
> +
> +/* Hash table for faster t_id -> tsa_entry lookup */
> +static HTAB *pgStatTabHash = NULL;
> +

'faster ... lookup' doesn't strike me as a useful comment, because it's
only useful in relation of the current code to the new code.



>  /*
>   * Backends store per-function info that's waiting to be sent to the 
> collector
>   * in this hash table (indexed by function OID).
> @@ -815,7 +825,13 @@ pgstat_report_stat(bool force)
>   pgstat_send_tabstat(this_msg);
>   this_msg->m_nentries = 0;
>   }
> +
> + /*
> +  * Entry will be freed soon so we need to remove it 
> from the lookup table.
> +  */
> + hash_search(pgStatTabHash, >t_id, HASH_REMOVE, 
> NULL);
>   }

It's not really freed...


>  static PgStat_TableStatus *
>  get_tabstat_entry(Oid rel_id, bool isshared)
>  {
> + TabStatHashEntry* hash_entry;
>   PgStat_TableStatus *entry;
>   TabStatusArray *tsa;
> - TabStatusArray *prev_tsa;
> - int i;
> +
> + /* Try to find an entry */
> + entry = find_tabstat_entry(rel_id);
> + if(entry != NULL)
> + return entry;

Arguably it'd be better to HASH_ENTER directly here, instead of doing
two lookups.


>   /*
> -  * Search the already-used tabstat slots for this relation.
> +  * Entry doesn't exist - creating one.
> +  * First make sure there is a free space in a last element of 
> pgStatTabList.
>*/
> - prev_tsa = NULL;
> - for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = 
> tsa->tsa_next)
> + if (!pgStatTabList)
>   {
> - for (i = 0; i < tsa->tsa_used; i++)
> - {
> - entry = >tsa_entries[i];
> - if (entry->t_id == rel_id)
> - return entry;
> - }
> + HASHCTL ctl;
>  
> - if (tsa->tsa_used < TABSTAT_QUANTUM)
> + Assert(!pgStatTabHash);
> +
> + memset(, 0, sizeof(ctl));
> + ctl.keysize = sizeof(Oid);
> + ctl.entrysize = sizeof(TabStatHashEntry);
> + ctl.hcxt = TopMemoryContext;
> +
> + pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup 
> table",
> + TABSTAT_QUANTUM, , HASH_ELEM | HASH_BLOBS | 
> HASH_CONTEXT);

Think it'd be better to move the hash creation into its own function.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RADIUS fallback servers

2017-03-06 Thread Adam Brightwell
On Mon, Mar 6, 2017 at 10:24 AM, Adam Brightwell
 wrote:
>>> I wonder if removing the complexity of maintaining two separate lists
>>> for the server and port would be a better/less complex approach.  For
>>> instance, why not go with a list of typical 'host:port' strings for
>>> 'radiusservers'?  If no port is specified, then simply use the default
>>> for that specific host. Therefore, we would not have to worry about
>>> keeping the two lists in sync. Thoughts?
>>
>>
>> If we do that we should do it for all the parameters, no? So not just
>> host:port, but something like host:port:secret:identifier? Mixing the two
>> ways of doing it would be quite confusing I think.
>>
>> And I wonder if that format wouldn't get even more confusing if you for
>> example want to use default ports, but non-default secrets.
>
> Yes, I agree. Such a format would be more confusing and I certainly
> wouldn't be in favor of it.
>
>> I can see how it would probably be easier in some of the simple cases, but I
>> wonder if it wouldn't make it worse in a lot of other cases.
>
> Ultimately, I think that it would be better off in a separate
> configuration file. Something to the effect of each line representing
> a server, something like:
>
> '   '
>
> With 'radiusservers' simply being the path to that file and
> 'radiusserver', etc. would remain as is. Where only one or the other
> could be provided, but not both. Though, that's perhaps would be
> beyond the scope of this patch.
>
> At any rate, I'm going to continue moving forward with testing this patch as 
> is.

I have run through testing this patch against a small set of RADIUS
servers.  This testing included both single server and multiple server
configurations. All seems to work as expected.

-Adam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
Hi,

On 2017-03-06 19:49:56 +0100, Tomas Vondra wrote:
> On 03/06/2017 07:05 PM, Robert Haas wrote:
> > On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:
> > > On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
> > > > On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  
> > > > wrote:
> > > > > The issue was that on 32bit platforms the Datum returned by some
> > > > > functions (int2int4_sum in this case) isn't actually a separately
> > > > > allocated Datum, but rather just something embedded in a larger
> > > > > struct.  That, combined with the following code:
> > > > > if (!peraggstate->resulttypeByVal && !*isnull &&
> > > > > !MemoryContextContains(CurrentMemoryContext,
> > > > >
> > > > > DatumGetPointer(*result)))
> > > > > seems somewhat problematic to me.  MemoryContextContains() can give
> > > > > false positives when used on memory that's not a distinctly allocated
> > > > > chunk, and if so, we violate memory lifetime rules.  It's quite
> > > > > unlikely, given the required bit patterns, but nonetheless it's making
> > > > > me somewhat uncomfortable.
> > > > > 
> > > > > Do others think this isn't an issue and we can just live with it?
> > > > 
> > > > I think it's 100% broken to call MemoryContextContains() on something
> > > > that's not guaranteed to be a palloc'd chunk.
> > > 
> > > I agree, but to me it seems the only fix would be to just yank out the
> > > whole optimization?
> > 
> > Dunno, haven't looked into it.
> > 
> 
> I think it might be fixable by adding a flag into the chunk, with 'true' for
> regular allocations, and 'false' for the optimized ones. And then only use
> MemoryContextContains() for 'flag=true' chunks.

I'm not quite following here.  We only get a Datum and the knowledge
that it's a pass-by-ref argument, so we really don't know that much.  We
could create an "EmbeddedDatum" type that has a preceding chunk header
(appropriately for the version), that just gets zeroed out at start.  Is
that what you mean?


> The question however is whether this won't make the optimization pointless.
> I also, wonder how much we save by this optimization and how widely it's
> used? Can someone point me to some numbers?

I don't recall any recent numbers.  I'm more than a bit doubful that it
really matters - it's only used for the results of aggregate/window
functions, and surely they've a good chunk of their own overhead...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-03-06 Thread Gavin Flower

On 07/03/17 02:46, Amit Kapila wrote:

On Mon, Mar 6, 2017 at 6:49 PM, Robert Haas  wrote:

On Mon, Mar 6, 2017 at 6:33 AM, Amit Kapila  wrote:

I was going to do it after index and index-only scans and parallel
bitmap heap scan were committed, but I've been holding off on
committing parallel bitmap heap scan waiting for Andres to fix the
simplehash regressions, so maybe I should just go do an update now and
another one later once that goes in.


As you wish, but one point to note is that as of now parallelism for
index scans can be influenced by table level parameter
parallel_workers.  It sounds slightly awkward, but if we want to keep
that way, then maybe we can update the docs to indicate the same.
Another option is to have a separate parameter for index scans.



My immediate gut feeling was to have separate parameters.

On thinking about it, I think they serve different use cases.  I don't 
think of workers when I think of Index scans, and I suspect I'd find 
more reasons to keep them separate if I looked deeper.



Cheers,
Gavin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Tomas Vondra

On 03/06/2017 07:05 PM, Robert Haas wrote:

On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:

On 2017-03-06 12:40:18 -0500, Robert Haas wrote:

On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:

The issue was that on 32bit platforms the Datum returned by some
functions (int2int4_sum in this case) isn't actually a separately
allocated Datum, but rather just something embedded in a larger
struct.  That, combined with the following code:
if (!peraggstate->resulttypeByVal && !*isnull &&
!MemoryContextContains(CurrentMemoryContext,
   
DatumGetPointer(*result)))
seems somewhat problematic to me.  MemoryContextContains() can give
false positives when used on memory that's not a distinctly allocated
chunk, and if so, we violate memory lifetime rules.  It's quite
unlikely, given the required bit patterns, but nonetheless it's making
me somewhat uncomfortable.

Do others think this isn't an issue and we can just live with it?


I think it's 100% broken to call MemoryContextContains() on something
that's not guaranteed to be a palloc'd chunk.


I agree, but to me it seems the only fix would be to just yank out the
whole optimization?


Dunno, haven't looked into it.



I think it might be fixable by adding a flag into the chunk, with 'true' 
for regular allocations, and 'false' for the optimized ones. And then 
only use MemoryContextContains() for 'flag=true' chunks.


The question however is whether this won't make the optimization 
pointless. I also, wonder how much we save by this optimization and how 
widely it's used? Can someone point me to some numbers?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Peter Eisentraut
On 3/5/17 05:40, Jan Michálek wrote:
> jelen=# \pset linestyle rst
> Line style is rst.
> jelen=# \pset format wrapped
> Output format is wrapped.
> jelen=# SELECT repeat('Goodnight Irene ', 30);
> +-+
> |  
> repeat|
> +=+
> | Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> Goodnight I.|
> |.rene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight Irene
> Goodni.|
> |.ght Irene Goodnight Irene Goodnight Irene Goodnight Irene Goodnight
> Irene G.|

I doubt that this kind of line breaking style is actually proper rst.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP FUNCTION of multiple functions

2017-03-06 Thread Peter Eisentraut
On 2/27/17 01:46, Michael Paquier wrote:
> On Sat, Feb 25, 2017 at 10:27 PM, Peter Eisentraut
>  wrote:
>> Here is a new patch set that addresses your comments.  The structure is
>> still the same, just a bunch of things have been renamed based on
>> suggestions.
> +  
> +   Drop multiple functions in one command:
> +
> +DROP FUNCTION sqrt(integer), sqrt(bigint);
> +
>   
> Perhaps adding as well on the page of DROP OPERATOR an example
> dropping multiple operators at once?

Committed with additional documentation.  Thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-06 Thread Robert Haas
On Fri, Mar 3, 2017 at 2:15 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
>> On 2/28/17 02:39, Tsunakawa, Takayuki wrote:
>> > I'd like to propose statement-level rollback feature.  To repeat myself,
>> this is requested for users to migrate from other DBMSs to PostgreSQL.  They
>> expect that a failure of one SQL statement should not abort the entire
>> transaction and their apps (client programs and stored procedures) can
>> continue the transaction with a different SQL statement.
>>
>> Can you provide some references on how other systems provide this feature?
>
> Oracle doesn't.

Really?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 10:32 AM, Tom Lane  wrote:
> Without having actually looked at this patch, I would say that if it added
> a direct call of fopen() to backend-side code, that was already the wrong
> thing.  Almost always, AllocateFile() would be a better choice, not only
> because it's tied into transaction abort, but also because it knows how to
> release virtual FDs in event of ENFILE/EMFILE errors.  If there is some
> convincing reason why you shouldn't use AllocateFile(), then a safe
> cleanup pattern would be to have the fclose() in a PG_CATCH stanza.

I think that my previous remarks on this issue were simply muddled
thinking.  The SQL-callable function pg_current_logfile() does use
AllocateFile(), so the ERROR which may occur afterward if the file is
corrupted is no problem.  The syslogger, on the other hand, uses
logfile_open() to open the file, but it's careful not to throw an
ERROR while the file is open, just like other code which runs in the
syslogger.  So now I think there's no bug here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib modules and relkind check

2017-03-06 Thread Corey Huinker
On Tue, Feb 14, 2017 at 1:30 AM, Michael Paquier 
wrote:

> Hm... It may be a good idea to be consistent on the whole system and
> refer to "partitioned table" as a table without storage and used as an
> entry point for partitions. The docs use this term in CREATE TABLE,
> and we would finish with messages like "not a table or a partitioned
> table". Extra thoughts are welcome here, the current inconsistencies
> would be confusing for users.
>

Updated CF status to "Waiting on Author". Waiting, mostly for the
regression tests suggested.

Michael, do you want to add yourself as a reviewer?


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 12:53 PM, Stephen Frost  wrote:
> Regarding 02, I certainly see that as valuable for the reasons which
> David outlined in his initial email.  I can certainly take point on
> getting it committed, but I wouldn't complain if someone else does
> either.

Sold, to the snowman in the first row.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Robert Haas
On Mon, Mar 6, 2017 at 12:44 PM, Andres Freund  wrote:
> On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
>> On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
>> > The issue was that on 32bit platforms the Datum returned by some
>> > functions (int2int4_sum in this case) isn't actually a separately
>> > allocated Datum, but rather just something embedded in a larger
>> > struct.  That, combined with the following code:
>> > if (!peraggstate->resulttypeByVal && !*isnull &&
>> > !MemoryContextContains(CurrentMemoryContext,
>> >
>> > DatumGetPointer(*result)))
>> > seems somewhat problematic to me.  MemoryContextContains() can give
>> > false positives when used on memory that's not a distinctly allocated
>> > chunk, and if so, we violate memory lifetime rules.  It's quite
>> > unlikely, given the required bit patterns, but nonetheless it's making
>> > me somewhat uncomfortable.
>> >
>> > Do others think this isn't an issue and we can just live with it?
>>
>> I think it's 100% broken to call MemoryContextContains() on something
>> that's not guaranteed to be a palloc'd chunk.
>
> I agree, but to me it seems the only fix would be to just yank out the
> whole optimization?

Dunno, haven't looked into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Sat, Mar 4, 2017 at 9:12 AM, David Steele  wrote:
> > Yes, that makes sense.  Attached are two patches as requested:
> >
> > 01 - Just marks pg_stop_backup() variants as parallel restricted
> > 02 - Add the wait_for_archive param to pg_stop_backup().
> >
> > These apply cleanly on 272adf4.
> 
> Committed 01.  Nobody's offered an opinion about 02 yet, so I'm not
> going to commit that, but one minor nitpick:
> 
> +WAL to be archived.  This behavior is only useful for backup
> +software which independently monitors WAL archiving, otherwise WAL
> +required to make the backup consistent might be missing and make the 
> backup
> 
> I think this should really say "...which independently monitors WAL
> archiving.  Otherwise, WAL..."

Regarding 02, I certainly see that as valuable for the reasons which
David outlined in his initial email.  I can certainly take point on
getting it committed, but I wouldn't complain if someone else does
either.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 9:12 AM, David Steele  wrote:
> Yes, that makes sense.  Attached are two patches as requested:
>
> 01 - Just marks pg_stop_backup() variants as parallel restricted
> 02 - Add the wait_for_archive param to pg_stop_backup().
>
> These apply cleanly on 272adf4.

Committed 01.  Nobody's offered an opinion about 02 yet, so I'm not
going to commit that, but one minor nitpick:

+WAL to be archived.  This behavior is only useful for backup
+software which independently monitors WAL archiving, otherwise WAL
+required to make the backup consistent might be missing and make the backup

I think this should really say "...which independently monitors WAL
archiving.  Otherwise, WAL..."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Andres Freund
On 2017-03-06 12:40:18 -0500, Robert Haas wrote:
> On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
> > The issue was that on 32bit platforms the Datum returned by some
> > functions (int2int4_sum in this case) isn't actually a separately
> > allocated Datum, but rather just something embedded in a larger
> > struct.  That, combined with the following code:
> > if (!peraggstate->resulttypeByVal && !*isnull &&
> > !MemoryContextContains(CurrentMemoryContext,
> >
> > DatumGetPointer(*result)))
> > seems somewhat problematic to me.  MemoryContextContains() can give
> > false positives when used on memory that's not a distinctly allocated
> > chunk, and if so, we violate memory lifetime rules.  It's quite
> > unlikely, given the required bit patterns, but nonetheless it's making
> > me somewhat uncomfortable.
> >
> > Do others think this isn't an issue and we can just live with it?
> 
> I think it's 100% broken to call MemoryContextContains() on something
> that's not guaranteed to be a palloc'd chunk.

I agree, but to me it seems the only fix would be to just yank out the
whole optimization?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: two slab-like memory allocators

2017-03-06 Thread Robert Haas
On Wed, Mar 1, 2017 at 5:55 PM, Andres Freund  wrote:
> The issue was that on 32bit platforms the Datum returned by some
> functions (int2int4_sum in this case) isn't actually a separately
> allocated Datum, but rather just something embedded in a larger
> struct.  That, combined with the following code:
> if (!peraggstate->resulttypeByVal && !*isnull &&
> !MemoryContextContains(CurrentMemoryContext,
>
> DatumGetPointer(*result)))
> seems somewhat problematic to me.  MemoryContextContains() can give
> false positives when used on memory that's not a distinctly allocated
> chunk, and if so, we violate memory lifetime rules.  It's quite
> unlikely, given the required bit patterns, but nonetheless it's making
> me somewhat uncomfortable.
>
> Do others think this isn't an issue and we can just live with it?

I think it's 100% broken to call MemoryContextContains() on something
that's not guaranteed to be a palloc'd chunk.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-06 Thread Robert Haas
On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan  wrote:
> On Sat, Mar 4, 2017 at 8:02 AM, Tom Lane  wrote:
>>> Perhaps there could be a choice of behaviors.  Even if we all agreed
>>> that parameter notation was better in theory, there's something to be
>>> said for maintaining backward compatibility, or having an option to do
>>> so.
>>
>> Meh ... we've generally regretted it when we "solved" a backwards
>> compatibility problem by introducing a GUC that changes query semantics.
>> I'm inclined to think we should either do it or not.
>
> In my opinion, we expose query id (and dbid, and userid) as the
> canonical identifier for each pg_stat_statements entry, and have done
> so for some time. That's the stable API -- not query text. I'm aware
> of cases where query text was used as an identifier, but that ended up
> being hashed anyway.
>
> Query text is just for human consumption.

Lukas evidently thinks otherwise, based on the original post.

> I'd be in favor of a change
> that makes it easier to copy and paste a query, to run EXPLAIN and so
> on. Lukas probably realizes that there are no guarantees that the
> query text that appears in pg_stat_statements will even appear as
> normalized in all cases. The "sticky entry" stuff is intended to
> maximize the chances of that happening, but it's still generally quite
> possible (e.g. pg_stat_statements never swaps constants in a query
> like "SELECT 5, pg_stat_statements_reset()"). This means that we
> cannot really say that this buys us a machine-readable query text
> format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] perlcritic

2017-03-06 Thread Dagfinn Ilmari Mannsåker
ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

> Hi Peter,
>
> Peter Eisentraut  writes:
>
>> I posted this about 18 months ago but then ran out of steam. [ ] Here
>> is an updated patch.  The testing instructions below still apply.
>> Especially welcome would be ideas on how to address some of the places
>> I have marked with ## no critic.
>
> Attached is a patch on top of yours that addresses all the ## no critic
> annotations except RequireFilenameMatchesPackage, which can't be fixed
> without more drastic reworking of the plperl build process.
>
> Tested on perl 5.8.1 and 5.24.0 by configuring with --with-perl and
> --enable-tap-tests followed by make check-world, and running pgindent
> --build.

Attached is an updated version of the patch, in which
src/tools/msvc/gendef.pl actually compiles.  If someone on Windows could
test it, that would be great.

-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law

>From 2bbdd768bdbabe10e0af6b95d2d09d29095d3a8b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 1 Mar 2017 15:32:45 +
Subject: [PATCH] Fix most perlcritic exceptions

The RequireFilenameMatchesPackage ones would require reworking the
plperl build process more drastically.
---
 src/pl/plperl/plc_perlboot.pl | 9 +++--
 src/tools/msvc/gendef.pl  | 2 +-
 src/tools/pgindent/pgindent   | 6 +++---
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index 292c9101c9..b4212f5ab2 100644
--- a/src/pl/plperl/plc_perlboot.pl
+++ b/src/pl/plperl/plc_perlboot.pl
@@ -81,18 +81,15 @@ sub ::encode_array_constructor
 		} sort keys %$imports;
 		$BEGIN &&= "BEGIN { $BEGIN }";
 
-		return qq[ package main; sub { $BEGIN $prolog $src } ];
+		# default no strict and no warnings
+		return qq[ package main; sub { no strict; no warnings; $BEGIN $prolog $src } ];
 	}
 
 	sub mkfunc
 	{
-		## no critic (ProhibitNoStrict, ProhibitStringyEval);
-		no strict;  # default to no strict for the eval
-		no warnings;# default to no warnings for the eval
-		my $ret = eval(mkfuncsrc(@_));
+		my $ret = eval(mkfuncsrc(@_)); ## no critic (ProhibitStringyEval);
 		$@ =~ s/\(eval \d+\) //g if $@;
 		return $ret;
-		## use critic
 	}
 
 	1;
diff --git a/src/tools/msvc/gendef.pl b/src/tools/msvc/gendef.pl
index 64227c2dce..598699e6ea 100644
--- a/src/tools/msvc/gendef.pl
+++ b/src/tools/msvc/gendef.pl
@@ -174,7 +174,7 @@ sub usage
 
 my %def = ();
 
-while (<$ARGV[0]/*.obj>)  ## no critic (RequireGlobFunction);
+while (glob("$ARGV[0]/*.obj"))
 {
 	my $objfile = $_;
 	my $symfile = $objfile;
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index a6b24b5348..51d6a28953 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -159,8 +159,7 @@ sub process_exclude
 		while (my $line = <$eh>)
 		{
 			chomp $line;
-			my $rgx;
-			eval " \$rgx = qr!$line!;";  ## no critic (ProhibitStringyEval);
+			my $rgx = eval { qr!$line! };
 			@files = grep { $_ !~ /$rgx/ } @files if $rgx;
 		}
 		close($eh);
@@ -435,7 +434,8 @@ sub diff
 
 sub run_build
 {
-	eval "use LWP::Simple;";  ## no critic (ProhibitStringyEval);
+	require LWP::Simple;
+	LWP::Simple->import(qw(getstore is_success));
 
 	my $code_base = shift || '.';
 	my $save_dir = getcwd();
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread Dagfinn Ilmari Mannsåker
David Christensen  writes:

>> Hi David,
>> 
>> Here's a review of your patch.
>
> Hi Ilmari, thanks for your time and review.  I’m fine with the revised 
> version.

Okay, I've marked the patch as Ready For Committer.

Thanks,

Ilmari

-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

2017-03-06 Thread David Christensen
> Hi David,
> 
> Here's a review of your patch.

Hi Ilmari, thanks for your time and review.  I’m fine with the revised version.

Best,

David
--
David Christensen
End Point Corporation
da...@endpoint.com
785-727-1171





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [GSoC] Push-based query executor discussion

2017-03-06 Thread Arseny Sher
Hello,

I would like to work on push-based executor [1] during GSoC, so I'm
writing to introduce myself and start the discussion of the project. I
think I should mention beforehand that the subject is my master's
thesis topic, and I have already started working on it. This letter is
not (obviously) a ready proposal but rather initial point to talk over
the concept. Below you can see a short review of the idea, description
of benefits for the community, details, related work and some info
about me.


*Brief review*
The idea is described at the wiki page [1] and in the letter [2]. I
propose to replace current ExecProcNode interface between execution
nodes with function called, say, pushTuple that pushes the ready tuple
to the current node's parent.


*Benefits for the community*
Why would we want this? In general, because Postgres executor is slow
for CPU-bound queries and this approach should accelerate it. [4] and
[5] argue that this model results in better code and data locality,
and that JIT compilation makes the difference even more drastic.

Besides, while working on this, in order to study the effects of model
change I will try to investigate the Postgres executor's performance
in both models extensively. For instance, it is commonly accepted that
current Volcano-style model leads to poor usage of modern CPUs
pipelining abilities and large percent of branch mispredictions. I am
going to see whether, where and when this is true in Postgres;
profiling results should be useful for any further executor
optimizations.


*Project details*
Technically, I am planning to implement this as follows. Common for
all nodes code which needs to be changed is in execMain.c and
execProcnode.c; standard_ExecutorRun in execMain.c now should start
execution of all leaf nodes in proper order instead of pulling tuples
one-by-one from top-level node. By 'proper' order here I mean that
inner nodes will be run first, outer nodes second, so that when the
first tuple from outer side of some node arrives to it, the node
already received all its tuples from the inner side.

How we 'start' execution of a leaf? Recall that now instead of
ExecProcNode we have pushTuple function with following signature:

bool pushTuple(TupleTableSlot *slot, PlanState *node, PlanState *pusher)

'slot' is the tuple we push. 'node' is a receiver of tuple, 'pusher'
is sender of the tuple, its parent is 'node'. We need 'pusher' only to
distinguish inner and outer pushes. This function returns true if
'node' is still accepting tuples after the push, false if not,
e.g. Limit node can return false after required number of tuples were
passed. We also add the convention that when a node has nothing to
push anymore, it calls pushTuple with slot=NULL to let parent know
that it is done. So, to start execution of a leaf,
standard_ExecutorRun basically needs to call pushTuple(NULL, leaf,
NULL) once. Leaf nodes are a special case because pusher=NULL; another
obvious special case is top-level node: it calls pushTuple(slot, NULL,
node), such call will push the slot to the destination
((*dest->receiveSlot) (slot, dest) in current code).

Like ExecProcNode, pushTuple will call the proper implementation, e.g.
pushTupleToLimit. Such implementations will contain the code similar
to its analogue (e.g. ExecLimit), but, very roughly, where we have

return slot;

now, in push model we will have

bool parent_accepts_tuples = pushTuple(slot, node->parent, node);

and then we will continue execution if parent_accepts_tuples is true
or exit if not.

Complex nodes require more complicated modifications to preserve the
correct behaviour and be efficient. The latter leads to some
architectural issues: for example, efficient SeqScan should call
pushTuple from function doing similar to what heapgettups_pagemode
currently does, otherwise, we would need to save/restore its state
(lines, page, etc) every time for each tuple. On the other hand, it is
not nice to call pushTuple from there because currently access level
(heapam.c) knows nothing about PlanStates. Such issues will need to be
addressed and discussed with the community.

Currently, I have a prototype (pretty much WIP) which implements this
model for SeqScan, Limit, Hash and Hashjoin nodes.

Since TPC-H benchmarks are de facto standard to evaluate such things,
I am planning to to use them for testing. BTW, I’ve written a couple
of scripts to automate this job [16], although it seems that everyone
who tests TPC-H ends up with writing his own version.

Now, it is clear that rewriting all nodes with full support in such a
manner is huge work. Besides, we still don't know quantitative profit
of this model.  Because of that, I do not propose any timeline right
now; instead, we should decide which part of this work (if any) is
going to be done in the course of GSoC. Probably, all TPC-H queries
with and without index support is a good initial target, but this
needs to be discussed. Anyway, I don't think that the result will be a
patch 

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-03-06 Thread Jan Michálek
2017-03-06 16:25 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-06 16:17 GMT+01:00 Jan Michálek :
>
>>
>>
>> 2017-03-06 15:19 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-06 0:26 GMT+01:00 Jan Michálek :
>>>


 2017-03-05 14:02 GMT+01:00 Jan Michálek :

>
>
> 2017-03-05 13:39 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2017-03-05 13:22 GMT+01:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-03-05 13:08 GMT+01:00 Jan Michálek :
>>>
 It is question if it is really new format, because formating is the
 same as aligned/wrapped format, changed is only style of lines.

>>>
>>> Please, don't do top posting https://en.wikipedia.org/wiki/
>>> Posting_style#Top-posting
>>>
>>>
 2017-03-05 12:36 GMT+01:00 Pavel Stehule :

>
>
> 2017-03-05 11:40 GMT+01:00 Jan Michálek :
>
>> I know, but, both new linestyles are created literally by cloning
>> ascii linestyle and few lines in print_aligned_text. Both works with
>> "aligned" and "wrapped" format. In rst is wrapped format useful, in 
>> my
>> opinion, in markdown i can`t find how I can get newline in record 
>> (maybe it
>> is not posiible in main markdown types). So it is why i add markdown 
>> and
>> rst as new linestyles. But it is not problem to change it in command 
>> to use
>> "\pset format", but i mean, that this is cleaner.
>>
>
> Using a special linestyle for new format is possible probably. But
> new format should be switched with \pset format command.
>

 changed

>>>
>>> quick test shows it is working.
>>>
>>> Just idea - can you specify aligning? right aligning for numbers, left
>>> for others?
>>>
>>
>> I used "aligned" format as it is and I don`t know, if I`m able to do this
>> with current solution based only on change linestyle internally.
>>
>
> you can try to test result in some markdown processors? I will not be
> surprised if these processor ignore aligning in input data.
>

I tested markdown only in retext and retext aligns columns set by -:|
well.

>
> Refards
>
> Pavel
>
>
>
>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>
>>
>> --
>> Jelen
>> Starší čeledín datovýho chlíva
>>
>
>


-- 
Jelen
Starší čeledín datovýho chlíva


  1   2   >