Re: [HACKERS] taking stdbool.h into use

2017-12-24 Thread Michael Paquier
On Thu, Dec 21, 2017 at 1:02 AM, Peter Eisentraut
 wrote:
> On 11/15/17 15:13, Peter Eisentraut wrote:
>> I'm going to put this patch set as Returned With Feedback for now.  The
>> GinNullCategory issues look like they will need quite a bit of work.
>> But it will be worth picking this up some time.
>
> I think the issue with GinNullCategory is practically unfixable.  This
> is on-disk data that needs to be castable to an array of bool.  So
> tolerating a bool of size other than 1 would either require a disk
> format change or extensive code changes, neither of which seem
> worthwhile at this point.
>
> So here is a minimal patch set to perhaps wrap this up for the time
> being.  I have added static assertions that check the sizes of
> GinNullCategory and GinTernaryValue, which I think are the two critical
> places that require compatibility with bool.  And then we include
>  only if its bool has size 1.

+   /*
+* now we can use the nullFlags as category codes
+*/
+   StaticAssertStmt(sizeof(GinNullCategory) == sizeof(bool),
+"sizes of GinNullCategory and bool are not equal");
*categories = (GinNullCategory *) nullFlags;

Hm. So on powerpc compilation is going to fail with this patch as
sizeof(char) is 1, no? Wouldn't it be better to just allocate an array
for GinNullCategory entries and then just fill in the values one by
one with GIN_CAT_NORM_KEY or GIN_CAT_NULL_KEY by scanning nullFlags?
-- 
Michael



Re: Observations in Parallel Append

2017-12-24 Thread Amit Kapila
On Sun, Dec 24, 2017 at 12:06 PM, Robert Haas  wrote:
> On Fri, Dec 22, 2017 at 6:18 AM, Amit Kapila  wrote:
>
>> Also, don't we need to use parallel_divisor for partial paths instead
>> of non-partial paths as those will be actually distributed among
>> workers?
>
> Uh, that seems backwards to me.  We're trying to estimate the average
> number of rows per worker.
>

Okay, but is it appropriate to use the parallel_divisor?  The
parallel_divisor means the contribution of all the workers (+
leader_contribution) whereas for non-partial paths there will be
always only the subset of workers which will operate on them.
Consider a case with one non-partial subpath and five partial subpaths
with six as parallel_divisor, now the current code will try to divide
the rows of non-partial subpath with respect to six workers.  However,
in reality, there will always be one worker which will execute that
path.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



RE: User defined data types in Logical Replication

2017-12-24 Thread Huong Dangminh
> From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> >> Attached a new version patch incorporated the aboves. Please review it.
> >
> > Thanks for updating the patch.
> > It looks fine to me.

I mean that it passes make check, and subscription TAP tests too.

> >
> 
> Thank you for confirmation.
> 

Also. I have changed status in CF to Ready for Committer.
I would be glad if it can be applied from PostgreSQL 10.


---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/


Re: User defined data types in Logical Replication

2017-12-24 Thread Masahiko Sawada
On Sat, Dec 23, 2017 at 4:08 PM, Dang Minh Huong  wrote:
> On 2017/12/21 10:05, Masahiko Sawada wrote:
>
>> On Wed, Dec 20, 2017 at 5:39 PM, Huong Dangminh
>>  wrote:
>>>
>>> Hi Sawada-san,
>>>
 Thank you for quick response. The changes look good to me. But I wonder
 if the following changes needs some comments to describe what each
 checks
 does for.

 -if (errarg->attnum < 0)
 +if (errarg->local_attnum <0)
 +return;
 +
 +rel = errarg->rel;
 +remote_attnum = rel->attrmap[errarg->local_attnum];
 +
 +if (remote_attnum < 0)
   return;
>>>
>>> Thanks, I have added some comments as you mentioned.
>>>
>> Thank you for updating the patch.
>>
>> -   if (errarg->attnum < 0)
>> +   /* Check case of slot_store_error_callback() is called before
>> +* errarg.local_attnum is set. */
>> +   if (errarg->local_attnum <0)
>>
>> This comment style isn't preferred by PostgreSQL code. Please refer to
>> https://www.postgresql.org/docs/current/static/source-format.html
>> --
>> $ git diff --check
>> src/backend/replication/logical/worker.c:291: trailing whitespace.
>> +   /* Check case of slot_store_error_callback() is called before
>
> Thanks, I will be careful in the next time.
>>
>> There is an extra white space in the patch.
>>
>> I'm inclined to think SlotErrCallbackArg can have remote_attnum and
>> pass it to the callback function. That way, we don't need to case the
>> case where local_attnum is not set yet.
>
> Nice.
>>
>> Attached a new version patch incorporated the aboves. Please review it.
>
> Thanks for updating the patch.
> It looks fine to me.
>

Thank you for confirmation.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: Reproducible builds: genbki.pl and Gen_fmgrtab.pl

2017-12-24 Thread Michael Paquier
On Mon, Dec 25, 2017 at 2:25 AM, Peter Eisentraut 
 wrote:
> On 12/21/17 08:13, Andrew Dunstan wrote:
>> Looks reasonable. Regarding the change to TestLib.pm, we should make
>> sure that the tests have unique names. There is a small amount of
>> duplication currently:
>>
>> ./src/bin/pg_dump/t/001_basic.pl
>> ./src/bin/pg_rewind/t/001_basic.pl
>> ./src/test/modules/commit_ts/t/001_base.pl
>> ./src/test/modules/test_pg_dump/t/001_base.pl
>
> But that's one of the reasons one has directories, so you don't need to
> have globally unique names.  I don't actually see why the change in
> TestLib.pm is necessary or how it relates to this thread.

+1 to both things here. I have a hard time undestanding why changing the
output of TAP test logs has anything to do with this thread, as well as
why having unique file names is mandatory. If this really matters, I would
suggest spawning a different thread.
-- 
Michael



Re: Reproducible builds: genbki.pl and Gen_fmgrtab.pl

2017-12-24 Thread David Fetter
On Sun, Dec 24, 2017 at 12:26:53PM -0500, Peter Eisentraut wrote:
> On 12/20/17 21:50, Tom Lane wrote:
> > -my $this_script = $0;
> > +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_BIG5.pl';
> 
> This kind of things looks awful.  Why is this better?

What's wrong with using the path?  As you point out elsewhere on this
thread, it's a way to identify the program uniquely and reproducibly.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Reproducible builds: genbki.pl and Gen_fmgrtab.pl

2017-12-24 Thread Peter Eisentraut
On 12/20/17 21:50, Tom Lane wrote:
> -my $this_script = $0;
> +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_BIG5.pl';

This kind of things looks awful.  Why is this better?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Reproducible builds: genbki.pl and Gen_fmgrtab.pl

2017-12-24 Thread Peter Eisentraut
On 12/21/17 08:13, Andrew Dunstan wrote:
> Looks reasonable. Regarding the change to TestLib.pm, we should make
> sure that the tests have unique names. There is a small amount of
> duplication currently:
> 
> ./src/bin/pg_dump/t/001_basic.pl
> ./src/bin/pg_rewind/t/001_basic.pl
> ./src/test/modules/commit_ts/t/001_base.pl
> ./src/test/modules/test_pg_dump/t/001_base.pl

But that's one of the reasons one has directories, so you don't need to
have globally unique names.  I don't actually see why the change in
TestLib.pm is necessary or how it relates to this thread.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-24 Thread Tomas Vondra


On 12/24/2017 05:51 AM, Craig Ringer wrote:
> On 23 December 2017 at 12:57, Tomas Vondra  > wrote:
> 
> Hi all,
> 
> Attached is a patch series that implements two features to the logical
> replication - ability to define a memory limit for the reorderbuffer
> (responsible for building the decoded transactions), and ability to
> stream large in-progress transactions (exceeding the memory limit).
> 
> I'm submitting those two changes together, because one builds on the
> other, and it's beneficial to discuss them together.
> 
> 
> PART 1: adding logical_work_mem memory limit (0001)
> ---
> 
> Currently, limiting the amount of memory consumed by logical decoding is
> tricky (or you might say impossible) for several reasons:
> 
> * The value is hard-coded, so it's not quite possible to customize it.
> 
> * The amount of decoded changes to keep in memory is restricted by
> number of changes. It's not very unclear how this relates to memory
> consumption, as the change size depends on table structure, etc.
> 
> * The number is "per (sub)transaction", so a transaction with many
> subtransactions may easily consume significant amount of memory without
> actually hitting the limit.
> 
> 
> Also, even without subtransactions, we assemble a ReorderBufferTXN
> per transaction. Since transactions usually occur concurrently,
> systems with many concurrent txns can face lots of memory use.
> 

I don't see how that could be a problem, considering the number of
toplevel transactions is rather limited (to max_connections or so).

> We can't exclude tables that won't actually be replicated at the reorder
> buffering phase either. So txns use memory whether or not they do
> anything interesting as far as a given logical decoding session is
> concerned. Even if we'll throw all the data away we must buffer and
> assemble it first so we can make that decision.

Yep.

> Because logical decoding considers snapshots and cid increments even
> from other DBs (at least when the txn makes catalog changes) the memory
> use can get BIG too. I was recently working with a system that had
> accumulated 2GB of snapshots ... on each slot. With 7 slots, one for
> each DB.
> 
> So there's lots of room for difficulty with unpredictable memory use.
> 

Yep.

> So the patch does two things. Firstly, it introduces logical_work_mem, a
> GUC restricting memory consumed by all transactions currently kept in
> the reorder buffer
> 
> 
> Does this consider the (currently high, IIRC) overhead of tracking
> serialized changes?
>  

Consider in what sense?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-24 Thread Erik Rijkers


logical replication of 2 instances is OK but 3 and up fail with:

TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
"reorderbuffer.c", Line: 1773)

I can cobble up a script but I hope you have enough from the 
assertion

to see what's going wrong...


The assertion says that the iterator produces changes in order that 
does

not correlate with LSN. But I have a hard time understanding how that
could happen, particularly because according to the line number this
happens in ReorderBufferCommit(), i.e. the current (non-streaming) 
case.


So instructions to reproduce the issue would be very helpful.


Using:

0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
0006-Add-support-for-streaming-to-built-in-replication-v2.patch

As you expected the problem is the same with these new patches.

I have now tested more, and seen that it not always fails.  I guess 
that
it here fails 3 times out of 4.  But the laptop I'm using at the 
moment

is old and slow -- it may well be a factor as we've seen before [1].

Attached is the bash that I put together.  I tested with
NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which 
fails

often.  This same program run with HEAD never seems to fail (I tried a
few dozen times).



Thanks. Unfortunately I still can't reproduce the issue. I even tried
running it in valgrind, to see if there are some memory access issues
(which should also slow it down significantly).


One wonders again if 2ndquadrant shouldn't invest in some old hardware 
;)


Another Good Thing would be if there was a provision in the buildfarm to 
test patches like these.


But I'm probably not to first one to suggest that; no doubt it'll be 
possible someday.  In the meantime I'll try to repeat this crash on 
other machines (but that will be after the holidays).



Erik Rijkers



Re: General purpose hashing func in pgbench

2017-12-24 Thread Fabien COELHO


Hello Ildar,


Actually the "bad" one appears in YCSB.


Fine. Then it must be kept, whatever its quality.

But if we should choose the only one I would stick to murmur too given 
it provides better results while having similar computational 
complexity.


No. Keep both as there is a justification for the bad one. Just make 
"hash()" default to a good one.



One implementation put constants in defines, the other one uses "const
int". [...]

[...] it looked ugly and hard to read (IMHO), like:

    k *= MURMUR2_M;
    k ^= k >> MURMUR2_R;
    k *= MURMUR2_M;
    result ^= k;
    result *= MURMUR2_M;


Yep. The ugliness is significantly linked to the choice of name. With 
MM2_MUL and MM2_ROT ISTM that it is more readable:



    k *= MM2_MUL;
    k ^= k >> MM2_ROT;
    k *= MM2_MUL;
    result ^= k;
    result *= MM2_MUL;


[...] So I'd better leave it the way it is. Actually I was thinking to 
do the same to fnv1a too : )


I think that the implementation style should be homogeneous, so I'd 
suggest at least to stick to one style.


I noticed from the source of all human knowledege (aka Wikipedia:-) that 
there seems to be a murmur3 successor. Have you considered it? One good 
reason to skip it would be that the implementation is long and complex. 
I'm not sure about a 8-byte input simplified version.


Just a question: Have you looked at SipHash24?

https://en.wikipedia.org/wiki/SipHash

The interesting point is that it can use a key and seems somehow 
cryptographically secure, for a similar cost. However the how to decide 
for/control the key is unclear.


--
Fabien.