from:"Haribabu Kommi"

Re: MSVC Build support with visual studio 2019

2019-07-03 Thread Haribabu Kommi

On Wed, 3 Jul 2019 at 10:01, Michael Paquier  wrote:

> On Tue, Jul 02, 2019 at 02:10:11PM +0900, Michael Paquier wrote:
> > OK, committed to HEAD for now after perltidy'ing the patch.  Let's see
> > what the buildfarm has to say about it first.  Once we are sure that
> > the thing is stable, I'll try to backpatch it.  This works on my own
> > dev machines with VS 2015 and 2019, but who knows what hides in the
> > shadows...
>
> The buildfarm did not have much to say, so backpatched down to 9.4,
> adjusting things on the way.


Thanks Michael.

Regards,
Haribabu Kommi

Re: Libpq support to connect to standby server as priority

2019-07-01 Thread Haribabu Kommi

On Mon, 3 Jun 2019 at 16:32, Haribabu Kommi 
wrote:

>
>
> On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi 
> wrote:
>
>> I fixed all the comments that you have raised above and attached the
>> updated
>> patches.
>>
>
> Rebased patches are attached.
>

Rebased patches are attached.

Regards,
Haribabu Kommi


0001-Restructure-the-code-to-remove-duplicate-code.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-variable.patch
Description: Binary data


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0005-New-read-only-target_session_attrs-type.patch
Description: Binary data


0006-Primary-prefer-standby-and-standby-options.patch
Description: Binary data


0007-New-function-to-rejecting-the-checked-write-connecti.patch
Description: Binary data


0008-Server-recovery-mode-handling.patch
Description: Binary data

Re: MSVC Build support with visual studio 2019

2019-07-01 Thread Haribabu Kommi

On Thu, 27 Jun 2019 at 17:28, Michael Paquier  wrote:

> On Wed, Jun 26, 2019 at 10:29:05PM +1000, Haribabu Kommi wrote:
> > Thanks for the review. Yes, that patch applies till 9.5, it is my mistake
> > in naming the patch.
>
> I have been able to finally set up an environment with VS 2019 (as
> usual this stuff needs time, anyway..), and I can confirm that the
> patch is able to compile properly.
>

Thanks for the review.


> -  Visual Studio 2017 (including Express
> editions),
> -  as well as standalone Windows SDK releases 6.0 to 8.1.
> +  Visual Studio 2019 (including Express
> editions),
> +  as well as standalone Windows SDK releases 8.1a to 10.
> I would like to understand why this range of requirements is updated.
> Is there any reason to do so.  If we change these docs, what does it
> mean in terms of versions of Visual Studio supported?
>

We stopped the support of building with all the visual studio versions less
than 2013.
I updated the SDK versions accordingly.



> -or a VS2015Solution or a VS2017Solution, all in Solution.pm, depending on
> -the user's build environment) and adding objects implementing the
> corresponding
> -Project interface (VC2013Project or VC2015Project or VC2017Project from
> -MSBuildProject.pm) to it.
> +or a VS2015Solution or a VS2017Solution or a VS2019Solution, all in
> Solution.pm,
> +depending on the user's build environment) and adding objects implementing
> +the corresponding Project interface (VC2013Project or VC2015Project or
> VC2017Project
> +or VC2019Project from MSBuildProject.pm) to it.
> This formulation is weird the more we accumulate new objects, let's
> put that in a proper list of elements separated with commas except
> for the two last ones which should use "or".
>
> s/greather/greater/.
>
> The patch still has typos, and the format is not satisfying yet, so I
> have done a set of fixes as per the attached.
>

The change in the patch is good.


>
> -   elsif ($major < 6)
> +   elsif ($major < 12)
> {
> croak
> - "Unable to determine Visual Studio version:
> Visual Studio versions before 6.0 aren't supported.";
> + "Unable to determine Visual Studio version:
> Visual Studio versions before 12.0 aren't supported.";
> Well, this is a separate bug fix, still I don't mind fixing that in
> the same patch as we meddle with those code paths now.  Good catch.
>
> -   croak $visualStudioVersion;
> +   carp $visualStudioVersion;
> Same here.  Just wouldn't it be better to print the version found in
> the same message?
>

Yes, that is a good change, I thought of doing the same, but I don't know
how to do it.

The similar change is required for the CreateProject also.



> +   # The major visual studio that is supported has nmake version >=
>  14.20 and < 15.
> if ($major > 14)
> Comment line is too long.  It seems to me that the condition here
> should be ($major >= 14 && $minor >= 30).  That's not completely
> correct either as we have a version higher than 14.20 for VS 2019 but
> that's better than just using 14.29 or a fake number I guess.
>

The change is good, but the comment is wrong.

+ # The major visual studio that is supported has nmake
+ # version >= 14.30, so stick with it as the latest version

The major visual studio version that is supported has nmake version <=14.30

Except for the above two changes, overall the patch is in good shape.

Regards,
Haribabu Kommi

Re: MSVC Build support with visual studio 2019

2019-06-26 Thread Haribabu Kommi

On Wed, 5 Jun 2019 at 17:22, Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> wrote:

>
> On Wed, May 29, 2019 at 10:30 AM Haribabu Kommi 
> wrote:
>
>>
>> Updated patches are attached.
>>
>>
> All patches apply, build and pass tests. The patch
> '0001-support-building-with-visual-studio-2019_v10_to_v9.6_v3.patch'
> applies on version 9.5.
>
> Not sure if more review is needed before moving to 'ready for commite'r,
> but I have no more comments to make on current patches.
>

Thanks for the review. Yes, that patch applies till 9.5, it is my mistake
in naming the patch.

Regards,
Haribabu Kommi

Re: pg_basebackup failure after setting default_table_access_method option

2019-06-05 Thread Haribabu Kommi

On Thu, Jun 6, 2019 at 1:46 AM vignesh C  wrote:

>
> Hi,
>
> *I noticed pg_basebackup failure when default_table_access_method option
> is set.*
>
> *Test steps:*
>
> Step 1: Init database
> ./initdb -D data
>
> Step 2: Start Server
> ./postgres -D data &
>
> Step 3: Set Guc option
> export PGOPTIONS='-c default_table_access_method=zheap'
>
> Step 4: Peform backup
> /pg_basebackup -D backup -p 5432 --no-sync
> 2019-06-05 20:35:04.088 IST [11601] FATAL:  cannot read pg_class without
> having selected a database
> pg_basebackup: error: could not connect to server: FATAL:  cannot read
> pg_class without having selected a database
>
> *Reason why it is failing:*
> pg_basebackup does not use any database to connect to server as it backs
> up the whole data instance.
> As the option default_table_access_method is set.
> It tries to validate this option, but while validating the option in
> ScanPgRelation function:
> if (!OidIsValid(MyDatabaseId))
> elog(FATAL, "cannot read pg_class without having selected a database");
>
> Here as pg_basebackup uses no database the command fails.
>

Thanks for the details steps to reproduce the bug, I am also able to
reproduce the problem.



> Fix:
> The patch has the fix for the above issue:
>
> Let me know your opinion on this.
>

Thanks for the patch and it fixes the problem.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-06-03 Thread Haribabu Kommi

On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi 
wrote:

> I fixed all the comments that you have raised above and attached the
> updated
> patches.
>

Rebased patches are attached.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-New-pg_basebackup-g-option-to-control-the-group-acce.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch
Description: Binary data


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0005-New-read-only-target_session_attrs-type.patch
Description: Binary data


0006-Primary-prefer-standby-and-standby-options.patch
Description: Binary data


0007-New-function-to-rejecting-the-checked-write-connecti.patch
Description: Binary data


0008-Server-recovery-mode-handling.patch
Description: Binary data

Re: How to know referenced sub-fields of a composite type?

2019-05-29 Thread Haribabu Kommi

On Wed, May 29, 2019 at 4:51 PM Kohei KaiGai  wrote:

> Hi Amit,
>
> 2019年5月29日(水) 13:26 Amit Langote :
> >
> > Kaigai-san,
> >
> > On 2019/05/29 12:13, Kohei KaiGai wrote:
> > > One interesting data type in Apache Arrow is "Struct" data type. It is
> > > equivalent to composite
> > > type in PostgreSQL. The "Struct" type has sub-fields, and individual
> > > sub-fields have its own
> > > values array for each.
> > >
> > > It means we can skip to load the sub-fields unreferenced, if
> > > query-planner can handle
> > > referenced and unreferenced sub-fields correctly.
> > > On the other hands, it looks to me RelOptInfo or other optimizer
> > > related structure don't have
> > > this kind of information. RelOptInfo->attr_needed tells extension
> > > which attributes are referenced
> > > by other relation, however, its granularity is not sufficient for
> sub-fields.
> >
> > Isn't that true for some other cases as well, like when a query accesses
> > only some sub-fields of a json(b) column?  In that case too, planner
> > itself can't optimize away access to other sub-fields.  What it can do
> > though is match a suitable index to the operator used to access the
> > individual sub-fields, so that the index (if one is matched and chosen)
> > can optimize away accessing unnecessary sub-fields.  IOW, it seems to me
> > that the optimizer leaves it up to the indexes (and plan nodes) to
> further
> > optimize access to within a field.  How is this case any different?
> >
> I think it is a little bit different scenario.
> Even if an index on sub-fields can indicate the tuples to be fetched,
> the fetched tuple contains all the sub-fields because heaptuple is
> row-oriented data.
> For example, if WHERE-clause checks a sub-field: "x" then aggregate
> function references other sub-field "y", Scan/Join node has to return
> a tuple that contains both "x" and "y". IndexScan also pops up a tuple
> with a full composite type, so here is no problem if we cannot know
> which sub-fields are referenced in the later stage.
> Maybe, if IndexOnlyScan supports to return a partial composite type,
> it needs similar infrastructure that can be used for a better composite
> type support on columnar storage.
>

There is another issue related to the columnar store that needs targeted
columns for projection from the scan is discussed in zedstore [1].
Projecting all columns from a columnar store is quite expensive than
the row store.

[1] -
https://www.postgresql.org/message-id/CALfoeivu-n5o8Juz9wW%2BkTjnis6_%2BrfMf%2BzOTky1LiTVk-ZFjA%40mail.gmail.com


Regards,
Haribabu Kommi
Fujitsu Australia

Re: MSVC Build support with visual studio 2019

2019-05-29 Thread Haribabu Kommi

On Mon, May 27, 2019 at 8:14 PM Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> wrote:

>
> On Thu, May 23, 2019 at 3:44 AM Haribabu Kommi 
> wrote:
>
>>
>> Updated patches are attached for all branches.
>>
>>
> I have gone through all patches and there are a couple of typos:
>

Thanks for the review.


> 1. s/prodcutname/productname/
>
> 1.1 In file: 0001-support-building-with-visual-studio-2019_v9.4.patch
> @@ -97,8 +97,9 @@
>Visual Studio 2013. Building with
>Visual Studio 2015 is supported down to
>Windows Vista and Windows Server 2008.
> -   Building with Visual Studio 2017 is
> supported
> -   down to Windows 7 SP1 and Windows Server
> 2008 R2 SP1.
> +   Building with Visual Studio 2017 and
> +   Visual Studio 2019 are supported down to
>
> 1.2 In file:
> 0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch
> @@ -97,8 +97,9 @@
>Visual Studio 2013. Building with
>Visual Studio 2015 is supported down to
>Windows Vista and Windows Server 2008.
> -   Building with Visual Studio 2017 is
> supported
> -   down to Windows 7 SP1 and Windows Server
> 2008 R2 SP1.
> +   Building with Visual Studio 2017 and
> +   Visual Studio 2019 are supported down to
>

Corrected.


> 2. s/stuido/studio/
>
>  2.1 In file: 0001-support-building-with-visual-studio-2019_v9.4.patch
>  @@ -143,12 +173,12 @@ sub DetermineVisualStudioVersion
>  sub _GetVisualStudioVersion
>  {
>   my ($major, $minor) = @_;
> - # visual 2017 hasn't changed the nmake version to 15, so still using
> the older version for comparison.
> + # The major visual stuido that is suppored has nmake version >= 14.20
> and < 15.
>
>  2.2 In file:
> 0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch
> @@ -132,12 +162,12 @@ sub DetermineVisualStudioVersion
>  sub _GetVisualStudioVersion
>  {
>   my ($major, $minor) = @_;
> - # visual 2017 hasn't changed the nmake version to 15, so still using
> the older version for comparison.
> + # The major visual stuido that is suppored has nmake version >= 14.20
> and < 15.
>
>  2.3 In file:  0001-support-building-with-visual-studio-2019_v11.patch
> @@ -139,12 +165,12 @@ sub _GetVisualStudioVersion
>  {
>   my ($major, $minor) = @_;
>
> - # visual 2017 hasn't changed the nmake version to 15, so still using
> the older version for comparison.
> + # The major visual stuido that is suppored has nmake version >= 14.20
> and < 15.
>
>  2.4 In file:  0001-Support-building-with-visual-studio-2019_HEAD.patch
> @@ -106,17 +132,17 @@ sub _GetVisualStudioVersion
>  {
>   my ($major, $minor) = @_;
>
> - # visual 2017 hasn't changed the nmake version to 15, so still using
> the older version for comparison.
> + # The major visual stuido that is suppored has nmake version >= 14.20
> and < 15.
>

Corrected. And also 'supported' spelling is also wrong.

Updated patches are attached.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-support-building-with-visual-studio-2019_v11_v3.patch
Description: Binary data


0001-support-building-with-visual-studio-2019_v10_to_v9.6_v3.patch
Description: Binary data


0001-support-building-with-visual-studio-2019_v9.4_v3.patch
Description: Binary data


0001-Support-building-with-visual-studio-2019_HEAD_v3.patch
Description: Binary data

Re: MSVC Build support with visual studio 2019

2019-05-22 Thread Haribabu Kommi

On Wed, May 22, 2019 at 7:36 AM Juanjo Santamaria Flecha <
juanjo.santama...@gmail.com> wrote:

> I have gone through path
> '0001-Support-building-with-visual-studio-2019.patch' only, but I am sure
> some comments will also apply to back branches.
>

Thanks for the review.



> 1. The VisualStudioVersion value looks odd:
>
> +   $self->{VisualStudioVersion}= '16.0.32.32432';
>
> Are you using a pre-release version [1]?
>

I first developed this patch on the preview version.
I updated it to version 16.0.28729.10.


> 2. There is a typo: s/stuido/studio/:
>
> +   # The major visual stuido that is suppored has nmake version >=
> 14.20 and < 15.
>
> There is something in the current code that I think should be also
> updated. The code for _GetVisualStudioVersion contains:
>
>   if ($major > 14)
> {
> carp
>  "The determined version of Visual Studio is newer than the latest
> supported version. Returning the latest supported version instead.";
> return '14.00';
> }
>
> Shouldn't the returned value be '14.20' for Visual Studio 2019?
>

Yes, that will be good to return Visual Studio 2019, updated.

Updated patches are attached for all branches.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Support-building-with-visual-studio-2019_HEAD.patch
Description: Binary data


0001-support-building-with-visual-studio-2019_v11.patch
Description: Binary data


0001-support-building-with-visual-studio-2019_v9.4.patch
Description: Binary data


0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch
Description: Binary data

Re: PG 12 draft release notes

2019-05-21 Thread Haribabu Kommi

On Tue, May 21, 2019 at 8:17 AM Andres Freund  wrote:

>
>   
>Add  command to create
>new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera,
>Dimitri Dolgov)
>   
>
> A few points:
>
> 1) Is this really source code, given that CREATE ACCESS METHOD TYPE
>TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an
>option to DDL commands?
>

+1

It would be better to provide a description of the newly added syntax.
Do we need to provide any 'Note' explaining that currently there are no
other
alternatives to the heap?


2) I think the description sounds a bit too much like it's about new
>forms of tables, rather than their storage. How about something
>roughly like:
>
>Allow different table access methods to be
>used. This allows to develop and
>use new ways of storing and accessing table data, optimized for
>different use-cases, without having to modify
>PostgreSQL. The existing heap access method
>remains the default.
>
> 3) This misses a large set of commits around making tableam possible, in
>particular the commits around
>
> commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1
> Author: Andres Freund 
> Date:   2018-11-16 16:35:11 -0800
>
> Make TupleTableSlots extensible, finish split of existing slot type.
>
>Given that those commits entail an API break relevant for extensions,
>should we have them as a separate "source code" note?
>

+1 to add, but I am not sure whether we need to list all the breakage that
has introduced by Tableam needs to be described separately or with some
combined note to explain it to extension developers is fine?



> 4) I think the attribution isn't quite right. For one, a few names with
>substantial work are missing (Amit Khandekar, Ashutosh Bapat,
>Alexander Korotkov), and the order doesn't quite seem right. On the
>latter part I might be somewhat petty, but I spend *many* months of
>my life on this.
>
>How about:
>Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov,
> David Rowley, Dimitri Golgov
>if we keep 3) separate and
>Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat,
> Alexander Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov
>otherwise?


+1 to either of the above.
Without Andres enormous efforts, Tableam couldn't have been possible into
v12.


Regards,
Haribabu Kommi
Fujitsu Australia

Transparent data encryption support as an extension

2019-04-12 Thread Haribabu Kommi

Hi Hackers,

I read many mail discussions in supporting data at rest encryption support
in
PostgreSQL.

I checked the discussions around full instance encryption or tablespace or
table level encryption. In my observation, all the proposals are trying to
modify
the core code to support encryption.

I am thinking of an approach of providing tablespace level encryption
support
including WAL using an extension instead of changing the core code by adding
hooks in xlogwrite and xlogread flows, reorderbuffer flows and also by
adding
smgr plugin routines to support encryption and decryption of other pages.

Definitely this approach does't work for full instance encryption.

Any opinions/comments/problems in evaluating the encryption with an
extesnion
approach?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Attempt to consolidate reading of XLOG page

2019-04-12 Thread Haribabu Kommi

On Fri, Apr 12, 2019 at 2:06 AM Antonin Houska  wrote:

> While working on the instance encryption I found it annoying to apply
> decyption of XLOG page to three different functions. Attached is a patch
> that
> tries to merge them all into one function, XLogRead(). The existing
> implementations differ in the way new segment is opened. So I added a
> pointer
> to callback function as a new argument. This callback handles the specific
> ways to determine segment file name and to open the file.
>
> I can split the patch into multiple diffs to make detailed review easier,
> but
> first I'd like to hear if anything is seriously wrong about this
> design. Thanks.
>

I didn't check the code, but it is good to combine all the 3 page read
functions
into one instead of spreading the logic.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-04-10 Thread Haribabu Kommi

On Fri, Mar 29, 2019 at 7:06 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> Hi Hari-san,
>
> I've reviewed all the files.  The patch would be OK when the following
> have been fixed, except for the complexity of fe-connect.c (which probably
> cannot be improved.)
>
> Unfortunately, I'll be absent next week.  The earliest date I can do the
> test will be April 8 or 9.  I hope someone could take care of this patch...
>

Thanks for the review. Apologies that I could not able finish it on time
because of
other work.



>
> (1) 0001
> With this read-only option type, application can connect to
> to a read-only server in the list of hosts, in case
> ...
> before issuing the SHOW transaction_readonly to find out whether
>
>
> "to" appears twice in a row.
> transaction_readonly -> transaction_read_only
>
>
> (2) 0001
> +succesful connection or failure.
>
> succesful -> successful
>
>
> (3) 0008
> to conenct to a standby server with a faster check instead of
>
> conenct -> connect
>
>
> (4) 0008
> Logically, recovery exit should be notified after the following statement:
>
> XLogCtl->SharedRecoveryInProgress = false;
>
>
> (5) 0008
> +   /* Update in_recovery status. */
> +   if (LocalRecoveryInProgress)
> +   SetConfigOption("in_recovery",
> +   "on",
> +   PGC_INTERNAL,
> PGC_S_OVERRIDE);
> +
>
> This SetConfigOption() is called for every RecoveryInProgress() call on
> the standby.  It may cause undesirable overhead.  How about just calling
> SetConfigOption() once in InitPostgres() to set the initial value for
> in_recovery?  InitPostgres() and its subsidiary functions call
> SetConfigOption() likewise.
>
>
> (6) 0008
> async.c is for LISTEN/UNLISTEN/NOTIFY.  How about adding the new functions
> in postgres.c like RecoveryConflictInterrupt()?
>
>
> (7) 0008
> +   if (pid != 0)
> +   {
> +   (void) SendProcSignal(pid, reason,
> procvxid.backendId);
> +   }
>
> The braces are not necessary because the block only contains a single
> statement.
>

I fixed all the comments that you have raised above and attached the updated
patches.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Restructure-the-code-to-remove-duplicate-code.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch
Description: Binary data


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0005-New-read-only-target_session_attrs-type.patch
Description: Binary data


0006-Primary-prefer-standby-and-standby-options.patch
Description: Binary data


0007-New-function-to-rejecting-the-checked-write-connecti.patch
Description: Binary data


0008-Server-recovery-mode-handling.patch
Description: Binary data

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-10 Thread Haribabu Kommi

On Wed, Apr 10, 2019 at 3:32 PM Amit Kapila  wrote:

> On Mon, Apr 8, 2019 at 8:51 AM Amit Kapila 
> wrote:
> >
> > On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk 
> wrote:
> > > So I am marking this thread as “Ready for Committer”.
> > >
> >
> > Thanks, Hari and Jamison for verification.  The patches for
> > back-branches looks good to me.  I will once again verify them before
> > commit. I will commit this patch tomorrow unless someone has
> > objections.  Robert/Alvaro, do let me know if you see any problem with
> > this fix?
> >
>
> Pushed.
>

Thanks Amit.
Will look into it further to handle all the internally generated
transactions.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pgsql: tableam: basic documentation.

2019-04-09 Thread Haribabu Kommi

On Wed, Apr 10, 2019 at 12:56 PM Michael Paquier 
wrote:

> Hi Andres,
>
> On Thu, Apr 04, 2019 at 12:42:06AM +, Andres Freund wrote:
> > tableam: basic documentation.
> >
> > This adds documentation about the user oriented parts of table access
> > methods (i.e. the default_table_access_method GUC and the USING clause
> > for CREATE TABLE etc), adds a basic chapter about the table access
> > method interface, and adds a note to storage.sgml that it's contents
> > don't necessarily apply for non-builtin AMs.
> >
> > Author: Haribabu Kommi and Andres Freund
> > Discussion:
> https://postgr.es/m/20180703070645.wchpu5muyto5n...@alap3.anarazel.de
>
> +  Any developer of a new table access method can refer
> to
> +  the existing heap implementation present in
> +  src/backend/heap/heapam_handler.c for more details
> of
> +  how it is implemented.
>
> This path is incorrect, it should be that instead (missing "access"):
> src/backend/access/heap/heapam_handler.c
>

Thanks for the review, Yes I missed it when I added the path.
Patch attached.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-tabeam-docs-file-path-correction.patch
Description: Binary data

Re: MSVC Build support with visual studio 2019

2019-04-09 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 11:42 AM Haribabu Kommi 
wrote:

>
> On Wed, Mar 27, 2019 at 3:03 AM Andrew Dunstan <
> andrew.duns...@2ndquadrant.com> wrote:
>
>>
>> On 3/20/19 8:36 PM, Haribabu Kommi wrote:
>> > Hi Hackers,
>> >
>> > Here I attached a patch that supports building of PostgreSQL with VS
>> 2019.
>> > VS 2019 is going to release on Apr 2nd 2019, it will be good if version
>> 12
>> > supports compiling. The attached for is for review, it may needs some
>> > updates
>> > once the final version is released.
>> >
>> > Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the
>> > minimum visual studio support to 2013 to support C99 standards,
>> > because of this
>> > reason, the current attached patch cannot be backpatched as it is.
>> >
>> > I can provide a separate back branches patch later once this patch
>> > comes to a stage of commit. Currently all the supported branches are
>> > possible to compile with VS 2017.
>> >
>> > comments?
>> >
>> >
>>
>>
>> I have verified that this works with VS2019.
>>
>>
>> There are a few typos in the comments that need cleaning up.
>>
>
> Thanks for the review.
>
> I corrected the typos in the comments, hopefully I covered everything.
> Attached is the updated patch. Once the final VS 2019, I will check the
> patch if it needs any more updates.
>

Visual Studio 2019 is officially released. There is no major change in the
patch, except some small comments update.

Also attached patches for the back branches also.

Regards,
Haribabu Kommi
Fujitsu Australia
From 6b21a229400bae51b225dd96d68249d2a61d9ac2 Mon Sep 17 00:00:00 2001
From: Hari Babu 
Date: Fri, 5 Apr 2019 10:15:57 +1100
Subject: [PATCH] support building with visual studio 2019

---
 doc/src/sgml/install-windows.sgml | 21 
 src/tools/msvc/MSBuildProject.pm  | 25 +++
 src/tools/msvc/README | 12 +-
 src/tools/msvc/Solution.pm| 28 ++
 src/tools/msvc/VSObjectFactory.pm | 40 +--
 5 files changed, 103 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index ac00ac8232..efa9f5dbc2 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -19,10 +19,10 @@
  
   There are several different ways of building PostgreSQL on
   Windows. The simplest way to build with
-  Microsoft tools is to install Visual Studio Express 2017
+  Microsoft tools is to install Visual Studio Express 2019
   for Windows Desktop and use the included
   compiler. It is also possible to build with the full
-  Microsoft Visual C++ 2005 to 2017.
+  Microsoft Visual C++ 2005 to 2019.
   In some cases that requires the installation of the
   Windows SDK in addition to the compiler.
  
@@ -69,19 +69,19 @@
   Visual Studio Express or some versions of the
   Microsoft Windows SDK. If you do not already have a
   Visual Studio environment set up, the easiest
-  ways are to use the compilers from Visual Studio Express 2017
+  ways are to use the compilers from Visual Studio Express 2019
   for Windows Desktop or those in the Windows SDK
-  8.1, which are both free downloads from Microsoft.
+  10, which are both free downloads from Microsoft.
  
 
  
   Both 32-bit and 64-bit builds are possible with the Microsoft Compiler suite.
   32-bit PostgreSQL builds are possible with
   Visual Studio 2005 to
-  Visual Studio 2017 (including Express editions),
-  as well as standalone Windows SDK releases 6.0 to 8.1.
+  Visual Studio 2019 (including Express editions),
+  as well as standalone Windows SDK releases 6.0 to 10.
   64-bit PostgreSQL builds are supported with
-  Microsoft Windows SDK version 6.0a to 8.1 or
+  Microsoft Windows SDK version 6.0a to 10 or
   Visual Studio 2008 and above. Compilation
   is supported down to Windows XP and
   Windows Server 2003 when building with
@@ -89,8 +89,9 @@
   Visual Studio 2013. Building with
   Visual Studio 2015 is supported down to
   Windows Vista and Windows Server 2008.
-   Building with Visual Studio 2017 is supported
-   down to Windows 7 SP1 and Windows Server 2008 R2 SP1.
+  Building with Visual Studio 2017 and Visual Studio 2019
+  are supported down to Windows 7 SP1 and
+  Windows Server 2008 R2 SP1.
  
 
  
@@ -162,7 +163,7 @@ $ENV{MSBFLAGS}="/m";
   If your build environment doesn't ship with a supported version of the
   Microsoft Windows SDK it
   is recommended that you upgrade to the latest version (currently
-  version 7.1), available for download from
+  version 10), available for download from
   https://www.microsoft.com/download;>.
  
  
diff --git a/src/tools

Re: Status of the table access method work

2019-04-08 Thread Haribabu Kommi

On Sat, Apr 6, 2019 at 7:25 AM Andres Freund  wrote:

> Hi,
>
> In this email I want to give a brief status update of the table access
> method work - I assume that most of you sensibly haven't followed it
> into all nooks and crannies.
>
> I want to thank Haribabu, Alvaro, Alexander, David, Dmitry and all the
> others that collaborated on making tableam happen. It was/is a huge
> project.

A big thank you Andres for your enormous efforts in this patch.
Without your involvement, this patch couldn't have been made into v12.

With regards to storing the rows themselves, the second biggest
> limitation is a limitation that is not actually a part of tableam
> itself: WAL. Many tableam's would want to use WAL, but we only have
> extensible WAL as part of generic_xlog.h. While that's useful to allow
> prototyping etc, it's imo not efficient enough to build a competitive
> storage engine for OLTP (OLAP probably much less of a problem).  I don't
> know what the best approach here is - allowing "well known" extensions
> to register rmgr entries would be the easiest solution, but it's
> certainly a bit crummy.
>

I got the same doubt when i looked into some of the UNDO patches
where it tries to modify the core code to add UNDO specific WAL types.
Different AM's may need different set of operations to be WAL logged,
so it may be better for the AM's to register their own types?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-07 Thread Haribabu Kommi

On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila  wrote:

> On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi 
> wrote:
> >
> > Thanks for the review.
> >
> > While changing the approach to use the is_parallel_worker_flag, I thought
> > that the rest of the stats are also not required to be updated and also
> those
> > are any way write operations and those values are zero anyway for
> parallel
> > workers.
> >
> > Instead of expanding the patch scope, I just changed to avoid the commit
> > or rollback stats as discussed, and later we can target the handling of
> all the
> > internal transactions and their corresponding stats.
> >
>
> The patch looks good to me.  I have changed the commit message and ran
> the pgindent in the attached patch.  Can you once see if that looks
> fine to you?  Also, we should backpatch this till 9.6.  So, can you
> once verify if the change is fine in all bank branches?   Also, test
> with a force_parallel_mode option.  I have already tested it with
> force_parallel_mode = 'regress' in HEAD, please test it in back
> branches as well.
>


Thanks for the updated patch.
I tested in back branches even with force_parallelmode and it is working
as expected. But the patches apply is failing in back branches, so attached
the patches for their branches. For v11 it applies with hunks.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Avoid-counting-transaction-stats-for-parallel-worker_10.patch
Description: Binary data


0001-Avoid-counting-transaction-stats-for-parallel-worker_96.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-04-02 Thread Haribabu Kommi

On Fri, Mar 29, 2019 at 9:05 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-26 03:26, Michael Paquier wrote:
> > Do we really want to extend the replication protocol to control that?
>
> Perhaps we are losing sight of the original problem, which is that if
> you create the target directory with the wrong permissions then ... it
> has the wrong permissions.  And you are free to change the permissions
> at any time.  Many of the proposed solutions sound excessively
> complicated relative to that.
>

Yes, I agree that the proposed solution for fixing the original problem of
existing
data directory permissions with new group options to pg_basebackup is a
big.

why can't we just fix the permissions of the directory by default as per the
source instance? I feel with this change, it may be useful to many people
than problem.

>From understanding of the thread discussion,

+1 by:

Michael Paquier
Robert Haas
Haribabu Kommi

-1 by:

Magnus Hagander
Peter Eisentraut

Does any one want to weigh their opinion on this patch to consider best
option for controlling the existing standby data directory permissions.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-02 Thread Haribabu Kommi

On Wed, Apr 3, 2019 at 1:59 PM Amit Kapila  wrote:

> On Thu, Mar 28, 2019 at 11:43 AM Haribabu Kommi
>  wrote:
> >
> > On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila 
> wrote:
> >>
> >> On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi <
> kommi.harib...@gmail.com> wrote:
> >> > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila 
> wrote:
> >> >>
> >> >>   As part of this thread, maybe we can
> >> >> just fix the case of the parallel cooperating transaction.
> >> >
> >> >
> >> > With the current patch, all the parallel implementation transaction
> are getting
> >> > skipped, in my tests parallel workers are the major factor in the
> transaction
> >> > stats counter. Even before parallelism, the stats of the autovacuum
> and etc
> >> > are still present but their contribution is not greatly influencing
> the stats.
> >> >
> >> > I agree with you in fixing the stats with parallel workers and
> improve stats.
> >> >
> >>
> >> I was proposing to fix only the transaction started with
> >> StartParallelWorkerTransaction by using is_parallel_worker flag as
> >> discussed above.  I understand that it won't completely fix the
> >> problem reported by you, but it will be a step in that direction.  My
> >> main worry is that if we fix it the way you are proposing and we also
> >> invent a new way to deal with all other internal transactions, then
> >> the fix done by us now needs to be changed/reverted.  Note, that this
> >> fix needs to be backpatched as well, so we should avoid doing any fix
> >> which needs to be changed or reverted.
> >
> >
> > I tried the approach as your suggested as by not counting the actual
> parallel work
> > transactions by just releasing the resources without touching the
> counters,
> > the counts are not reduced much.
> >
> > HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3
> + 1)
> > Patch -  With 4 parallel workers running query generates 9 stats ( 4 * 2
> + 1)
> > Old approach patch - With 4 parallel workers running query generates 1
> stat (1)
> >
> > Currently the parallel worker start transaction 3 times in the following
> places.
> > 1. InitPostgres
> > 2. ParallelWorkerMain (2)
> >
> > with the attached patch, we reduce one count from ParallelWorkerMain.
> >
>
> Right, that is expected from this fix.  BTW, why you have changed the
> approach in this patch to not count anything by the parallel worker as
> compared to the earlier version where you were just ignoring the stats
> for transactions.  As of now, either way is fine, but in future (after
> parallel inserts/updates), we want other stats to be updated.  I think
> your previous idea was better, just you need to start using
> is_parallel_worker flag.
>

Thanks for the review.

While changing the approach to use the is_parallel_worker_flag, I thought
that the rest of the stats are also not required to be updated and also
those
are any way write operations and those values are zero anyway for parallel
workers.

Instead of expanding the patch scope, I just changed to avoid the commit
or rollback stats as discussed, and later we can target the handling of all
the
internal transactions and their corresponding stats.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Avoid-counting-parallel-worker-start-transactions_v4.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-04-02 Thread Haribabu Kommi

On Tue, Apr 2, 2019 at 11:53 AM Andres Freund  wrote:

> Hi,
>
> On 2019-04-02 11:39:57 +1100, Haribabu Kommi wrote:
> > > What made you rename indexam.sgml to am.sgml, instead of creating a
> > > separate tableam.sgml?  Seems easier to just have a separate file?
> > >
> >
> > No specific reason, I just thought of adding all the access methods under
> > one file.
> > I can change it to tableam.sgml.
>
> I'd rather keep it separate. It seems likely that both table and indexam
> docs will grow further over time, and they're not that closely
> related. Additionally not moving sect1->sect2 etc will keep links stable
> (which we could also achieve with different sect1 names, I realize
> that).
>

OK.


> > I can understand your point of avoiding function-by-function API
> reference,
> > as the user can check directly the code comments, Still I feel some
> people
> > may refer the doc for API changes. I am fine to remove based on your
> > opinion.
>
> I think having to keeping both tableam.h and the sgml file current is
> too much overhead - and anybody that's going to create a new tableam is
> going to be able to look into the source anyway.
>

Here I attached updated patches as per the discussion.
Is the description of table access methods is enough? or do you want me to
add some more details?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-tableam-doc-update-of-table-access-methods.patch
Description: Binary data


0002-Doc-updates-for-pluggable-table-access-method-syntax.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-04-01 Thread Haribabu Kommi

On Tue, Apr 2, 2019 at 10:18 AM Andres Freund  wrote:

> Hi,
>
> On 2019-03-16 23:21:31 +1100, Haribabu Kommi wrote:
> > updated patches are attached.
>
> Now that nearly all of the tableam patches are committed (with the
> exception of the copy.c changes which are for bad reasons discussed at
> [1]) I'm looking at the docs changes.
>

Thanks.

> What made you rename indexam.sgml to am.sgml, instead of creating a
> separate tableam.sgml?  Seems easier to just have a separate file?
>

No specific reason, I just thought of adding all the access methods under
one file.
I can change it to tableam.sgml.

> I'm currently not planning to include the function-by-function API
> reference you've in your patchset, as I think it's more reasonable to
> centralize all of it in tableam.h. I think I've included most of the
> information there - could you check whether you agree?
>

I checked all the comments and explanation that is provided in the
tableam.h is
good enough to understand. Even I updated docs section to reflect with some
more
details from tableam.h comments.

I can understand your point of avoiding function-by-function API reference,
as the user can check directly the code comments, Still I feel some people
may refer the doc for API changes. I am fine to remove based on your
opinion.

Added current set of doc patches for your reference.

Regards,
Haribabu Kommi
Fujitsu Australia

0001-Rename-indexam.sgml-to-am.sgml.patch
Description: Binary data

0002-Doc-updates-for-pluggable-table-access-method-syntax.patch
Description: Binary data

0003-Table-access-method-API-explanation.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-03-29 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 11:17 AM Andres Freund  wrote:

> Hi,
>
> On 2019-02-22 14:52:08 -0500, Robert Haas wrote:
> > On Fri, Feb 22, 2019 at 11:19 AM Amit Khandekar 
> wrote:
> > > Thanks for the review. Attached v2.
> >
> > Thanks.  I took this, combined it with Andres's
> > v12-0040-WIP-Move-xid-horizon-computation-for-page-level-.patch, did
> > some polishing of the code and comments, and pgindented.  Here's what
> > I ended up with; see what you think.
>
> I pushed this after some fairly minor changes, directly including the
> patch to route the horizon computation through tableam. The only real
> change is that I removed the table relfilenode from the nbtree/hash
> deletion WAL record - it was only required to access the heap without
> accessing the catalog and was unused now.  Also added a WAL version
> bump.
>
> It seems possible that some other AM might want to generalize the
> prefetch logic from heapam.c, but I think it's fair to defer that until
> such an AM wants to do so
>

As I see that your are fixing some typos of the code that is committed,
I just want to share some more corrections that I found in the patches
that are committed till now.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-dA-to-show-Table-type-access-method.patch
Description: Binary data


0002-Typos-and-comemnt-corrections.patch
Description: Binary data

Re: Libpq support to connect to standby server as priority

2019-03-28 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 5:17 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> I've looked through 0004-0007.  I've only found the following:
>
> (5) 0005
> With this read-only option type, application can connect to
> connecting to a read-only server in the list of hosts, in case
> if there is any read-only servers available, the connection
> attempt fails.
>
> "connecting to" can be removed.
>
> in case if there is any read-only servers
> -> If There's no read only server
>

Thanks for the review.

I corrected all the comments that are raised by you and attached updated
version of patches
along with implementation of WIP in_recovery GUC_REPORT variable. This
patch has used some
of the implementations that are provided earlier in thread [1].

During the implementation of in_recovery configuration variable, I see a
lot of code addition just
for this variable, is this worth it?

[1] -
https://www.postgresql.org/message-id/2239254.dtfY1H9x0t%40hammer.magicstack.net

Regards,
Haribabu Kommi
Fujitsu Australia
From aa812716104b3fbd787dae4483341894c9e5ed9a Mon Sep 17 00:00:00 2001
From: Hari Babu 
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/8] Restructure the code to remove duplicate code

The duplicate code logic of checking for the server version
before issuing the SHOW transaction_readonly to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
 doc/src/sgml/libpq.sgml   | 26 +---
 src/interfaces/libpq/fe-connect.c | 99 ---
 src/interfaces/libpq/libpq-fe.h   |  3 +-
 3 files changed, 57 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..6221e359f0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
  
   target_session_attrs
   
+   
+The supported options for this parameter are, any and
+read-write. The default value of this parameter,
+any, regards all connections as acceptable.
+If multiple hosts were specified in the connection string, based on the
+specified value, any remaining servers will be tried before confirming
+succesful connection or failure.
+   
+

 If this parameter is set to read-write, only a
 connection in which read-write transactions are accepted by default
-is considered acceptable.  The query
-SHOW transaction_read_only will be sent upon any
-successful connection; if it returns on, the connection
-will be closed.  If multiple hosts were specified in the connection
-string, any remaining servers will be tried just as if the connection
-attempt had failed.  The default value of this parameter,
-any, regards all connections as acceptable.
-  
+is considered acceptable.
+   
+
+   
+To find out whether the server supports read-write transactions are not,
+query SHOW transaction_read_only will be sent upon any
+successful connection; if it returns on, it means server
+doesn't support read-write transactions.
+   
   
 
 
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e3bf6a7449..c68448786d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3188,6 +3188,43 @@ keep_going:		/* We will come back to here until there is
 	return PGRES_POLLING_WRITING;
 }
 
+conn->status = CONNECTION_CHECK_TARGET;
+goto keep_going;
+			}
+
+		case CONNECTION_SETENV:
+			{
+
+/*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+conn->status = CONNECTION_OK;
+
+switch (pqSetenvPoll(conn))
+{
+	case PGRES_POLLING_OK:	/* Success */
+		break;
+
+	case PGRES_POLLING_READING: /* Still going */
+		conn->status = CONNECTION_SETENV;
+		return PGRES_POLLING_READING;
+
+	case PGRES_POLLING_WRITING: /* Still going */
+		conn->status = CONNECTION_SETENV;
+		return PGRES_POLLING_WRITING;
+
+	default:
+		goto error_return;
+}
+			}
+
+			/* Intentional fall through */
+
+		case CONNECTION_CHECK_TARGET:
+			{
 /*
  * If a read-write connection is required, see if we have one.
  *
@@ -3229,68 +3266,6 @@ keep_going:		/* We will come back to here until there is
 return PGRES_POLLING_OK;
 			}
 
-		case CONNECTION_SETENV:
-
-			/*
-			 * Do post-connection housekeeping (only needed in protocol 2.0).
-			 *
-			 * We pretend that the connection is OK for the duration of these
-			 * queries.
-			 */
-			conn->

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-28 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila 
wrote:

> On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi 
> wrote:
> > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila 
> wrote:
> >>
> >>   As part of this thread, maybe we can
> >> just fix the case of the parallel cooperating transaction.
> >
> >
> > With the current patch, all the parallel implementation transaction are
> getting
> > skipped, in my tests parallel workers are the major factor in the
> transaction
> > stats counter. Even before parallelism, the stats of the autovacuum and
> etc
> > are still present but their contribution is not greatly influencing the
> stats.
> >
> > I agree with you in fixing the stats with parallel workers and improve
> stats.
> >
>
> I was proposing to fix only the transaction started with
> StartParallelWorkerTransaction by using is_parallel_worker flag as
> discussed above.  I understand that it won't completely fix the
> problem reported by you, but it will be a step in that direction.  My
> main worry is that if we fix it the way you are proposing and we also
> invent a new way to deal with all other internal transactions, then
> the fix done by us now needs to be changed/reverted.  Note, that this
> fix needs to be backpatched as well, so we should avoid doing any fix
> which needs to be changed or reverted.
>

I tried the approach as your suggested as by not counting the actual
parallel work
transactions by just releasing the resources without touching the counters,
the counts are not reduced much.

HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3  +
1)
Patch -  With 4 parallel workers running query generates 9 stats ( 4 * 2 +
1)
Old approach patch - With 4 parallel workers running query generates 1 stat
(1)

Currently the parallel worker start transaction 3 times in the following
places.
1. InitPostgres
2. ParallelWorkerMain (2)

with the attached patch, we reduce one count from ParallelWorkerMain.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Avoid-counting-parallel-worker-transactions-stats_v3.patch
Description: Binary data

Re: [HACKERS] Block level parallel vacuum

2019-03-26 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 1:31 AM Masahiko Sawada 
wrote:

> On Tue, Mar 26, 2019 at 10:19 AM Haribabu Kommi
>  wrote:
> >
> >
> > + for (i = 0; i < nindexes; i++)
> > + {
> > + LVIndStats *s = &(copied_indstats[i]);
> > +
> > + if (s->updated)
> > + lazy_update_index_statistics(Irel[i], &(s->stats));
> > + }
> > +
> > + pfree(copied_indstats);
> >
> > why can't we use the shared memory directly to update the stats once all
> the workers
> > are finished, instead of copying them to a local memory?
>
> Since we cannot use heap_inplace_update() which is called by
> vac_update_relstats() during parallel mode I copied the stats. Is that
> safe if we destroy the parallel context *after* exited parallel mode?
>

OK, understood the reason behind the copy.

Thanks for the updated patches. I reviewed them again and they
are fine. I marked the patch as "ready for committer".

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-26 Thread Haribabu Kommi

On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila  wrote:

> On Mon, Mar 25, 2019 at 6:55 PM Haribabu Kommi 
> wrote:
> >
> >
> > Thanks to everyone for their opinions and suggestions to improve.
> >
> > Without parallel workers, there aren't many internal implementation
> > logic code that causes the stats to bloat. Parallel worker stats are not
> > just the transactions, it update other stats also, for eg; seqscan stats
> > of a relation. I also eagree that just fixing it for transactions may not
> > be a proper solution.
> >
> > Using of XID data may not give proper TPS of the instance as it doesn't
> > involve the read only transactions information.
> >
> > How about adding additional two columns that provides all the internal
> > and background worker transactions into that column?
> >
>
> I can see the case where the users will be interested in application
> initiated transactions, so if we want to add new columns, it could be
> to display that information and the current columns can keep
> displaying the overall transactions in the database.   Here, I think
> we need to find a way to distinguish between internal and
> user-initiated transactions.  OTOH, I am not sure adding new columns
> is better than changing the meaning of existing columns.  We can go
> either way based on what others feel good, but I think we can do that
> as a separate head-only feature.


I agree with you. Adding new columns definitely needs more discussions
of what processes should be skipped and what needs to be added and etc.



>   As part of this thread, maybe we can
> just fix the case of the parallel cooperating transaction.
>

With the current patch, all the parallel implementation transaction are
getting
skipped, in my tests parallel workers are the major factor in the
transaction
stats counter. Even before parallelism, the stats of the autovacuum and etc
are still present but their contribution is not greatly influencing the
stats.

I agree with you in fixing the stats with parallel workers and improve
stats.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: MSVC Build support with visual studio 2019

2019-03-26 Thread Haribabu Kommi

On Wed, Mar 27, 2019 at 3:03 AM Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

>
> On 3/20/19 8:36 PM, Haribabu Kommi wrote:
> > Hi Hackers,
> >
> > Here I attached a patch that supports building of PostgreSQL with VS
> 2019.
> > VS 2019 is going to release on Apr 2nd 2019, it will be good if version
> 12
> > supports compiling. The attached for is for review, it may needs some
> > updates
> > once the final version is released.
> >
> > Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the
> > minimum visual studio support to 2013 to support C99 standards,
> > because of this
> > reason, the current attached patch cannot be backpatched as it is.
> >
> > I can provide a separate back branches patch later once this patch
> > comes to a stage of commit. Currently all the supported branches are
> > possible to compile with VS 2017.
> >
> > comments?
> >
> >
>
>
> I have verified that this works with VS2019.
>
>
> There are a few typos in the comments that need cleaning up.
>

Thanks for the review.

I corrected the typos in the comments, hopefully I covered everything.
Attached is the updated patch. Once the final VS 2019, I will check the
patch if it needs any more updates.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Support-building-with-visual-studio-2019_v2.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-03-25 Thread Haribabu Kommi

On Tue, Mar 26, 2019 at 1:27 PM Michael Paquier  wrote:

> On Sun, Mar 24, 2019 at 10:30:47PM +1100, Haribabu Kommi wrote:
> > With the above additional options, the pg_basebackup is able to control
> > the access permissions of the backup files, but when it comes to tar mode
> > all the files are sent from the server and stored as it is in backup, to
> > support
> > tar mode group access mode control, the BASE BACKUP protocol is
> > enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group'
> > to control the file permissions before they are sent to backup. Sending
> > GROUP_MODE to the server depends on the -g option received to the
> > pg_basebackup utility.
>
>
Thanks for the review.

> Do we really want to extend the replication protocol to control that?
>

As the backup data is passed in tar format and if the pg_basebackup
is also storing it in tar format, i feel changing the permissions on tar
creation is easier than regenerating the received tar with different
permissions at pg_basebackup side.

Other than tar format, changing only in pg_basebackup can support
independent group access permissions of the standby directory.

I am really questioning if we should keep this stuff isolated within
> pg_basebackup or not.  At the same time, it may be confusing to have
> BASE_BACKUP only use the permissions inherited from the data
> directory, so some input from folks maintaining an external backup
> tool is welcome.
>

That would be good to hear what other external backup tool authors
think of this change.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-03-25 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 4:06 PM Masahiko Sawada 
wrote:

>
> Attached the updated version patch. 0001 patch allows all existing
> vacuum options an boolean argument. 0002 patch introduces parallel
> lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
> command.
>

Thanks for sharing the updated patches.

0001 patch:

+PARALLEL [ N ]

But this patch contains syntax of PARALLEL but no explanation, I saw that
it is explained in 0002. It is not a problem, but just mentioning.

+  Specifies parallel degree for PARALLEL option. The
+  value must be at least 1. If the parallel degree
+  integer is omitted, then
+  VACUUM decides the number of workers based on
number of
+  indexes on the relation which further limited by
+  .

Can we add some more details about backend participation also, parallel
workers will
come into picture only when there are 2 indexes in the table.

+ /*
+ * Do post-vacuum cleanup and statistics update for each index if
+ * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+ * only post-vacum cleanup and then update statistics after exited
+ * from parallel mode.
+ */
+ lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+ lps, true);

How about renaming the above function, as it does the cleanup also?
lazy_vacuum_or_cleanup_all_indexes?

+ if (!IsInParallelVacuum(lps))
+ {
+ /*
+ * Update index statistics. If in parallel lazy vacuum, we will
+ * update them after exited from parallel mode.
+ */
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }

The above check in lazy_vacuum_all_indexes can be combined it with the outer
if check where the memcpy is happening. I still feel that the logic around
the stats
makes it little bit complex.

+ if (IsParallelWorker())
+ msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum
worker";
+ else
+ msg = "scanned index \"%s\" to remove %d row versions";

I feel, this way of error message may not be picked for the translations.
Is there any problem if we duplicate the entire ereport message with
changed message?

+ for (i = 0; i < nindexes; i++)
+ {
+ LVIndStats *s = &(copied_indstats[i]);
+
+ if (s->updated)
+ lazy_update_index_statistics(Irel[i], &(s->stats));
+ }
+
+ pfree(copied_indstats);

why can't we use the shared memory directly to update the stats once all
the workers
are finished, instead of copying them to a local memory?

+ tab->at_params.nworkers = 0; /* parallel lazy autovacuum is not supported
*/

User is not required to provide workers number compulsory even that
parallel vacuum can
work, so just setting the above parameters doesn't stop the parallel
workers, user must
pass the PARALLEL option also. So mentioning that also will be helpful
later when we
start supporting it or some one who is reading the code can understand.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-25 Thread Haribabu Kommi

On Sat, Mar 23, 2019 at 11:10 PM Amit Kapila 
wrote:

> On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera 
> wrote:
> >
> > On 2019-Mar-23, Amit Kapila wrote:
> >
> > > I think some users might also be interested in the write transactions
> > > happened in the system, basically, those have consumed xid.
> >
> > Well, do they really want to *count* these transactions, or is it enough
> > to keep an eye on the "age" of some XID column?  Other than for XID
> > freezing purposes, I don't see such internal transactions as very
> > interesting.
> >
>
> That's what I also had in mind.  I think doing anything more than just
> fixing the count for the parallel cooperating transaction by parallel
> workers doesn't seem intuitive to me. I mean if we want we can commit
> the fix such that all supporting transactions by parallel worker
> shouldn't be counted, but I am not able to convince myself that that
> is the good fix.  Instead, I think rather than fixing that one case we
> should think more broadly about all the supportive transactions
> happening in the various operations.  Also, as that is a kind of
> behavior change, we should discuss that as a separate topic.
>

Thanks to everyone for their opinions and suggestions to improve.

Without parallel workers, there aren't many internal implementation
logic code that causes the stats to bloat. Parallel worker stats are not
just the transactions, it update other stats also, for eg; seqscan stats
of a relation. I also eagree that just fixing it for transactions may not
be a proper solution.

Using of XID data may not give proper TPS of the instance as it doesn't
involve the read only transactions information.

How about adding additional two columns that provides all the internal
and background worker transactions into that column?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-03-25 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 6:07 PM Haribabu Kommi 
wrote:

>
> On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi 
> wrote:
>
>>
>> On Fri, Mar 22, 2019 at 6:57 AM Robert Haas 
>> wrote:
>>
>>> On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi 
>>> wrote:
>>> > Based on the above new options that can be added to
>>> target_session_attrs,
>>> >
>>> > primary - it is just an alias to the read-write option.
>>> > standby, prefer-standby - These options should check whether server is
>>> running in recovery mode or not
>>> > instead of checking whether server accepts read-only connections or
>>> not?
>>>
>>> I think it will be best to have one set of attributes that check
>>> default_transaction_read_only and a differently-named set that check
>>> pg_is_in_recovery().  For each, there should be one value that looks
>>> for a 'true' return and one value that looks for a 'false' return and
>>> perhaps values that accept either but prefer one or the other.
>>>
>>> IOW, there's no reason to make primary an alias for read-write.  If
>>> you want read-write, you can just say read-write.  But we can make
>>> 'primary' or 'master' look for a server that's not in recovery and
>>> 'standby' look for one that is.
>>>
>>
>> OK, I agree with opinion. I will produce a patch for those new options.
>>
>
> Here I attached WIP patch for the new options along with other older
> patches.
> The basic cases are working fine, doc and tests are missing.
>
> Currently this patch doesn't implement the GUC_REPORT for recovery mode
> yet. I am yet to optimize the complex if check.
>

Except in_hotstandby GUC_REPORT, rest of the changes are implemented.
Updated patches are attached.

while going through the old patch where the GUC_REPORT is implemented,
Tom has commented the logic of sending the signal to all backends to process
the hot standby exit with SIGHUP, if we add the logic of updating the GUC
variable value in SIGHUP, we may need to change all the SIGHUP handler
code paths. It is also possible that there is no need to update the
variable for
other processes except backends.

If we go with adding the new SIGUSR1 type to check and update the GUC
varaible
can work for most of the backends and background workers.

opinions

Regards,
Haribabu Kommi
Fujitsu Australia

Note - Attachments order may sometime go wrong.


0001-Restructure-the-code-to-remove-duplicate-code.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch
Description: Binary data


0005-New-read-only-target_session_attrs-type.patch
Description: Binary data


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0006-Primary-prefer-standby-and-standby-options.patch
Description: Binary data


0007-New-function-to-rejecting-the-checked-write-connecti.patch
Description: Binary data

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-03-25 Thread Haribabu Kommi

On Sun, Mar 24, 2019 at 11:16 PM Michael Paquier 
wrote:

> On Fri, Mar 22, 2019 at 01:01:44PM +0900, Michael Paquier wrote:
> > On Fri, Mar 22, 2019 at 02:35:41PM +1100, Haribabu Kommi wrote:
> > > Thanks for the correction.  Yes, that is correct and it works fine.
> >
> > Thanks for double-checking.  Are there any objections with this patch?
>
> Done and committed down to v11 where group access has been added.
> There could be an argument to do the same in v10, but as the root of
> the problem is the interaction between a data folder using 0640 as
> base mode for files and log_file_mode being more restrictive, then it
> cannot apply to v10.
>
> After testing and reviewing the patch, I noticed that all versions
> sent up to now missed two things done by logfile_open():
> - Bufferring is line-buffered.  For current_logfiles it may not matter
> much as the contents are first written into a temporary file and then
> the file is renamed, but for debugging having the insurance of
> consistent contents is nice even for the temporary file.
> - current_logfiles uses \r\n.  While it does not have a consequence
> for the parsing of the file by pg_current_logfile, it breaks the
> readability of the file on Windows, which is not nice.
> So I have kept the patch with the previous defaults for consistency.
> Perhaps they could be changed, but the current set is a good set.
>

Thanks Micheal and others.
This really helps to choose the restrictive log file permissions even when
the data directory is enabled with group access.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-03-24 Thread Haribabu Kommi

On Sat, Mar 23, 2019 at 2:23 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-22 05:00, Michael Paquier wrote:
> > On Fri, Mar 22, 2019 at 02:45:24PM +1100, Haribabu Kommi wrote:
> >> How about letting the pg_basebackup to decide group permissions of the
> >> standby directory irrespective of the primary directory permissions.
> >>
> >> Default - permissions are same as primary
> >> --allow-group-access - standby directory have group access permissions
> >> --no-group--access - standby directory doesn't have group permissions
> >>
> >> The last two options behave irrespective of the primary directory
> >> permissions.
> >
> > Yes, I'd imagine that we would want to be able to define three
> > different behaviors, by either having a set of options, or a sinple
> > option with a switch, say --group-access:
> > - "inherit" causes the permissions to be inherited from the source
> > node, and that's the default.
> > - "none" enforces the default 0700/0600.
> > - "group" enforces group read access.
>
> Yes, we could use those three behaviors.
>

Thanks for all your opinions, here I attached an updated patch as discussed.

New option -g --group-mode is added to pg_basebackup to specify the
group access permissions.

inherit - same permissions as source instance (default)
none - No group permissions irrespective of source instance
group - group permissions irrespective of source instance

With the above additional options, the pg_basebackup is able to control
the access permissions of the backup files, but when it comes to tar mode
all the files are sent from the server and stored as it is in backup, to
support
tar mode group access mode control, the BASE BACKUP protocol is
enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group'
to control the file permissions before they are sent to backup. Sending
GROUP_MODE to the server depends on the -g option received to the
pg_basebackup utility.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia

0001-New-pg_basebackup-g-option-to-control-the-group-acce.patch
Description: Binary data

Re: Libpq support to connect to standby server as priority

2019-03-22 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi 
wrote:

>
> On Fri, Mar 22, 2019 at 6:57 AM Robert Haas  wrote:
>
>> On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi 
>> wrote:
>> > Based on the above new options that can be added to
>> target_session_attrs,
>> >
>> > primary - it is just an alias to the read-write option.
>> > standby, prefer-standby - These options should check whether server is
>> running in recovery mode or not
>> > instead of checking whether server accepts read-only connections or not?
>>
>> I think it will be best to have one set of attributes that check
>> default_transaction_read_only and a differently-named set that check
>> pg_is_in_recovery().  For each, there should be one value that looks
>> for a 'true' return and one value that looks for a 'false' return and
>> perhaps values that accept either but prefer one or the other.
>>
>> IOW, there's no reason to make primary an alias for read-write.  If
>> you want read-write, you can just say read-write.  But we can make
>> 'primary' or 'master' look for a server that's not in recovery and
>> 'standby' look for one that is.
>>
>
> OK, I agree with opinion. I will produce a patch for those new options.
>

Here I attached WIP patch for the new options along with other older
patches.
The basic cases are working fine, doc and tests are missing.

Currently this patch doesn't implement the GUC_REPORT for recovery mode
yet. I am yet to optimize the complex if check.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Restructure-the-code-to-remove-duplicate-code.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0005-New-read-only-target_session_attrs-type.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch
Description: Binary data


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0006-Primary-prefer-standby-and-standby-options.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-03-21 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 11:42 AM Michael Paquier 
wrote:

> On Thu, Mar 21, 2019 at 02:56:24PM -0400, Robert Haas wrote:
> > On Tue, Mar 19, 2019 at 2:29 AM Michael Paquier 
> wrote:
> >> Hm.  We have been assuming that the contents of a base backup inherit
> >> the permission of the source when using pg_basebackup because this
> >> allows users to keep a nodes in a consistent state without deciding
> >> which option to use.  Do you mean that you would like to enforce the
> >> permissions of only the root directory if it exists?  Or the root
> >> directory with all its contents?  The former may be fine.  The latter
> >> is definitely not.
> >
> > Why not?
>
> Because we have released v11 so as we respect the permissions set on
> the source instead from which the backup is taken for all the folder's
> content.  If we begin to enforce it we may break some cases.  If a new
> option is introduced, it seems to me that the default should remain
> what has been released with v11, but that it is additionally possible
> to enforce group permissions or non-group permissions at will on the
> backup taken for all the contents in the data folder, including the
> root folder, created manually or not before running the pg_basebackup
> command.
>

How about letting the pg_basebackup to decide group permissions of the
standby directory irrespective of the primary directory permissions.

Default - permissions are same as primary
--allow-group-access - standby directory have group access permissions
--no-group--access - standby directory doesn't have group permissions

The last two options behave irrespective of the primary directory
permissions.

opinions?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-03-21 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 12:24 PM Michael Paquier 
wrote:

> On Thu, Mar 21, 2019 at 12:52:14PM +1100, Haribabu Kommi wrote:
> > Earlier attached patch is wrong.
>
> -   oumask = umask(pg_file_create_mode);
> +   oumask = umask(pg_mode_mask);
> Indeed that was wrong.
>
> > Correct patch attached. Sorry for the inconvenience.
>
> This looks better for the umask setting, still it could be more
> simple.
>
>  #include 
> -
> +#include "common/file_perm.h"
>  #include "lib/stringinfo.h"
> Nit: it is better for readability to keep an empty line between the
> system includes and the Postgres ones.
>
> A second thing, more important, is that you can reset umask just after
> opening the file, as attached.  This way there is no need to reset the
> umask in all the code paths leaving update_metainfo_datafile().  Does
> that look fine to you?
>

Thanks for the correction, Yes, that is correct and it works fine.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-03-21 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 6:57 AM Robert Haas  wrote:

> On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi 
> wrote:
> > Based on the above new options that can be added to target_session_attrs,
> >
> > primary - it is just an alias to the read-write option.
> > standby, prefer-standby - These options should check whether server is
> running in recovery mode or not
> > instead of checking whether server accepts read-only connections or not?
>
> I think it will be best to have one set of attributes that check
> default_transaction_read_only and a differently-named set that check
> pg_is_in_recovery().  For each, there should be one value that looks
> for a 'true' return and one value that looks for a 'false' return and
> perhaps values that accept either but prefer one or the other.
>
> IOW, there's no reason to make primary an alias for read-write.  If
> you want read-write, you can just say read-write.  But we can make
> 'primary' or 'master' look for a server that's not in recovery and
> 'standby' look for one that is.
>

OK, I agree with opinion. I will produce a patch for those new options.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-03-21 Thread Haribabu Kommi

On Fri, Mar 22, 2019 at 5:16 AM Andres Freund  wrote:

> Hi,
>
> Attached is a version of just the first patch. I'm still updating it,
> but it's getting closer to commit:
>
> - There were no tests testing EPQ interactions with DELETE, and only an
>   accidental test for EPQ in UPDATE with a concurrent DELETE. I've added
>   tests. Plan to commit that ahead of the big change.
>
> - I was pretty unhappy about how the EPQ integration looked before, I've
>   changed that now.
>
>   I still wonder if we should restore EvalPlanQualFetch and move the
>   table_lock_tuple() calls in ExecDelete/Update into it. But it seems
>   like it'd not gain that much, because there's custom surrounding code,
>   and it's not that much code.
>
> - I changed heapam_tuple_tuple to return *WouldBlock rather than just
>   the last result. I think that's one of the reason Haribabu had
>   neutered a few asserts.
>
> - I moved comments from heapam.h to tableam.h where appropriate
>
> - I updated the name of HeapUpdateFailureData to TM_FailureData,
>   HTSU_Result to TM_Result, TM_Results members now properly distinguish
>   between update vs modifications (delete & update).
>
> - I separated the HEAP_INSERT_ flags into TABLE_INSERT_* and
>   HEAP_INSERT_ with the latter being a copy of TABLE_INSERT_ with the
>   sole addition of _SPECULATIVE. table_insert_speculative callers now
>   don't specify that anymore.
>
>
> Pending work:
> - Wondering if table_insert/delete/update should rather be
>   table_tuple_insert etc. Would be a bit more consistent with the
>   callback names, but a bigger departure from existing code.
>
> - I'm not yet happy with TableTupleDeleted computation in heapam.c, I
>   want to revise that further
>
> - formatting
>
> - commit message
>
> - a few comments need a bit of polishing (ExecCheckTIDVisible,
> heapam_tuple_lock)
>
> - Rename TableTupleMayBeModified to TableTupleOk, but also probably a
> s/TableTuple/TableMod/
>
> - I'll probably move TUPLE_LOCK_FLAG_LOCK_* into tableam.h
>
> - two more passes through the patch
>

Thanks for the corrections.



> On 2019-03-21 15:07:04 +1100, Haribabu Kommi wrote:
> > As you are modifying the 0003 patch for modify API's, I went and reviewed
> > the
> > existing patch and found couple corrections that are needed, in case if
> you
> > are not
> > taken care of them already.
>
> Some I have...
>
>
>
> > + /* Update the tuple with table oid */
> > + slot->tts_tableOid = RelationGetRelid(relation);
> > + if (slot->tts_tableOid != InvalidOid)
> > + tuple->t_tableOid = slot->tts_tableOid;
> >
> > The setting of slot->tts_tableOid is not required in this function,
> > After set the check is happening, the above code is present in both
> > heapam_heap_insert and heapam_heap_insert_speculative.
>
> I'm not following? Those functions are independent?
>

In those functions, the slot->tts_tableOid is set and in the next statement
checked whether it is invalid or not? Callers of table_insert should have
already set that. So setting the value and checking in the next line is it
required?
The value cannot be InvalidOid.


>
> > + slot->tts_tableOid = RelationGetRelid(relation);
> >
> > In heapam_heap_update, i don't think there is a need to update
> > slot->tts_tableOid.
>
> Why?
>

The slot->tts_tableOid should have been updated before the call to
heap_update.
setting it again after the heap_update is required?

I also observed setting slot->tts_tableOid after table_insert_XXX calls
also in
Exec_insert function?

Is this to make sure that AM hasn't modified that value?


> > + default:
> > + elog(ERROR, "unrecognized heap_update status: %u", result);
> >
> > heap_update --> table_update?
> >
> >
> > + default:
> > + elog(ERROR, "unrecognized heap_delete status: %u", result);
> >
> > same as above?
>
> I've fixed that in a number of places.
>
>
> > + /*hari FIXME*/
> > + /*Assert(result != HeapTupleUpdated && hufd.traversed);*/
> >
> > Removing the commented codes in both ExecDelete and ExecUpdate functions.
>
> I don't think that's the right fix. I've refactored that code
> significantly now, and restored the assert in a, imo, correct version.
>
>
OK.


> > + /**/
> > + if (epqreturnslot)
> > + {
> > + *epqreturnslot = epqslot;
> > + return NULL;
> > + }
> >
> > comment update missed?
>
> Well, you'd deleted a comment around there ;). I've added something back
> now...
>

This is not only the problem I could have introduced, All the comments that
listed are introduced by me ;).

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-03-21 Thread Haribabu Kommi

On Wed, Mar 20, 2019 at 5:01 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Robert Haas [mailto:robertmh...@gmail.com]
> > I really dislike having both target_sesion_attrs and
> > target_server_type.  It doesn't solve any actual problem.  master,
> > slave, prefer-save, or whatever you like could be put in
> > target_session_attrs just as easily, and then we wouldn't end up with
> > two keywords doing closely related things.  'master' is no more or
> > less a server attribute than 'read-write'.
>
> Hmm, that may be OK.  At first, I felt it strange to treat the server type
> (primary or standby) as a session attribute.  But we can see the server
> type as one attribute in a sense that a session is established for.  I'm
> inclined to agree with:
>
> target_session_attr = {any | read-write | read-only | prefer-read |
> primary | standby | prefer-standby}
>

Thanks for your suggestions.

Based on the above new options that can be added to target_session_attrs,

primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server is
running in recovery mode or not
instead of checking whether server accepts read-only connections or not?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-21 Thread Haribabu Kommi

On Wed, Mar 20, 2019 at 7:38 PM Amit Kapila  wrote:

> On Sun, Feb 10, 2019 at 10:54 AM Haribabu Kommi
>  wrote:
> > On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila 
> wrote:
> >>
> >> I don't think so.  It seems to me that we should consider it as a
> >> single transaction.  Do you want to do the leg work for this and try
> >> to come up with a patch?  On a quick look, I think we might want to
> >> change AtEOXact_PgStat so that the commits for parallel workers are
> >> not considered.  I think the caller has that information.
> >
> >
> > I try to fix it by adding a check for parallel worker or not and based
> on it
> > count them into stats. Patch attached.
> >
>

Thanks for the review.

> @@ -2057,14 +2058,18 @@ AtEOXact_PgStat(bool isCommit)
>  {
> ..
> + /* Don't count parallel worker transactions into stats */
> + if (!IsParallelWorker())
> + {
> + /*
> + * Count transaction commit or abort.  (We use counters, not just bools,
> + * in case the reporting message isn't sent right away.)
> + */
> + if (isCommit)
> + pgStatXactCommit++;
> + else
> + pgStatXactRollback++;
> + }
> ..
> }
>
> I wonder why you haven't used the 'is_parallel_worker' flag from the
> caller,  see CommitTransaction/AbortTransaction?  The difference is
> that if we use that then it will just avoid counting transaction for
> the parallel work (StartParallelWorkerTransaction), otherwise, it
> might miss the count of any other transaction we started in the
> parallel worker for some intermediate work (for example, the
> transaction we started to restore library and guc state).
>

I understand your comment. current patch just avoids every transaction
that is used for the parallel operation.

With the use of 'is_parallel_worker' flag, it avoids only the actual
transaction
that has done parallel operation (All supportive transactions are counted).

> I think it boils down to whether we want to avoid any transaction that
> is started by a parallel worker or just the transaction which is
> shared among leader and worker.  It seems to me that currently, we do
> count all the internal transactions (started with
> StartTransactionCommand and CommitTransactionCommand) in this counter,
> so we should just try to avoid the transaction for which state is
> shared between leader and workers.
>
> Anyone else has any opinion on this matter?
>

As I said in my first mail, including pgAdmin4 also displays the TPS as
stats on
the dashboard. Those TPS values are received from xact_commits column of the
pg_stat_database view.

With Inclusion of parallel worker transactions, the TPS will be a higher
number,
thus user may find it as better throughput. This is the case with one of
our tool.
The tool changes the configuration randomly to find out the best
configuration
for the server based on a set of workload, during our test, with some
configurations,
the TPS is so good, but the overall performance is slow as the system is
having
less resources to keep up with that configuration.

Opinions?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-03-20 Thread Haribabu Kommi

Hi,

The psql \dA commands currently doesn't show the type of the access methods
of
type 'Table'.

postgres=# \dA heap
List of access methods
 Name | Type
--+---
 heap |
(1 row)

Attached a simple patch that fixes the problem and outputs as follows.

postgres=# \dA heap
List of access methods
 Name | Type
--+---
 heap | Table
(1 row)

The attached patch directly modifies the query that is sent to the server.
Servers < 12 doesn't support of type 'Table', so the same query can work,
because of the case addition as follows.

SELECT amname AS "Name",
  CASE amtype WHEN 'i' THEN 'Index' WHEN 't' THEN 'Table' END AS
"Type"
FROM pg_catalog.pg_am ...

Anyone feels that it requires a separate query for servers < 12?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-dA-to-show-Table-type-access-method.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-03-20 Thread Haribabu Kommi

On Sat, Mar 16, 2019 at 5:43 PM Haribabu Kommi 
wrote:

>
>
> On Sat, Mar 9, 2019 at 2:13 PM Andres Freund  wrote:
>
>> Hi,
>>
>> While 0001 is pretty bulky, the interesting bits concentrate on a
>> comparatively small area. I'd appreciate if somebody could give the
>> comments added in tableam.h a read (both on callbacks, and their
>> wrappers, as they have different audiences). It'd make sense to first
>> read the commit message, to understand the goal (and I'd obviously also
>> appreciate suggestions for improvements there as well).
>>
>> I'm pretty happy with the current state of the scan patch. I plan to do
>> two more passes through it (formatting, comment polishing, etc. I don't
>> know of any functional changes needed), and then commit it, lest
>> somebody objects.
>>
>
> I found couple of typos in the committed patch, attached patch fixes them.
> I am not sure about one typo, please check it once.
>
> And I reviewed the 0002 patch, which is a pretty simple and it can be
> committed.
>

As you are modifying the 0003 patch for modify API's, I went and reviewed
the
existing patch and found couple corrections that are needed, in case if you
are not
taken care of them already.


+ /* Update the tuple with table oid */
+ slot->tts_tableOid = RelationGetRelid(relation);
+ if (slot->tts_tableOid != InvalidOid)
+ tuple->t_tableOid = slot->tts_tableOid;

The setting of slot->tts_tableOid is not required in this function,
After set the check is happening, the above code is present in both
heapam_heap_insert and heapam_heap_insert_speculative.


+ slot->tts_tableOid = RelationGetRelid(relation);

In heapam_heap_update, i don't think there is a need to update
slot->tts_tableOid.


+ default:
+ elog(ERROR, "unrecognized heap_update status: %u", result);

heap_update --> table_update?


+ default:
+ elog(ERROR, "unrecognized heap_delete status: %u", result);

same as above?


+ /*hari FIXME*/
+ /*Assert(result != HeapTupleUpdated && hufd.traversed);*/

Removing the commented codes in both ExecDelete and ExecUpdate functions.


+ /**/
+ if (epqreturnslot)
+ {
+ *epqreturnslot = epqslot;
+ return NULL;
+ }

comment update missed?


Regards,
Haribabu Kommi
Fujitsu Australia

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-03-20 Thread Haribabu Kommi

On Thu, Mar 21, 2019 at 12:41 PM Haribabu Kommi 
wrote:

>
> On Wed, Mar 20, 2019 at 4:33 PM Michael Paquier 
> wrote:
>
>> And actually it seems to me that you have a race condition in that
>> stuff.  I think that you had better use umask(), then fopen, and then
>> once again umask() to put back the previous permissions, removing the
>> extra chmod() call.
>>
>
> Changed the patch to use umask() instead of chmod() according to
> your suggestion.
>
> updated patch attached.
>

Earlier attached patch is wrong.
Correct patch attached. Sorry for the inconvenience.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Adjust-current_logfiles-file-permissions_v3.patch
Description: Binary data

Re: MSVC Build support with visual studio 2019

2019-03-20 Thread Haribabu Kommi

On Thu, Mar 21, 2019 at 12:31 PM Michael Paquier 
wrote:

> On Thu, Mar 21, 2019 at 09:47:02AM +0900, Michael Paquier wrote:
> > When it comes to support newer versions of MSVC, we have come up
> > lately to backpatch that down to two stable versions but not further
> > down (see f2ab389 for v10 and v9.6), so it looks sensible to target
> > v11 and v10 as well if we were to do it today, and v11/v12 if we do it
> > in six months per the latest trends.
>
> By the way, you mentioned upthread that all the branches can compile
> with MSVC 2017, but that's not actually the case for 9.5 and 9.4 if
> you don't back-patch f2ab389 further down.
>

The commit f2ab389 is later back-patch to version till 9.3 in commit
19acfd65.
I guess that building the windows installer for all the versions using the
same
visual studio is may be the reason behind that back-patch.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-03-20 Thread Haribabu Kommi

On Wed, Mar 20, 2019 at 4:33 PM Michael Paquier  wrote:

> On Fri, Mar 15, 2019 at 06:51:37PM +1100, Haribabu Kommi wrote:
> > IMO, this update is just a recommendation to the user, and sometimes it
> is
> > still possible that there may be strict permissions for the log file
> > even the data directory is allowed for the group access. So I feel
> > it is still better to update the permissions of the current_logfiles
> > to the database files permissions than log file permissions.
>
> I was just reading again this thread, and the suggestions that
> current_logfiles is itself not a log file is also a sensible
> position.  I was just looking at the patch that you sent at the top of
> the thread here:
>
> https://www.postgresql.org/message-id/cajrrpgceotf1p7awoeqyd3pqr-0xkqg_herv98djbamj+na...@mail.gmail.com
>

Thanks for the review.


> And actually it seems to me that you have a race condition in that
> stuff.  I think that you had better use umask(), then fopen, and then
> once again umask() to put back the previous permissions, removing the
> extra chmod() call.
>

Changed the patch to use umask() instead of chmod() according to
your suggestion.

updated patch attached.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Adjust-current_logfiles-file-permissions_v2.patch
Description: Binary data

MSVC Build support with visual studio 2019

2019-03-20 Thread Haribabu Kommi

Hi Hackers,

Here I attached a patch that supports building of PostgreSQL with VS 2019.
VS 2019 is going to release on Apr 2nd 2019, it will be good if version 12
supports compiling. The attached for is for review, it may needs some
updates
once the final version is released.

Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the
minimum visual studio support to 2013 to support C99 standards, because of
this
reason, the current attached patch cannot be backpatched as it is.

I can provide a separate back branches patch later once this patch comes to
a stage of commit. Currently all the supported branches are possible to
compile with VS 2017.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Support-building-with-visual-studio-2019.patch
Description: Binary data

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-19 Thread Haribabu Kommi

On Tue, Mar 19, 2019 at 6:47 PM Jamison, Kirk 
wrote:

> Hi Hari-san,
>
>
>
> On Sunday, February 10, 2019 2:25 PM (GMT+9), Haribabu Kommi wrote:
>
> > I try to fix it by adding a check for parallel worker or not and based
> on it
>
> > count them into stats. Patch attached.
>
> >
>
> > With this patch, currently it doesn't count parallel worker
> transactions, and
>
> > rest of the stats like seqscan and etc are still get counted. IMO they
> still
>
> > may need to be counted as those stats represent the number of tuples
>
> > returned and etc.
>
> >
>
> > Comments?
>
>
>
> I took a look at your patch, and it’s pretty straightforward.
>
> However, currently the patch does not apply, so I reattached an updated one
>
> to keep the CFbot happy.
>
>
>
> The previous patch also had a missing header to detect parallel workers
>
> so I added that. --> #include "access/parallel.h"
>

Thanks for update and review krik.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-19 Thread Haribabu Kommi

On Tue, Mar 19, 2019 at 2:32 PM Rahila Syed 
wrote:

> Hi Haribabu,
>
> The latest patch fails while applying header files part. Kindly rebase.
>

Thanks for the review.


> The patch looks good to me. However, I wonder what are the other scenarios
> where xact_commit is incremented because even if I commit a single
> transaction with your patch applied the increment in xact_commit is > 1. As
> you mentioned upthread, need to check what internal operation resulted in
> the increase. But the increase in xact_commit is surely lesser with the
> patch.
>

Currently, the transaction counts are incremented by the background process
like autovacuum and checkpointer.
when turn the autovacuum off, you can see exactly how many transaction
commits increases.


> postgres=# BEGIN;
> BEGIN
> postgres=# select xact_commit from pg_stat_database where datname =
> 'postgres';
>  xact_commit
> -
>  158
> (1 row)
>
> postgres=# explain analyze select * from foo where i = 1000;
>QUERY
> PLAN
>
> 
>  Gather  (cost=1000.00..136417.85 rows=1 width=37) (actual
> time=4.596..3792.710 rows=1 loops=1)
>Workers Planned: 2
>Workers Launched: 2
>->  Parallel Seq Scan on foo  (cost=0.00..135417.75 rows=1 width=37)
> (actual time=2448.038..3706.009 rows=0 loops=3)
>  Filter: (i = 1000)
>  Rows Removed by Filter: 333
>  Planning Time: 0.353 ms
>  Execution Time: 3793.572 ms
> (8 rows)
>
> postgres=# commit;
> COMMIT
> postgres=# select xact_commit from pg_stat_database where datname =
> 'postgres';
>  xact_commit
> -
>  161
> (1 row)
>

Thanks for the test and confirmation.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-03-19 Thread Haribabu Kommi

On Tue, Mar 19, 2019 at 5:29 PM Michael Paquier  wrote:

> On Mon, Mar 18, 2019 at 11:45:05AM -0400, Robert Haas wrote:
> > So you want to default to no group access regardless of the directory
> > permissions, with an option to enable group access that must be
> > explicitly specified?  That seems like a reasonable option to me; note
> > that initdb does seem to chdir() an existing directory.
>
> Hm.  We have been assuming that the contents of a base backup inherit
> the permission of the source when using pg_basebackup because this
> allows users to keep a nodes in a consistent state without deciding
> which option to use.  Do you mean that you would like to enforce the
> permissions of only the root directory if it exists?  Or the root
> directory with all its contents?  The former may be fine.  The latter
> is definitely not.
>

As per my understanding going through the discussion, the option is for
root directory with all its contents also.

How about the following change?

pg_basebackup  --> copies the contents of the src directory (with group
access)
and even the root directory permissions.

pg_basebackup --no-group-access   --> copies the contents of the src
directory
(with no group access) even for the root directory.

So the default behavior works for many people, others that needs restrict
behavior
can use the new option.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: What to name the current heap after pluggable storage / what to rename?

2019-03-18 Thread Haribabu Kommi

On Tue, Mar 19, 2019 at 2:32 PM Andres Freund  wrote:

> On 2019-03-18 16:24:40 -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2019-03-13 08:29:47 -0400, Robert Haas wrote:
> > > On Tue, Mar 12, 2019 at 8:39 PM Andres Freund 
> wrote:
> > > > > I like that option.
> > > >
> > > > In that vein, does anybody have an opinion about the naming of
> > > > a) HeapUpdateFailureData, which will be used for different AMs
> > > > b) HTSU_Result itself, which'll be the return parameter for
> > > >update/delete via tableam
> > > > c) Naming of HTSU_Result members (like HeapTupleBeingUpdated)
> > > >
> > > > I can see us doing several things:
> > > > 1) Live with the old names, explain the naming as historical baggage
> > > > 2) Replace names, but add typedefs / #defines for backward
> compatibility
> > > > 3) Rename without backward compatibility
> > >
> > > I think I have a mild preference for renaming HeapUpdateFailureData to
> > > TableUpdateFailureData, but I'm less excited about renaming
> > > HTSU_Result or its members.  I don't care if you want to do
> > > s/HeapTuple/TableTuple/g and s/HTSU_Result/TTSU_Result/g though.
> >
> > Anybody else got an opion on those? I personally don't have meaningful
> > feelings on this.  I'm mildly inclined to the renamings suggested
> > above.
>
> What I now have is:
>
> /*
>  * Result codes for table_{update,delete,lock}_tuple, and for visibility
>  * routines inside table AMs.
>  */
> typedef enum TM_Result
> {
> /* Signals that the action succeeded (i.e. update/delete
> performed) */
> TableTupleMayBeModified,
>
> /* The affected tuple wasn't visible to the relevant snapshot */
> TableTupleInvisible,
>
> /* The affected tuple was already modified by the calling backend
> */
> TableTupleSelfModified,
>
> /* The affected tuple was updated by another transaction */
> TableTupleUpdated,
>
> /* The affected tuple was deleted by another transaction */
> TableTupleDeleted,
>
> /*
>  * The affected tuple is currently being modified by another
> session. This
>  * will only be returned if (update/delete/lock)_tuple are
> instructed not
>  * to wait.
>  */
> TableTupleBeingModified,
>
> /* lock couldn't be acquired, action skipped. Only used by
> lock_tuple */
> TableTupleWouldBlock
> } TM_Result;
>

With the above structure, it is easy to understand where and how these
values
are used.

so +1 to go with this change.



> I'm kinda wondering about replacing the TableTuple prefix with TableMod,
> seems less confusing to me.


One more way, how about just TupleUpdated and etc. Removing of Table?
The structure name also suggests as TM (IMO, it is TupleModication?)


  I'm also wondering about replacing
> *MayBeModified with *OK.
>

How about TupleModified? I am fine with *OK also.


> Right now I have
>
> typedef enum TM_Result HTSU_Result;
>
> #define HeapTupleMayBeUpdated TableTupleMayBeModified
> #define HeapTupleInvisible TableTupleInvisible
> #define HeapTupleSelfUpdated TableTupleSelfModified
> #define HeapTupleUpdated TableTupleUpdated
> #define HeapTupleDeleted TableTupleDeleted
> #define HeapTupleBeingUpdated TableTupleBeingModified
> #define HeapTupleWouldBlock TableTupleWouldBlock
>
> in heapam.h (whereas the above is in tableam.h), for backward
> compat. But I'm not sure it's worth it.
>

These old macros are pretty much used in the internal code, and I doubt
that any one depends directly on those macros. I vote for removal of
these backward compatibility macros.


Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-03-18 Thread Haribabu Kommi

On Mon, Mar 18, 2019 at 1:58 PM Masahiko Sawada 
wrote:

> On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada 
> wrote:
> >
> > On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi 
> wrote:
> > >
> > > On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada 
> wrote:
> > >>
> > >> Thank you. Attached the rebased patch.
> > >
> > >
> > > I ran some performance tests to compare the parallelism benefits,
> >
> > Thank you for testing!
> >
> > > but I got some strange results of performance overhead, may be it is
> > > because, I tested it on my laptop.
> >
> > Hmm, I think the parallel vacuum would help for heavy workloads like a
> > big table with multiple indexes. In your test result, all executions
> > are completed within 1 sec, which seems to be one use case that the
> > parallel vacuum wouldn't help. I suspect that the table is small,
> > right? Anyway I'll also do performance tests.
> >
>
> Here is the performance test results. I've setup a 500MB table with
> several indexes and made 10% of table dirty before each vacuum.
> Compared execution time of the patched postgrse with the current HEAD
> (at 'speed_up' column). In my environment,
>
>  indexes | parallel_degree |  patched   |head| speed_up
> -+-+++--
>   0 |   0 |   238.2085 |   244.7625 |   1.0275
>   0 |   1 |   237.7050 |   244.7625 |   1.0297
>   0 |   2 |   238.0390 |   244.7625 |   1.0282
>   0 |   4 |   238.1045 |   244.7625 |   1.0280
>   0 |   8 |   237.8995 |   244.7625 |   1.0288
>   0 |  16 |   237.7775 |   244.7625 |   1.0294
>   1 |   0 |  1328.8590 |  1334.9125 |   1.0046
>   1 |   1 |  1325.9140 |  1334.9125 |   1.0068
>   1 |   2 |  1333.3665 |  1334.9125 |   1.0012
>   1 |   4 |  1329.5205 |  1334.9125 |   1.0041
>   1 |   8 |  1334.2255 |  1334.9125 |   1.0005
>   1 |  16 |  1335.1510 |  1334.9125 |   0.9998
>   2 |   0 |  2426.2905 |  2427.5165 |   1.0005
>   2 |   1 |  1416.0595 |  2427.5165 |   1.7143
>   2 |   2 |  1411.6270 |  2427.5165 |   1.7197
>   2 |   4 |  1411.6490 |  2427.5165 |   1.7196
>   2 |   8 |  1410.1750 |  2427.5165 |   1.7214
>   2 |  16 |  1413.4985 |  2427.5165 |   1.7174
>   4 |   0 |  4622.5060 |  4619.0340 |   0.9992
>   4 |   1 |  2536.8435 |  4619.0340 |   1.8208
>   4 |   2 |  2548.3615 |  4619.0340 |   1.8126
>   4 |   4 |  1467.9655 |  4619.0340 |   3.1466
>   4 |   8 |  1486.3155 |  4619.0340 |   3.1077
>   4 |  16 |  1481.7150 |  4619.0340 |   3.1174
>   8 |   0 |  9039.3810 |  8990.4735 |   0.9946
>   8 |   1 |  4807.5880 |  8990.4735 |   1.8701
>   8 |   2 |  3786.7620 |  8990.4735 |   2.3742
>   8 |   4 |  2924.2205 |  8990.4735 |   3.0745
>   8 |   8 |  2684.2545 |  8990.4735 |   3.3493
>   8 |  16 |  2672.9800 |  8990.4735 |   3.3635
>  16 |   0 | 17821.4715 | 17740.1300 |   0.9954
>  16 |   1 |  9318.3810 | 17740.1300 |   1.9038
>  16 |   2 |  7260.6315 | 17740.1300 |   2.4433
>  16 |   4 |  5538.5225 | 17740.1300 |   3.2030
>  16 |   8 |  5368.5255 | 17740.1300 |   3.3045
>  16 |  16 |  5291.8510 | 17740.1300 |   3.3523
> (36 rows)
>

The performance results are good. Do we want to add the recommended
size in the document for the parallel option? the parallel option for
smaller
tables can lead to performance overhead.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-03-18 Thread Haribabu Kommi

On Thu, Feb 28, 2019 at 1:00 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Haribabu Kommi [mailto:kommi.harib...@gmail.com]
> > Attached are the updated patches.
>
> Thanks, all look fixed.
>
>
> > The target_server_type option yet to be implemented.
>
> Please let me review once more and proceed to testing when the above is
> added, to make sure the final code looks good.  I'd like to see how complex
> the if conditions in multiple places would be after adding
> target_server_type, and consider whether we can simplify them together with
> you.  Even now, the if conditions seem complicated to me... that's probably
> due to the existence of read_write_host_index.
>

Yes, if checks are little bit complex because of additional checks to
identify, I will check if there is
any easier way to update them without introducing code duplication.

While working on implementation of target_server_type new connection option
for the libpq
to specify master, slave and etc, there is no problem when the newly added
target_server_type
option is used separate, but when it is combined with the existing
target_session_attrs, there
may be some combinations that are not valid or such servers doesn't exist.

Target_session_attrs  Target_server_type

read-write   prefer-slave, slave
prefer-read master, slave
read-onlymaster, prefer-slave

I know that some of the cases above is possible, like master server with by
default accepts
read-only sessions. Instead of we put a check to validate what is right
combination, how
about allowing the combinations and in case if such combination is not
possible, means
there shouldn't be any server the supports the requirement, and connection
fails.

comments?

And also as we need to support the new option to connect to servers < 12
also, this option
sends the command "select pg_is_in_recovery()" to the server to find out
whether the server
is recovery mode or not?

And also regarding the implementation point of view, the new
target_server_type option
validation is separately handled, means the check for the required server
is handled in a separate
switch case, when both options are given, first find out the required
server for target_session_attrs
and validate the same again with target_server_type?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-03-16 Thread Haribabu Kommi

On Sat, Mar 9, 2019 at 2:13 PM Andres Freund  wrote:

> Hi,
>
> While 0001 is pretty bulky, the interesting bits concentrate on a
> comparatively small area. I'd appreciate if somebody could give the
> comments added in tableam.h a read (both on callbacks, and their
> wrappers, as they have different audiences). It'd make sense to first
> read the commit message, to understand the goal (and I'd obviously also
> appreciate suggestions for improvements there as well).
>
> I'm pretty happy with the current state of the scan patch. I plan to do
> two more passes through it (formatting, comment polishing, etc. I don't
> know of any functional changes needed), and then commit it, lest
> somebody objects.
>

I found couple of typos in the committed patch, attached patch fixes them.
I am not sure about one typo, please check it once.

And I reviewed the 0002 patch, which is a pretty simple and it can be
committed.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-table-access-methods-typos-correction.patch
Description: Binary data

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-03-15 Thread Haribabu Kommi

On Tue, Mar 12, 2019 at 5:03 PM Michael Paquier  wrote:

> On Tue, Feb 26, 2019 at 12:22:53PM +1100, Haribabu Kommi wrote:
> > I checked the code why the current_logfiles is not implemented as
> > shared memory and found that the current syslogger doesn't attach to
> > the shared memory of the postmaster. To support storing the
> > current_logfiles in shared memory, the syslogger process also needs
> > to attach to the shared memory, this seems to be a new
> > infrastructure change.
>
> I don't think you can do that anyway and we should not do it.  Shared
> memory can be reset after a backend exits abnormally, but the
> syslogger lives across that.  What you sent upthread to improve the
> documentation is in my opinion sufficient:
>
> https://www.postgresql.org/message-id/cajrrpge-v2_lmfd9nhrbejjy3vvokjwy3w_h+fs2nxcjg3p...@mail.gmail.com
>
> I would not have split the paragraph you broke into two, but instead
> just add this part in-between:
> +   
> +Permissions 0640 are recommended to be
> compatible with
> +initdb option
> --allow-group-access.
> +   
> Any objections in doing that?
>

If I remember correctly, in one of the mails, you mentioned that having a
separate
para is better. Attached is the updated patch as per your suggestion.

IMO, this update is just a recommendation to the user, and sometimes it is
still
possible that there may be strict permissions for the log file even the
data directory
is allowed for the group access. So I feel it is still better to update the
permissions
of the current_logfiles to the database files permissions than log file
permissions.

Regards,
Haribabu Kommi
Fujitsu Australia

0001-log_file_mode-recommended-value-update_v2.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-03-15 Thread Haribabu Kommi

On Fri, Mar 15, 2019 at 10:33 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-09 02:19, Haribabu Kommi wrote:
> > Yes, I agree that it may be a problem if the existing data directory
> > permissions
> > are 0700 to changing it to 0750. But it may not be a problem for the
> > scenarios,
> > where the existing data permissions  >=0750, to the upstream permissions.
> > Because user must need to change anyway to start the server, otherwise
> > server
> > start fails, and also the files inside the data folder follows the
> > permissions of the
> > upstream data directory.
> >
> > usually production systems follows same permissions are of upstream, I
> don't
> > see a problem in following the same for development environment also?
>
> I think the potential problems of getting this wrong are bigger than the
> issue we are trying to fix.
>

Thanks for your opinion. I am not sure exactly what are the problems.
But anyway I can go with your suggestion.

How about changing the data directory permissions for the -R scenario?
if executing pg_basebackup on to an existing data directory is a common
scenario? or leave it?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-03-10 Thread Haribabu Kommi

On Sat, Mar 9, 2019 at 2:18 PM Andres Freund  wrote:

> Hi,
>
> On 2019-03-09 11:03:21 +1100, Haribabu Kommi wrote:
> > Here I attached the rebased patches that I shared earlier. I am adding
> the
> > comments to explain the API's in the code, will share those patches
> later.
>
> I've started to add those for the callbacks in the first commit. I'd
> appreciate a look!
>

Thanks for the updated patches.

+ /*

+ * Callbacks for hon-modifying operations on individual tuples
+ * 

Typo in tableam.h file. hon->non

> I think I'll include the docs patches, sans the callback documentation,
> in the next version. I'll probably merge them into one commit, if that's
> OK with you?
>

OK.
For easy review, I will still maintain 3 or 4 patches instead of the
current patch
series.

> > I observed a crash with the latest patch series in the COPY command.
>
> Hm, which version was this? I'd at some point accidentally posted a
> 'tmp' commit that was just a performance hack
>

Yes. in my version that checked have that commit.
May be that is the reason for the failure.

Btw, your patches always are attached out of order:
>
> https://www.postgresql.org/message-id/CAJrrPGd%2Brkz54wE-oXRojg4XwC3bcF6bjjRziD%2BXhFup9Q3n2w%40mail.gmail.com
> 10, 1, 3, 4, 2 ...
>

Sorry about that.
I always think why it is ordering that way when I attached the patch files
into the mail.
I thought it may be gmail behavior, but with experiment I found that, while
adding the multiple
patches, the last selected patch given the preference and it will be listed
as first attachment.

I will take care that this problem doesn't repeat it again.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-03-08 Thread Haribabu Kommi

On Fri, Mar 8, 2019 at 11:59 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> pg_basebackup copies the data directory permission mode from the
> upstream server.  But it doesn't copy the ownership.  So if say the
> upstream server allows group access and things are owned by
> postgres:postgres, and I want to make a copy for local development and
> make a backup into a directory owned by peter:staff without group
> access, then it would be inappropriate for pg_basebackup to change the
> permissions on that directory.
>

Yes, I agree that it may be a problem if the existing data directory
permissions
are 0700 to changing it to 0750. But it may not be a problem for the
scenarios,
where the existing data permissions  >=0750, to the upstream permissions.
Because user must need to change anyway to start the server, otherwise
server
start fails, and also the files inside the data folder follows the
permissions of the
upstream data directory.

usually production systems follows same permissions are of upstream, I don't
see a problem in following the same for development environment also?

comments?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-03-08 Thread Haribabu Kommi

On Thu, Mar 7, 2019 at 6:33 AM Andres Freund  wrote:

> Hi,
>
> On 2019-03-05 23:07:21 -0800, Andres Freund wrote:
> > My next steps are:
> > - final polish & push the basic DDL and pg_dump patches
>
> Done and pushed. Some collation dependent fallout, I'm hoping I've just
> fixed that.
>

Thanks for the corrections that I missed, also for the extra changes.

Here I attached the rebased patches that I shared earlier. I am adding the
comments to explain the API's in the code, will share those patches later.

I observed a crash with the latest patch series in the COPY command.
I am not sure whether the problem is with the reduce of tableOid patch
problem,
Will check it and correct the problem.

Regards,
Haribabu Kommi
Fujitsu Australia


0010-Table-access-method-API-explanation.patch
Description: Binary data


0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch
Description: Binary data


0003-Docs-of-default_table_access_method-GUC.patch
Description: Binary data


0004-Rename-indexam.sgml-to-am.sgml.patch
Description: Binary data


0002-Removal-of-scan_update_snapshot-callback.patch
Description: Binary data


0005-Reorganize-am-as-both-table-and-index.patch
Description: Binary data


0006-Doc-update-of-Create-access-method-type-table.patch
Description: Binary data


0009-Doc-of-CREATE-TABLE-AS-.-USING-syntax.patch
Description: Binary data


0007-Doc-update-of-create-materialized-view-.-USING-synta.patch
Description: Binary data


0008-Doc-update-of-CREATE-TABLE-.-USING-syntax.patch
Description: Binary data

Re: [bug fix] Produce a crash dump before main() on Windows

2019-03-04 Thread Haribabu Kommi

On Mon, Mar 4, 2019 at 3:23 PM Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> Hello.
>
> At Tue, 6 Nov 2018 15:53:37 +1100, Haribabu Kommi <
> kommi.harib...@gmail.com> wrote in <
> cajrrpgcxzi4z_sttnuuvyoaw+sadk7+cjgypuf7ao43vujl...@mail.gmail.com>
> > Thanks for confirmation of that PostgreSQL runs as service.
> >
> > Based on the following details, we can decide whether this fix is
> required
> > or not.
> > 1. Starting of Postgres server using pg_ctl without service is of
> > production use or not?
> > 2. Without this fix, how difficult is the problem to find out that
> > something is preventing the
> > server to start? In case if it is easy to find out, may be better to
> > provide some troubleshoot
> > guide for windows users can help.
> >
> > I am in favor of doc fix if it easy to find the problem instead of
> assuming
> > the user usage.
>
> I don't have an idea about which is better behavior, but does
> this work for you?
>
>
> https://docs.microsoft.com/en-us/windows/desktop/wer/collecting-user-mode-dumps
>

No, this option is not generating local dumps for postgresql, but this
option is useful to
generate dumps for the applications that don't have a specific WER
reporting.



> > These dumps are configured and controlled independently of the
> > rest of the WER infrastructure. You can make use of the local
> > dump collection even if WER is disabled or if the user cancels
> > WER reporting. The local dump can be different than the dump sent
> > to Microsoft.
>
> I found that symantec offers this in the steps for error
> reporting of its products.
>

Adding some doc changes for the users to refer what can be the issue
windows if the
PostgreSQL server doesn't start and there is no core file available.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Inheriting table AMs for partitioned tables

2019-03-04 Thread Haribabu Kommi

On Tue, Mar 5, 2019 at 10:47 AM Andres Freund  wrote:

> Hi,
>
> In the pluggable storage patch [1], one thing that I'm wondering about
> is how exactly to inherit the storage AM across partitions. I think
> that's potentially worthy of a discussion with a wider audience than I'd
> get in that thread.  It seems also related to the recent discussion in [2]
>
> Consider (excerpted from the tests):
>
> CREATE TABLE tableam_parted_heap2 (a text, b int) PARTITION BY list (a)
> USING heap2;
>
> SET default_table_access_method = 'heap';
> CREATE TABLE tableam_parted_a_heap2 PARTITION OF tableam_parted_heap2 FOR
> VALUES IN ('a');
>
> SET default_table_access_method = 'heap2';
> CREATE TABLE tableam_parted_b_heap2 PARTITION OF tableam_parted_heap2 FOR
> VALUES IN ('b');
>
> CREATE TABLE tableam_parted_c_heap2 PARTITION OF tableam_parted_heap2 FOR
> VALUES IN ('c') USING heap;
> CREATE TABLE tableam_parted_d_heap2 PARTITION OF tableam_parted_heap2 FOR
> VALUES IN ('d') USING heap2;
>
> It seems pretty clear that tableam_parted_heap2, tableam_parted_d_heap2
> would be stored via heap2, and tableam_parted_c_heap2 via heap.
>
> But for tableam_parted_a_heap2 tableam_parted_b_heap2 the answer isn't
> quite as clear.  I think it'd both be sensible for new partitions to
> inherit the AM from the root, but it'd also be sensible to use the
> current default.
>
> Out of laziness (it's how it works rn) I'm inclined to to go with using
> the current default, but I'd be curious if others disagree.
>


As other said that, I also agree to go with default_table_access_method to
be
preferred if not explicitly specified the access method during the table
creation.

This discussion raises a point that, in case if the user wants to change the
access method of a table later once it is created, currently there is no
option.
currently there are no other alternative table access methods that are
available
for the user to switch, but definitely it may be required later.

I will provide a patch to alter the access method of a table for v13.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-03-01 Thread Haribabu Kommi

On Sat, Mar 2, 2019 at 7:27 AM Robert Haas  wrote:

> On Thu, Feb 7, 2019 at 3:28 AM Masahiko Sawada 
> wrote:
> > WAL encryption will follow as an additional feature.
>
> I don't think WAL encryption is an optional feature.  You can argue
> about whether it's useful to encrypt the disk files in the first place
> given that there's no privilege boundary between the OS user and the
> database, but a lot of people seem to think it is and maybe they're
> right.  However, who can justify encrypting only SOME of the disk
> files and not others?  I mean, maybe you could justify not encryption
> the SLRU files on the grounds that they probably don't leak much in
> the way of interesting information, but the WAL files certainly do --
> your data is there, just as much as in the data files themselves.
>

+1

WAL encryption is not optional, it must be encrypted.

> To be honest, I think there is a lot to like about the patches
> Cybertec has proposed.  Those patches don't have all of the fancy
> key-management stuff that you are proposing here, but maybe that
> stuff, if we want it, could be added, rather than starting over from
> scratch.  It seems to me that those patches get a lot of things right.
> In particular, it looked to me when I looked at them like they made a
> pretty determined effort to encrypt every byte that might go down to
> the disk.  It seems to me that you if you want encryption, you want
> that.
>

The Cybertec proposed patches are doing the encryption at the instance
level, AFAIK, the current discussion is also trying to reduce the scope of
the
encryption to object level like (tablesapce, database or table) to avoid
the encryption
performance impact for the databases, tables that don't need it.

IMO, the entire performance of encryption depends on WAL encryption, whether
we choose instance level or object level encryption.

As WAL is not an optional encryption, even if the encryption is set to
object
level, the corresponding objects WAL needs to be encrypted, so this should
be
done at the WAL insertion not at WAL write to disk, because some entries are
not encrypted and some not. Or may be something like encrypting entire WAL
even if one object is set for encryption.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Drop type "smgr"?

2019-02-27 Thread Haribabu Kommi

On Thu, Feb 28, 2019 at 5:37 PM Tom Lane  wrote:

> Thomas Munro  writes:
> > On Thu, Feb 28, 2019 at 7:08 PM Tom Lane  wrote:
> >> I agree that smgrtype as it stands is pretty pointless, but what
> >> will we be using instead to get to those other implementations?
>
> > Our current thinking is that smgropen() should know how to map a small
> > number of special database OIDs to different smgr implementations
>
> Hmm.  Maybe mapping based on tablespaces would be a better idea?
>

Thanks to bringing up this idea of mutliple smgr implementations. I also
thought of implementing our own smgr implementation to support transparent
data encryption on the disk based on tablespace mapping.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-02-26 Thread Haribabu Kommi

On Mon, Feb 25, 2019 at 11:38 AM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> Hi Hari-san,
>
> I've reviewed all files.  I think I'll proceed to testing when I've
> reviewed the revised patch and the patch for target_server_type.
>
>
Thanks for the review.


>
> (1) patch 0001
> CONNECTION_CHECK_WRITABLE,  /* Check if we could make a
> writable
>  *
> connection. */
> +   CONNECTION_CHECK_TARGET,/* Check if we have a proper target
> +*
> connection */
> CONNECTION_CONSUME  /* Wait for any pending
> message and consume
>  * them. */
>
> According to the following comment, a new enum value should be added at
> the end.
>
> /*
>  * Although it is okay to add to these lists, values which become unused
>  * should never be removed, nor should constants be redefined - that would
>  * break compatibility with existing code.
>  */
>

fixed.


>
>
> (2) patch 0002
> It seems to align better with the existing code to set the default value
> to pg_conn.requested_session_type explicitly in makeEmptyPGconn(), even if
> the default value is 0.  Although I feel it's redundant, other member
> variables do so.
>
>
fixed.


>
> (3) patch 0003
>  IntervalStyle was not reported by releases before
> 8.4;
> -application_name was not reported by releases
> before 9.0.)
> +application_name was not reported by releases
> before 9.0
> +transaction_read_only was not reported by releases
> before 12.0.)
>
> ";" is missing at the end of the third line.
>
>
fixed.


>
> (4) patch 0004
> -   /* Type of connection to make.  Possible values: any, read-write.
> */
> +   /* Type of connection to make.  Possible values: any, read-write,
> perfer-read. */
> char   *target_session_attrs;
>
> perfer -> prefer
>
>
fixed.


>
> (5) patch 0004
> @@ -3608,6 +3691,9 @@ makeEmptyPGconn(void)
> conn = NULL;
> }
>
> +   /* Initial value */
> +   conn->read_write_host_index = -1;
>
> The new member should be initialized earlier in this function.  Otherwise,
> as you can see in the above fragment, conn can be NULL in an out-of-memory
> case.
>
>
fixed.



>
> (6) patch 0004
> Don't we add read-only as well as prefer-read, which corresponds to Slave
> or Secondary of PgJDBC's targetServerType?  I thought the original proposal
> was to add both.
>
>
Added read-only option.



>
> (7) patch 0004
> @@ -2347,6 +2367,7 @@ keep_going:
>  /* We will come back to here until there is
>
> conn->try_next_addr = true;
> goto keep_going;
> }
> +
>
> appendPQExpBuffer(>errorMessage,
>
> libpq_gettext("could not create socket: %s\n"),
>
> Is this an unintended newline addition?
>
>
removed.


>
> (8) patch 0004
> +   const char *type =
> (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ?
> +
>  "read-only" : "writable";
>
> I'm afraid these strings are not translatable into other languages.
>
>
OK. I added two separate error message prepare statements for both
read-write and read-only
so, it shouldn't be a problem.



>
> (9) patch 0004
> +   /* Record read-write host
> index */
> +   if
> (conn->read_write_host_index == -1)
> +
>  conn->read_write_host_index = conn->whichhost;
>
> At this point, the session can be either read-write or read-only, can't
> it?  Add the check "!conn->transaction_read_only" in this if condition?
>

Yes, fixed.


>
> (10) patch 0004
> +   if ((conn->target_session_attrs != NULL) &&
> +  (conn->requested_session_type
> == SESSION_TYPE_PREFER_READ) &&
> +  (conn->read_write_host_index !=
> -2))
>
> The first condition is not necessary because the second one exists.
>
> The parenthesis surrounding each of these conditions are redundant.  It
> would be better to remove them for readability.  This also applies to the
> following part:
>
> +   if (((strncmp(val, "on", 2) == 0)
> &&
&g

Re: Pluggable Storage - Andres's take

2019-02-26 Thread Haribabu Kommi

On Wed, Feb 27, 2019 at 11:10 AM Andres Freund  wrote:

> Hi,
>
> On 2019-01-21 10:32:37 +1100, Haribabu Kommi wrote:
> > I am not able to remove the complete t_tableOid from HeapTuple,
> > because of its use in triggers, as the slot is not available in triggers
> > and I need to store the tableOid also as part of the tuple.
>
> What precisely do you man by "use in triggers"? You mean that a trigger
> might access a HeapTuple's t_tableOid directly, even though all of the
> information is available in the trigger context?
>

I forgot the exact scenario, but during the trigger function execution, the
pl/pgsql function execution access the TableOidAttributeNumber from the
stored
tuple using the heap_get* function. Because of lack of slot support in the
triggers,
we still need to maintain the t_tableOid with proper OID. The heaptuple
t_tableOid
member data is updated whenever the heaptuple is generated from slot.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-02-25 Thread Haribabu Kommi

On Mon, Feb 4, 2019 at 12:16 PM Haribabu Kommi 
wrote:

>
> And regarding current_logfiles permissions, I feel this file should have
> permissions of data directory files as it is present in the data directory
> whether it stores the information of log file, until this file is
> completely
> removed with another approach to store the log file details.
>
> I am not sure whether this has been already discussed or not? How about
> using shared memory to store the log file names? So that we don't need
> of this file?
>

I checked the code why the current_logfiles is not implemented as shared
memory
and found that the current syslogger doesn't attach to the shared memory of
the
postmaster. To support storing the current_logfiles in shared memory, the
syslogger
process also needs to attach to the shared memory, this seems to be a new
infrastructure
change.

In case if we are not going to change the permissions of the file to group
access mode
instead of if we strict with log_file_mode, I just tried the attached patch
of moving the
current_logfiles patch to the log_directory. The only drawback of this
approach, is incase
if the user changes the log_directory, the current_logfiles is present in
the old log_directory.
I don't see that as a problem.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia

0001-Move-the-current_logfiles-file-into-log_directory.patch
Description: Binary data

Re: [HACKERS] Block level parallel vacuum

2019-02-23 Thread Haribabu Kommi

On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada 
wrote:

> On Wed, Feb 13, 2019 at 9:32 PM Haribabu Kommi 
> wrote:
> >
> >
> > On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada 
> wrote:
> >>
> >> On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi <
> kommi.harib...@gmail.com> wrote:
> >> >
> >> >
> >> > On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada 
> wrote:
> >> >>
> >> >>
> >> >> The passing stats = NULL to amvacuumcleanup and ambulkdelete means
> the
> >> >> first time execution. For example, btvacuumcleanup skips cleanup if
> >> >> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
> >> >> amvacuumcleanup when the first time calling. And they store the
> result
> >> >> stats to the memory allocated int the local memory. Therefore in the
> >> >> parallel vacuum I think that both worker and leader need to move it
> to
> >> >> the shared memory and mark it as updated as different worker could
> >> >> vacuum different indexes at the next time.
> >> >
> >> >
> >> > OK, understood the point. But for btbulkdelete whenever the stats are
> NULL,
> >> > it allocates the memory. So I don't see a problem with it.
> >> >
> >> > The only problem is with btvacuumcleanup, when there are no dead
> tuples
> >> > present in the table, the btbulkdelete is not called and directly the
> btvacuumcleanup
> >> > is called at the end of vacuum, in that scenario, there is code flow
> difference
> >> > based on the stats. so why can't we use the deadtuples number to
> differentiate
> >> > instead of adding another flag?
> >>
> >> I don't understand your suggestion. What do we compare deadtuples
> >> number to? Could you elaborate on that please?
> >
> >
> > The scenario where the stats should pass NULL to btvacuumcleanup
> function is
> > when there no dead tuples, I just think that we may use that deadtuples
> structure
> > to find out whether stats should pass NULL or not while avoiding the
> extra
> > memcpy.
> >
>
> Thank you for your explanation. I understood. Maybe I'm worrying too
> much but I'm concernced compatibility; currently we handle indexes
> individually. So if there is an index access method whose ambulkdelete
> returns NULL at the first call but returns a palloc'd struct at the
> second time or other, that doesn't work fine.
>
> The documentation says that passed-in 'stats' is NULL at the first
> time call of ambulkdelete but doesn't say about the second time or
> more. Index access methods may expect that the passed-in 'stats'  is
> the same as what they has returned last time. So I think to add an
> extra flag for keeping comptibility.
>

I checked some of the ambulkdelete functions, and they are not returning
a NULL data whenever those functions are called. But the palloc'd structure
doesn't get filled with the details.

IMO, there is no need of any extra code that is required for parallel vacuum
compared to normal vacuum.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-02-22 Thread Haribabu Kommi

On Wed, Feb 20, 2019 at 7:40 PM Magnus Hagander  wrote:

>
> On Wed, Feb 20, 2019 at 5:17 AM Haribabu Kommi 
> wrote:
>
>>
>> On Fri, Feb 15, 2019 at 10:15 AM Michael Paquier 
>> wrote:
>>
>>> On Thu, Feb 14, 2019 at 11:21:19PM +1100, Haribabu Kommi wrote:
>>> > On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander 
>>> wrote:
>>> >> I think it could be argued that neither initdb *or* pg_basebackup
>>> should
>>> >> change the permissions on an existing directory, because the admin
>>> may have
>>> >> done that intentionally. But when they do create the directory, they
>>> should
>>> >> follow the same patterns.
>>> >
>>> > Hmm, even if the administrator set some specific permissions to the
>>> data
>>> > directory, PostgreSQL server doesn't allow server to start if the
>>> > permissions are not (0700) for versions less than 11 and (0700 or
>>> > 0750) for version 11 or later.
>>>
>>> Yes, particularly with pg_basebackup -R this adds an extra step in the
>>> user flow.
>>>
>>
> Perhaps we should make the enforcement of permissions conditional on -R?
> OTOH that's documented as "write recovery.conf", but we could change that
> to be "prepare for replication" or something?
>

Yes, the enforcement of permissions can be done only when -R option is
provided.
The documentation is changed in v12 already as "write configuration for
replication".

>
> > To let the user to use the PostgreSQL server, user must change the
>>> > permissions of the data directory. So, I don't see a problem in
>>> > changing the permissions by these tools.
>>>
>>> I certainly agree with the point of Magnus that both tools should
>>> behave consistently, and I cannot actually imagine why it would be
>>> useful for an admin to keep a more permissive data folder while all
>>> the contents already have umasks set at the same level as the primary
>>> (or what initdb has been told to use), but perhaps I lack imagination.
>>> If we doubt about potential user impact, the usual, best, answer is to
>>> let back-branches behave the way they do now, and only do something on
>>> HEAD.
>>>
>>
>> I also agree that both inidb and pg_basebackup should behave same.
>> Our main concern is that standby data directory that doesn't follow
>> the primary data directory permissions can lead failures when the standby
>> gets promoted.
>>
>
> I don't think that follows at all. There are many scenarios where you'd
> want the standby to have different permissions than the primary.
>

I really having a hard time to understand that how the different
permissions are possible?
I think of that the standby is having more restrict permissions. May be the
standby is not a
hot standby?

Can you please provide some more details?

Till v11, PostgreSQL  allows the data directory permissions to be 0700 of
the directory, otherwise
server start fails, even if the pg_basebackup is successful. In my testing
I came to know that data
directory permissions less than 0700 e.g- 0300 also the server is started.
I feel the check of validating
data directory is checking whether are there any group permissions or not?
But it didn't whether the
current owner have all the permissions are not? Is this the scenario are
you expecting?

> And I'm not sure it's our business to enforce that. A much much more
> common mistake people make is run pg_basebackup as the wrong user, thereby
> getting the wrong owner of all files. But that doesn't mean we should
> enforce the owner/group of the files.
>

I didn't understand this point also clearly. The system user who executes
the pg_basebackup command,
all the database files are associated with that user. If that corresponding
user don't have permissions to
access that existing data folder, the backup fails.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Removal of duplicate variable declarations in fe-connect.c

2019-02-21 Thread Haribabu Kommi

On Fri, Feb 22, 2019 at 3:22 PM Michael Paquier  wrote:

> On Fri, Feb 22, 2019 at 11:33:17AM +1100, Haribabu Kommi wrote:
> > During the development of another feature, I found that same local
> > variables are declared twice.
> > IMO, there is no need of again declaring the local variables. Patch
> > attached.
>
> Indeed, fixed.  That's not a good practice, and each variable is
> assigned in its own block before getting used, so there is no
> overlap.
>

Thanks.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-02-21 Thread Haribabu Kommi

On Fri, Feb 22, 2019 at 5:47 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Haribabu Kommi [mailto:kommi.harib...@gmail.com]
> Here I attached first set of patches that implemented the prefer-read
> option
> > after reporting the transaction_read_only GUC to client. Along the lines
> > of adding prefer-read option patch,
>
> Great, thank you!  I'll review and test it.
>

Thanks.


> > 3. Existing read-write code is modified to use the new reported GUC
> instead
> > of executing the show command.
>
> Is the code path left to use SHOW for older servers?
>

Yes, for older versions less than 12, still uses the SHOW command approach.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-02-21 Thread Haribabu Kommi

On Thu, Feb 14, 2019 at 1:04 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Haribabu Kommi [mailto:kommi.harib...@gmail.com]
> >   No.  It's not good if the user has to be bothered by
> > default_transaction_read_only when he simply wants to a standby.
> >
> >
> >
> > OK. Understood.
> > so if we are going to differentiate between readonly and standby types,
> > then I still
> > feel that adding a prefer-read to target_session_attrs is still valid
> > improvement.
>
> I agree that it's valid improvement to add prefer-read to
> target_session_attr, as a means to "get a read-only session."
>
> > But the above improvement can be enhanced once the base work of
> GUC_REPORT
> > is finished.
>
> Is it already in progress in some thread, or are you trying to start from
> scratch?  (I may have done it, but I don't remember it well...)
>
> > Yes, I want to work on this patch, hopefully by next commitfest. In case
> > if I didn't get time,
> > I can ask for your help.
>
> I'm glad to hear that.  Sure.  I'd like to review your patch, and possibly
> add/modify code if necessary.  Are you going to add
> target_server_type={primary | standby | prefer_standby} as well as add
> prefer-read to target_session_attr?
>
>
> >   (I wonder which of server_type or server_role feels natural in
> > English.)
> >
> >
> >
> > server_type may be good as it stands with connection option
> > (target_server_type).
>
> Thanks, agreed.  That also follows PgJDBC's targetServerType.
>

Here I attached first set of patches that implemented the prefer-read
option after reporting
the transaction_read_only GUC to client. Along the lines of adding
prefer-read option patch,

1. I refactor the existing code to reduce the duplicate.
2. Added a enum to represent the user requested target_session_attrs type,
this is used in
comparison instead of doing a strcmp always.
3. Existing read-write code is modified to use the new reported GUC instead
of executing the
show command.

Basic patches are working, there may still need some documentation works.

Now I will add the another parameter target_server_type to choose the
primary, standby or prefer-standby
as discussed in the upthreads with a new GUC variable.

Regards,
Haribabu Kommi
Fujitsu Australia


0004-New-prefer-read-target_session_attrs-type.patch
Description: Binary data


0001-Restructure-the-code-to-remove-duplicate-code.patch
Description: Binary data


0002-New-TargetSessionAttrsType-enum.patch
Description: Binary data


0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch
Description: Binary data

Removal of duplicate variable declarations in fe-connect.c

2019-02-21 Thread Haribabu Kommi

Hi Hackers,

During the development of another feature, I found that same local
variables are declared twice.
IMO, there is no need of again declaring the local variables. Patch
attached.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Removal-of-duplicate-local-variable-declaration.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-02-19 Thread Haribabu Kommi

On Fri, Feb 15, 2019 at 10:15 AM Michael Paquier 
wrote:

> On Thu, Feb 14, 2019 at 11:21:19PM +1100, Haribabu Kommi wrote:
> > On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander 
> wrote:
> >> I think it could be argued that neither initdb *or* pg_basebackup should
> >> change the permissions on an existing directory, because the admin may
> have
> >> done that intentionally. But when they do create the directory, they
> should
> >> follow the same patterns.
> >
> > Hmm, even if the administrator set some specific permissions to the data
> > directory, PostgreSQL server doesn't allow server to start if the
> > permissions are not (0700) for versions less than 11 and (0700 or
> > 0750) for version 11 or later.
>
> Yes, particularly with pg_basebackup -R this adds an extra step in the
> user flow.
>
> > To let the user to use the PostgreSQL server, user must change the
> > permissions of the data directory. So, I don't see a problem in
> > changing the permissions by these tools.
>
> I certainly agree with the point of Magnus that both tools should
> behave consistently, and I cannot actually imagine why it would be
> useful for an admin to keep a more permissive data folder while all
> the contents already have umasks set at the same level as the primary
> (or what initdb has been told to use), but perhaps I lack imagination.
> If we doubt about potential user impact, the usual, best, answer is to
> let back-branches behave the way they do now, and only do something on
> HEAD.
>

I also agree that both inidb and pg_basebackup should behave same.
Our main concern is that standby data directory that doesn't follow
the primary data directory permissions can lead failures when the standby
gets promoted.

Lack of complaints from the users, how about making this change in the HEAD?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-02-19 Thread Haribabu Kommi

On Tue, Nov 27, 2018 at 4:59 PM Amit Langote 
wrote:

> Hi,
>
> On 2018/11/02 9:17, Haribabu Kommi wrote:
> > Here I attached the cumulative fixes of the patches, new API additions
> for
> > zheap and
> > basic outline of the documentation.
>
> I've read the documentation patch while also looking at the code and here
> are some comments.
>

Thanks for the review and apologies for the delay.
I have taken care of your most of the comments in the latest version of the
doc patches.


>
> +  
> +
> +TupleTableSlotOps *
> +slot_callbacks (Relation relation);
> +
> +   API to access the slot specific methods;
> +   Following methods are available;
> +   TTSOpsVirtual,
> +   TTSOpsHeapTuple,
> +   TTSOpsMinimalTuple,
> +   TTSOpsBufferTuple,
> +  
>
> Unless I'm misunderstanding what the TupleTableSlotOps abstraction is or
> its relations to the TableAmRoutine abstraction, I think the text
> description could better be written as:
>
> "API to get the slot operations struct for a given table access method"
>
> It's not clear to me why various TTSOps* structs are listed here?  Is the
> point that different AMs may choose one of the listed alternatives?  For
> example, I see that heap AM implementation returns TTOpsBufferTuple, so it
> manipulates slots containing buffered tuples, right?  Other AMs are free
> to return any one of these?  For example, some AMs may never use buffer
> manager and hence not use TTOpsBufferTuple.  Is that understanding correct?
>

Yes, AM can decide what type of Slot method it wants to use.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-02-19 Thread Haribabu Kommi

On Mon, Feb 4, 2019 at 2:31 PM Haribabu Kommi 
wrote:

>
> On Tue, Jan 22, 2019 at 1:43 PM Haribabu Kommi 
> wrote:
>
>>
>>
>> OK. I will work on the doc changes.
>>
>
> Sorry for the delay.
>
> Attached a draft patch of doc and comments changes that I worked upon.
> Currently I added comments to the callbacks that are present in the
> TableAmRoutine
> structure and I copied it into the docs. I am not sure whether it is a
> good approach or not?
> I am yet to add description for the each parameter of the callbacks for
> easier understanding.
>
> Or, Giving description of each callbacks in the docs with division of
> those callbacks
> according to them divided in the TableAmRoutine structure? Currently
> following divisions
> are available.
> 1. Table scan
> 2. Parallel table scan
> 3. Index scan
> 4. Manipulation of physical tuples
> 5. Non-modifying operations on individual tuples
> 6. DDL
> 7. Planner
> 8. Executor
>
> Suggestions?
>

Here I attached the doc patches for the pluggable storage, I divided the
API's into the above
specified groups and explained them in the docs.I can further add more
details if the approach
seems fine.

Regards,
Haribabu Kommi
Fujitsu Australia


0008-Table-access-method-API-explanation.patch
Description: Binary data


0001-Docs-of-default_table_access_method-GUC.patch
Description: Binary data


0002-Rename-indexam.sgml-to-am.sgml.patch
Description: Binary data


0003-Reorganize-am-as-both-table-and-index.patch
Description: Binary data


0004-Doc-update-of-Create-access-method-type-table.patch
Description: Binary data


0005-Doc-update-of-create-materialized-view-.-USING-synta.patch
Description: Binary data


0006-Doc-update-of-CREATE-TABLE-.-USING-syntax.patch
Description: Binary data


0007-Doc-of-CREATE-TABLE-AS-.-USING-syntax.patch
Description: Binary data

Re: pg_basebackup ignores the existing data directory permissions

2019-02-14 Thread Haribabu Kommi

On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander  wrote:

> On Thu, Feb 14, 2019 at 9:10 AM Michael Paquier 
> wrote:
>
>> On Thu, Feb 14, 2019 at 06:34:07PM +1100, Haribabu Kommi wrote:
>> > we have an application that is used to create the data directory with
>>
>> Well, initdb would do that happily, so there is no actual any need to
>> do that to begin with.  Anyway..
>>
>> > owner access (0700), but with initdb group permissions option, it
>> > automatically
>> > converts to (0750) by the initdb. But pg_basebackup doesn't change it
>> when
>> > it tries to do a backup from a group access server.
>>
>> So that's basically the opposite of the case I was thinking about,
>> where you create a path for a base backup with permissions strictly
>> higher than 700, say 755, and the base backup path does not have
>> enough restrictions.  And in your case the permissions are too
>> restrictive because of the application creating the folder itself but
>> they should be relaxed if group access is enabled.  Actually, that's
>> something that we may want to do consistently across all branches.  If
>> an application calls pg_basebackup after creating a path, they most
>> likely change the permissions anyway to allow the postmaster to
>> start.
>>
>
> I think it could be argued that neither initdb *or* pg_basebackup should
> change the permissions on an existing directory, because the admin may have
> done that intentionally. But when they do create the directory, they should
> follow the same patterns.
>

Hmm, even if the administrator set some specific permissions to the data
directory,
PostgreSQL server doesn't allow server to start if the permissions are not
(0700)
for versions less than 11 and (0700 or 0750) for version 11 or later.

To let the user to use the PostgreSQL server, user must change the
permissions
of the data directory. So, I don't see a problem in changing the
permissions by these
tools.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-02-13 Thread Haribabu Kommi

On Thu, Feb 14, 2019 at 3:04 PM Michael Paquier  wrote:

> On Wed, Feb 13, 2019 at 06:42:36PM +1100, Haribabu Kommi wrote:
> > This should back-patch till 11 where the group access is introduced.
> > Because of lack of complaints, I agree with you that there is no need of
> > further back-patch.
>
> I am confused by the link with group access.

Apologies to confuse you by linking it with group access. This patch doesn't
have an interaction with group access. From v11 onwards, PostgreSQL server
accepts two set of permissions for the data directory because of group
access.

we have an application that is used to create the data directory with
owner access (0700), but with initdb group permissions option, it
automatically
converts to (0750) by the initdb. But pg_basebackup doesn't change it when
it tries to do a backup from a group access server.

> The patch you are
> sending is compatible down to v11, but we could also do it further
> down by just using chmod with S_IRWXU on the target folder if it
> exists and is empty.
>

Yes, I agree with you that by changing chmod as you said fixes it in the
back-branches.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-02-13 Thread Haribabu Kommi

On Fri, Feb 8, 2019 at 8:16 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Haribabu Kommi [mailto:kommi.harib...@gmail.com]
> > target_session_attrs checks for the default_transaction_readonly or not?
>
> PG 11 uses transaction_read_only, not default_transaction_readonly.
> That's fine, because its purpose is to get a read-only session as the name
> suggests, not to connect to a standby.
>

Thanks for correction, yes it uses the transaction_readonly.


> > target_server_type checks for whether the server is in recovery or not?
>
> Yes.
>
>
> > I feel having two options make this feature complex to use it from the
> user
> > point of view?
> >
> > The need of two options came because of a possibility of a master server
> > with default_transaction_readonly set to true. Even if the default
> > transaction
> > is readonly, it is user changeable parameter, so there shouldn't be any
> > problem.
>
> No.  It's not good if the user has to be bothered by
> default_transaction_read_only when he simply wants to a standby.
>

OK. Understood.
so if we are going to differentiate between readonly and standby types,
then I still
feel that adding a prefer-read to target_session_attrs is still valid
improvement.

But the above improvement can be enhanced once the base work of GUC_REPORT
is finished.



> > how about just adding one parameter that takes the options similar like
> > JDBC?
> > target_server_type - Master, standby and prefer-standby. (The option
> names
> > can revised based on the common words on the postgresql docs?)
>
> "Getting a read-only session" is not equal to "connecting to a standby",
> so two different parameters make sense.
>
>
> > And one more thing, what happens when the server promotes to master but
> > the connection requested is standby? I feel we can maintain the existing
> > connections
> > and later new connections can be redirected? comments?
>
> Ideally, it should be possible for the user to choose the behavior like
> Oracle below.  But that's a separate feature.
>
>
> 9.2 Role Transitions Involving Physical Standby Databases
>
> https://docs.oracle.com/en/database/oracle/oracle-database/18/sbydb/managing-oracle-data-guard-role-transitions.html#GUID-857F6F45-DC1C-4345-BD39-F3BE7D79F1CD
> --
> Keeping Physical Standby Sessions Connected During Role Transition
>
> As of Oracle Database 12c Release 2 (12.2.0.1), when a physical standby
> database is converted into a primary you have the option to keep any
> sessions connected to the physical standby connected, without disruption,
> during the switchover/failover.
>
> To enable this feature, set the STANDBY_DB_PRESERVE_STATES initialization
> parameter in your init.ora file before the standby instance is started.
> This parameter applies to physical standby databases only. The allowed
> values are:
>
> NONE — No sessions on the standby are retained during a
> switchover/failover. This is the default value.
>
> ALL — User sessions are retained during switchover/failover.
>
> SESSION — User sessions are retained during switchover/failover.
> --
>

Yes, the above feature is completely a different role enhancement feature,
that can taken up separately.


> Would you like to work on this patch?  I'm not sure if I can take time,
> but I'm willing to do it if you don't have enough time.
>
> As Tom mentioned, we need to integrate and clean patches in three mail
> threads:
>
> * Make a new GUC_REPORT parameter, server_type, to show the server role
> (primary or standby).
> * Add target_server_type libpq connection parameter, whose values are
> either primary, standby, or prefer_standby.
> * Failover timeout, load balancing, etc. that someone proposed in the
> other thread?
>

Yes, I want to work on this patch, hopefully by next commitfest. In case if
I didn't get time,
I can ask for your help.


> (I wonder which of server_type or server_role feels natural in English.)
>

server_type may be good as it stands with connection option
(target_server_type).

Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-02-13 Thread Haribabu Kommi

On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada 
wrote:

> On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi 
> wrote:
> >
> >
> > On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada 
> wrote:
> >>
> >>
> >> The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
> >> first time execution. For example, btvacuumcleanup skips cleanup if
> >> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
> >> amvacuumcleanup when the first time calling. And they store the result
> >> stats to the memory allocated int the local memory. Therefore in the
> >> parallel vacuum I think that both worker and leader need to move it to
> >> the shared memory and mark it as updated as different worker could
> >> vacuum different indexes at the next time.
> >
> >
> > OK, understood the point. But for btbulkdelete whenever the stats are
> NULL,
> > it allocates the memory. So I don't see a problem with it.
> >
> > The only problem is with btvacuumcleanup, when there are no dead tuples
> > present in the table, the btbulkdelete is not called and directly the
> btvacuumcleanup
> > is called at the end of vacuum, in that scenario, there is code flow
> difference
> > based on the stats. so why can't we use the deadtuples number to
> differentiate
> > instead of adding another flag?
>
> I don't understand your suggestion. What do we compare deadtuples
> number to? Could you elaborate on that please?
>

The scenario where the stats should pass NULL to btvacuumcleanup function is
when there no dead tuples, I just think that we may use that deadtuples
structure
to find out whether stats should pass NULL or not while avoiding the extra
memcpy.


> > And also this scenario is not very often, so avoiding
> > memcpy for normal operations would be better. It may be a small gain,
> just
> > thought of it.
> >
>
> This scenario could happen periodically on an insert-only table.
> Additional memcpy is executed once per indexes in a vacuuming but I
> agree that the avoiding memcpy would be good.
>

Yes, understood. If possible removing the need of memcpy would be good.
The latest patch doesn't apply anymore. Needs a rebase.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: pg_basebackup ignores the existing data directory permissions

2019-02-12 Thread Haribabu Kommi

On Wed, Feb 13, 2019 at 12:42 PM Michael Paquier 
wrote:

> On Tue, Feb 12, 2019 at 06:03:37PM +1100, Haribabu Kommi wrote:
> > During the testing allow group access permissions on the standby database
> > directory, one of my colleague found the issue, that pg_basebackup
> > doesn't verify whether the existing data directory has the required
> > permissions or not?  This issue is not related allow group access
> > permissions. This problem was present for a long time, (I think from
> > the time the pg_basebackup was introduced).
>
> In which case this would cause the postmaster to fail to start.
>

Yes, the postmaster fails to start, but I feel if pg_basebackup takes care
of correcting the file permissions automatically like initdb, that will be
good.


> > Attached patch fixes the problem similar like initdb by changing the
> > permissions of the data
> > directory to the required permissions.
>
> It looks right to me and takes care of the case where group access is
> allowed.  Still, we have not seen any complains about the current
> behavior either and pg_basebackup is around for some time already, so
> I would tend to not back-patch that.  Any thoughts from others?
>

This should back-patch till 11 where the group access is introduced.
Because of lack of complaints, I agree with you that there is no need of
further back-patch.

Regards,
Haribabu Kommi
Fujitsu Australia

pg_basebackup ignores the existing data directory permissions

2019-02-11 Thread Haribabu Kommi

Hi Hackers,

During the testing allow group access permissions on the standby database
directory,
one of my colleague found the issue, that pg_basebackup doesn't verify
whether the existing data directory has the required permissions or not?
This issue is not related allow group access permissions. This problem was
present for a long time, (I think from the time the pg_basebackup was
introduced).

Attached patch fixes the problem similar like initdb by changing the
permissions of the data
directory to the required permissions.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-pg_basebackup-Correct-the-existing-directory-permiss.patch
Description: Binary data

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-02-07 Thread Haribabu Kommi

On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi 
wrote:

> Hi Hackers,
>
> Does increase in Transaction commits per second means good query
> performance?
> Why I asked this question is, many monitoring tools display that number of
> transactions
> per second in the dashboard (including pgadmin).
>
> During the testing of bunch of queries with different set of
> configurations, I observed that
> TPS of some particular configuration has increased compared to default
> server configuration, but the overall query execution performance is
> decreased after comparing all queries run time.
>
> This is because of larger xact_commit value than default configuration.
> With the changed server configuration, that leads to generate more parallel
> workers and every parallel worker operation is treated as an extra commit,
> because of this reason, the total number of commits increased, but the
> overall query performance is decreased.
>
> Is there any relation of transaction commits to performance?
>
> Is there any specific reason to consider the parallel worker activity also
> as a transaction commit? Especially in my observation, if we didn't
> consider the parallel worker activity as separate commits, the test doesn't
> show an increase in transaction commits.
>

The following statements shows the increase in the xact_commit value with
parallel workers. I can understand that workers updating the seq_scan stats
as they performed the seq scan. Is the same applied to parallel worker
transaction
commits also?

The transaction commit counter is updated with all the internal operations
like
autovacuum, checkpoint and etc. The increase in counters with these
operations
are not that visible compared to parallel workers.

The spike of TPS with parallel workers is fine to consider?

postgres=# select relname, seq_scan from pg_stat_user_tables where relname
= 'tbl';
 relname | seq_scan
-+--
 tbl |   16
(1 row)

postgres=# begin;
BEGIN
postgres=# select xact_commit from pg_stat_database where datname =
'postgres';
 xact_commit
-
 524
(1 row)

postgres=# explain analyze select * from tbl where f1 = 1000;
QUERY PLAN

---
 Gather  (cost=0.00..3645.83 rows=1 width=214) (actual time=1.703..79.736
rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on tbl  (cost=0.00..3645.83 rows=1 width=214)
(actual time=28.180..51.672 rows=0 loops=3)
 Filter: (f1 = 1000)
 Rows Removed by Filter: 3
 Planning Time: 0.090 ms
 Execution Time: 79.776 ms
(8 rows)

postgres=# commit;
COMMIT
postgres=# select xact_commit from pg_stat_database where datname =
'postgres';
 xact_commit
-
 531
(1 row)

postgres=# select relname, seq_scan from pg_stat_user_tables where relname
= 'tbl';
 relname | seq_scan
-+--
 tbl |   19
(1 row)

Regards,
Haribabu Kommi
Fujitsu Australia

Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-02-07 Thread Haribabu Kommi

Hi Hackers,

Does increase in Transaction commits per second means good query
performance?
Why I asked this question is, many monitoring tools display that number of
transactions
per second in the dashboard (including pgadmin).

During the testing of bunch of queries with different set of
configurations, I observed that
TPS of some particular configuration has increased compared to default
server configuration, but the overall query execution performance is
decreased after comparing all queries run time.

This is because of larger xact_commit value than default configuration.
With the changed server configuration, that leads to generate more parallel
workers and every parallel worker operation is treated as an extra commit,
because of this reason, the total number of commits increased, but the
overall query performance is decreased.

Is there any relation of transaction commits to performance?

Is there any specific reason to consider the parallel worker activity also
as a transaction commit? Especially in my observation, if we didn't
consider the parallel worker activity as separate commits, the test doesn't
show an increase in transaction commits.

Suggestions?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-02-05 Thread Haribabu Kommi

On Mon, Jan 21, 2019 at 5:48 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Haribabu Kommi [mailto:kommi.harib...@gmail.com]
> > Thanks for finding out the problem, how about the following way of
> checking
> > for prefer-read/prefer-standby.
> >
> > 1. (default_transaction_read_only = true and recovery = true)
> >
> > 2. If none of the host satisfies the above scenario, then recovery = true
> > 3. Last check is for default_transaction_read_only = true
>
> That would be fine.  But as I mentioned in another mail, I think "get
> read-only session" and "connect to standby" differ.  So I find it better to
> separate parameters for those request; target_session_attr and
> target_server_type.
>

Thanks for your opinion.

target_session_attrs checks for the default_transaction_readonly or not?
target_server_type checks for whether the server is in recovery or not?

I feel having two options make this feature complex to use it from the user
point of view?

The need of two options came because of a possibility of a master server
with default_transaction_readonly set to true. Even if the default
transaction
is readonly, it is user changeable parameter, so there shouldn't be any
problem.

The same can be applied for logical replication also, the user can change
the
default transaction mode once the connection is established, if it is not
according
to it's requirement.

how about just adding one parameter that takes the options similar like
JDBC?
target_server_type - Master, standby and prefer-standby. (The option names
can revised based on the common words on the postgresql docs?)

And one more thing, what happens when the server promotes to master but
the connection requested is standby? I feel we can maintain the existing
connections
and later new connections can be redirected? comments?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Memory contexts reset for trigger invocations

2019-02-04 Thread Haribabu Kommi

On Tue, Feb 5, 2019 at 4:29 PM Andres Freund  wrote:

> Hi,
>
> trigger.c goes through some trouble to free the tuples returned by
> trigger functions. There's plenty codepaths that look roughly like:
> if (oldtuple != newtuple && oldtuple != slottuple)
> heap_freetuple(oldtuple);
> if (newtuple == NULL)
> {
> if (trigtuple != fdw_trigtuple)
> heap_freetuple(trigtuple);
> return NULL;/* "do nothing" */
> }
>
> but we, as far as I can tell, do not reset the memory context in which
> the trigger functions have been called.
>
> Wouldn't it be better to reset an appropriate context after each
> invocation? Yes, that'd require some care to manage the lifetime of
> tuples returned by triggers, but that seems OK?
>

Currently the trigger functions are executed under per tuple memory
context, but the returned tuples are allocated from the executor memory
context in some scenarios.

   /*
* Copy tuple to upper executor memory.  But if user just did
* "return new" or "return old" without changing anything, there's
* no need to copy; we can return the original tuple (which will
* save a few cycles in trigger.c as well as here).
*/
   if (rettup != trigdata->tg_newtuple &&
rettup != trigdata->tg_trigtuple)
rettup = SPI_copytuple(rettup);

we need to take care of these also before switch to a context?

>
Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-02-04 Thread Haribabu Kommi

On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada 
wrote:

> On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi 
> wrote:
> >
> >
> >
> >
> > + * Before starting parallel index vacuum and parallel cleanup index we
> launch
> > + * parallel workers. All parallel workers will exit after processed all
> indexes
> >
> > parallel vacuum index and parallel cleanup index?
> >
> >
>
> ISTM we're using like "index vacuuming", "index cleanup" and "FSM
> vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
> "parallel index cleanup" would be better?
>

OK.


> > + /*
> > + * If there is already-updated result in the shared memory we
> > + * use it. Otherwise we pass NULL to index AMs and copy the
> > + * result to the shared memory segment.
> > + */
> > + if (lvshared->indstats[idx].updated)
> > + result = &(lvshared->indstats[idx].stats);
> >
> > I didn't really find a need of the flag to differentiate the stats
> pointer from
> > first run to second run? I don't see any problem in passing directing
> the stats
> > and the same stats are updated in the worker side and leader side.
> Anyway no two
> > processes will do the index vacuum at same time. Am I missing something?
> >
> > Even if this flag is to identify whether the stats are updated or not
> before
> > writing them, I don't see a need of it compared to normal vacuum.
> >
>
> The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
> first time execution. For example, btvacuumcleanup skips cleanup if
> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
> amvacuumcleanup when the first time calling. And they store the result
> stats to the memory allocated int the local memory. Therefore in the
> parallel vacuum I think that both worker and leader need to move it to
> the shared memory and mark it as updated as different worker could
> vacuum different indexes at the next time.
>

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the
btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow
difference
based on the stats. so why can't we use the deadtuples number to
differentiate
instead of adding another flag? And also this scenario is not very often,
so avoiding
memcpy for normal operations would be better. It may be a small gain, just
thought of it.


>
> > + initStringInfo();
> > + appendStringInfo(,
> > + ngettext("launched %d parallel vacuum worker %s (planned: %d",
> > +   "launched %d parallel vacuum workers %s (planned: %d",
> > +   lvstate->pcxt->nworkers_launched),
> > + lvstate->pcxt->nworkers_launched,
> > + for_cleanup ? "for index cleanup" : "for index vacuum",
> > + lvstate->pcxt->nworkers);
> > + if (lvstate->options.nworkers > 0)
> > + appendStringInfo(, ", requested %d", lvstate->options.nworkers);
> >
> > what is the difference between planned workers and requested workers,
> aren't both
> > are same?
>
> The request is the parallel degree that is specified explicitly by
> user whereas the planned is the actual number we planned based on the
> number of indexes the table has. For example, if we do like 'VACUUM
> (PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
> and the planned is 4. Also if max_parallel_maintenance_workers is 2
> the planned is 2.
>

OK.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-02-03 Thread Haribabu Kommi

On Tue, Jan 22, 2019 at 1:43 PM Haribabu Kommi 
wrote:

>
>
> OK. I will work on the doc changes.
>

Sorry for the delay.

Attached a draft patch of doc and comments changes that I worked upon.
Currently I added comments to the callbacks that are present in the
TableAmRoutine
structure and I copied it into the docs. I am not sure whether it is a good
approach or not?
I am yet to add description for the each parameter of the callbacks for
easier understanding.

Or, Giving description of each callbacks in the docs with division of those
callbacks
according to them divided in the TableAmRoutine structure? Currently
following divisions
are available.
1. Table scan
2. Parallel table scan
3. Index scan
4. Manipulation of physical tuples
5. Non-modifying operations on individual tuples
6. DDL
7. Planner
8. Executor

Suggestions?

Regards,
Haribabu Kommi
Fujitsu Australia

0001-Doc-and-comments-update.patch
Description: Binary data

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-02-03 Thread Haribabu Kommi

On Fri, Feb 1, 2019 at 7:22 PM Michael Paquier  wrote:

> On Fri, Jan 18, 2019 at 09:50:40AM -0500, Stephen Frost wrote:
> > Yes, we should update the documentation in this regard, though it's
> > really an independent thing as that documentation should have been
> > updated in the original group-access patch, so I'll see about fixing
> > it and back-patching it.
>
> Stephen, could you apply Hari's patch then?  I am not sure what the
> consensus is, but documenting the restriction is the minimum we can
> do.
>
> -The default permissions are 0600, meaning only the
> -server owner can read or write the log files.  The other commonly
> -useful setting is 0640, allowing members of the
> owner's
> -group to read the files.  Note however that to make use of such a
> -setting, you'll need to alter  to
> -store the files somewhere outside the cluster data directory.  In
> -any case, it's unwise to make the log files world-readable, since
> -they might contain sensitive data.
> +The default permissions are either 0600, meaning
> only the
> +server owner can read or write the log files or
> 0640, that
> +allows any user in the same group can read the log files, based on
> the new
> +cluster created with --allow-group-access option of
> initdb
> +command. Note however that to make use of any setting other than
> default,
> +you'll need to alter  to store the
> files
> +somewhere outside the cluster data directory.
>
> I would formulate that differently, by just adding an extra paragraph
> to mention that using 0640 is recommended to be
> compatible with initdb's --allow-group-access instead of sticking it
> on the middle of the existing paragraph.
>

Thanks for the review.
I changed the log_file_mode doc patch as per your comment.

How about the attached?

And regarding current_logfiles permissions, I feel this file should have
permissions of data directory files as it is present in the data directory
whether it stores the information of log file, until this file is completely
removed with another approach to store the log file details.

I am not sure whether this has been already discussed or not? How about
using shared memory to store the log file names? So that we don't need
of this file?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-log_file_mode-recommended-value-update.patch
Description: Binary data

Re: initdb --allow-group-access behaviour in windows

2019-02-03 Thread Haribabu Kommi

On Sun, Feb 3, 2019 at 10:34 AM Michael Paquier  wrote:

> On Sat, Feb 02, 2019 at 08:50:14AM +0200, David Steele wrote:
> > How about:
> >
> > +files created by initdb.  This option is
> ignored
> > +on Windows, which does not support
> > +POSIX-style group permissions.
>
> Fine for me.  Anybody else has an opinion to offer?
>

+1 to the above changes. Thanks for working on it.

Regards,
Haribabu Kommi
Fujitsu Australia

initdb --allow-group-access behaviour in windows

2019-01-31 Thread Haribabu Kommi

Hi Hackers,

During the testing of new feature allowing group access mode for the data
directory by specifying initdb --allow-group-access by one of my collegue,
we didn't observe any change in the data folder permissions. With/without
group access the data folder have permissions to the group.

As Microsoft windows doesn't support POSIX style of permissions, we always
set for the permissions GUC's as not supported in windows. Even the new GUC
that is added with
--allow-group-access feature also mentioned the same.

The initdb --allow-group-access doesn't have any impact on the microsoft
windows, so I feel it should be better to write the same in initdb docs?

need a patch?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-01-29 Thread Haribabu Kommi

On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada 
wrote:

>
> Attached the latest patches.
>

Thanks for the updated patches.
Some more code review comments.

+ started by a single utility command.  Currently, the parallel
+ utility commands that support the use of parallel workers are
+ CREATE INDEX and VACUUM
+ without FULL option, and only when building
+ a B-tree index.  Parallel workers are taken from the pool of

I feel the above sentence may not give the proper picture, how about the
adding following modification?

CREATE INDEX only when building a B-tree index
and VACUUM without FULL option.

+ * parallel vacuum, we perform both index vacuum and index cleanup in
parallel.
+ * Individual indexes is processed by one vacuum process. At beginning of

How about vacuum index and cleanup index similar like other places?

+ * memory space for dead tuples. When starting either index vacuum or
cleanup
+ * vacuum, we launch parallel worker processes. Once all indexes are
processed

same here as well?

+ * Before starting parallel index vacuum and parallel cleanup index we
launch
+ * parallel workers. All parallel workers will exit after processed all
indexes

parallel vacuum index and parallel cleanup index?

+ /*
+ * If there is already-updated result in the shared memory we
+ * use it. Otherwise we pass NULL to index AMs and copy the
+ * result to the shared memory segment.
+ */
+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);

I didn't really find a need of the flag to differentiate the stats pointer
from
first run to second run? I don't see any problem in passing directing the
stats
and the same stats are updated in the worker side and leader side. Anyway
no two
processes will do the index vacuum at same time. Am I missing something?

Even if this flag is to identify whether the stats are updated or not before
writing them, I don't see a need of it compared to normal vacuum.

+ * Enter the parallel mode, allocate and initialize a DSM segment. Return
+ * the memory space for storing dead tuples or NULL if no workers are
prepared.
+ */

+ pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
+ request, true);

But we are passing as serializable_okay flag as true, means it doesn't
return
NULL. Is it expected?

+ initStringInfo();
+ appendStringInfo(,
+ ngettext("launched %d parallel vacuum worker %s (planned: %d",
+   "launched %d parallel vacuum workers %s (planned: %d",
+   lvstate->pcxt->nworkers_launched),
+ lvstate->pcxt->nworkers_launched,
+ for_cleanup ? "for index cleanup" : "for index vacuum",
+ lvstate->pcxt->nworkers);
+ if (lvstate->options.nworkers > 0)
+ appendStringInfo(, ", requested %d", lvstate->options.nworkers);

what is the difference between planned workers and requested workers,
aren't both
are same?

- COMPARE_SCALAR_FIELD(options);
- COMPARE_NODE_FIELD(rels);
+ if (a->options.flags != b->options.flags)
+ return false;
+ if (a->options.nworkers != b->options.nworkers)
+ return false;

Options is changed from SCALAR to check, but why the rels check is removed?
The options is changed from int to a structure so using SCALAR may not work
in other function like _copyVacuumStmt and etc?

+typedef struct VacuumOptions
+{
+ VacuumFlag flags; /* OR of VacuumFlag */
+ int nworkers; /* # of parallel vacuum workers */
+} VacuumOptions;

Do we need to add NodeTag for the above structure? Because this structure is
part of VacuumStmt structure.

+vacuumdb will require background
workers,
+so make sure your 
+setting is more than one.

removing vacuumdb and changing it as "This option will ..."?

I will continue the testing of this patch and share the details.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: [HACKERS] Block level parallel vacuum

2019-01-22 Thread Haribabu Kommi

On Fri, Jan 18, 2019 at 11:42 PM Masahiko Sawada 
wrote:

> On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
>  wrote:
> >
> >
> > On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada 
> wrote:
> >>
> >>
> >> Rebased.
> >
> >
> > I started reviewing the patch, I didn't finish my review yet.
> > Following are some of the comments.
>
> Thank you for reviewing the patch.
>
> >
> > +PARALLEL  class="parameter">N
> > +
> > + 
> > +  Execute index vacuum and cleanup index in parallel with
> >
> > I doubt that user can understand the terms index vacuum and cleanup
> index.
> > May be it needs some more detailed information.
> >
>
> Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
> phases. So maybe adding the referencint to it would work.
>

OK.


> >
> > -typedef enum VacuumOption
> > +typedef enum VacuumOptionFlag
> >  {
> >
> > I don't find the new name quite good, how about VacuumFlags?
> >
>
> Agreed with removing "Option" from the name but I think VacuumFlag
> would be better because this enum represents only one flag. Thoughts?
>

OK.


>
> > postgres=# vacuum (parallel 2, verbose) tbl;
> >
> > With verbose, no parallel workers related information is available.
> > I feel giving that information is required even when it is not parallel
> > vacuum also.
> >
>
> Agreed. How about the folloiwng verbose output? I've added the number
> of launched, planned and requested vacuum workers and purpose (vacuum
> or cleanup).
>
> postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
> 'test' has 3 indexes
> INFO:  vacuuming "public.test"
> INFO:  launched 2 parallel vacuum workers for index vacuum (planned:
> 2, requested: 30)
> INFO:  scanned index "test_idx1" to remove 2000 row versions
> DETAIL:  CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
> INFO:  scanned index "test_idx2" to remove 2000 row versions by
> parallel vacuum worker
> DETAIL:  CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
> INFO:  scanned index "test_idx3" to remove 2000 row versions by
> parallel vacuum worker
> DETAIL:  CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
> INFO:  "test": removed 2000 row versions in 10 pages
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> INFO:  launched 2 parallel vacuum workers for index cleanup (planned:
> 2, requested: 30)
> INFO:  index "test_idx1" now contains 991151 row versions in 2745 pages
> DETAIL:  2000 index row versions were removed.
> 24 index pages have been deleted, 18 are currently reusable.
> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
> INFO:  index "test_idx2" now contains 991151 row versions in 2745 pages
> DETAIL:  2000 index row versions were removed.
> 24 index pages have been deleted, 18 are currently reusable.
> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
> INFO:  index "test_idx3" now contains 991151 row versions in 2745 pages
> DETAIL:  2000 index row versions were removed.
> 24 index pages have been deleted, 18 are currently reusable.
> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
> INFO:  "test": found 2000 removable, 367 nonremovable row versions in
> 41 out of 4425 pages
> DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 500
> There were 6849 unused item pointers.
> Skipped 0 pages due to buffer pins, 0 frozen pages.
> 0 pages are entirely empty.
> CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
> VACUUM
>

The verbose output is good.

Since the previous patch conflicts with 285d8e12 I've attached the
> latest version patch that incorporated the review comment I got.
>

Thanks for the latest patch. I have some more minor comments.

+  Execute index vacuum and cleanup index in parallel with

Better to use vacuum index and cleanup index? This is in same with
the description of vacuum phases. It is better to follow same notation
in the patch.


+ dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);

With the change, the lazy_space_alloc takes care of initializing the
parallel vacuum, can we write something related to that in the comments.


+ initprog_val[2] = dead_tuples->max_dead_tuples;

dead_tuples variable may need rename for better reading?



+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
+ else
+ copy_result = true;


I don't see a need for copy_result variable, how about directly using
the updated flag to decide whether to copy or not? Once the result is
copied update the flag.


+use Test::More tests => 34;

I don't find any new tetst are added in this patch.

I am thinking of performance penalty if we use the parallel option of
vacuum on a small sized table?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Pluggable Storage - Andres's take

2019-01-21 Thread Haribabu Kommi

On Tue, Jan 22, 2019 at 12:15 PM Andres Freund  wrote:

> Hi,
>
> Thanks!
>
> On 2019-01-22 11:51:57 +1100, Haribabu Kommi wrote:
> > Attached the patch for removal of scan_update_snapshot
> > and also the rebased patch of reduction in use of t_tableOid.
>
> I'll soon look at the latter.
>

Thanks.


>
> > > - consider removing table_gimmegimmeslot()
> > > - add substantial docs for every callback
> > >
> >
> > Will work on the above two.
>
> I think it's easier if I do the first, because I can just do it while
> rebasing, reducing unnecessary conflicts.
>
>
OK. I will work on the doc changes.


> > > While I saw an initial attempt at writing smgl docs for the table AM
> > > API, I'm not convinced that's the best approach.  I think it might make
> > > more sense to have high-level docs in sgml, but then do all the
> > > per-callback docs in tableam.h.
> > >
> >
> > OK, I will update the sgml docs accordingly.
> > Index AM has per callback docs in the sgml, refactor them also?
>
> I don't think it's a good idea to tackle the index docs at the same time
> - this patchset is already humongously large...
>

OK.


>
> > diff --git a/src/backend/access/heap/heapam_handler.c
> b/src/backend/access/heap/heapam_handler.c
> > index 62c5f9fa9f..3dc1444739 100644
> > --- a/src/backend/access/heap/heapam_handler.c
> > +++ b/src/backend/access/heap/heapam_handler.c
> > @@ -2308,7 +2308,6 @@ static const TableAmRoutine heapam_methods = {
> >   .scan_begin = heap_beginscan,
> >   .scan_end = heap_endscan,
> >   .scan_rescan = heap_rescan,
> > - .scan_update_snapshot = heap_update_snapshot,
> >   .scan_getnextslot = heap_getnextslot,
> >
> >   .parallelscan_estimate = table_block_parallelscan_estimate,
> > diff --git a/src/backend/executor/nodeBitmapHeapscan.c
> b/src/backend/executor/nodeBitmapHeapscan.c
> > index 59061c746b..b48ab5036c 100644
> > --- a/src/backend/executor/nodeBitmapHeapscan.c
> > +++ b/src/backend/executor/nodeBitmapHeapscan.c
> > @@ -954,5 +954,9 @@ ExecBitmapHeapInitializeWorker(BitmapHeapScanState
> *node,
> >   node->pstate = pstate;
> >
> >   snapshot = RestoreSnapshot(pstate->phs_snapshot_data);
> > - table_scan_update_snapshot(node->ss.ss_currentScanDesc, snapshot);
> > + Assert(IsMVCCSnapshot(snapshot));
> > +
> > + RegisterSnapshot(snapshot);
> > + node->ss.ss_currentScanDesc->rs_snapshot = snapshot;
> > + node->ss.ss_currentScanDesc->rs_temp_snap = true;
> >  }
>
> I was rather thinking that we'd just move this logic into
> table_scan_update_snapshot(), without it invoking a callback.
>

OK. Changed accordingly.
But the table_scan_update_snapshot() function is moved into tableam.c,
to avoid additional header file snapmgr.h inclusion in tableam.h

Regards,
Haribabu Kommi
Fujitsu Australia


0002-Removal-of-scan_update_snapshot-callback.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-01-21 Thread Haribabu Kommi

On Mon, Jan 21, 2019 at 1:01 PM Andres Freund  wrote:

> Hi,
>
> On 2018-12-10 18:13:40 -0800, Andres Freund wrote:
> > On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > > FWIW, now that oids are removed, and the tuple table slot abstraction
> > > got in, I'm working on rebasing the pluggable storage patchset ontop of
> > > that.
> >
> > I've pushed a version to that to the git tree, including a rebased
> > version of zheap:
> > https://github.com/anarazel/postgres-pluggable-storage
> > https://github.com/anarazel/postgres-pluggable-zheap
>
> I've pushed the newest, substantially revised, version to the same
> repository. Note, that while the newest pluggable-zheap version is newer
> than my last email, it's not based on the latest version, and the
> pluggable-zheap development is now happening in the main zheap
> repository.
>

Thanks for the new version of patches and changes.


> Todo:
> - consider removing scan_update_snapshot
>

Attached the patch for removal of scan_update_snapshot
and also the rebased patch of reduction in use of t_tableOid.


> - consider removing table_gimmegimmeslot()
> - add substantial docs for every callback
>

Will work on the above two.

While I saw an initial attempt at writing smgl docs for the table AM
> API, I'm not convinced that's the best approach.  I think it might make
> more sense to have high-level docs in sgml, but then do all the
> per-callback docs in tableam.h.
>

OK, I will update the sgml docs accordingly.
Index AM has per callback docs in the sgml, refactor them also?

Regards,
Haribabu Kommi
Fujitsu Australia


0002-Removal-of-scan_update_snapshot.patch
Description: Binary data


0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch
Description: Binary data

Re: Pluggable Storage - Andres's take

2019-01-20 Thread Haribabu Kommi

On Tue, Jan 15, 2019 at 6:05 PM Andres Freund  wrote:

> Hi,
>
> On 2019-01-15 18:02:38 +1100, Haribabu Kommi wrote:
> > On Tue, Dec 11, 2018 at 1:13 PM Andres Freund 
> wrote:
> >
> > > Hi,
> > >
> > > On 2018-11-26 17:55:57 -0800, Andres Freund wrote:
> > > Further tasks I'm not yet planning to tackle, that I'd welcome help on:
> > > - pg_upgrade testing
> > >
> >
> > I did the pg_upgrade testing from older version with some tables and
> views
> > exists,  and all of them are properly transformed into new server with
> heap
> > as the default access method.
> >
> > I will add the dimitry pg_dump patch and test the pg_upgrade to confirm
> > the proper access method is retained on the upgraded database.
> >
> >
> >
> > > - I think we should consider removing HeapTuple->t_tableOid, it should
> > >   imo live entirely in the slot
> > >
> >
> > I removed the t_tableOid from HeapTuple and during testing I found some
> > problems with triggers, will post the patch once it is fixed.
>
>
> Please note that I'm working on a heavily revised version of the patch
> right now, trying to clean up a lot of things (you might have seen some
> of the threads I started). I hope to post it ~Thursday.  Local-ish
> patches shouldn't be a problem though.
>

Yes, I am checking you other threads of refactoring and cleanups.
I will rebase this patch once the revised code is available.

I am not able to remove the complete t_tableOid from HeapTuple,
because of its use in triggers, as the slot is not available in triggers
and I need to store the tableOid also as part of the tuple.

Currently setting of t_tableOid is done only when the tuple is formed
from the slot, and it is use is replaced with slot member.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch
Description: Binary data

Re: Libpq support to connect to standby server as priority

2019-01-20 Thread Haribabu Kommi

On Fri, Jan 18, 2019 at 5:33 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Tsunakawa, Takayuki [mailto:tsunakawa.ta...@jp.fujitsu.com]
> > As for prefer-standby/prefer-read, if host parameter specifies
> host1,host2
> > in this order, and host1 is the primary with
> > default_transaction_read_only=true, does the app get a connection to
> host1?
> > I want to connect to host2 (standby) only if host1 is down.
>
> Oops, reverse -- I wanted to say "I want to connect to host1 (primary)
> only if host2 is down."
>

Thanks for finding out the problem, how about the following way of checking
for prefer-read/prefer-standby.

1. (default_transaction_read_only = true and recovery = true)
2. If none of the host satisfies the above scenario, then recovery = true
3. Last check is for default_transaction_read_only = true

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2019-01-17 Thread Haribabu Kommi

On Fri, Jan 18, 2019 at 2:34 PM Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Laurenz Albe [mailto:laurenz.a...@cybertec.at]
> > I think that transaction_read_only is good.
> >
> > If it is set to false, we are sure to be on a replication primary or
> > stand-alone server, which is enough to know for the load balancing use
> case.
>
> As Tatsuo-san said, setting default_transaction_read_only leads to a
> misjudgement of the primary.
>
>
> > I deem it unlikely that someone will set default_transaction_read_only to
> > FALSE and then complain that the feature is not working as expected, but
> > again
> > I cannot prove that claim.
>
> I wonder what default_transaction_read_only exists for.  For maing the
> database by default and allowing only specific users to write to the
> database with "CREATE/ALTER USER SET default_transaction_read_only = false"?
>

default_transaction_read_only is a user settable parameter, even if it set
as true by default,
any user can change it later. Deciding server type based on this whether it
supports read-write
or read-only can go wrong, as the user can change it later.

> I'm sorry to repeat myself, but anyway, I think we need a method to
> connect to a standby as the original desire, because the primary instance
> may be read only by default while only limited users update data.  That's
> for reducing the burdon on the primary and minimizing the impact on users
> who update data.  For example,
>
> * run data reporting on the standby
> * backup the database from the standby
> * cascade replication from the standby
>

IMO, if we try to use only pg_is_in_recovery() only to decide to connect,
we may not
support all the target_session_attrs that are possible. how about using
both to decide?

Master/read-write -- recovery = false and default_transaction_read_only =
false
Standby/read-only -- recovery = true
prefer-standby/prefer-read -- recovery = true or
default_transaction_read_only = true
any -- Nothing to be verified

I feel above verifications can cover for both physical and logical
replication.
we can decide what type of options that we can support? and also if we
don't want to rely on default_transaction_read_only user settable parameter,
we can add a new parameter that cannot be changed only with server restart?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-01-17 Thread Haribabu Kommi

On Thu, Jan 17, 2019 at 5:49 AM Stephen Frost  wrote:

> Greetings,
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Stephen Frost  writes:
> > > * Michael Paquier (mich...@paquier.xyz) wrote:
> > >> On Tue, Jan 15, 2019 at 10:53:30AM -0500, Tom Lane wrote:
> > >>> On reflection, maybe the problem is not that we're giving the file
> > >>> the wrong permissions, but that we're putting it in the wrong place?
> >
> > > I'm not really sure putting it into the logfile directory is such a hot
> > > idea as users might have set up external log file rotation of files in
> > > that directory.  Of course, in that case they'd probably signal PG
> right
> > > afterwards and PG would go write out a new file, but it still seems
> > > pretty awkward.  I'm not terribly against solving this issue that way
> > > either though, but I tend to think the originally proposed patch is
> more
> > > sensible.
> >
> > I dunno, I think that the current design was made without any thought
> > whatsoever about the log-files-outside-the-data-directory case.  If
> > you're trying to set things up that way, it's because you want to give
> > logfile read access to people who shouldn't be able to look into the
> > data directory proper.  That makes current_logfiles pretty useless
> > to such people, as it's now designed.
>
> ... or you just want to move the log files to a more sensible location
> than the data directory.  The justification for log_file_mode existing
> is because you might want to have log files with different privileges,
> but that's quite a different thing.
>

Thanks for sharing your opinions.

The current_logfiles is used to store the meta data information of current
writing log files, that is different to log files, so giving permissions of
the
log file may not be correct,

> Now, if the expectation is that current_logfiles is just an internal
> > working file that users shouldn't access directly, then this argument
> > is wrong --- but then why is it documented in user-facing docs?
>
> I really couldn't say why it's documented in the user-facing docs, and
> for my 2c I don't really think it should be- there's a function to get
> that information.  Sprinkling the data directory with files for users to
> access directly doesn't exactly fit my view of what a good API looks
> like.
>
> The fact that there isn't any discussion about where that file actually
> lives does make me suspect you're right that log files outside the data
> directory wasn't really contemplated.
>

I can only think of reading this file by the user directly when the server
is not available, but I don't find any scenario where that is required?



> > If we're going to accept the patch as-is, then it logically follows
> > that we should de-document current_logfiles, because we're taking the
> > position that it's an internal temporary file not meant for user access.
>
> ... and hopefully we'd get rid of it one day entirely.
>

If there is no use of it when server is offline, it will be better to
remove that
file with an alternative to provide the current log file name.

With group access mode, the default value of log_file_mode is changed,
Attached patch reflects the same in docs.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-log_file_mode-default-value-update.patch
Description: Binary data

Re: [HACKERS] Block level parallel vacuum

2019-01-17 Thread Haribabu Kommi

On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada 
wrote:

>
> Rebased.
>

I started reviewing the patch, I didn't finish my review yet.
Following are some of the comments.

+PARALLEL N
+
+ 
+  Execute index vacuum and cleanup index in parallel with

I doubt that user can understand the terms index vacuum and cleanup index.
May be it needs some more detailed information.

- VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
+ VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
+ VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
+} VacuumOptionFlag;

Any specific reason behind not adding it as last member of the enum?

-typedef enum VacuumOption
+typedef enum VacuumOptionFlag
 {

I don't find the new name quite good, how about VacuumFlags?

+typedef struct VacuumOption
+{

How about VacuumOptions? Because this structure can contains all the
options provided to vacuum operation.

+ vacopt1->flags |= vacopt2->flags;
+ if (vacopt2->flags == VACOPT_PARALLEL)
+ vacopt1->nworkers = vacopt2->nworkers;
+ pfree(vacopt2);
+ $$ = vacopt1;
+ }

As the above statement indicates the the last parallel number of workers
is considered into the account, can we explain it in docs?

postgres=# vacuum (parallel 2, verbose) tbl;

With verbose, no parallel workers related information is available.
I feel giving that information is required even when it is not parallel
vacuum also.

Regards,
Haribabu Kommi
Fujitsu Australia

Re: What to name the current heap after pluggable storage / what to rename?

2019-01-15 Thread Haribabu Kommi

On Tue, Jan 15, 2019 at 1:43 PM Andres Freund  wrote:

> Hi,
>
> On 2019-01-11 12:01:36 -0500, Robert Haas wrote:
> > On Thu, Jan 10, 2019 at 7:05 PM Andres Freund 
> wrote:
> > > ISTM that the first block would best belong into new files like
> > > access/rel[ation].h and access/common/rel[ation].h.
> >
> > +1.
> >
> > > I think the second
> > > set should be renamed to be table_open() (with backward compat
> > > redirects, there's way way too many references) and should go into
> > > access/table.h access/table/table.c alongside tableam.[ch],
> >
> > Sounds reasonable.
> >
> > > but I could
> > > also see just putting them into relation.[ch].
> >
> > I would view that as a less-preferred alternative, but not crazy.
>
> Here's a set of patches. Not commit quality, but enough to discuss.
>
>
> The first patch, the only really interesting one, splits out
> relation_(open|openrv|openrv_extended|close) into access/relation.h and
> access/common/relation.c
> and
> heap_(open|openrv|openrv_extended|close) into access/table.h and
> access/table/table.c
>
> It's worthwhile to note that there's another header named
> nodes/relation.h. But there's also utils/rel.h, so I couldn't think of a
> another good name.
>

access/relation.[c|h] name is fine. Or how about access/rel.[c|h],
because nodes/relation.h is related to planner. utils/rel.h is some how
related to relation caches.


> I'm basically thinking that table.h, even in the post pluggable storage
> world, should not contain lower level functionality like dispatching
> into table-am (that'll reside in tableam.h). But e.g. a
> simple_table_(insert|update|delete) could live there, as well as
> potentially some other heap_ functionality strewn around the backend.
>
> I made table.h not include relation.h, which means that a few files
> might need both.  I'm not sure that's the right choice, but it seems
> easier to extend that later if shows to be painful, than to do the
> reverse.
>

The need of both relation.h and table.h into a single file is because of
use of both relation_open table_open functions. when I checked one
of the file where both headers are included, I found that it simply used
the relation_open function even the type of the relation is known.

I didn't check whether it is possible or not? In case if the kind of the
relation
is known at every caller of relation_open, it can be replaced with either
table_open or index_open or composite_type_open functions. So that
there may not need any of the relation functions needs to be exposed.


I've left the following in table.h:
> /*
>  * heap_ used to be the prefix for these routines, and a lot of code will
> just
>  * continue to work without adaptions after the introduction of pluggable
>  * storage, therefore just map these names.
>  */
> #define heap_open(r, l) table_open(r, l)
> #define heap_openrv(r, l)   table_openrv(r, l)
> #define heap_openrv_extended(r, l, m)   table_openrv_extended(r, l, m)
> #define heap_close(r, l)table_close(r, l)
>
> and I think we should leave that in there for the forseeable future.
>

Above typedefs are good. They are useful to avoid any third party
extensions.

Regards,
Haribabu Kommi
Fujitsu Australia

1 2 3 >

1 - 100 of 234 matches

Mail list logo