from:"Rajeev rastogi"

Re: [HACKERS] Improving executor performance

2016-06-28 Thread Rajeev rastogi

On 25 June 2016 05:00, Andres Freund Wrote:
>To: pgsql-hackers@postgresql.org
>Subject: [HACKERS] Improving executor performance
>
>My observations about the performance bottlenecks I found, and partially
>addressed, are in rough order of importance (there's interdependence
>between most of them):
>
>1) Cache misses cost us a lot, doing more predictable accesses can make
>   a huge difference. Without addressing that, many other bottlenecks
>   don't matter all that much.  I see *significant* performance
>   improvements for large seqscans (when the results are used) simply
>   memcpy()'ing the current target block.
>
>   This partially is an intrinsic problem of analyzing a lot of data,
>   and partially because our access patterns are bad.
>
>
>2) Baring 1) tuple deforming is the biggest bottleneck in nearly all
>   queries I looked at. There's various places we trigger deforming,
>   most ending in either slot_deform_tuple(), heap_getattr(),
>   heap_deform_tuple().

Agreed, 
I had also observed similar behavior specifically (2) while working on 
improving executor performance.
I had done some prototype work on this to improve performance by using native 
compilation.
Details of same is available at my blog:
http://rajeevrastogi.blogspot.in/2016/03/native-compilation-part-2-pgday-asia.html
 


>3) Our 1-by-1 tuple flow in the executor has two major issues:

Agreed, In order to tackle this IMHO, we should
1. Makes the processing data-centric instead of operator centric.
2. Instead of pulling each tuple from immediate operator, operator can push the 
tuple to its parent. It can be allowed to push until it sees any operator, 
which cannot be processed without result from other operator.   
More details from another thread:
https://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab77159a9b...@szxeml521-mbs.china.huawei.com
 

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Academic help for Postgres

2016-05-11 Thread Rajeev rastogi

On 11 May 2016 19:50, Bruce Momjian Wrote:


>I am giving a keynote at an IEEE database conference in Helsinki next
>week (http://icde2016.fi/).  (Yes, I am not attending PGCon Ottawa
>because I accepted the Helsinki conference invitation before the PGCon
>Ottawa date was changed from June to May).
>
>As part of the keynote, I would like to mention areas where academia can
>help us.  The topics I can think of are:
>
>   Query optimization
>   Optimizer statistics
>   Indexing structures
>   Reducing function call overhead
>   CPU locality
>   Sorting
>   Parallelism
>   Sharding
>
>Any others?

How about?
1. Considering NUMA aware architecture.
2. Optimizer tuning as per new hardware trends.
3. More effective version of Join algorithms (e.g. Compare to traditional 
"build and then probe" mechanism of Hash Join, now there is pipelining Hash 
join where probe and build both happens together).

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Rajeev rastogi

On 09 May 2016 23:04, Robert Haas Wrote:

>2. vectorized execution, by which I mean the ability of a node to return
>tuples in batches rather than one by one.  Andres has opined more than
>once that repeated trips through ExecProcNode defeat the ability of the
>CPU to do branch prediction correctly, slowing the whole system down,
>and that they also result in poor CPU cache behavior, since we jump all
>over the place executing a little bit of code from each node before
>moving onto the next rather than running one bit of code first, and then
>another later.  I think that's
>probably right.   For example, consider a 5-table join where all of
>the joins are implemented as hash tables.  If this query plan is going
>to be run to completion, it would make much more sense to fetch, say,
>100 tuples from the driving scan and then probe for all of those in the
>first hash table, and then probe for all of those in the second hash
>table, and so on.  What we do instead is fetch one tuple and probe for
>it in all 5 hash tables, and then repeat.  If one of those hash tables
>would fit in the CPU cache but all five together will not,
>that seems likely to be a lot worse.   But even just ignoring the CPU
>cache aspect of it for a minute, suppose you want to write a loop to
>perform a hash join.  The inner loop fetches the next tuple from the
>probe table and does a hash lookup.  Right now, fetching the next tuple
>from the probe table means calling a function which in turn calls
>another function which probably calls another function which probably
>calls another function and now about 4 layers down we actually get the
>next tuple.  If the scan returned a batch of tuples to the hash join,
>fetching the next tuple from the batch would probably be 0 or 1 function
>calls rather than ... more.  Admittedly, you've got to consider the cost
>of marshaling the batches but I'm optimistic that there are cycles to be
>squeezed out here.  We might also want to consider storing batches of
>tuples in a column-optimized rather than row-optimized format so that
>iterating through one or two attributes across every tuple in the batch
>touches the minimal number of cache lines.


This sounds to be really great idea in the direction of performance improvement.
I would like to share my thought as per our research work in the similar area 
(Mostly it may be as you have mentioned).
Our goal with this work was to:
1. Makes the processing data-centric instead of operator centric.
2. Instead of pulling each tuple from immediate operator, operator can push the 
tuple to its parent. It can be allowed to push until it sees any operator, 
which cannot be processed without result from other operator.   
3. Above two points to make better data-locality.

e.g. we had done some quick prototyping (take it just as thought provoker) as 
mentioned below:
Query: select * from tbl1, tbl2, tbl3 where tbl1.a=tbl2.b and tbl2.b=tbl3.c;
For hash join:
For each tuple t2 of tbl2
Materialize a hash tbl1.a = tbl2.b

For each tuple t3 of tbl3
Materialize a hash tbl2.b = tbl3.c

for each tuple t1 of tbl1
Search in hash  tbl1.a = tbl2.b
Search in hash tbl2.b = tbl3.c
Output t1*t2*t3

Off course at each level if there is any additional Qual for the table, same 
can be applied. 

Similarly for Nested Loop Join, plan tree can be processed in the form of 
post-order traversal of tree.
Scan first relation (leftmost relation), store all tuple --> Outer
Loop through all scan (Or some part of total tuples)node relation starting from 
second one
Scan the current relation
For each tuple, match with all tuples of outer result, build the 
combined tuple.
Save all combined satisfied tuple --> Outer

The result we got was really impressive.

There is a paper by Thomas Neumann on this idea: 
http://www.vldb.org/pvldb/vol4/p539-neumann.pdf 

Note: VitesseDB has also implemented this approach for Hash Join along with 
compilation and their result is really impressive.

Thanks and Regards,
Kumar Rajeev Rastogi.
http://rajeevrastogi.blogspot.in/   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-11-02 Thread Rajeev rastogi

On 30 October 2015 20:33, Andres Freund Wrote:
>On 2015-10-30 10:57:45 -0400, Tom Lane wrote:
>> Andres Freund <and...@anarazel.de> writes:
>> > adding a parseInput(conn) into the loop yields the expected
>> > FATAL:  57P01: terminating connection due to unexpected postmaster
>> > exit Is there really any reason not to do that?
>>
>> Might work, but it probably needs some study:
>
>Yea, definitely. I was just at pgconf.eu's keynote catching up on a talk.
>No fully thought through proposal's to be expected ;)
>
>> (a) is it safe
>
>I don't immediately see why not.
>
>> (b) is this the right place / are there other places
>
>I think it's ok for the send failure case, in a quick lookthrough I
>didn't find anything else for writes - I'm not entirely sure all read
>cases are handled tho, but it seems less likely to be mishandles.


I also did some analysis as Andres suggested and observed that since the error 
message is already sent by backend to frontend, 
so error message is available on connection channel just that is was not 
received/processed by frontend.
Also pqHandleSendFailure was checking the available data if any but it was 
ignored to process.

So as Andres suggested above, if we add parseInput to receiver and parse the 
available message on connection channel, we could display appropriate error. 

IMHO above proposed place to add parseInput seems to be right, as already this 
function takes of handling " NOTICE message that the backend might have sent 
just before it died "

Attached is the patch with this change.

Comments?

Thanks and Regards,
Kumar Rajeev Rastogi


dangle-v4.patch
Description: dangle-v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-10-26 Thread Rajeev rastogi

On 23 October 2015 01:58, Robert Haas [mailto:robertmh...@gmail.com] Wrote:

>Well, I'm not buying this extra PostmasterIsAlive() call on every pass
>through the main loop.  That seems more expensive than we can really
>justify. Checking this when we're already calling WaitLatchOrSocket is
>basically free, but the other part is not.

Agree. 

>Here's a version with that removed and some changes to the comments.

Thanks for changing.

>I still don't think this is quite working right, though, because here's
>what happened when I killed the postmaster:
>
>rhaas=# select 1;
> ?column?
>--
>1
>(1 row)
>
>rhaas=# \watch
>Watch every 2sThu Oct 22 16:24:10 2015
>
> ?column?
>--
>1
>(1 row)
>
>Watch every 2sThu Oct 22 16:24:12 2015
>
> ?column?
>--
>1
>(1 row)
>
>Watch every 2sThu Oct 22 16:24:14 2015
>
> ?column?
>--
>1
>(1 row)
>
>Watch every 2sThu Oct 22 16:24:16 2015
>
> ?column?
>--
>1
>(1 row)
>
>server closed the connection unexpectedly
>This probably means the server terminated abnormally
>before or while processing the request.
>The connection to the server was lost. Attempting reset: Failed.
>
>Note that the error message doesn't actually show up on the client (it
>did show up in the log).  I guess that may be inevitable if we're
>blocked in secure_write(), but if we're in secure_read() maybe it should
>work?  I haven't investigated why it doesn't.

Actually in this case client is not waiting for the response from the server 
rather it is waiting for new command from user. 
So even though server has sent the response to client, client is not able to 
receive.
Once client receives the next command to execute, by the time connection has 
terminated from server side and hence  it comes out with the above message 
(i.e. server closed the connection...)

Though I am also in favor of  providing some error message to client. But with 
the current infrastructure, I don’t think there is any way to pass this error 
message to client [This error has happened without involvement of the client 
side].

Comments?

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-10-20 Thread Rajeev rastogi

On 20 October 2015 23:34, Robert Haas [mailto:robertmh...@gmail.com] Wrote:

>On Tue, Oct 20, 2015 at 12:11 PM, Alvaro Herrera
><alvhe...@2ndquadrant.com> wrote:
>> Robert Haas wrote:
>>> I don't think that proc_exit(1) is the right way to exit here.  It's
>>> not very friendly to exit without at least attempting to give the
>>> client a clue about what has gone wrong.  I suggest something like
>>> this:
>>>
>>> ereport(FATAL,
>>> (errcode(ERRCODE_ADMIN_SHUTDOWN),
>>>  errmsg("terminating connection due to postmaster
>>> shutdown")));
>>
>> Agreed, but I don't think "shutdown" is the right word to use here --
>> that makes it sound like it was orderly.  Perhaps "crash"?
>
>Well, that's a little speculative.  "due to unexpected postmaster exit"?

Agreed. Attached is the patch with changes.

Thanks and Regards,
Kumar Rajeev Rastogi




dangling_backend_process_v2.patch
Description: dangling_backend_process_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-10-19 Thread Rajeev rastogi

On  19 October 2015 21:37, Robert Haas [mailto:robertmh...@gmail.com] Wrote:

>On Sat, Oct 17, 2015 at 4:52 PM, Alvaro Herrera
><alvhe...@2ndquadrant.com> wrote:
>> Andres Freund wrote:
>>> On 2015-10-14 17:33:01 +0900, Kyotaro HORIGUCHI wrote:
>>> > If I recall correctly, he concerned about killing the backends
>>> > running transactions which could be saved. I have a sympathy with
>>> > the opinion.
>>>
>>> I still don't. Leaving backends alive after postmaster has died
>>> prevents the auto-restart mechanism to from working from there on.
>>> Which means that we'll potentially continue happily after another
>>> backend has PANICed and potentially corrupted shared memory. Which
>>> isn't all that unlikely if postmaster isn't around anymore.
>>
>> I agree.  When postmaster terminates without waiting for all backends
>> to go away, things are going horribly wrong -- either a DBA has done
>> something stupid, or the system is misbehaving.  As Andres says, if
>> another backend dies at that point, things are even worse -- the dying
>> backend could have been holding a critical lwlock, for instance, or it
>> could have corrupted shared buffers on its way out.  It is not
>> sensible to leave the rest of the backends in the system still trying
>> to run just because there is no one there to kill them.
>
>Yep.  +1 for changing this.

Seems many people are in favor of this change.
I have made changes to handle backend exit on postmaster death (after they 
finished their work and waiting for new command). 
Changes are as per approach explained in my earlier mail i.e.
1. WaitLatchOrSocket called from secure_read and secure_write function will 
wait on an additional event as WL_POSTMASTER_DEATH.
2. There is a possibility that the command is read without waiting on latch. 
This case is handled by checking postmaster status after command read (i.e. 
after ReadCommand).

Attached is the patch.

Thanks and Regards,
Kumar Rajeev Rastogi





dangling_backend_process.patch
Description: dangling_backend_process.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-10-16 Thread Rajeev rastogi

On 14 October 2015 14:03, Kyotaro HORIGUCHI:

>Subject: Re: [HACKERS] Dangling Client Backend Process
>
>At Wed, 14 Oct 2015 11:08:37 +0530, Amit Kapila <amit.kapil...@gmail.com>
>wrote in
><CAA4eK1L8SGWymhXF+yDpxiyA2ARCiEgQ88XsLhEvJba3Fh_F=q...@mail.gmail.com>
>> On Tue, Oct 13, 2015 at 3:54 PM, Rajeev rastogi
>> <rajeev.rast...@huawei.com>
>> wrote:
>> > If we add the event WL_POSTMASTER_DEATH also, client backend process
>> > handling will become same as other backend process. So postmaster
>> > death can be detected in the same way.
>> >
>> > But I am not sure if WL_POSTMASTER_DEATH event was not added
>> > intentionally for some reason. Please confirm.
>> >
>> > Also is it OK to add this even handling in generic path of Libpq?
>> >
>> > Please let me know if I am missing something?
>> >
>> >
>> I feel this is worth investigation, example for what kind of cases
>> libpq is used for non-blocking sockets, because for such cases above
>> idea will not work.
>
>Blocking mode of a port is changed using socket_set_nonblocking(). I
>found two points that the function is called with true.
>pq_getbyte_if_available() and socket_flush_if_writable(). They seems to
>be used only in walsender *so far*.

Yes and in that case I assume the proposed solution will work in all cases.

>> Here, I think the bigger point is that, Tom was not in favour of this
>> proposal (making backends exit on postmaster death ) at that time, not
>> sure whether he has changed his mind.

>If I recall correctly, he concerned about killing the backends running
>transactions which could be saved. I have a sympathy with the opinion.
>But also think it reasonable to kill all backends immediately so that
>new postmaster can run...

Yes it will be really helpful to know the earlier reason for "not making 
backend exit on postmaster death". 
Please let me know if there is any thread, which I can refer to find the same.

IMHO there could be below major issues, if we don't kill client backend process 
on postmaster death:
1. Postmaster cannot be re-started again as pointed by Kyotaro and Andres Also.
2. If from existing client session, we try to do some operation which has 
dependency with backend process, then that operation will either fail or may 
lead to undefined behavior sometime.
3. Also unintentionally all operation done by application will not be 
checkpointed and also will not be flushed by bgworker. 
4. Replicating of WAL to standby will be stopped and if synchronous standby is 
configured then command from master will be hanged.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dangling Client Backend Process

2015-10-13 Thread Rajeev rastogi

On 12th October 2015 20:45, Rajeev Rastogi Wrote:

>>> I observed one strange behavior today that if postmaster process gets 
>>> crashed/killed, then it kill all background processes but not the client 
>>> backend process.

>> This is a known behaviour and there was some discussion on this
>> topic [1] previously as well.

> Now as it is confirmed to be valid issue, I will spend some time on this to 
> find if there is something more appropriate solution.

I checked the latest code and found Heikki has already added code for 
secure_read using the latch mechanism (using WaitLatchOrSocket). It currently 
waits for two events i.e. WL_LATCH_SET and WL_SOCKET_READABLE.
Commit id: 80788a431e9bff06314a054109fdea66ac538199

If we add the event WL_POSTMASTER_DEATH also, client backend process handling 
will become same as other backend process. So postmaster death can be detected 
in the same way.

But I am not sure if WL_POSTMASTER_DEATH event was not added intentionally for 
some reason. Please confirm.
Also is it OK to add this even handling in generic path of Libpq?

Please let me know if I am missing something?

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Dangling Client Backend Process

2015-10-12 Thread Rajeev rastogi

On 10 October 2015 20:45, Amit Kapila Wrote:

>> I observed one strange behavior today that if postmaster process gets 
>> crashed/killed, then it kill all background processes but not the client 
>> backend process.

> This is a known behaviour and there was some discussion on this
> topic [1] previously as well.  I think that thread didn't reach to conclusion,
> but there were couple of other reasons discussed in that thread as well to
> have the behaviour as you are proposing here.

Oops..I did not know about this. I shall check the older thread to get other 
opinions.

>> One way to handle this issue will be to check whether postmaster is alive 
>> after every command read but it will add extra cost for each query execution.

> I don't think that is a good idea as if there is no command execution
> it will still stay as it is and doing such operations on each command
> doesn't sound to be good idea even though overhead might not be
> big.  There are some other ideas discussed in that thread [2] to achieve
> this behaviour, but I think we need to find a portable way to achieve it.

Yes, you are right that process will not be closed till a new command comes but 
I think it does not harm functionality in anyway except that the process and 
its acquired resources
does not get freed. Also use-case of application will be very less where their 
client process stays idle for very long time.
But at the same time I agree this is not the best solution, we should look for 
more appropriate/better one.
Now as it is confirmed to be valid issue, I will spend some time on this to 
find if there is something more appropriate solution.

Thanks and Regards,
Kumar Rajeev Rastogi

[HACKERS] Dangling Client Backend Process

2015-10-10 Thread Rajeev rastogi

I observed one strange behavior today that if postmaster process gets 
crashed/killed, then it kill all background processes but not the client 
backend process.
Moreover it is also allowed to execute query on the connected client session 
without any other background process.
But If I try to execute some command (like checkpoint) from the client session 
which requires any background task to perform, it fails because it cannot find 
the corresponding background process (like checkpoint process).

I am not sure if this is already known behavior but I found it to be little 
awkward. This may lead to some unknown behavior in user application.

Currently All background process keeps checking if Postmaster is Alive while 
they wait for any event but for client backend process there is no such 
mechanism.

One way to handle this issue will be to check whether postmaster is alive after 
every command read but it will add extra cost for each query execution.

Any comments?

Thanks and Regards,
Kumar Rajeev Rastogi

[HACKERS] Memory Context Info dump

2015-09-08 Thread Rajeev rastogi

In our customer environment as well as during development, we have observed 
that many time we need to get details of memory used by each contexts in order 
to analyze the memory consumption/leak.
So I would like to propose one contrib function interface, which will dump the 
whole memory context into a file.

One of the major infrastructure i.e. the function MemoryContextStats (which 
dump all memory context details on the console) is already there in PG but it 
is being automatically called for scenarios
such as "system is out of memory".

So Now:

1.   Create one contrib function dump_memctxt_info, which is when executed, 
it will call the existing function MemoryContextStats.

2.   The file to dump will be created in the following format:

context__.dump

3.   The file handler will be passed to MemoryContextStats.

4.   If the parameter to MemoryContextStats is NULL (All existing calls 
will pass NULL), then it will behave as earlier i.e. dump on console.

5.   The output format of the context will be same as earlier.

6.   One sample file is attached.

Use-case:
In order to check the memory leak, this contrib function can be called before 
and after execution of query. Then comparison of these two dump will be helpful 
to further analyze.

Attached is the patch. Please provide your opinion. If it is OK, I will add it 
to next commitFest.

Thanks and Regards,
Kumar Rajeev Rastogi



memory_ctxt_dumpv1.patch
Description: memory_ctxt_dumpv1.patch


context_13192_2015-09-08_181236.dump
Description: context_13192_2015-09-08_181236.dump

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-08-20 Thread Rajeev rastogi

On 18 August 2015 21:18, Robert Haas Wrote:

This footnote goes to my point.

It seems clear to me that having the autonomous transaction see the
effects of the outer uncommitted transaction is a recipe for trouble.
If the autonomous transaction updates a row and commits, and the outer
transaction later aborts, the resulting state is inconsistent with any
serial history.  I'm fairly certain that's going to leave us in an
unhappy place.

Even more obviously, ending up with two committed row versions that are
both updates of a single ancestor version is no good.

So, I agree that this scenario should be an error.  What I don't agree
with is the idea that it should be the deadlock detector's job to throw
that error.  Rather, I think that when we examine the xmax of the tuple
we can see - which is the original one, not the one updated by the outer
transaction - we should check whether that XID belongs to an outer
transaction.  If it does, we should throw an error instead of trying to
lock it.  That way (1) the error message will be clear and specific to
the situation and (2) we don't need a separate PGPROC for each
autonomous transaction.  The first of those benefits is agreeable; the
second one is, in my opinion, a key design goal for this feature.

Yes I agree with this. I was in favor of error all the time without involving 
deadlock detector. 

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-08-04 Thread Rajeev rastogi

On 03 August 2015 18:40, Merlin Moncure [mailto:mmonc...@gmail.com] Wrote:
On Sun, Aug 2, 2015 at 11:37 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 31 July 2015 23:10, Robert Haas Wrote:
I think we're going entirely down the wrong path here.  Why is it ever
useful for a backend's lock requests to conflict with themselves, even
with autonomous transactions?  That seems like an artifact of somebody
else's implementation that we should be happy we don't need to copy.

 IMHO, since most of the locking are managed at transaction level not
backend level and we consider main  autonomous transaction to be
independent transaction, then practically they may conflict right.
 It is also right as you said that there is no as such useful use-cases
where autonomous transaction conflicts with main (parent) transaction.
But we cannot take it for granted as user might make a mistake. So at-
least we should have some mechanism to handle this rare case, for which
as of now I think throwing error from autonomous transaction as one of
the solution. Once error thrown from autonomous transaction, main
transaction may continue as it is (or abort main transaction also??).

hm.  OK, what's the behavior of:

BEGIN
  UPDATE foo SET x = x + 1 WHERE foo_id = 1;

  BEGIN WITH AUTONOMOUS TRANSACTION
UPDATE foo SET x = x + 1 WHERE foo_id = 1;
  END;

  RAISE EXCEPTION ...;
EXCEPTION ...

END;

It should throw an error (or something equivalent) as the second update will 
wait for record lock to get released, which in this case will not happen till 
second update finishes. So catch 22.

Also,
*) What do the other candidate implementations do?  IMO, compatibility
should be the underlying design principle.


Oracle throws error in such case. But we can decide on what behavior we want to 
keep.

*) What will the SQL only feature look like?

Similar to PL as mentioned in your example, we can provide the SQL only 
feature also.

*) Is the SPI interface going to be extended to expose AT?

I don’t think at this point that there is any need of exposing SPI interface 
for this.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-08-02 Thread Rajeev rastogi

On 31 July 2015 23:10, Robert Haas Wrote: 
On Tue, Jul 28, 2015 at 6:01 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 That should be practical to special-case by maintaining a list of 
 parent transaction (virtual?) transaction IDs. Attempts to wait on a 
 lock held by any of those should fail immediately. There's no point 
 waiting for the deadlock detector since the outer tx can never 
 progress and commit/rollback to release locks, and it might not be 
 able to see the parent/child relationship from outside the backend 
 doing the nested tx anyway.

I think we're going entirely down the wrong path here.  Why is it ever useful 
for a backend's lock requests to conflict with themselves, even with 
autonomous transactions?  That seems like an artifact of somebody else's 
implementation that we should be happy we don't need to copy.

IMHO, since most of the locking are managed at transaction level not backend 
level and we consider main  autonomous transaction to be independent 
transaction, then practically they may conflict right.   
It is also right as you said that there is no as such useful use-cases where 
autonomous transaction conflicts with main (parent) transaction. But we cannot 
take it for granted as user might make a mistake. So at-least we should have 
some mechanism to handle this rare case, for which as of now I think throwing 
error from autonomous transaction as one of the solution. Once error thrown 
from autonomous transaction, main transaction may continue as it is (or abort 
main transaction also??). 

Any other suggestion to handle this will be really helpful.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-07-30 Thread Rajeev rastogi

On 28 July 2015 03:21, Josh Berkus Wrote:
 
On 07/27/2015 02:47 AM, Rajeev rastogi wrote:
 Why have any fixed maximum?
 Since we are planning to have nested autonomous transaction, so it is 
 required to have limit on this so that resources can be controlled.

Is there a particular reason why this limit wouldn't just be max_stack_depth?

We will require to allocate some initial resources in order to handle all 
nested autonomous transaction. 
So I think it is better to have some different configuration parameter.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-07-28 Thread Rajeev rastogi

On 28 July 2015 15:31, Craig Ringer Wrote:


 2.It should be allowed to deadlock with master transaction.
We
 need to work-out a solution to avoid deadlock.

The deadlock case in autonomous tx's is a bit different.

Assuming you don't intend to allow interleaving, where you can switch
between transactions at will rather than just at begin/commit, the only
way a deadlock can happen is when the outer tx holds a lock that the
inner tx tries to acquire.

That should be practical to special-case by maintaining a list of parent
transaction (virtual?) transaction IDs. Attempts to wait on a lock held
by any of those should fail immediately. There's no point waiting for
the deadlock detector since the outer tx can never progress and
commit/rollback to release locks, and it might not be able to see the
parent/child relationship from outside the backend doing the nested tx
anyway.

Thanks, sounds to be a good idea. I shall evaluate the same.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction is back

2015-07-27 Thread Rajeev rastogi

On 23 July 2015 21:04, Robert Haas Wrote:
On Thu, Jul 23, 2015 at 1:31 AM, Rajeev rastogi rajeev.rast...@huawei.com 
wrote:
 2.It should be allowed to deadlock with master transaction. We
 need to work-out a solution to avoid deadlock.

This sentence seems to contradict itself.  I thought the consensus was that 
the transaction should NOT conflict with the master transaction.

Since we are saying transaction is autonomous to parent transaction, we cannot 
guarantee that it does not take any conflicting lock unless otherwise designed 
so by the application.
But yes, we should have mechanism to deal with the possible deadlock.

 3.It can support multiple level of nesting based on the
 configuration (may be max as 70).

Why have any fixed maximum?

Since we are planning to have nested autonomous transaction, so it is required 
to have limit on this so that resources can be controlled.

 2. The above commands can be issued either inside the procedure to 
 make few statements of procedure inside autonomous transaction or even 
 in stand-alone query execution.

I think inside a procedure the autonomous transaction will need to be 
lexically scoped.  You won't be able to do this, for example:

BEGIN AUTONOMOUS TRANSACTION;
FOR x IN SELECT ... LOOP
COMMIT;
BEGIN AUTONOMOUS TRANSACTION;
END LOOP;

I am not sure, how we will be able to control this. IMHO user should be able to 
control this, especially since it does not have any meaning from user 
perspective.
Please let me know if I am missing something here.

Rather you'd have to do something like this:

FOR x IN SELECT .. LOOP
   BEGIN WITH AUTONOMOUS TRANSACTION
   do stuff
   END;
END LOOP;


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Autonomous Transaction is back

2015-07-22 Thread Rajeev rastogi

After few failed attempt to propose Autonomous transaction earlier. I along 
with Simon Riggs would like to propose again but completely different in 
approach.

We also had discussion about this feature in last PGCon2015 Unconference Day, 
those who missed this discussion, please refer

https://wiki.postgresql.org/wiki/AutonomousTransactionsUnconference2015


Before jumping into the design and code proposal for this feature, me along 
with Simon Riggs wanted to propose its behavior and usage to keep everyone in 
the same boat.
So we have summarized the behavior and usage of the Autonomous Transaction 
based on the discussion with community members in last PGCon2015 Unconference 
Day:

Behavior of Autonomous Transaction:
1.The autonomous transaction treated as a completely different 
transaction from the master transaction.
2.It should be allowed to deadlock with master transaction. We need 
to work-out a solution to avoid deadlock.
3.It can support multiple level of nesting based on the 
configuration (may be max as 70).
4.Outer (i.e. main or upper autonomous) transaction to be suspended 
while the inner autonomous transaction is running.
5.Outer transaction should not see data of inner till inner is 
committed (serializable upper transaction should not see even after inner 
transaction commit).

How to Use Autonomous Transaction:
1. We can issue explicit command to start an Autonomous transaction as below:
BEGIN AUTONOMOUS TRANSACTION  (Don't worry about keywords at 
this point.)
Do you work.
COMMIT/ROLLBACK   (Will commit/rollback the autonomous 
transaction and will return to main transaction or upper autonomous 
transaction).

2. The above commands can be issued either inside the procedure to make few 
statements of procedure inside autonomous transaction or even in stand-alone 
query execution.
3. We can make whole procedure itself as autonomous, which will be similar to 
start autonomous transaction in the beginning of the procedure and 
commit/rollback at the end of the procedure.

There was another discussion in Unconference Day to decide whether to implement 
COMMIT/ROLLBACK inside the procedure or autonomous transaction. So our opinion 
about this is that
COMMIT/ROLLBACK inside procedure will be somewhat different 
from Autonomous Transaction as incase of first, once we commit inside the 
procedure,
it commits everything done before call of procedure. This is the behavior of 
Oracle.
So in this case user required to be very careful to not do any operation before 
call of procedure, which is not yet intended to be committed inside procedure.

So we can prefer to implement Autonomous Transaction, which will not only be 
compatible with Oracle but also gives really strong required features.

I have not put the use-cases here as already we agree about its strong 
use-cases.

Requesting for everyone's opinion regarding this based on which we can proceed 
to enhance/tune/re-write our design.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] 9.5 release notes

2015-06-22 Thread Rajeev rastogi

On 11 June 2015 09:45, Bruce Momjian Wrote:

 
 I have committed the first draft of the 9.5 release notes.  You can
 view the output here:
 
   http://momjian.us/pgsql_docs/release-9-5.html
 
 and it will eventually appear here:
 
   http://www.postgresql.org/docs/devel/static/release.html
 
 I am ready to make suggested adjustments, though I am traveling for
 conferences for the next ten days so there might a delay in my replies.

Below performance improvement in the General Performance category is missing:

Reduce btree scan overhead for  and  strategies

For , =,  and = strategies, mark the first scan key
as already matched if scanning in an appropriate direction.
If index tuple contains no nulls we can skip the first
re-check for each tuple.

Author: Kumar Rajeev Rastogi
Committer: Simon Riggs

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Pluggable Parser

2015-03-27 Thread Rajeev rastogi

What is Pluggable Parser:
It is an option for users to select a different kind of parser to evaluate 
PostgreSQL for their business logic without much manual effort.

Why do we need?
As of now migration from other databases to PostgreSQL requires manual effort 
of translating SQL  PL/SQL to PostgreSQL equivalent SQL queries. Because of 
this manual effort in converting scripts, migration to PostgreSQL considered to 
be very slow and sometime  de-motivating also.  So if we allow to plug 
different database syntaxes with PostgreSQL, then it will be one of the strong 
motivating result for many DBAs to try PostgreSQL.

How to Do?
This can be realized by supporting new SQL/Stored procedures syntaxes in the 
parser corresponding to each other database syntax and plug the one needs to be 
evaluated by putting the corresponding so file in dynamic_library_path. 
Default will be PostgreSQL syntax.

Parser Interface:
/* Hook for plugins to get control in Parser */
typedef List * (*parser_hook_type) const char *str);
parser_hook_type parser_hook = NULL;
extern PGDLLIMPORT parser_hook_type parser_hook;

Make the parser entry point as function pointer (raw_parser); which can be 
loaded based on parser type. The parse_hook will be initialized with proper 
function during shared library loading, which is done only during server 
startup. By this way it can be ensured as only one parser which is provided by 
the user is used in the PostgreSQL.

Each Database syntax related parser can be implemented as part 
of contrib module to keep it separate from the core code.

To start with, I am planning to (For 2015-06 commitFest):

1.   Support infrastructure to allow plugging.

2.   Support of ORACLE SQL query syntaxes.


Please let me know if community will be interested in this or if there were 
already any discussion about this in past?

Please provide your opinion/suggestion.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Rajeev rastogi

On 20 March 2015 17:37, Amit Kapila Wrote:

 So the patches have to be applied in below sequence:
 HEAD Commit-id : 8d1f2390
 parallel-mode-v8.1.patch [2]
 assess-parallel-safety-v4.patch [1]
 parallel-heap-scan.patch [3]
 parallel_seqscan_v11.patch (Attached with this mail)

While I was going through this patch, I observed one invalid ASSERT in the 
function “ExecInitFunnel” i.e.
Assert(outerPlan(node) == NULL);

Outer node of Funnel node is always non-NULL and currently it will be 
PartialSeqScan Node.

May be ASSERT is disabled while building the code because of which this issue 
has not yet been observed.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Rajeev rastogi


On 25 March 2015 16:00, Amit Kapila Wrote:

  Which version of patch you are looking at?
 I am seeing below code in ExecInitFunnel() in Version-11 to which
 you have replied.

  + /* Funnel node doesn't have innerPlan node. */
 + Assert(innerPlan(node) == NULL

I was seeing the version-10.
I just checked version-11 and version-12 and found to be already fixed.
I should have checked the latest version before sending the report…☺

Thanks and Regards,
Kumar Rajeev Rastogi

From: Amit Kapila [mailto:amit.kapil...@gmail.com]
Sent: 25 March 2015 16:00
To: Rajeev rastogi
Cc: Amit Langote; Robert Haas; Andres Freund; Kouhei Kaigai; Amit Langote; 
Fabrízio Mello; Thom Brown; Stephen Frost; pgsql-hackers
Subject: Re: [HACKERS] Parallel Seq Scan

On Wed, Mar 25, 2015 at 3:47 PM, Rajeev rastogi 
rajeev.rast...@huawei.commailto:rajeev.rast...@huawei.com wrote:

 On 20 March 2015 17:37, Amit Kapila Wrote:


  So the patches have to be applied in below sequence:

  HEAD Commit-id : 8d1f2390
  parallel-mode-v8.1.patch [2]
  assess-parallel-safety-v4.patch [1]
  parallel-heap-scan.patch [3]
  parallel_seqscan_v11.patch (Attached with this mail)


 While I was going through this patch, I observed one invalid ASSERT in the 
 function “ExecInitFunnel” i.e.

 Assert(outerPlan(node) == NULL);

 Outer node of Funnel node is always non-NULL and currently it will be 
 PartialSeqScan Node.

Which version of patch you are looking at?

I am seeing below code in ExecInitFunnel() in Version-11 to which
you have replied.

+ /* Funnel node doesn't have innerPlan node. */
+ Assert(innerPlan(node) == NULL);


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.comhttp://www.enterprisedb.com/

[HACKERS] Materialiation is slower than non-materialized

2015-03-23 Thread Rajeev rastogi

During my routine work, I observed that incase of execution of plan having 
inner node of NLJ as materialized node (on top of SeqScan) is slower compared 
to non-materialized SeqScan node. This happens only if Work_mem is not big 
enough to hold all tuples in memory.

To make test easy and faster, I set the work_mem as 256kB. Then result is as 
below:

=With Material off=
postgres=# set enable_material to off;
SET
Time: 0.225 ms
postgres=# select count(tbl.id1) from tbl, tbl2 where tbl.id1tbl2.id1;
  count
--
49995000
(1 row)

Time: 26674.299 ms
postgres=# explain select count(tbl.id1) from tbl, tbl2 where tbl.id1tbl2.id1;
  QUERY PLAN
--
Aggregate  (cost=2783478.33..2783478.34 rows=1 width=4)
   -  Nested Loop  (cost=0.00..2700145.00 rows= width=4)
 Join Filter: (tbl.id1  tbl2.id1)
 -  Seq Scan on tbl  (cost=0.00..145.00 rows=1 width=4)
 -  Seq Scan on tbl2  (cost=0.00..145.00 rows=1 width=4)
Planning time: 0.120 ms
(6 rows)


=With Material on=

postgres=# set enable_material to on;
SET
Time: 0.222 ms
postgres=# select count(tbl.id1) from tbl, tbl2 where tbl.id1tbl2.id1;
  count
--
49995000
(1 row)

Time: 32839.627 ms
postgres=# explain select count(tbl.id1) from tbl, tbl2 where tbl.id1tbl2.id1;
 QUERY PLAN

Aggregate  (cost=1983648.33..1983648.34 rows=1 width=4)
   -  Nested Loop  (cost=0.00..1900315.00 rows= width=4)
 Join Filter: (tbl.id1  tbl2.id1)
 -  Seq Scan on tbl  (cost=0.00..145.00 rows=1 width=4)
 -  Materialize  (cost=0.00..235.00 rows=1 width=4)
   -  Seq Scan on tbl2  (cost=0.00..145.00 rows=1 width=4)
Planning time: 0.140 ms
(7 rows)

As per my analysis, above result is aligned with our current design.

Materialization Node:
Cost Calculation @ Plan time:
If the results spills over to disk in case of Materialization, 
it considers the cost for the same in total cost.
Actual Execution:
Result is actually fetched from disk only even on re-scan.

Scan Node:
Cost Calculation @ Plan time:
The cost of re-scan of SeqScan node is considered to be same scan of SeqScan 
node, which always assumes that the records is fetched from disk and hence disk 
access cost is added (As we don't know really how much memory will be available 
to cache during execution).
Actual Execution:
After first scan, once the whole records is loaded to memory 
(provided shared_buffer is big enough), rescan of records are read from memory 
only and hence it is much faster.

So because of this while planning cost of Materialized node is lesser than that 
of SeqScan node but while execution SeqScan is faster because it fetches tuples 
from memory on re-scan.

I am not sure if we can consider this to be a problem or not but I just wanted 
to share as generally it is expected by user to be Materialization faster than 
Non-materialized.
Please provide your opinion. If we can do something about this then I can take 
up this work.

Thanks and Regards,
Kumar Rajeev Rastogi
--
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

Re: [HACKERS] Materialiation is slower than non-materialized

2015-03-23 Thread Rajeev rastogi

On 23 March 2015 21:39, Robert Haas
 
 On Mon, Mar 23, 2015 at 6:01 AM, Rajeev rastogi
 rajeev.rast...@huawei.com wrote:
  The cost of re-scan of SeqScan node is considered to be same scan of
  SeqScan node, which always assumes that the records is fetched from
  disk and hence disk access cost is added (As we don’t know really how
  much memory will be available to cache during execution).
 
 That's a general problem not limited to materialize nodes.  We might
 choose to do a heap-sort rather than a quick-sort, but it may turn out
 that the tapes we create end up in the OS buffer cache instead of on
 physical storage; in fact, it's probably the common case.  Scans are
 costed using seq_page_cost and random_page_cost, but most of the time
 the random page cost will not be the cost of a head seek, because
 we'll find the data in the OS page cache.  Some of the time it really
 will be a head seek, but we have no idea whether that will happen in
 any given case.  The autovacuum cost delays have this problem too: a
 miss in shared buffers may really be a hit in the OS page cache, but
 we don't know.

Yes, I agree.

 This kind of disclaimer is inappropriate on a public mailing list.
 Don't send confidential information to public mailing lists.  You
 probably don't have any legal right to control what happens to it after
 that, regardless of what you put in your email.
Sorry for this. Generally we delete this legal message before sending mails to
community but somehow missed to do the same this time.

Thanks and Regards,
Kumar Rajeev Rastogi.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Multiple call of GetTransactionSnapshot in single flow

2014-11-18 Thread Rajeev rastogi

I have observed  in some places like exec_bind_message  and exec_simple_query,
Though these two function have already got snapshot but before they call 
function PortalStart,
current snapshot gets popped off and then they pass InvalidSnapshot  as 
parameter
because of which inside PortalStart again snapshot is taken.

In cases where many transactions are running, taking snapshot multiple times 
may be very costly.

What is the reason for taking snapshot multiple time:

1.   Is this implementation to make sure snapshot is at more granular level 
?

2.   Is it something do with current command id of the snapshot?

3.   Or there is any other specific reason for this, which I am not able 
visualize?

4.   Or am I missing something else?

If it is just reason 1, then maybe we can try to pass the same snapshot to 
PortalStart as taken in caller, it can enhance the performance in many case.
With this change, I  did one small performance test on pgbench with prepared 
queries for pgbench select with 16 users and observed performance benefit of 
10%.

Please provide your opinion?

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Index scan optimization

2014-11-17 Thread Rajeev rastogi

On 17 November 2014 16:05, Simon Riggs Wrote:

 I confirm 5% improvement in both short and long index scans, on the least 
 beneficial datatype. Looks likely to be a very positive win overall.

Thanks a lot for testing and confirmation.


 The language used in the comments is not clear enough. I'll do my best to 
 improve that, then look to commit this in about 5 hours.
Thanks a lot for support. Please let me know if I also need to add something.

Thanks and Regards,
Kumar Rajeev Rastogi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index scan optimization

2014-10-27 Thread Rajeev rastogi

On 26 October 2014 10:42, Haribabu Kommi wrote:
 
 Hi,
 
 I reviewed index scan optimization patch, the following are the
 observations.
 
 - Patch applies cleanly.
 - Compiles without warnings
 - All regress tests are passed.
 
 There is a good performance gain with the patch in almost all scenarios.

Thanks for reviewing.

 I have a question regarding setting of key flags matched. Only the
 first key was set as matched even if we have multiple index conditions.
 Is there any reason behind that?

Actually the keysCount can be more than one only if the key strategy is 
BTEqualStrategyNumber (i.e. = condition) 
But our optimization is only for the '' or '=' condition only. So adding code 
to set flag for all keys is of no use.

Let me know if I am missing anything here?

 If any volatile function is present in the index condition, the index
 scan itself is not choosen, Is there any need of handling the same
 similar to NULLS?

Do you mean to say that whether we should add any special handling for volatile 
function?
If yes then as you said since index scan itself is not selected for such 
functions, then 
I feel We don’t need to add anything for that.
Any opinion?

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index scan optimization

2014-09-24 Thread Rajeev rastogi

On 22 September 2014 18:51, Stephen Frost Wrote:

* Rajeev rastogi (rajeev.rast...@huawei.com) wrote:
 Thanks, I shall start to prepare a patch for this optimization and share in 1 
 or 2 days.

 This sounded interesting to me also- please be sure to put it on the open 
 commitfest once you have posted the patch.

Thanks, I have added it to next CommitFest i.e. October CommitFest.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] make pg_controldata accept -D dirname

2014-09-24 Thread Rajeev rastogi


On 24 September 2014 17:15, Michael Paquier Wrote:
 
 On Wed, Sep 24, 2014 at 7:59 PM, Abhijit Menon-Sen a...@2ndquadrant.com
 wrote:
  I can never remember that pg_controldata takes only a dirname rather
  than -D dirname, and I gather I'm not the only one. Here's a tiny
  patch to make it work either way. As far as I can see, it doesn't
  break anything, not even if you have a data directory named -D.
 +1. I forget it all the time as well...

+1, always I get confused once pg_controldata does not work with -D.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index scan optimization

2014-09-23 Thread Rajeev rastogi

On 22 September 2014 19:17, Heikki Linnakangas wrote:
 
 On 09/22/2014 04:45 PM, Tom Lane wrote:
  Heikki Linnakangas hlinnakan...@vmware.com writes:
  On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
  So my proposal is to skip the condition check on the first scan key
 condition for every tuple.
 
  The same happens in a single-column case. If you have a query like
  SELECT * FROM tbl2 where id2  'a', once you've found the start
  position of the scan, you know that all the rows that follow match
 too.
 
  ... unless you're doing a backwards scan.
 
 Sure. And you have to still check for NULLs. Have to get the details
 right..

I have finished implementation of the discussed optimization.
I got a performance improvement of around 30% on the schema and data shared 
in earlier mail.

I also tested for the index scan case, where our optimization is not done and 
observed that there
is no effect on those query because of this change.

Change details:
I have added a new flag as SK_BT_MATCHED as part of sk_flags (ScanKey 
structure), the value used for this 
0x0004, which was unused.
Inside the function _bt_first, once we finish finding the start scan position 
based on the first key, 
I am appending the flag SK_BT_MATCHED to the first key.
Then in the function _bt_checkkeys, during the key comparison, I am checking if 
the key has SK_BT_MATCHED flag set, if yes then
there is no need to further comparison. But if the tuple is having NULL value, 
then even if this flag is set, we will continue
with further comparison (this handles the Heikki point of checking NULLs).

I will add this patch to the next CommitFest.

Please let me know your feedback. 

Thanks and Regards,
Kumar Rajeev Rastogi




index_scan_opt.patch
Description: index_scan_opt.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index scan optimization

2014-09-22 Thread Rajeev rastogi

On 22 September 2014 12:35, Heikki Linnakangas:

  I have observed a scope of considerable performance improvement in-
 case of index by a very minor code change.
  Consider the below schema:
 
  create table tbl2(id1 int, id2 varchar(10), id3 int); create index
  idx2 on tbl2(id2, id3);
 
  Query as:
   select count(*) from tbl2 where id2'a' and
  id399;
 
  As per current design, it takes following steps to retrieve index
 tuples:
 
  1.   Find the scan start position by searching first position in
 BTree as per the first key condition i.e. as per id2'a'
 
  2.   Then it fetches each tuples from position found in step-1.
 
  3.   For each tuple, it matches all scan key condition, in our
 example it matches both scan key condition.
 
  4.   If condition match, it returns the tuple otherwise scan
 stops.
 
  Now problem is here that already first scan key condition is matched
 to find the scan start position (Step-1), so it is obvious that any
 further tuple also will match the first scan key condition (as records
 are sorted).
  So comparison on first scan key condition again in step-3 seems to be
 redundant.
 
  So my proposal is to skip the condition check on the first scan key
 condition for every tuple.
 
 The same happens in a single-column case. If you have a query like
 SELECT * FROM tbl2 where id2  'a', once you've found the start
 position of the scan, you know that all the rows that follow match too.

Very much true.

  I would like to submit the patch for this improvement.
  Please provide your feedback. Also let me know if I am missing
 something.
 
 Yeah, sounds like a good idea. This scenario might not arise very often,
 but it should be cheap to check, so I doubt it will add any measurable
 overhead to the cases where the optimization doesn't help.

Thanks, I shall start to prepare a patch for this optimization and share in 1 
or 2 days.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Index scan optimization

2014-09-21 Thread Rajeev rastogi

I have observed a scope of considerable performance improvement in-case of 
index by a very minor code change.
Consider the below schema:

create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
select count(*) from tbl2 where id2'a' and id399;

As per current design, it takes following steps to retrieve index tuples:

1.   Find the scan start position by searching first position in BTree as 
per the first key condition i.e. as per id2'a'

2.   Then it fetches each tuples from position found in step-1.

3.   For each tuple, it matches all scan key condition, in our example it 
matches both scan key condition.

4.   If condition match, it returns the tuple otherwise scan stops.

Now problem is here that already first scan key condition is matched to find 
the scan start position (Step-1), so it is obvious that any further tuple also 
will match the first scan key condition (as records are sorted).
So comparison on first scan key condition again in step-3 seems to be redundant.

So my proposal is to skip the condition check on the first scan key condition 
for every tuple.

I have prototype code to see the result, Below are my some test data as per the 
query attached with this mail:
Time:
Original Code took:3621 ms
Optimized Code took:   2576 ms
So overall performance improvement is around 29%.

Instruction:
Original Code took:20057
Optimized Code took:   14557
So overall instruction improvement is around 27%.
Also I observed that only for function varstr_cmp, around 661M instruction was 
taken, which is completely saved for optimized code.

This optimization can be extended:

1.   For a case where once all scan key matches the condition, no need to 
check condition for further tuples for any scan key. Just keep returning the 
tuple.

2.   Above approach can be used for '' operator also by finding the 
position of last matching tuples.

I would like to submit the patch for this improvement.
Please provide your feedback. Also let me know if I am missing something.

Thanks and Regards,
Kumar Rajeev Rastogi

--Schema
create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

--Procedure to insert 1M data:

create or replace function insert_data(count1 int, count2 int) returns void
AS
$$
Begin
for i IN 1..count1 loop
insert into tbl2 values(i, 'a', i);
end loop;   

for i IN count1+1..count2 loop
insert into tbl2 values(i, 'b', i);
end loop;   
End;  
$$  language plpgsql;

select insert_data(99, 100);

--Procedure to select data 1000 times (1000 times selected to make data more 
appropriate.)
create or replace function select_data(count1 int) returns void
AS
$$
declare
x int;
Begin
for i IN 1..count1 loop
select count(*) into x from tbl2 where id2'a' and id399; 
end loop;   
End;  
$$  language plpgsql;

select select_data(1000);

--Result:
   OptimizedOriginal
Reading-1   2579.27 3621.82
Reading-2   2573.82 3618.29
Reading-3   2575.08 3625.16
Average 2576.06 3621.76

Overall Improvement 29% 

Instruction Original Code took: 20057 M
Insrtuction Optimized Code took:14557 M
So overall instruction improvement is around 27%.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers

2014-08-26 Thread Rajeev rastogi

On 23 August 2014 11:22, Michael Paquier Wrote:

  2. Logic of deciding the highest priority one seems to be in-correct.
  Assume, s_s_num = 3, s_s_names = 3,4,2,1
  standby nodes are in order as: 1,2,3,4,5,6,7
 
  As per the logic in patch, node 4 with priority 2 will not
 be added in the list whereas 1,2,3 will be added.
 
  The problem is because priority updated for next tracking is
 not the highest priority as of that iteration, it is just priority of
 last node added to the list. So it may happen that a node with
 higher priority is still there in list but we are comparing with some
 other smaller priority.
 
 
  Fixed. Nice catch!
 
 
 Actually by re-reading the code I wrote yesterday I found that the fix
 in v6 for that is not correct. That's really fixed with v7 attached.

I have done some more review, below are my comments:

1. There are currently two loops on *num_sync, Can we simply the function 
SyncRepGetSynchronousNodes by moving the priority calculation inside the upper 
loop
if (*num_sync == allowed_sync_nodes)
{
for (j = 0; j  *num_sync; j++)
{
Anyway we require priority only if *num_sync == allowed_sync_nodes 
condition matches.
So in this loop itself, we can calculate the priority as well as 
assigment of new standbys with lower priority.

Let me know if you see any issue with this. 

2.  Comment inside the function SyncRepReleaseWaiters,
/*
 * We should have found ourselves at least, except if it is not expected
 * to find any synchronous nodes.
 */
Assert(num_sync  0);

I think except if it is not expected to find any synchronous nodes is 
not required. 
As if it has come till this point means atleast this node is 
synchronous.

3.  Document says that s_s_num should be lesser than max_wal_senders but 
code wise there is no protection for the same.
IMHO, s_s_num should be lesser than or equal to max_wal_senders 
otherwise COMMIT will never return back the console without
any knowledge of user.
I see that some discussion has happened regarding this but I think just 
adding documentation for this is not enough.

I am not sure what issue is observed in adding check during GUC 
initialization but if there is unavoidable issue during GUC initialization
then can't we try to add check at later points.

4.  Similary interaction between parameters s_s_names and s_s_num. I see some 
discussion has happened regarding this and it is acceptable
to have s_s_num more than s_s_names. But I was thinking should not give 
atleast some notice message to user for such case along with 
some documentation.

config.sgml
5. At any one time there will be at a number of active synchronous standbys: 
this sentence is not proper.

6.  When this parameter is set to literal0/, all the standby
nodes will be considered as asynchronous.

Can we make this as 
When this parameter is set to literal0/, all the standby
nodes will be considered as asynchronous irrespective of value of 
synchronous_standby_names.

7.  Are considered as synchronous the first elements of
xref linkend=guc-synchronous-standby-names in number of
xref linkend=guc-synchronous-standby-num that are
connected.

Starting of this sentence looks to be incomplete.

high-availability.sgml
8.  Standbys listed after this will take over the role
of synchronous standby if the first one should fail.

Should not we make it as:

Standbys listed after this will take over the role
of synchronous standby if any of the first synchronous-standby-num standby 
fails.

Let me know incase if something is not clear.

Thanks and Regards,
Kumar Rajeev Rastogi.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers

2014-08-22 Thread Rajeev rastogi

On 09 August 2014 11:33, Michael Paquier Wrote:

 Please find attached a patch to add support of synchronous replication
 for multiple standby servers. This is controlled by the addition of a
 new GUC parameter called synchronous_standby_num, that makes server
 wait for transaction commit on the first N standbys defined in
 synchronous_standby_names. The implementation is really straight-
 forward, and has just needed a couple of modifications in walsender.c
 for pg_stat_get_wal_senders and syncrep.c.

I have just started looking into this patch. 
Please find below my first level of observation from the patch:

1. Allocation of memory for sync_nodes in function SyncRepGetSynchronousNodes 
should be equivalent to allowed_sync_nodes instead of max_wal_senders. As 
anyway we are not going to store sync stdbys more   than allowed_sync_nodes.
sync_nodes = (int *) palloc(allowed_sync_nodes * sizeof(int));

2. Logic of deciding the highest priority one seems to be in-correct.
Assume, s_s_num = 3, s_s_names = 3,4,2,1 
standby nodes are in order as: 1,2,3,4,5,6,7

As per the logic in patch, node 4 with priority 2 will not be added in 
the list whereas 1,2,3 will be added.

The problem is because priority updated for next tracking is not the 
highest priority as of that iteration, it is just priority of last node added 
to the list. So it may happen that a node with   higher priority is still 
there in list but we are comparing with some other smaller priority. 

3. Can we optimize the function SyncRepGetSynchronousNodes in such a way that 
it gets the number of standby nodes from s_s_names itself. With this it will be 
usful to stop scanning the moment we get first s_s_num potential standbys.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Unwanted LOG during recovery of DROP TABLESPACE REDO

2014-07-16 Thread Rajeev rastogi

I found and fixed a bug that causes recovery (crash recovery , PITR) to throw 
unwanted LOG message if the tablespace symlink is not found during the 
processing of DROP TABLESPACE redo.
LOG:  could not remove symbolic link 
pg_tblspc/16384: No such file or directory

To Reproduce the issue:

1.   Start the server.

2.   Create a tablespace.

3.   Perform Checkpoint.

4.   Drop tablespace.

5.   Stop server using immediate mode.

6.   Start server : At this stage, recovery throw log message as mentioned 
above.

Reason is that DROP TABLESPACE has already removed symlink and again it is 
being tried to remove during recovery.
As it is very much possible that DROP TABLESPACE was successful and cleaned up 
the file before server crashed. So this should be considered as valid scenario 
and no need to throw
any LOG in such case. In case of processing of CREATE TABLESPACE redo, same is 
already handled.

I will add this to 2014-08 CF for review.

Thanks and Regards,
Kumar Rajeev Rastogi



rec_issue_with_drop_tblspc_redo.patch
Description: rec_issue_with_drop_tblspc_redo.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-06 Thread Rajeev rastogi


On 04 July 2014 19:29, MauMau Wrote:

 [How to fix]
 Of course, adding -o '-c synchronous_commit=local' or -o '-c
 synchronous_standby_names=' to pg_ctl start in the recovery script
 would prevent the problem.
 
 But isn't there anything to fix in PostgreSQL?  I think the doc needs
 improvement so that users won't misunderstand that only write
 transactions would block at commit.

As of now there is no solution for this in PostgreSQL but I had submitted a 
patch Standalone synchronous master in 
9.4 2014-01 CommitFest, which was rejected because of some issues. This patch 
was meant to degrade the synchronous
level of master, if all synchronous standbys are down.
I plan to resubmit this with better design sometime in 9.5.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-07-03 Thread Rajeev rastogi

On 01 July 2014 12:00, Amit Kapila Wrote:

On Tue, Jul 1, 2014 at 11:46 AM, Rajeev rastogi
rajeev.rast...@huawei.commailto:rajeev.rast...@huawei.com wrote:
On 30 June 2014 22:50, Pavel Stehule Wrote:
I didn't find a related message.
?

I think there have been some confusion, the design idea were never rejected
but yes there were few feedback/ concern, which I had clarified. Also some
of the other concerns are already fixed in latest patch.

Simon has mentioned that exactly this idea has been rejected at
PGCon 2 years back. Please refer that in below mail:
http://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab7713dde1...@szxeml508-mbx.china.huawei.com

As far as I can see, you never came back with the different solution.

Yeah right. So for this I tried to search archived mails to get the details
about the discussion but I could not find anything regarding design.
So I am not sure how shall I make my solution different from earlier as earlier
solution is not accessible to me. Any help regarding this will be really great
help to me.

Also from the current Autonomous transaction discussion thread (including
ca+u5nmkeum4abrqbndlyt5ledektae8rbiyw3977yhmeowq...@mail.gmail.comhttp://www.postgresql.org/message-id/ca+u5nmkeum4abrqbndlyt5ledektae8rbiyw3977yhmeowq...@mail.gmail.com),
I have summarized all important feedbacks as mentioned below along with the
resolution suggested:

1. Pavel Stehule (07-04-2014): -1 for Oracle syntax - it is hardly
inconsistent with Postgres

Changed the syntax to “START AUTONOMOUS TRANSACTION”

2. Pavan (10-04-2014): Making autonomous transaction properties
independent of main transaction.
Made all properties of autonomous transaction (including read-only) independent
from main transaction except isolation level, which I did not find very useful
to keep different. But others opinion is different then we can make this
property also independent.

3. Alvaro Herrarta (09-04-2014): Autonomous transaction to have their own
separate proc entry.
This was concluded to not have separate proc entry for autonomous transaction
and same concluded.

4. Tom Lane (09-04-2014): The point being that you need to change both
pg_subtrans and pg_clog to make that state transition.
This is handled for autonomous transaction.

5. Robert Haas (09-04-2014): Not in favour of current design related to
maintaining lockmask for autonomous transaction.

I had replied for this mail regarding why this design is kept but still if
design for this part is not acceptable, then I can rework to make it better. In
order to do so I would be very happy to have more discussion to get concrete
feedback and direction to improve this.

6. Tom Lane (09-04-2014): no justification for distinguishing normal and
autonomous transactions at this level (locking level).
I had replied this also earlier. Reason for distinguishing at this level is to
handle any kind of deadlock possibility between main and autonomous
transaction. Deadlock handling between main and autonomous transaction was one
of the requirement discussed at PGCon 2012 as part of autonomous transaction
discussion. Please let me know if I am missing something in this.

All of the above mentioned changes are included in latest patch shared.
Please let me know if I have missed any other important points from the earlier
discussion, I would like to address that also.
Have you checked the discussion in Developer meeting notes. Please
check the same at below link:
http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions

From the discussion, I am able to make out two important points:

1. Main transaction and autonomous transaction should be independent and
can conflict.

This is already included in our latest patch.

2. Utility commands like VACUUM and CREATE INDEX CONCURRENTLY should be
able to work from autonomous transaction.

Both of the above mentioned utility commands are not supported even inside the
main transaction. So it is not working within autonomous transaction.

Any opinion about this?
Please let me know if I have missed any points from the link given.

So I wanted to have this patch in commitfest application, so that we can
have a healthy discussion and rectify all the issues.
But now I see that this patch has already been moved to rejected category,
which will put break on further review.
I believe ideally this patch should have been marked as
Returned with feedback as you already got a feedback long
back and never come up with solution for same.

Since this patch is very big and complex, it is better we continue discussing
from the first CommitFest itself so that we can get ample time to share
everyone’s opinion and then address all possible issue.

Any Opinions/Suggestions are welcome. Also let me know if I have missed
something.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output

2014-07-02 Thread Rajeev rastogi

On 23 June 2014 15:22, Andres Freund Wrote:
 
  This may be harmless but pinning_backends should be defined as int4
  rather than int8 because BufferDesc-refcount is just defined as
  unsigned and it's converted to Datum by Int32GetDatum().
 
 Well, in theory a uint32 can't generally be converted to a int32.
 That's why I chose a int64 because it's guaranteed to have sufficient
 range. It's fairly unlikely to have that many pins, but using a int64
 seems cheap enough here.

But since we have declared pinning_backends as int8, we should use 
Int64GetDatum to form the tuple datum.

Using Int32GetDatum will lead to issue specially incase of WIN32 platform or 
any other platform where int8
is not considered as USE_FLOAT8_BYVAL (or FLOAT8PASSBYVAL as 0). On such 
platform while initializing the tuple, 
type by value will be marked as false but still value (not address) will be 
passed to datum, which will lead to crash.

I have done testing to verify this on win32 and able to observe the crash where 
as it works fine on Linux.

Also can we change the description of function pg_buffercache_pages to 
include pinning_backend also in the description.

Thanks and Regards,
Kumar Rajeev Rastogi






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-07-01 Thread Rajeev rastogi

On 30 June 2014 22:50, Pavel Stehule Wrote:
2014-06-30 12:38 GMT+02:00 Abhijit Menon-Sen 
a...@2ndquadrant.commailto:a...@2ndquadrant.com:
If I understand correctly, the design of this patch has already been
considered earlier and rejected. So I guess the patch should also be
marked rejected?

I didn't find a related message.
?
I think there have been some confusion, the design idea were never rejected but 
yes there were few feedback/ concern, which I had clarified. Also some of the 
other concerns are already fixed in latest patch.
So I wanted to have this patch in commitfest application, so that we can have a 
healthy discussion and rectify all the issues.
But now I see that this patch has already been moved to rejected category, 
which will put break on further review.
So is there any way to bring back and continue reviewing this patch.
Please let me know if any issue or I am missing something.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Autonomous Transaction (WIP)

2014-07-01 Thread Rajeev rastogi

On 01 July 2014 12:26, Pavel Stehule Wrote:

Have you checked the discussion in Developer meeting notes. Please
check the same at below link:
http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions

Are these notes still valid?
* Why autonomous transaction should be close to  functions? We can implement 
AT as first step and next step can be implementation of integration AT to 
stored procedures.
We have implemented AT on the line of sub-transaction. Also we have integrated 
AT with stored procedure i.e. we can create an autonomous transaction inside 
the store procedure, which can be also committed.
* When autonomous transaction is independent on parent transaction, then locks 
parent and autonomous transaction should be in conflict
Yes our implementation makes the autonomous transaction independent of main 
transaction and hence as per our design parent (main) transaction and 
autonomous may get conflicted.  For which we have implemented deadlock 
detection mechanism between autonomous transaction and its parent transaction.
 I though about integration to PL/pgSQL and I don't think so close integration 
 between autonomous transaction and procedure is optimal. More practical is 
 design so autonomous transaction is similar to subtransaction.
Yes as mentioned above, our implementation of autonomous transaction is on 
track of subtransaction.
Then we can simply wrote some code like
  BEGIN
.. some code
  WHEN OTHERS THEN
.. I would to write permanently to log
BEGIN AUTONOMOUS
  INSERT INTO log VALUES(..);
WHEN OTHERS
  RAISE WARNING 'Cannot to write to log ..';
  RAISE EXCEPTION ' ...' forward up exception from autonomous transaction 
 to parent transaction
END
  END;
Now I am thinking so PL/SQL design of autonomous transactions is relatively 
limited and is not best to follow it.
With our approach, we can use autonomous transaction in procedure as given 
below:
  BEGIN
.. some code
  WHEN OTHERS THEN
.. I would to write permanently to log
START AUTONOMOUS TRANSACTION
  INSERT INTO log VALUES(..);
   COMMIT:
WHEN OTHERS
  RAISE WARNING 'Cannot to write to log ..';
  RAISE EXCEPTION ' ...' forward up exception from autonomous transaction 
to parent transaction
END
  END;
Please let me know if I have missed to answer any of your queries.
Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] psql: show only failed queries

2014-06-30 Thread Rajeev rastogi

On 26 June 2014 11:53, Samrat Revagade Wrote:

 I am sending updated patch - buggy statement is printed via more logical 
 psql_error function instead printf

Thank you for updating patch, I really appreciate your efforts.

Now, everything is good from my side.
* it apply cleanly to the current git master
* includes necessary docs
* I think It is very good  and necessary feature.

If Kumar Rajeev Rastogi do not have any extra comments, then I think patch is  
ready for committer.

I have reviewed this patch. Please find my review comments below:

1. Command start-up option (e.g. -a/--echo-all for --ECHO=all), for new 
functionality is not provided.

2. New Command start-up option should be added in psql --help as well as 
in documentation.

Also as I understand, this new option is kind of sub-set of existing option 
(ECHO=query), so should not we display
query string in the same format as it was getting printed earlier.
Though I also feel that prefixing query with STATEMENT word will be helpful to 
grep but at the same time I am worried
about inconsistency with existing option.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] psql: show only failed queries

2014-06-30 Thread Rajeev rastogi

On 30 June 2014 12:24, Pavel Stehule Wrote:

I have reviewed this patch. Please find my review comments below:

1. Command start-up option (e.g. -a/--echo-all for --ECHO=all), for new 
functionality is not provided.
all not options entered via psql variables has psql option and psql comment. 
I'll plan add new decription to --help-variables list.
If it is necessary I can add long option --echo-errors, I didn't a good char 
for short option. Any idea?

But the new option we are adding are on a track of existing option, so better 
we add start-up option for this also.
Yeah long option –echo-errors seems to be fine to me also. For short option, I 
think we can use “-b” stands for blunder. This is the closest one I could think 
of.


2. New Command start-up option should be added in psql --help as well 
as in documentation.
depends on previous,
Right.
Also as I understand, this new option is kind of sub-set of existing option 
(ECHO=query), so should not we display
query string in the same format as it was getting printed earlier.
Though I also feel that prefixing query with STATEMENT word will be helpful 
to grep but at the same time I am worried
about inconsistency with existing option.

This is question. And I am not strong in feeling what should be preferred. But 
still I am inclined to prefer a variant with STATEMENT prefix. Mode with -a is 
used with different purpose than mode show errors only - and output with 
prefix is much
 more consistent with log entry - and displaying error. So I agree, so there 
 is potential inconsistency (but nowhere is output defined), but this output 
 is more practical, when you are concentrated to error's processing.
Yeah right, I just wanted to raise point to provoke other thought to see if 
anyone having different opinion. If no objection from others, we can go ahead 
with the current prefixing approach.
Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Proposal for CSN based snapshots

2014-05-13 Thread Rajeev rastogi

On 13 May 2014 14:06, Heikki Linnakangas

  The core of the design is to store the LSN of the commit record in
  pg_clog. Currently, we only store 2 bits per transaction there,
  indicating if the transaction committed or not, but the patch will
  expand it to 64 bits, to store the LSN. To check the visibility of
  an XID in a snapshot, the XID's commit LSN is looked up in pg_clog,
  and compared with the snapshot's LSN.
  Isn't it will be bit in-efficient to look in to pg_clog to read XID's
  commit LSN for every visibility check?
 
 Maybe. If no hint bit is set on the tuple, you have to check the clog
 anyway to determine if the tuple is committed. And if for XIDs older
 than xmin or newer than xmax, you don't need to check pg_clog. But it's
 true that for tuples with hint bit set, and xmin  XID  xmax, you have
 to check the pg_clog in the new system, when currently you only need to
 do a binary search of the local array in the snapshot. My gut feeling
 is that it won't be significantly slower in practice. If it becomes a
 problem, some rearrangement pg_clog code might help, or you could build
 a cache of XID-CSN mappings that you've alread looked up in
 SnapshotData. So I don't think that's going to be a show-stopper.

Yes definitely it should not be not show-stopper. This can be optimized later 
by method 
as you mentioned  and also by some cut-off technique based on which we can 
decide that a XID beyond a certain range will be always visible, and thereby
avoiding look-up in pg_clog.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for CSN based snapshots

2014-05-12 Thread Rajeev rastogi

On 12 May 2014 19:27, Heikki Linnakangas Wrote:
 
 On 01/24/2014 02:10 PM, Rajeev rastogi wrote:
  We are also planning to implement CSN based snapshot.
  So I am curious to know whether any further development is happening
 on this.
 
 I started looking into this, and plan to work on this for 9.5. It's a
 big project, so any help is welcome. The design I have in mind is to
 use the LSN of the commit record as the CSN (as Greg Stark suggested).

Great !

 Some problems and solutions I have been thinking of:
 
 The core of the design is to store the LSN of the commit record in
 pg_clog. Currently, we only store 2 bits per transaction there,
 indicating if the transaction committed or not, but the patch will
 expand it to 64 bits, to store the LSN. To check the visibility of an
 XID in a snapshot, the XID's commit LSN is looked up in pg_clog, and
 compared with the snapshot's LSN.

Isn't it will be bit in-efficient to look in to pg_clog to read XID's commit
LSN for every visibility check?
 
 With this mechanism, taking a snapshot is just a matter of reading the
 current WAL insertion point. There is no need to scan the proc array,
 which is good. However, it probably still makes sense to record an xmin
 and an xmax in SnapshotData, for performance reasons. An xmax, in
 particular, will allow us to skip checking the clog for transactions
 that will surely not be visible. We will no longer track the latest
 completed XID or the xmin like we do today, but we can use
 SharedVariableCache-nextXid as a conservative value for xmax, and keep
 a cached global xmin value in shared memory, updated when convenient,
 that can be just copied to the snapshot.

I think we can update xmin, whenever transaction with its XID equal
to xmin gets committed (i.e. in ProcArrayEndTransaction).

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing empty target list in SELECT (1b4f7f93b4693858cb983af3cd557f6097dab67b)

2014-05-02 Thread Rajeev rastogi

On 02 May 2014 10:00, Amit Longote Wrote:

  I
  s the following behavior perceived fix-worthy?
 
 
  -- note the
  '
  1's
   in the output
  s
 
  po
  stgres=# CREATE TABLE test AS SELECT;
  SELECT 1
 
  postgres=# insert into test select;
  INSERT 0 1
 
 
 Or maybe, it just means 1 'null' row/record and not no row at all?

It just creates an item pointer and corresponding to that heap tuple header 
(without data or bitmask for NULL) gets stored as part of this insertion.
So though it does not insert anything (not even NULL) but still it reserve one 
row position. 
So while SELECT, it will not display anything but it will show actual number of 
rows.

Even below syntax is also allowed:

CREATE TABLE no_column_table();

IMO, this might be useful for dynamic use of table (later column might be added 
using 'ALTER') or to use as abstract ancestor in class hierarchy.


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] So why is EXPLAIN printing only plan time?

2014-04-27 Thread Rajeev rastogi

On 27 April 2014 22:38, Tom Lane Wrote:
 
 ... and not, in particular, parse analysis or rewrite time?
 
 I'd been a bit suspicious of the recent patch to add $SUBJECT without
 the other pre-execution components, but it just now occurred to me that
 there's at least one reason why this might be a significant omission:
 any delay caused by waiting to acquire locks on the query's tables will
 be spent in the parser.
 
 It's probably okay to omit raw parsing time (that is, flex+bison), even
 though profiling sometimes suggests that that's a nontrivial cost.
 It's completely predictable and more or less linear in the query string
 length, so there are not likely to be any surprises for users in that
 time.  But it's not at all clear to me that the same can be said for
 parse analysis or rewrite times.
 
 Rewrite timing could easily be captured by EXPLAIN since that call is
 done within ExplainQuery().  Parse analysis isn't, but we could imagine
 having transformExplainStmt() time the operation and stick the result
 into a new field in struct ExplainStmt.
 
 I'm not sure if it'd be appropriate to add all of these measurements as
 separate printout lines; arguably we should just fold them into
 planning time.
 
 Thoughts?

I had understanding that the Planning time printed along with EXPLAIN is 
only for the purpose to see how much time is spent in creating plan for 
execution.
If we start including all previous (i.e. parsing + analyzing + rewrite) time 
also 
in Planning time then, it might lose whole of the meaning of printing this.

If we add time spent in locking, then Planning time for the same query with 
same statistics
will become variable depending on the amount of time it had to wait to acquire 
lock, which 
will be bit confusing for users.

May be if we really have to print other time (i.e. parsing + analyzing + 
rewrite), then IMHO 
we can print with some different name instead of including in Planning time 
only. 

Thanks and Regards,
Kumar Rajeev Rastogi







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-15 Thread Rajeev rastogi

On 14 April 2014 20:10, Simon Riggs wrote:

Autonomous Transaction Storage:
As for main transaction, structure PGXACT is used to store main transactions, 
which are created in shared memory of size:
   (Number of process)*sizeof(struct PGXACT)
Similarly a new structure will be defined to store autonomous transaction:
   Struct PGAutonomousXACT

Oh...I had already added this patch for 2014-June CommitFest, thinking that 
everyone is busy with  work to wrap up 9.4.

 I already proposed exactly this design two years ago and it was rejected at 
 the PgCon hackers meeting.
 I have a better design worked out now and will likely be working on it for 9.5

Can we work together to take this feature to final goal.
May be you can go through my complete patch and see whatever part of the patch 
and related design can be re-used along with your new design.
Also if possible you can share your design (even rough is OK), I will see if I 
can contribute to that in some-way.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-11 Thread Rajeev rastogi

On 09 April 2014 21:25, Robert Haas Wrote:

   Deadlock Detection:
  I'm not sure how this would work out internally
  In order to resolve deadlock, two member variable will be created in
 the structure PROLOCK:
  Bitmask for lock types currently held by autonomous
 transaction.
  LOCKMASKholdMaskByAutoTx[MAX_AUTO_TX_LEVEL]
  Bitmask for lock types currently held by main transaction.
  LOCKMASKholdMaskByNormalTx
 
  Now when we grant the lock to particular transaction, depending on
  type of transaction, bit Mask will be set for either holdMaskByAutoTx
 or holdMaskByNormalTx.
  Similar when lock is ungranted, corresponding bitmask will be reset.
 
 That sounds pretty ugly, not to mention the fact that it will cause a
 substantial increase in the amount of memory required to store
 PROCLOCKs.  It will probably slow things down, too.

Actually I followed above design to keep it align with the existing design. As 
I understand, currently also
all lock conflict is checked based on the corresponding lock bit mask. 

This is good catch that shared memory required will increase but isn't it 
justified from user perspective
since we are allowing more transactions per session and hence memory required 
to store various kind of resources 
will increase.

Since we are just additionally setting the bitmask for each lock (in-case there 
is autonomous transaction, then there will
be one more additional bit mask setting and deadlock check), I don't think it 
should slow down the overall operation. 

Also We can keep number of autonomous transaction configurable(default-0), to 
keep it less impacting incase it is not configured.

An autonomous transaction can also conflict with main transaction, so in order 
to check conflict between them, 
I am distinguishing at this level.

Please correct me If I am wrong anywhere and also please provide your thought 
on this and on overall design.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS]

2014-04-10 Thread Rajeev rastogi

On 10 April 2014 11:18, Pavan Deolasee Wrote:
I could think of few  global variables like transaction properties 
related(i.e. read-only mode, isolation level etc). As I plan to keep 
transaction properties of autonomous transaction same as main transaction, so 
there is no need to have these global variables separately.
Apart from this there are global variables like with-in transaction counters, 
GUC, xactStartTimeStamp. I think there is no need to maintain these variables 
also separately. They can continue from previous value for autonomous 
transaction also similar to as sub-transaction does.

Hmm. Is that in line with what other databases do ? I would have preferred AT 
to run like a standalone transaction without any influence of the starting 
transaction, managing its own resources/locks/visibility/triggers etc.

To me it seems it is not very useful to keep the transaction properties 
separate except the read-only properties (though oracle does not share any 
transaction properties).

So we can have restriction that isolation and deferrable properties of main 
transaction will be inherited by autonomous transaction but read-only 
properties can be defined independently by autonomous transaction. Which looks 
to be fair restriction according to me.

In order to keep read-only properties separate, there is already infrastructure 
in PG. Inside the structure TransactionStateData, there is variable 
prevXactReadOnly (entry-time xact r/o state), which can keep the parent 
transaction read only properties and XactReadOnly can be changed to current 
transaction properties.
Moreover we can take this (transaction 
properties) as a feature enhancement also once a basic infrastructure is 
established, if acceptable to everyone.

Autonomous transaction will not share resource/lock/visibility etc with main 
transaction. This has been already taken care in WIP patch.
In-case of autonomous transaction, only specific global variables initialized 
are related to resources (similar to sub-transaction), which anyway  gets 
stored in current transaction state.
Please let me know if I am missing something or if you have some specific 
global variables related issue.
No, I don't have any specific issues in mind. Mostly all such global state is 
managed through various AtStart/AtEOX and related routines. So a careful 
examination of all those routines will give a good idea what needs to be 
handled. You probably will require to write
AtATStart/AtATEOX and similar routines to manage the state at AT 
start/commit/rollback. Sorry, I haven't looked at your WIP patch yet.

For some of the resources, I have already written AtATStart/AtATEOX kind of 
routines in WIP patch.

Comments/feedbacks/doubts are welcome.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-10 Thread Rajeev rastogi

On 10 April 2014 11:18, Pavan Deolasee Wrote:
I could think of few  global variables like transaction properties 
related(i.e. read-only mode, isolation level etc). As I plan to keep 
transaction properties of autonomous transaction same as main transaction, so 
there is no need to have these global variables separately.
Apart from this there are global variables like with-in transaction counters, 
GUC, xactStartTimeStamp. I think there is no need to maintain these variables 
also separately. They can continue from previous value for autonomous 
transaction also similar to as sub-transaction does.

Hmm. Is that in line with what other databases do ? I would have preferred AT 
to run like a standalone transaction without any influence of the starting 
transaction, managing its own resources/locks/visibility/triggers etc.

To me it seems it is not very useful to keep the transaction properties 
separate except the read-only properties (though oracle does not share any 
transaction properties).

So we can have restriction that isolation and deferrable properties of main 
transaction will be inherited by autonomous transaction but read-only 
properties can be defined independently by autonomous transaction. Which looks 
to be fair restriction according to me.

In order to keep read-only properties separate, there is already infrastructure 
in PG. Inside the structure TransactionStateData, there is variable 
prevXactReadOnly (entry-time xact r/o state), which can keep the parent 
transaction read only properties and XactReadOnly can be changed to current 
transaction properties.
Moreover we can take this (transaction 
properties) as a feature enhancement also once a basic infrastructure is 
established, if acceptable to everyone.

Autonomous transaction will not share resource/lock/visibility etc with main 
transaction. This has been already taken care in WIP patch.
In-case of autonomous transaction, only specific global variables initialized 
are related to resources (similar to sub-transaction), which anyway  gets 
stored in current transaction state.
Please let me know if I am missing something or if you have some specific 
global variables related issue.
No, I don't have any specific issues in mind. Mostly all such global state is 
managed through various AtStart/AtEOX and related routines. So a careful 
examination of all those routines will give a good idea what needs to be 
handled. You probably will require to write
AtATStart/AtATEOX and similar routines to manage the state at AT 
start/commit/rollback. Sorry, I haven't looked at your WIP patch yet.

For some of the resources, I have already written AtATStart/AtATEOX kind of 
routines in WIP patch.

Comments/feedbacks/doubts are welcome.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-09 Thread Rajeev rastogi

On 09 April 2014 12:14, Pavan Deolasee Wrote:

Whenever I was asked to have a look at implementing this feature, I always 
wondered about the great amount of global state that a backend maintains which 
is normally tied to a single top transaction. Since AT will have same 
characteristics as a top level transaction, I
wonder how do you plan to separate those global state variables ? Sure, we can 
group them in a structure and put them on a stack when an AT starts and pop 
them off when the original top transaction becomes active again, finding all 
such global state variables is
going to be tricky.

I could think of few  global variables like transaction properties related(i.e. 
read-only mode, isolation level etc). As I plan to keep transaction properties 
of autonomous transaction same as main transaction, so there is no need to have 
these global variables separately.
Apart from this there are global variables like with-in transaction counters, 
GUC, xactStartTimeStamp. I think there is no need to maintain these variables 
also separately. They can continue from previous value for autonomous 
transaction also similar to as sub-transaction does.

In-case of autonomous transaction, only specific global variables initialized 
are related to resources (similar to sub-transaction), which anyway  gets 
stored in current transaction state.

Please let me know if I am missing something or if you have some specific 
global variables related issue.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-08 Thread Rajeev rastogi

On 09 April 2014 01:09, Rover Haas Wrote:
 
 I'm also pretty unconvinced that multiple PGPROCs is the right way to
 go.  First, PGPROCs have a bunch of state in them that is assumed to
 exist once per backend.  We might find pretty substantial code churn
 there if we try to go change that.  

Yes you right. That is why I am not creating a separate procarray entry to 
maintain autonomous transaction. Please find details in previous reply sent 
today sometime back.

 Second, why do other backends 
 really need to know about our ATs?  As far as I can see, if other
 backends see the AT as a subtransaction of our top-level transaction up
 until it actually commits, that ought to be just fine.  Maybe the
 backend needs to internally frob visibility rules, but that's not a
 matter for shared memory.

In order to get snapshot from other session, it will be required by other 
session to access autonomous transaction and their sub-transactions.

During snapshot creation, autonomous transaction is considered as main
transaction and list of all running autonomous transaction and their 
sub-transaction
gets stored in snapshot data.

e.g. Suppose below processes are running with given transactions:

Proc-1: 100
Proc-2: 101, 102 (Auto Tx1), 103 (Auto Tx2), 104 (Sub-tx of Auto Tx2)
Proc-3: 105, 106 (Auto Tx2), 107 (Auto Tx2)

Suppose latest completed transaction is 108.

Then Snapshot data for autonomous transaction 107 will be as below:
Xmin: 100
Xmax: 109
Snapshot-xip[]:  100, 101, 102, 103, 105, 106  

Snapshot-subxip[]:   104

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-08 Thread Rajeev rastogi

On 09 April 2014 01:43, Tom Lane Wrote:

  I'm also pretty unconvinced that multiple PGPROCs is the right way to
  go.  First, PGPROCs have a bunch of state in them that is assumed to
  exist once per backend.  We might find pretty substantial code churn
  there if we try to go change that.  Second, why do other backends
  really need to know about our ATs?  As far as I can see, if other
  backends see the AT as a subtransaction of our top-level transaction
  up until it actually commits, that ought to be just fine.
 
 If we can make it work like that, sure.  I'm a bit worried about how
 you'd decouple a subtransaction and commit it atomically ... or if
 that's not atomic, will it create any problems?  

Though autonomous transaction uses mixed approach of sub-transaction as well as 
main
transaction, transaction state of autonomous transaction is handled 
independently.
So depending on the transaction state of autonomous transaction (for commit 
TBLOCK_AUTOCOMMIT), 
this transaction will be committed. While committing:
1.  Commit of record and logging the corresponding WAL happens in the same 
way as main transaction (except the way autonomous transaction and their 
sub-transaction accessed).
This will take care automatically of updating pg_clog also for 
autonomous transaction.
2.  Also it marks the autonomous transaction finish by setting appropriate 
fields of MyPgAutonomousXact in similar manner as done for main transaction.
3.  Freeing of all resource and popping out of parent transaction happens 
in the same way as sub-transaction.

 The point being that
 you need to change both pg_subtrans and pg_clog to make that state
 transition.

Yes I am changing both. But no specific changes were required. During commit 
and assignment of autonomous transaction, it is automatically taken care. 

Any comment/feedback/doubt are welcome?

Thanks and Regards,
Kumar Rajeev Rastogi








-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Rajeev rastogi

On 07 April 2014 12:20, Craig Ringer
 
  Syntax to create autonomous transaction can be as:
 
  */PRAGMA AUTONOMOUS TRANSACTION;/*
 
 Wouldn't you want to use SET TRANSACTION for this?
 
 Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ?
 
 What's the logic behind introducing PRAGMA ?
 
 
 If you wanted to use that syntax for Oracle compatibility you'd need to
 use:
 
 PRAGMA AUTONOMOUS_TRANSACTION;
 
 (note underscore). But really, this would no be a pragma at all,
 PostgreSQL doesn't really have the concept. Calling it that would just
 be misleading.

Actually it is same as oracle (i.e. PRAGMA AUTONOMOUS_TRANSACTION), it was just 
typo mistake in previous mail.
But if this is also not accepted then we can discuss and come out with a syntax 
based on everyone agreement.

 
  *_Starting of Autonomous  Transaction:_*
 
  Starting of autonomous transaction will be exactly same as starting
  sub-transaction.
 
 If you don't want it to dirty read data from the parent tx, or inherit
 parent locks, then it cannot be the same at all.

While starting sub-transaction, it is just initializing the resources required 
and
links the same to the parent transaction, which we require for autonomous 
transaction also.
I am not able to notice any issue as you mentioned above with this.
Please let me know if I am missing something or misunderstood your concern.

  2.  Freeing of all resource and popping of previous transaction
  happens in the same way as sub-transaction.
 
 I'm not sure what you mean here.

It means, during commit of autonomous transaction, freeing of all resource are 
done in the same way as done for sub-transaction.
Also current autonomous transaction gets popped out and points to the parent 
transaction in the similar way as done for sub-transaction.
 
 Overall, this looks like a HUGE job to make work well. I know some
 others have been doing work along the same lines, so hopefully you'll
 be able to collaborate and share ideas.

Yes it is huge works, so I have proposed in the beginning of 9.5 so that we can 
have multiple round of discussion and hence address
all concerns.
Also I have proposed to finish this feature in multiple rounds i.e. first 
patch, we can try to support autonomous transaction from
standalone SQL-command only, which will set-up infrastructure for future work 
in this area.

Using the WIP patch sent, I have done basic testing and it works fine.

Any comments?

Thanks and Regards,
Kumar Rajeev Rastogi





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-04-07 Thread Rajeev rastogi

On 07 April 2014 12:12, Pavel Stehule wrote:

+1 for feature
Thanks

-1 for Oracle syntax - it is hardly inconsistent with Postgres
We can discuss and come out with the syntax based on everyone agreement.
Autonomous transactions should be used everywhere - not only in plpgsql

Yes you are right. I am not planning to support only using plpgsql.  Initially 
we can support this
Using the standalone SQL-commands and then later we can enhance based on this 
infrastructure
to be used using plpgsql, triggers.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-07 Thread Rajeev rastogi

On 04 April 2014 18:09, Joshua Yanovski Wrote:

 The counter would be updated only by VACUUM, which would perform the 
 same computation performed by the COUNT operation but add it 
 permanently to counter just before it set the visible_map bit to 1 (I 
 am not certain whether this would require unusual changes to WAL 
 replay or not).

I think there is one more disadvantages with this (or maybe I have missed 
something): 
User may not use count(*) or may use just once after creating new kind 
of index proposed but VACUUM will have to keep performing operation equivalent 
to iterative count(*), which might degrade performance for VACUUM.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-04-06 Thread Rajeev rastogi

On 05 April 2014 04:14, Tom Lane

  [ pgctl_win32service_rel_dbpath_v6.patch ]
 
 Committed with minor corrections, mostly but not all cosmetic.

Thanks a lot...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Observed an issue in CREATE TABLE syntax

2014-04-04 Thread Rajeev rastogi

I observed an issue that even if invalid syntax is provided for CREATE TABLE, 
table is getting created successfully.

Below table creation succeed even though same constraint name 
is given multiple times.
None of the below constraints has any meaning of giving 
multiple times.

postgres=# create table new (id int NULL NULL);
CREATE TABLE

postgres=# create table new1(id serial NOT NULL NOT NULL);
CREATE TABLE

postgres=# create table new2 (id int unique unique);
CREATE TABLE

Should we not throw error for above syntaxes?

Please find the attached patch with the fix.

Thanks and Regards,
Kumar Rajeev Rastogi



multiconstissuev1.patch
Description: multiconstissuev1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] datistemplate of pg_database does not behave as per description in documentation

2014-04-01 Thread Rajeev rastogi

On 27 March 2014 17:16, Tom Lane Wrote:

 I agree we need to make the docs match the code, but changing behavior
 that's been like that for ten or fifteen years isn't the answer.

Sounds good.

Please find the attached patch with only documentation change.

Thanks and Regards,
Kumar Rajeev Rastogi




datistemplate_issueV2.patch
Description: datistemplate_issueV2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] datistemplate of pg_database does not behave as per description in documentation

2014-03-27 Thread Rajeev rastogi

As per the documentation, datistemplate of pg_database is used in following way:

datistemplate

Bool

If true then this database can be used in the TEMPLATE clause of CREATE 
DATABASE to create a new database as a clone of this one


But current code does not behave in this manner.  Even if dbistemplate of 
database is false, still it allows to be used as template database.

postgres=# select datname, datistemplate from pg_database;
  datname  | datistemplate
---+---
template1 | t
template0 | t
postgres  | f
(3 rows)

postgres=# create database tempdb template postgres;  ---Actually 
this should fail.
CREATE DATABASE

Though I am not sure if we have to modify source code to align the behavior 
with documentation or we need to change the documentation itself.
To me it looks like code change will be better, so I am attaching the current 
patch with source code change.  After modification, result will be as follows:

postgres=# create database newtempdb template postgres;
ERROR:  DB name postgres given as template is not a template database

Please provide your feedback.

Thanks and Regards,
Kumar Rajeev Rastogi


datistemplate_issue.patch
Description: datistemplate_issue.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standby server won't start

2014-03-21 Thread Rajeev rastogi

On 21 March 2014 13:41, Tatsuo Wrote:
 
 I changed primary servers max_connections from 100 to 4 for just a
 testing purpose. Now standby server won't start and complains:
 
 hot standby is not possible because max_connections = 4 is a lower
 setting than on the master server (its value was 100)
 
 My guess is this is because standby's pg_control file contains previous
 primary setting (max_connections = 100). Is there any way to start the
 standby server without re-creating pg_control (which implies getting
 base backup again)? If not, there should be some way to allow to start
 standby server without getting base backup...

I think there is no way to do this because parameter from master is already set
in pg_control file, which can not be changed without taking new backup from 
master.

Also this is not recommended to have standby's max_connection values lesser 
than the master's max_connection value.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standby server won't start

2014-03-21 Thread Rajeev rastogi

On 21 March 2014 16:17, Tatsuo Wrote:


 In my case I had already changed primary's max_connections to 4 and
 restarted it. So at that point both postgresql.conf of primary and
 standby were 4.

If you changed max_connection to 4 only in primary, then I am not able to 
understand, how it got changed in standby also (if you have not taken back 
again)?
Let me know If I have missed something.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-13 Thread Rajeev rastogi

On 12 March 2014 23:57, Tom Lane Wrote: 
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  My inclination now (see later traffic) is to suppress the status
  report when the COPY destination is the same as pset.queryFout (ie,
 a
  simple test whether the FILE pointers are equal).  This would
  suppress the status report for \copy to stdout and COPY TO
 STDOUT
  cases, and also for \copy to pstdout if you'd not redirected
  queryFout with \o.

Based on my analysis, I observed that just file pointer comparison may not be 
sufficient 
to decide whether to display command tag or not. E.g. imagine below scenario:

psql.exe -d postgres -o 'file.dat' -c  \copy tbl to 'file.dat';

Though both destination files are same but file pointer will be different and 
hence 
printing status in file 'file.dat' will overwrite some part of data copied to 
file.
Also we don't have any direct way of comparison of file name itself.
As I see \copy ... TO.. will print status only in-case of \copy to pstdout if 
-o option is given.

So instead of having so much of confusion and inconsistency that too for one 
very specific case, 
I though not to print status for all case Of STDOUT and \COPY ... TO ...

  This is reasonably similar to what we already do for SELECT, isn't it?
   I mean, the server always sends back a command tag, but psql
  sometimes opts not to print it.
 
 Right, the analogy to SELECT gives some comfort that this is reasonable.

I have modified the patch based on above analysis as:
1. In-case of COPY ... TO STDOUT, command tag will not be displayed.
2. In-case of \COPY ... TO ..., command tag will not be displayed.
3. In all other cases, command tag will be displayed similar as were getting 
displayed earlier. 

I have modified the corresponding documentation.

Please find the attached revised patch.

Thanks and Regards,
Kumar Rajeev Rastogi




psql-copy-count-tag-20140313.patch
Description: psql-copy-count-tag-20140313.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-11 Thread Rajeev rastogi

On 10 March 2014 23:44, Tom Lane wrote:

 Unfortunately, while testing it I noticed that there's a potentially
 fatal backwards-compatibility problem, namely that the COPY n status
 gets printed on stdout, which is the same place that COPY OUT data is
 going.  While this isn't such a big problem for interactive use, usages
 like this one are pretty popular:
 
psql -c 'copy mytable to stdout' mydatabase | some-program
 
 With the patch, COPY n gets included in the data sent to some-program,
 which never happened before and is surely not what the user wants.
 The same if the -c string uses \copy.
 
 There are several things we could do about this:
 
 1. Treat this as a non-backwards-compatible change, and document that
 people have to use -q if they don't want the COPY tag in the output.
 I'm not sure this is acceptable.
 
 2. Kluge ProcessResult so that it continues to not pass back a PGresult
 for the COPY TO STDOUT case, or does so only in limited circumstances
 (perhaps only if isatty(stdout), for instance).

 I'm inclined to think #2 is the best answer if we can't stomach #1.

Is it OK to have different status output for different flavor of COPY command? 
I am afraid that It will become kind of inconsistent result.

Also not providing the command result status may be inconsistent from 
behavior of any other SQL commands.

I agree that it breaks the backward compatibility but I am not sure if anyone 
is so tightly coupled with this ( or whether they will be effected with 
additional status result).

To me option #1 seems to be more suitable specially since there is an option to 
disable
the status output by giving -q.

Please provide your opinion or let me know If I have missed something.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-11 Thread Rajeev rastogi

On 11 March 2014 19:52, Tom Lane wrote:

 After sleeping on it, I'm inclined to think we should continue to not
 print status for COPY TO STDOUT.  Aside from the risk of breaking
 scripts, there's a decent analogy to be made to SELECT: we don't print
 a status tag for that either.

It is correct that SELECT does not print conventional way of status tag but 
still it prints the number
of rows selected (e.g. (2 rows)) along with rows actual value, which can be 
very well considered
as kind of status. User can make out with this result, that how many rows have 
been selected.

But in-case of COPY TO STDOUT, if we don't print anything, then user does not 
have any direct way of finding
that how many rows were copied from table to STDOUT, which might have been very 
useful.

Please let me know your opinion or if I have missed something.

Thanks and Regards,
Kumar Rajeev Rastogi





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: psql command copy count tag

2014-03-09 Thread Rajeev rastogi

As mentioned by Pavel also, this patch will be very useful, which provides 
below enhancement:

1.   Brings consistency between copy from “stdin” and “file”.

2.   Consistent with server side copy statement.

3.   Also fixes the issue related to “\copy destination file becomes 
default destination file for next command given in sequence”.

This has been in “Ready for committer” stage for long time.
Please check if this can be committed now or any other changes required.

Thanks and Regards,
Kumar Rajeev Rastogi
--
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

From: Pavel Stehule [mailto:pavel.steh...@gmail.com]
Sent: 30 January 2014 01:57
To: PostgreSQL Hackers
Cc: Amit Khandekar; Rajeev rastogi
Subject: review: psql command copy count tag

related to 
http://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab7713ddb1...@szxeml508-mbx.china.huawei.com
Hello
1. I had to rebase this patch - actualised version is attached - I merged two 
patches to one
2. The psql code is compiled without issues after patching
3. All regress tests are passed without errors
5. We like this feature - it shows interesting info without any slowdown - psql 
copy command is more consistent with server side copy statement from psql 
perspective.
This patch is ready for commit
Regards

Pavel

Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-03-09 Thread Rajeev rastogi

While registering the service for PostgreSQL on windows, we cannot expect user 
to give absolute path always.
So it is required to support relative path as data directory. So this patch 
will be very useful to handle the same.

This patch has been in Ready for committer stage for long time.
Please check if this can be committed now or any other changes required.

Thanks and Regards,
Kumar Rajeev Rastogi

 -Original Message-
 From: MauMau [mailto:maumau...@gmail.com]
 Sent: 24 February 2014 15:31
 To: Rajeev rastogi
 Cc: pgsql-hackers@postgresql.org; Magnus Hagander
 Subject: Re: [HACKERS] [review] PostgreSQL Service on Windows does not
 start if data directory given is relative path
 
 From: Rajeev rastogi rajeev.rast...@huawei.com Please find the
 attached modified patch.
 
 Thanks, reviewed and made this patch ready for committer.
 
 Regards
 MauMau



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-24 Thread Rajeev rastogi

On 04 February 2014 14:38, Myself wrote:

 
 On 4th February 2014, Christian kruse Wrote:
  On 04/02/14 12:38, Fujii Masao wrote:
   ISTM that the phrase Request queue is not used much around the
 lock.
   Using the phrase wait queue or Simon's suggestion sound better to
  at least me.
   Thought?
 
  Sounds reasonable to me. Attached patch changes messages to the
  following:
 
  Process holding the lock: A. Wait queue: B.
  Processes holding the lock: A, B. Wait queue: C.
 
 This looks good to me also.

I have tested the revised patch and found ready to be committed.

I am marking this as Ready for Committer.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-23 Thread Rajeev rastogi

On 22 February 2014 06:16, MauMau Wrote:

Thanks for reviewing again.

 Please make small cosmetic changes so that make_absolute_path() follows
 the
 style of other parts.  Then I'll make this ready for committer.
 
 (1)
 Add the function name in the comment as in:
 
 /*
  * make_absolute_path
  *
  * ...existing function descripton
  */

Added.

 (2)
 Add errno description as in:
 
 fprintf(stderr, _(could not get current working directory: %s\n,
 strerror(errno)));

Modified.

Please find the attached modified patch.

Thanks and Regards,
Kumar Rajeev Rastogi


pgctl_win32service_rel_dbpath_v6.patch
Description: pgctl_win32service_rel_dbpath_v6.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-20 Thread Rajeev rastogi

On 18 February 2014 21:47, MauMau Wrote:
 

  We already have two different versions of make_absolute_path() in the
  tree
  - one in src/backend/utils/init/miscinit.c and one in
  src/test/regress/pg_regress.c.
 
  I don't think we need a third one.
 
  If we put it in port/ like this patch done, then we should make the
  other callers use it. We might need one for frontend and one for
  backend (I haven't looked into that much detail), but definitely the
  one between pg_regress and pg_ctl should be shared.
 
 Agreed.  Sorry, I should have proposed the refactoring.
 
 Rajeev, could you refactor the code so that there is only one
 make_absolute_path()?  I think we can do like this:
 
 1. Delete get_absolute_path() in src/port/path.c.
 2. Delete make_absolute_path() in src/test/regress/pg_regress.c.
 3. Move make_absolute_path() from src/backend/utils/init/miscinit.c to
 src/port/path.c, and delete the declaration in miscadmin.h.
 4. Modify make_absolute_path() in path.c so that it can be used both in
 backend and frontend.  That is, surround the error message output with
 #ifdef FRONTEND ... #else ... #endif.  See some other source files in
 src/port.

Changed the patch as per your suggestion. 
Now only one version of make_absolute_path there defined in src/port/path.c

Found one small memory leak in the existing function 
make_absolute_path(miscinit.c),
fixed the same.

Please find the revised patch.

Thanks and Regards,
Kumar Rajeev Rastogi



pgctl_win32service_rel_dbpath_v5.patch
Description: pgctl_win32service_rel_dbpath_v5.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-02-17 Thread Rajeev rastogi

On 12 February 2014 12:16, KONDO Mitsumasa Wrote:
 
 Hi Rajeev,
 
  (2014/01/29 17:31), Rajeev rastogi wrote:
  No Issue, you can share me the test cases, I will take the
 performance report.
 Attached patch is supported to latest pg_stat_statements. It includes
 min, max, and stdev statistics. Could you run compiling test on your
 windows enviroments?
 I think compiling error was fixed.

It got compiled successfully on Windows.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-04 Thread Rajeev rastogi


On 4th February 2014, Christian kruse Wrote:
 On 04/02/14 12:38, Fujii Masao wrote:
  ISTM that the phrase Request queue is not used much around the lock.
  Using the phrase wait queue or Simon's suggestion sound better to
 at least me.
  Thought?
 
 Sounds reasonable to me. Attached patch changes messages to the
 following:
 
 Process holding the lock: A. Wait queue: B.
 Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-03 Thread Rajeev rastogi

On 3rd February 2014, MauMau Wrote: 

 Thanks, your patch looks good.  I confirmed the following:
 
 * Applied cleanly to the latest source tree, and built on Linux and
 Windows (with MSVC) without any warnings.
 
 * Changed to D:\ and ran pg_ctl register -N pg -D pgdata, and checked
 the registry value ImagePath with regedit.  It contained -D D:\pgdata.
 
 * Changed to D:\pgdata and ran pg_ctl register -N pg -D ..\pgdata,
 and checked the registry value ImagePath with regedit.  It contained -D
 D:\pgdata.

Thanks for reviewing and testing the patch.

 Finally, please change the function description comment of
 get_absolute_path() so that the style follows other function in path.c.
 That is:
 
 /*
  * function_name
  *
  * Function description
  */

I have modified the function description comment as per your suggestion.

 Then update the CommitFest entry with the latest patch and let me know
 of that.  Then, I'll change the patch status to ready for committer.

I will update commitFest with the latest patch immediately after sending the 
mail.

Thanks and Regards,
Kumar Rajeev Rastogi





pgctl_win32service_rel_dbpath_v4.patch
Description: pgctl_win32service_rel_dbpath_v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-02 Thread Rajeev rastogi


On 1st February 2014, MauMau Wrote: 
 
 I reviewed the patch content.  I find this fix useful.
 
 I'd like to suggest some code improvements.  I'll apply and test the
 patch when I receive your reply.

Thanks for reviewing the patch.

 (1)
 I think it is appropriate to place find_my_abs_path() in path.c rather
 than
 exec.c.  Please look at the comments at the beginning of those files.
 exec.c handles functions related to executables, while path.c handles
 general functions handling paths.

I have moved this function to path.c

 It's better to rename the function to follow the naming of other
 functions
 in path.c, something like get_absolute_path() or so.  Unfortunately, we
 cannot use make_absolute_path() as the name because it is used in
 src/backend/utils/init/miscinit.c, which conflicts in the backend
 module.

Renamed the function as get_absolute_path.

 (2)
 In pg_ctl.c, dbpath can be better named as datadir, because it holds
 data
 directory location.  dbpath is used to mean some different location in
 other
 source files.

Renamed as dataDir.

 (3)
 find_my_abs_path() had better not call make_native_path() because the
 role
 of this function should be to just return an absolute path.  pg_ctl.c
 can
 instead call make_native_path() after find_my_abs_path().

Changed as per suggestion.

 (4)
 find_my_abs_path() should not call resolve_symlinks().  For reference,
 look
 at make_absolute_path() in src/backend/utils/init/miscinit.c and
 src/test/regress/pg_regress.c.  I guess the reason is that if the user
 changed to the directory through a symbolic link, we should retain the
 symbolic link name.

Changed as per suggestion.

 (5)
 Change file/path in the comment of find_my_abs_path() to path,
 because
 file is also a path.

Changed as per suggestion.

Please find the attached revised patch.

Thanks and Regards,
Kumar Rajeev Rastogi




pgctl_win32service_rel_dbpath_v3.patch
Description: pgctl_win32service_rel_dbpath_v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Rajeev rastogi

On 28th January, Mitsumasa KONDO wrote:
  2014-01-26 Simon Riggs si...@2ndquadrant.com
  mailto:si...@2ndquadrant.com
 
   On 21 January 2014 19:48, Simon Riggs si...@2ndquadrant.com
   mailto:si...@2ndquadrant.com wrote:
 On 21 January 2014 12:54, KONDO Mitsumasa
  kondo.mitsum...@lab.ntt.co.jp
   mailto:kondo.mitsum...@lab.ntt.co.jp wrote:
 Rebased patch is attached.

 Does this fix the Windows bug reported by Kumar on
 20/11/2013 ?
  Sorry, I was misunderstanding. First name of Mr. Rajeev Rastogi is
  Kumar! I searched only e-mail address and title by his name...
 
  I don't have windows compiler enviroment, but attached patch might
 be
  fixed.
  Could I ask Mr. Rajeev Rastogi to test my patch again?
 
  I tried to test this but I could not apply the patch on latest git
 HEAD.
  This may be because of recent patch (related to pg_stat_statement
 only
  pg_stat_statements external query text storage ), which got
  committed on 27th January.
 Thank you for trying to test my patch. As you say, recently commit
 changes pg_stat_statements.c a lot. So I have to revise my patch.
 Please wait for a while.
 
 By the way, latest pg_stat_statement might affect performance in
 Windows system.
 Because it uses fflush() system call every creating new entry in
 pg_stat_statements, and it calls many fread() to warm file cache. It
 works well in Linux system, but I'm not sure in Windows system. If you
 have time, could you test it on your Windows system? If it affects
 perfomance a lot, we can still change it.


No Issue, you can share me the test cases, I will take the performance report.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Observed Compilation warning in WIN32 build

2014-01-28 Thread Rajeev rastogi

I observed below WIN32 compilation warnings in postmaster.c (seems introduced 
by commit ea9df812d8502fff74e7bc37d61bdc7d66d77a7f Relax the requirement that 
all lwlocks be stored in a single array.).

1.\src\backend\postmaster\postmaster.c(5625) : warning C4133: 
'=' : incompatible types - from 'LWLockPadded *' to 'LWLock *'
1.\src\backend\postmaster\postmaster.c(5856) : warning C4133: '=' : 
incompatible types - from 'LWLock *' to 'LWLockPadded *'

Attached is the patch with the fix.

Thanks and Regards,
Kumar Rajeev Rastogi



compile_issue_lwlock.patch
Description: compile_issue_lwlock.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-01-28 Thread Rajeev rastogi

On 28/01/14, Christian Kruse wrote:
  I have checked the revised patch. It looks fine to me except one
 minor code formatting issue.
  In elog.c, two tabs are missing in the definition of function
 errdetail_log_plural.
  Please run pgindent tool to check the same.
 
 I did, but this reformats various other locations in the file, too.
 Nevertheless I now ran pg_indent against it and removed the other parts.
 Attached you will find the corrected patch version.
 
  Also I would like to highlight one behavior here is that process ID
 of
  process trying to acquire lock is also listed in the list of Request
 queue. E.g.
 
session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE
 MODE;
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE
  MODE;
 
  On execution of LOCK in session-2, as part of log it will display as:
DETAIL:  Process holding the lock: X. Request queue: Y.
 
Where Y is the process ID of same process, which was trying to
 acquire lock.
 
 This is on purpose due to the rewording of the Message. In the first
 version the PID of the backend was missing.
 
 Thanks for the review!
 

Now patch looks fine to me. I am marking this as Ready for Committer.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Function definition removed but prototype still there

2014-01-28 Thread Rajeev rastogi

As part of the below commit

36a35c550ac114caa423bcbe339d3515db0cd957 (Compress GIN posting 
lists, for smaller index size.)

Function GinDataPageAddItemPointer definition was removed but corresponding 
prototype was still there.

Attached is the patch to fix the same.

Thanks and Regards,
Kumar Rajeev Rastogi



unwanted_prototype.patch
Description: unwanted_prototype.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-01-27 Thread Rajeev rastogi

On 23/01/14, Christian Kruse wrote:
  Well, is it context or detail?  Those fields have reasonably well
  defined meanings IMO.
 
 I find the distinction somewhat blurry and think both would be
 appropriate. But since I wasn't sure I changed to detail.
 
  If we need errcontext_plural, let's add it, not adopt inferior
  solutions just because that isn't there for lack of previous need.
 
 I would've added it if I would've been sure.
 
  But having said that, I think this is indeed detail not context.
  (I kinda wonder whether some of the stuff that's now in the primary
  message shouldn't be pushed to errdetail as well.  It looks like some
  previous patches in this area have been lazy.)
 
 I agree, the primary message is not very well worded. On the other hand
 finding an appropriate alternative seems hard for me.
 
  While I'm griping, this message isn't even trying to follow the
  project's message style guidelines.  Detail or context messages are
  supposed to be complete sentence(s), with capitalization and
 punctuation to match.
 
 Hm, I hope I fixed it in this version of the patch.
 
  Lastly, is this information that we want to be shipping to clients?
  Perhaps from a security standpoint that's not such a wise idea, and
  errdetail_log() is what should be used.
 
 Fixed. I added an errdetail_log_plural() for this, too.

I have checked the revised patch. It looks fine to me except one minor code 
formatting issue.
In elog.c, two tabs are missing in the definition of function 
errdetail_log_plural. 
Please run pgindent tool to check the same.

Also I would like to highlight one behavior here is that process ID of process 
trying to
acquire lock is also listed in the list of Request queue. E.g.

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE; 
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE;

On execution of LOCK in session-2, as part of log it will display as:
DETAIL:  Process holding the lock: X. Request queue: Y.

Where Y is the process ID of same process, which was trying to acquire 
lock.

To me, it seems to be correct behavior as the meaning of message will be
list of all processes already holding the lock and processes waiting in queue 
and 
position of self process in wait-list. In above example, it will indicate that
process Y in on top of wait list.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-27 Thread Rajeev rastogi



On 27th January, Mitsumasa KONDO wrote:
  2014-01-26 Simon Riggs si...@2ndquadrant.com
  mailto:si...@2ndquadrant.com
 
  On 21 January 2014 19:48, Simon Riggs si...@2ndquadrant.com
  mailto:si...@2ndquadrant.com wrote:
On 21 January 2014 12:54, KONDO Mitsumasa
 kondo.mitsum...@lab.ntt.co.jp
  mailto:kondo.mitsum...@lab.ntt.co.jp wrote:
Rebased patch is attached.
   
Does this fix the Windows bug reported by Kumar on 20/11/2013 ?
 Sorry, I was misunderstanding. First name of Mr. Rajeev Rastogi is
 Kumar! I searched only e-mail address and title by his name...
 
 I don't have windows compiler enviroment, but attached patch might be
 fixed.
 Could I ask Mr. Rajeev Rastogi to test my patch again?

I tried to test this but I could not apply the patch on latest git HEAD.
This may be because of recent patch (related to pg_stat_statement only
pg_stat_statements external query text storage ), which got committed 
on 27th January.


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-26 Thread Rajeev rastogi

On 01/25/2014, Josh Berkus wrote:
  ISTM the consensus is that we need better monitoring/administration
  interfaces so that people can script the behavior they want in
  external tools. Also, a new synchronous apply replication mode would
  be handy, but that'd be a whole different patch. We don't have a
 patch
  on the table that we could consider committing any time soon, so I'm
  going to mark this as rejected in the commitfest app.
 
 I don't feel that we'll never do auto-degrade is determinative;
 several hackers were for auto-degrade, and they have a good use-case
 argument.  However, we do have consensus that we need more scaffolding
 than this patch supplies in order to make auto-degrade *safe*.
 
 I encourage the submitter to resumbit and improved version of this
 patch (one with more monitorability) for  9.5 CF1.  That'll give us a
 whole dev cycle to argue about it.

I shall rework to improve this patch. Below are the summarization of all
discussions, which will be used as input for improving the patch:

1. Method of degrading the synchronous mode:
a. Expose the configuration variable to a new SQL-callable functions.
b. Using ALTER SYSTEM SET.
c. Auto-degrade using some sort of configuration parameter as done in 
current patch.
d. Or may be combination of above, which DBA can use depending on their 
use-cases.  

  We can discuss further to decide on one of the approach.

2. Synchronous mode should upgraded/restored after at-least one synchronous 
standby comes up and has caught up with the master.

3. A better monitoring/administration interfaces, which can be even better if 
it is made as a generic trap system.

  I shall propose a better approach for this.

4. Send committing clients, a WARNING if they have committed a synchronous 
transaction and we are in degraded mode.

5. Please add more if I am missing something.

Thanks and Regards,
Kumar Rajeev Rastogi
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for CSN based snapshots

2014-01-24 Thread Rajeev rastogi

On 11th June 2013, Markus Wanner wrote:

  Agreed. Postgres-R uses a CommitOrderId, which is very similar in
  concept, for example.
 
  Do you think having this snapshot scheme would be helpful for
 Postgres-R?
 
 Yeah, it could help to reduce patch size, after a rewrite to use such a
 CSN.
 
  Or why do you need to tell apart aborted from in-progress
  transactions by CSN?
 
  I need to detect aborted transactions so they can be discared during
  the eviction process, otherwise the sparse array will fill up. They
  could also be filtered out by cross-referencing uncommitted slots
 with
  the procarray. Having the abort case do some additional work to make
  xid assigment cheaper looks like a good tradeoff.
 
 I see.
 
  Sparse buffer needs to be at least big enough to fit CSN slots for
  the xids of all active transactions and non-overflowed
  subtransactions. At the current level PGPROC_MAX_CACHED_SUBXIDS=64,
  the minimum comes out at 16 bytes * (64 + 1) slots * 100 =
 backends
  = 101.6KB per buffer, or 203KB total in the default configuration.
 
  A CSN is 8 bytes, the XID 4, resulting in 12 bytes per slot. So I
  guess the given 16 bytes includes alignment to 8 byte boundaries.
 Sounds good.
 
  8 byte alignment for CSNs is needed for atomic if not something else.
 
 Oh, right, atomic writes.
 
  I think the size could be cut in half by using a base value for CSNs
  if we assume that no xid is active for longer than 2B transactions as
  is currently the case. I didn't want to include the complication in
  the first iteration, so I didn't verify if that would have any
  gotchas.
 
 In Postgres-R, I effectively used a 32-bit order id which wraps around.
 
 In this case, I guess adjusting the base value will get tricky.
 Wrapping could probably be used as well, instead.
 
  The number of times each cache line can be invalidated is bounded by
  8.
 
 Hm.. good point.
 

We are also planning to implement CSN based snapshot.
So I am curious to know whether any further development is happening on this.
If not then what is the reason?

Am I missing something?

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-01-23 Thread Rajeev rastogi

On 23/01/14 14:45, Christian Kruse wrote:
  Well, is it context or detail?  Those fields have reasonably well
  defined meanings IMO.
 
 I find the distinction somewhat blurry and think both would be
 appropriate. But since I wasn't sure I changed to detail.
 
  If we need errcontext_plural, let's add it, not adopt inferior
  solutions just because that isn't there for lack of previous need.
 
 I would've added it if I would've been sure.
 
  But having said that, I think this is indeed detail not context.
  (I kinda wonder whether some of the stuff that's now in the primary
  message shouldn't be pushed to errdetail as well.  It looks like some
  previous patches in this area have been lazy.)
 
 I agree, the primary message is not very well worded. On the other hand
 finding an appropriate alternative seems hard for me.
 
  While I'm griping, this message isn't even trying to follow the
  project's message style guidelines.  Detail or context messages are
  supposed to be complete sentence(s), with capitalization and
 punctuation to match.
 
 Hm, I hope I fixed it in this version of the patch.
 
  Lastly, is this information that we want to be shipping to clients?
  Perhaps from a security standpoint that's not such a wise idea, and
  errdetail_log() is what should be used.
 
 Fixed. I added an errdetail_log_plural() for this, too.

I think you have attached wrong patch.

Thanks and Regards,
Kumar Rajeev Rastogi
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-01-21 Thread Rajeev rastogi


On 31st December 2013, Christian Kruse Wrote:
 
 Hi there,
 
 I created two patches improving the log messages generated by
 log_lock_waits. The first patch shows the process IDs holding a lock we
 try to acquire (show_pids_in_lock_log.patch), sample output
 (log_lock_waits=on required):
 
 session 1: BEGIN; LOCK TABLE foo IN SHARE MODE; session 2: BEGIN; LOCK
 TABLE foo IN SHARE MODE; session 3: BEGIN; LOCK TABLE foo IN EXCLUSIVE
 MODE;
 
 Output w/o patch:
 
 LOG:  process 13777 still waiting for ExclusiveLock on relation 16385
 of database 16384 after 1000.030 ms
 
 Output with patch:
 
 LOG:  process 13777 still waiting for ExclusiveLock on relation 16385
 of database 16384 after 1000.030 ms
 CONTEXT:  processes owning lock: 13775, 13776

I am reviewing this patch. The idea seems to be reasonable.
Following are my first level observation:

1. I find a issue in following scenario:

session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE; 
session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE; 
session 3 with process id Z: BEGIN; LOCK TABLE foo IN SHARE MODE; 

On execution of LOCK in session-3, as part of log it will display as: 
processes owning lock: X, Y
But actually if we see Y has not yet own the lock, it is still waiting 
with higher priority.
It may mislead user.
May be we should change message to give all meaning i.e. which process 
is owning lock and
Which process is already in queue.

2. Can we give a better name to new variable 'buf1'?

3. Do we need to take performance reading to see if any impact?

4. Do we require documentation?


Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-01-14 Thread Rajeev rastogi

On Tue, Jan 12, 2014 David Rowley wrote:
I have found a case that PostgreSQL as win32 service does not start, if the 
data directory given as relative path.
Error observed in this case is:
The PostgreSQL on Local Computer started and 
 then stopped.
 This may happen because relative path given will be relative to path from 
 where service is registered but
the path from where WIN32 service execution gets invoked may be different and 
hence it won't be able
to find the  data directory.
 I have fixed the same by internally converting the relative path to absolute 
 path as it is being done for executable file.
 Hi,
I've not quite got around to testing this yet, but I think the patch is a good 
idea as I can see that I can use a relative path when I start postgres.exe 
from the command line, I guess the behaviour should likely be the same when 
installed as a windows
 service.

Thanks for reviewing and providing the first level of feedback.

So, I've just been looking over this patch and I'm just wondering about a few 
things:

In pgwin32_CommandLine you declare dbPath, it looks like you could declare 
this in the scope of; if (pg_config)

Yes I have declared the same in the scope of if (pg_config) 

In find_my_abs_path you're making a call to StrNCpy, I know you likely just 
used find_my_exec as an example here, but I'd say the use of StrNCpy is not 
really needed here and is a bit of a waste of cycles. I'd rather see strlcpy 
being used as strncpy
 will needlessly zero the remaining buffer space.
Yes you are right, it is much better to use strlcpy instead of 
StrNCpy. I have modified the patch.

Also in find_my_abs_path, I'm just wondering if the cwd variable is really 
needed, I think you could just use retpath each time and it would also save a 
little bit of copying that's done in join_path_components(). By the looks of 
it you can just call
 join_path_components(retpath, retpath, inpath). Perhaps some people would 
 disagree, but I'm not really seeing the harm in it and it saves some copying 
 too.
Yes I am also convinced with your suggestion. It really avoid one 
level of copy.
I have seen that the similar mechanism is used in many places where 
join_path_components() is called. So I think it should be OK to use in our case 
also.
I have made the necessary changes for the same.

I have attached the changed patch. Please provide your further feedback, if any.

Thanks and Regards,

Kumar Rajeev Rastogi





pgctl_win32service_rel_dbpath_v2.patch
Description: pgctl_win32service_rel_dbpath_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-13 Thread Rajeev rastogi

 
 On Sun, Jan 12, Amit Kapila wrote:
  How would that work?  Would it be a tool in contrib?  There already
  is a timeout, so if a tool checked more frequently than the timeout,
  it should work.  The durable notification of the admin would happen
  in the tool, right?
 
  Well, you know what tool *I'm* planning to use.
 
  Thing is, when we talk about auto-degrade, we need to determine
 things
  like Is the replica down or is this just a network blip? and take
  action according to the user's desired configuration.  This is not
  something, realistically, that we can do on a single request.
 Whereas
  it would be fairly simple for an external monitoring utility to do:
 
  1. decide replica is offline for the duration (several poll attempts
  have failed)
 
  2. Send ALTER SYSTEM SET to the master and change/disable the
  synch_replicas.
 
Will it possible in current mechanism, because presently master will
not accept any new command when the sync replica is not available?
Or is there something else also which needs to be done along with
above 2 points to make it possible.

Since there is not WAL written for ALTER SYSTEM SET command, 
then
it should be able to handle this command even though sync 
replica is
not available.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-12 Thread Rajeev rastogi


On 13th January 2013, Josh Berkus Wrote:

 I'm leading this off with a review of the features offered by the
 actual patch submitted.  My general discussion of the issues of Sync
 Degrade, which justifies my specific suggestions below, follows that.
 Rajeev, please be aware that other hackers may have different opinions
 than me on what needs to change about the patch, so you should collect
 all opinions before changing code.

Thanks for reviewing and providing the first level of comments. Surely
We'll collect all feedback to improve this patch.

 
  Add a new parameter :
 
  synchronous_standalone_master = on | off
 
 I think this is a TERRIBLE name for any such parameter.  What does
 synchronous standalone even mean?  A better name for the parameter
 would be auto_degrade_sync_replication or synchronous_timeout_action
 = error | degrade, or something similar.  It would be even better for
 this to be a mode of synchronous_commit, except that synchronous_commit
 is heavily overloaded already.

Yes we can change this parameter name. Some of the suggestion in order to 
degrade the mode
1. Auto-degrade using some sort of configuration parameter as done in 
current patch.
2. Expose the configuration variable to a new SQL-callable functions as 
suggested by Heikki.
3. Or using ALTER SYSTEM SET as suggested by others.

 Some issues raised by this log script:
 
 LOG:  standby tx0113 is now the synchronous standby with priority 1
 LOG:  waiting for standby synchronization
   -- standby wal receiver on the standby is killed (SIGKILL)
 LOG:  unexpected EOF on standby connection
 LOG:  not waiting for standby synchronization
   -- restart standby so that it connects again
 LOG:  standby tx0113 is now the synchronous standby with priority 1
 LOG:  waiting for standby synchronization
   -- standby wal receiver is first stopped (SIGSTOP) to make sure
 
 The not waiting for standby synchronization message should be marked
 something stronger than LOG.  I'd like ERROR.

Yes we can change this to ERROR.

 Second, you have the master resuming sync rep when the standby
 reconnects.  How do you determine when it's safe to do that?  You're
 making the assumption that you have a failing sync standby instead of
 one which simply can't keep up with the master, or a flakey network
 connection (see discussion below).

Yes this can be further improved so that only if we make sure that synchronous
Standby has caught up with master node (may require a better design), then only 
master can be upgraded to Synchronous mode by one of the method discussed above.

  a.   Master_to_standalone_cmd: To be executed before master
 switches to standalone mode.
 
  b.  Master_to_sync_cmd: To be executed before master switches
 from
 sync mode to standalone mode.
 
 I'm not at all clear what the difference between these two commands is.
  When would one be excuted, and when would the other be executed?  Also,
 renaming ...

There is typo mistake in above explain, meaning of two commands are:
a.Master_to_standalone_cmd: To be executed during degradation of sync mode.

 b.  Master_to_sync_cmd: To be executed before upgrade or restoration of mode.

These two commands are per the TODO item to inform DBA.

But as per Heikki suggestion, we should not use this mechanism to inform DBA 
rather
We should some have some sort of generic trap system, instead of adding this 
one 
particular extra config option specifically for this feature. 
This looks to be better idea so we can have further discussion to come with 
proper
design.


 Missing features:
 
 a) we should at least send committing clients a WARNING if they have
 commited a synchronous transaction and we are in degraded mode.

Yes it is great idea.

 One of the reasons that there's so much disagreement about this feature
 is that most of the folks strongly in favor of auto-degrade are
 thinking
 *only* of the case that the standby is completely down.  There are many
 other reasons for a sync transaction to hang, and the walsender has
 absolutely no way of knowing which is the case.  For example:
 
 * Transient network issues
 * Standby can't keep up with master
 * Postgres bug
 * Storage/IO issues (think EBS)
 * Standby is restarting
 
 You don't want to handle all of those issues the same way as far as
 sync rep is concerned.  For example, if the standby is restaring, you
 probably want to wait instead of degrading.

I think if we support to have some external SQL-callable functions as Heikki 
suggested to degrade instead of auto-degrade then user can handle at-least some 
of the above scenarios if not all based on their experience and observation. 


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standalone synchronous master

2014-01-07 Thread Rajeev rastogi

On 8th Jan, 2014, Amit Kapila Wrote
 
  Add a new eager synchronous mode that starts out synchronous but
  reverts to asynchronous after a failure timeout period
 
  This would require some type of command to be executed to alert
  administrators of this change.
 
  http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php
  This patch implementation is in the same line as it was given in the
  earlier thread.
 
  Some Of the additional important changes are:
 
  1.   Have added two GUC variable to take commands from user to be
  executed
 
  a.   Master_to_standalone_cmd: To be executed before master
 switches to
  standalone mode.
 
  b.  Master_to_sync_cmd: To be executed before master switches
 from sync
  mode to standalone mode.
 
In description of both switches (a  b), you are telling that it
 will switch to
standalone mode, I think by your point 1b. you mean to say other way
(switch from standalone to sync mode).

Yes you are right. Its typo mistake.

Instead of getting commands, why can't we just log such actions? I
 think
adding 3 new guc variables for this functionality seems to be bit
 high.

Actually in earlier discussion as well as in TODO added, it is mentioned
to execute some kind of command to be executed to alert administrator.
http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php

In my current patch, I have kept the LOG along with command. 

Also what will happen when it switches to standalone mode incase
 there
are some async standby's already connected to it before going to
standalone mode, if it continues to send data then I think naming it
 as
'enable_standalone_master' is not good.

Yes we can change name to something more appropriate, some of them are:
1. enable_async_master
2. sync_standalone_master
3. enable_nowait_master
4. enable_nowait_resp_master

Please provide your suggestion on above name or any other?.

Thanks and Regards,
Kumar Rajeev Rastogi
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-01-06 Thread Rajeev rastogi

I have found a case that PostgreSQL as win32 service does not start, if the 
data directory given as relative path.
Error observed in this case is:
The PostgreSQL on Local Computer started and 
then stopped.

This may happen because relative path given will be relative to path from where 
service is registered but
the path from where WIN32 service execution gets invoked may be different and 
hence it won't be able
to find the  data directory.

I have fixed the same by internally converting the relative path to absolute 
path as it is being done for executable file.

Attached is the patch with the fix.
Please provide your opinion.

I will add this to Jan 2014 CommitFest.

Thanks and Regards,
Kumar Rajeev Rastogi



pgctl_win32service_rel_dbpath.patch
Description: pgctl_win32service_rel_dbpath.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-12-12 Thread Rajeev rastogi

On 12 December 2013, Heikki Linnakangas  Wrote:
  Further Review of this patch:
 
  I have done few more cosmetic changes in your patch, please find
  the updated patch attached with this mail.
  Kindly check once whether changes are okay.
 
  Changes are fine. Thanks you.
 
  I have uploaded the latest patch in CF app and marked it as Ready
 For
  Committer.
 
  Thanks a lot.
 
 Committed, with some minor kibitzing of comments and docs. Thanks!

Thanks a lot..:-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2013-12-12 Thread Rajeev rastogi

On 12th December 2013, Rajeev Rastogi Wrote:
On 9th December, Amit Khandelkar wrote:

1.  slashcopyissuev1.patch :- This patch fixes the \COPY issue.
You have removed the if condition in this statement, mentioning that it is 
always true now:
-   if (copystream == pset.cur_cmd_source)
-   pset.lineno++;
+   pset.lineno++;

 But copystream can be different than pset.cur_cmd_source , right ?

As per the earlier code, condition result was always true. So pset.lineno was 
always incremented.
In the earlier code pset.cur_cmd_source was sent as parameter to function and 
inside the function same parameter was used with the name copystream. So on 
entry of this function both will be one and same.
I checked inside the function handleCopyIn, both of these parameters are not 
changing before above check. Also since pset is specific to single session, so 
it cannot change concurrently.
Please let me know, if I am missing something.

+   FILE   *copyStream; /* Stream to read/write for copy 
command */

There is no tab between FILE and *copystream, hence it is not aligned.

OK. I shall change accodingly.

I ran pgindent on settings.h file but found no issue reported. Seems tab is not 
mandatory in between variable declaration.
Even for other parameters also in psqlSetting structure space instead of tab is 
being used.
So seems no change required for this. Please confirm.


2.  initialcopyissuev1_ontopofslashcopy.patch : Fix for COPY table FROM 
STDIN/STDOUT doesn't show count tag.
The following header comments of ProcessResult() need to be modified:
* Changes its argument to point to the last PGresult of the command string,
* or NULL if that result was for a COPY FROM STDIN or COPY TO STDOUT.

OK. I shall change accodingly.
I have changed it in the latest patch.

Regression results show all passed.
Other than this, the patch needs a new regression test.

I had checked the existing regression test cases and observed that it has 
already got all kind of test cases. Like
copystdin,
copystdout,
\copy.stdin
\copy.stdout.

But since as regression framework runs in quite i.e. -q mode, so it does not 
show any message except query output.
So our new code change does not impact regression framework.

Please let me know if you were expecting any other test cases?

Summary of two patches:

1.  slashcopyissuev1.patch :- No change in this patch (same as earlier).

2.  Initialcopyissuev2_ontopofslashcopy.patch : This patch is modified to 
change comment as per above review comments.

Please provide your opinion or let me know if any other changes are required.

Thanks and Regards,
Kumar Rajeev Rastogi




slashcopyissuev1.patch
Description: slashcopyissuev1.patch


initialcopyissuev2_ontopofslashcopy.patch
Description: initialcopyissuev2_ontopofslashcopy.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2013-12-10 Thread Rajeev rastogi

On 9th December, Amit Khandelkar wrote:

1.  slashcopyissuev1.patch :- This patch fixes the \COPY issue.
You have removed the if condition in this statement, mentioning that it is 
always true now:
-   if (copystream == pset.cur_cmd_source)
-   pset.lineno++;
+   pset.lineno++;

 But copystream can be different than pset.cur_cmd_source , right ?

As per the earlier code, condition result was always true. So pset.lineno was 
always incremented.
In the earlier code pset.cur_cmd_source was sent as parameter to function and 
inside the function same parameter was used with the name copystream. So on 
entry of this function both will be one and same.
I checked inside the function handleCopyIn, both of these parameters are not 
changing before above check. Also since pset is specific to single session, so 
it cannot change concurrently.
Please let me know, if I am missing something.

+   FILE   *copyStream; /* Stream to read/write for copy 
command */

There is no tab between FILE and *copystream, hence it is not aligned.

OK. I shall change accodingly.


2.  initialcopyissuev1_ontopofslashcopy.patch : Fix for COPY table FROM 
STDIN/STDOUT doesn't show count tag.
The following header comments of ProcessResult() need to be modified:
* Changes its argument to point to the last PGresult of the command string,
* or NULL if that result was for a COPY FROM STDIN or COPY TO STDOUT.

OK. I shall change accodingly.

Regression results show all passed.
Other than this, the patch needs a new regression test.

I had checked the existing regression test cases and observed that it has 
already got all kind of test cases. Like
copystdin,
copystdout,
\copy.stdin
\copy.stdout.

But since as regression framework runs in quite i.e. -q mode, so it does not 
show any message except query output.
So our new code change does not impact regression framework.

Please let me know if you were expecting any other test cases?

I don't think we need to do any doc changes, because the doc already mentions 
that COPY should show the COUNT tag, and does not mention anything specific to 
client-side COPY.
 OK.

Please provide you opinion, based on which I shall prepare new patch and share 
the same.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2013-11-29 Thread Rajeev rastogi

On 26 November 2013, Amit Khandelkar wrote:
Can you please submit the \COPY patch as a separate patch ? Since these are 
two different issues, I would like to have these two fixed and committed 
separately. You can always test the \COPY issue using \COPY TO followed by 
INSERT.

Please find the attached two separate patches:


1.  slashcopyissuev1.patch :- This patch fixes the \COPY issue.

2.  initialcopyissuev1_ontopofslashcopy.patch : Fix for COPY table FROM 
STDIN/STDOUT doesn't show count tag.

Thanks and Regards,
Kumar Rajeev Rastogi




slashcopyissuev1.patch
Description: slashcopyissuev1.patch


initialcopyissuev1_ontopofslashcopy.patch
Description: initialcopyissuev1_ontopofslashcopy.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-28 Thread Rajeev rastogi

On 29 November 2013, Amit Kapila Wrote:
   Further Review of this patch:
   b. why to display 'First log segment after reset' in 'Currrent
   pg_control values' section as now the same information
   is available in new section Values to be used after reset: ?
  
   May not be always. Suppose if user has given new value of seg no
   and
  TLI, then it will be different.
   Otherwise it will be same.
   So now I have changed the code in such way that the value of XLOG
   segment # read from checkpoint record gets printed as part of
   current value and any further new value gets printed in Values to
   be reset (This will be always at-least one plus the current value).
   Because of
  following code in FindEndOfXLOG
   xlogbytepos = newXlogSegNo *
  ControlFile.xlog_seg_size;
   newXlogSegNo = (xlogbytepos + XLogSegSize
 -
   1)
  / XLogSegSize;
   newXlogSegNo++;
 
  It can be different, but I don't think we should display it in both
  sections because:
  a. it doesn't belong to control file.
  b. if you carefully look at original link
  (http://www.postgresql.org/message-id/1283277511-sup-2...@alvh.no-
  ip.org),
  these values are not getting displayed in pg_control values
 section.
 
  So I suggest it is better to remove it from pg_control values
 section.
 
  Done as per suggestion.
 
 I have done few more cosmetic changes in your patch, please find the
 updated patch attached with this mail.
 Kindly check once whether changes are okay.

Changes are fine. Thanks you.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-28 Thread Rajeev rastogi

On 29 November 2013, Amit Kapila Wrote:
Further Review of this patch:
 
  I have done few more cosmetic changes in your patch, please find the
  updated patch attached with this mail.
  Kindly check once whether changes are okay.
 
  Changes are fine. Thanks you.
 
 I have uploaded the latest patch in CF app and marked it as Ready For
 Committer.

Thanks a lot.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-11-27 Thread Rajeev rastogi

On 27 November 2013, Naoya Anzai wrote:
 Hi, Rajeev
 
   I tested the latest patch. My observation is:
 If we give relative data directory path while registering the
   service, then service start fails.
 But same works if the data directory is absolute path.
  
 Looks like an existing issue. May be we need to internally
 convert
   relative data path to absolute.
 
  Since the mentioned issue is an existing issue and not because of
 this patch.
  So can we take that as separate defect and fix. If so, then I can
 move
  this patch to ready for committer.
 
 I think so too.
 In boot by Service, CurrentDirectory seems to be C:/Windows/system32.
 So, you have to set a relative data directory path that the starting
 point to be C:/Windows/system32.
 

OK. Then I am moving it to ready for committer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-27 Thread Rajeev rastogi

On 28 November 2013, Amit Kapila Wrote:
  Further Review of this patch:
  b. why to display 'First log segment after reset' in 'Currrent
  pg_control values' section as now the same information
  is available in new section Values to be used after reset: ?
 
  May not be always. Suppose if user has given new value of seg no and
 TLI, then it will be different.
  Otherwise it will be same.
  So now I have changed the code in such way that the value of XLOG
  segment # read from checkpoint record gets printed as part of current
  value and any further new value gets printed in Values to be reset
  (This will be always at-least one plus the current value). Because of
 following code in FindEndOfXLOG
  xlogbytepos = newXlogSegNo *
 ControlFile.xlog_seg_size;
  newXlogSegNo = (xlogbytepos + XLogSegSize - 1)
 / XLogSegSize;
  newXlogSegNo++;
 
 It can be different, but I don't think we should display it in both
 sections because:
 a. it doesn't belong to control file.
 b. if you carefully look at original link
 (http://www.postgresql.org/message-id/1283277511-sup-2...@alvh.no-
 ip.org),
 these values are not getting displayed in pg_control values section.
 
 So I suggest it is better to remove it from pg_control values section.

Done as per suggestion.

 
 Few new comments:
 
 1.
 Formatting for 'OldestMulti's DB' and 'OldestXID's DB' is not
 consistent with other values, try by printing it.

Changed to align output.

 2.
 +It will print values in two sections. In first section it will
 print all original/guessed values
 +and in second section all values to be used after reset. In
 second section filename will be
 +always printed as segment number will be at-least one more than
 current. Also if any other parameter
 +is given using appropriate switch, then those new values will be
 also printed.
 
 a. the length of newly added lines is not in sync with previous text,
 try to keep length less than 80 chars.
 b. I think there is no need of text (In second section ), you can
 remove all text after that point.

Done.

 3.
 ! printf(_(  -n   no update, just show extracted control
 values (for testing) and to be reset values of parameters(if
 given)\n)); I find this line too long and elaborative, try to make it
 shorter.

Changed accordingly.
 

Please provide your opinion.

Thanks and Regards,
Kumar Rajeev Rastogi


pg_resetxlogsectionV4.patch
Description: pg_resetxlogsectionV4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO: Split out pg_resetxlog output into pre- and post-sections

2013-11-26 Thread Rajeev rastogi

On 26 November 2013, Amit Kapila Wrote: 
 Further Review of this patch:
 a. there are lot of trailing whitespaces in patch.

Fixed.

 b. why to display 'First log segment after reset' in 'Currrent
 pg_control values' section as now the same information
 is available in new section Values to be used after reset: ?

May not be always. Suppose if user has given new value of seg no and TLI, then 
it will be different.
Otherwise it will be same. 
So now I have changed the code in such way that the value of XLOG segment # 
read from 
checkpoint record gets printed as part of current value and any further new 
value gets
printed in Values to be reset (This will be always at-least one plus the 
current value). Because
of following code in FindEndOfXLOG
xlogbytepos = newXlogSegNo * ControlFile.xlog_seg_size;
newXlogSegNo = (xlogbytepos + XLogSegSize - 1) / 
XLogSegSize;
newXlogSegNo++;

 c. I think it is better to display 'First log segment after reset' as
 file name as it was earlier.
This utility takes filename as input, so I think we should display
 filename.

Fixed. Printing filename.

 d. I still think it is not good idea to use new parameter changedParam
 to detect which parameters are changed and the reason is
 code already has that information. We should try to use that
 information rather introducing new parameter to mean the same.

Fixed. Removed changedParam and made existing variables like set_xid
global and same is being used for check.

 e.
 if (guessed  !force)
 {
 PrintControlValues(guessed);
 if (!noupdate)
 {
 printf(_(\nIf these values seem acceptable, use -f to force
 reset.\n)); exit(1); }
 
 Do you think there will be any chance when noupdate can be true in
 above loop, if not then why to keep the check for same?

Fixed along with the last comment.

 f.
 if (changedParam  DISPLAY_XID)
   {
   printf(_(NextXID:  %u\n),
 ControlFile.checkPointCopy.nextXid);
   printf(_(oldestXID:%u\n),
 ControlFile.checkPointCopy.oldestXid);
   }
 Here you are priniting oldestXid, but not oldestXidDB, if user provides
 xid both values are changed. Same is the case for multitransaction.

Fixed.

 g.
 /* newXlogSegNo will be always  printed unconditionally*/ Extra space
 between always and printed. In single line comments space should be
 provided when comment text ends, please refer other single line
 comments.

Fixed.

 In case when the values are guessed and -n option is not given, then
 this patch will not be able to distinguish the values. Don't you think
 it is better to display values in 2 sections in case of guessed values
 without -n flag as well, otherwise this utility will have 2 format's to
 display values?

Yes you are right. Now printing the values in two section in two cases:
a. -n is given or
b. If we had to guess and -f not given.

Please let know your opinion.

Thanks and Regards,
Kumar Rajeev Rastogi



pg_resetxlogsectionV3.patch
Description: pg_resetxlogsectionV3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service on Windows does not start. ~ is not a valid Win32 application

2013-11-26 Thread Rajeev rastogi

On 25 November 2013, Rajeev Rastogi Wrote:
   One suggestion:
   Instead of using sizeof(cmdLine),
 a. Can't we use strlen  (hence small 'for' loop).
 b. Or use memmove to move one byte.
 
  I looked at this patch a bit.  I agree that we need to fix
  pgwin32_CommandLine to double-quote the executable name, but it needs
  a great deal more work than that :-(.  Whoever wrote this code was
  apparently unacquainted with the concept of buffer overrun.  It's not
  going to be hard at all to crash pg_ctl with overlength arguments.
  I'm not sure that that amounts to a security bug, but it's certainly
 bad.
 
  After some thought it seems like the most future-proof fix is to not
  use a fixed-length buffer for the command string at all.  The
 attached
  revised patch switches it over to using a PQExpBuffer instead, which
  is pretty much free since we're relying on libpq anyway in this
 program.
  (We still use a fixed-length buffer for the program path, which is OK
  because that's what find_my_exec and find_other_exec expect.)
 
  In addition, I fixed it to append .exe in both cases not just the one.
 
  I'm not in a position to actually test this, but it does compile
  without warnings.
 
 I tested the latest patch. My observation is:
   If we give relative data directory path while registering the
 service, then service start fails.
   But same works if the data directory is absolute path.
 
   Looks like an existing issue. May be we need to internally
 convert relative data path to absolute.

Since the mentioned issue is an existing issue and not because of this patch.
So can we take that as separate defect and fix. If so, then I can 
move this patch to ready for committer.

Thanks and Regards,
Kumar Rajeev Rastogi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 117 matches

Mail list logo