Re: [HACKERS] Partition-wise aggregation/grouping

2017-11-01 Thread Jeevan Chalke
On Sat, Oct 28, 2017 at 3:07 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Fri, Oct 27, 2017 at 1:01 PM, Jeevan Chalke
> <jeevan.cha...@enterprisedb.com> wrote:
> > 1. Added separate patch for costing Append node as discussed up-front in
> the
> > patch-set.
> > 2. Since we now cost Append node, we don't need
> > partition_wise_agg_cost_factor
> > GUC. So removed that. The remaining patch hence merged into main
> > implementation
> > patch.
> > 3. Updated rows in test-cases so that we will get partition-wise plans.
>
> With 0006 applied, cost_merge_append() is now a little bit confused:
>
> /*
>  * Also charge a small amount (arbitrarily set equal to operator cost)
> per
>  * extracted tuple.  We don't charge cpu_tuple_cost because a
> MergeAppend
>  * node doesn't do qual-checking or projection, so it has less overhead
>  * than most plan nodes.
>  */
> run_cost += cpu_operator_cost * tuples;
>
> /* Add MergeAppend node overhead like we do it for the Append node */
> run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;
>
> The first comment says that we don't add cpu_tuple_cost, and the
> second one then adds half of it anyway.
>

Yep.
But as David reported earlier, if we remove the first part i.e. adding
cpu_operator_cost per tuple, Merge Append will be preferred over an Append
node unlike before. And thus, I thought of better having both, but no so
sure. Should we remove that part altogether, or add both in a single
statement with updated comments?


> I think it's fine to have a #define for DEFAULT_APPEND_COST_FACTOR,
> because as you say it's used twice, but I don't think that should be
> exposed in cost.h; I'd make it private to costsize.c and rename it to
> something like APPEND_CPU_COST_MULTIPLIER.  The word DEFAULT, in
> particular, seems useless to me, since there's no provision for it to
> be overridden by a different value.
>

Agree. Will make that change.


>
> What testing, if any, can we think about doing with this plan to make
> sure it doesn't regress things?  For example, if we do a TPC-H run
> with partitioned tables and partition-wise join enabled, will any
> plans change with this patch?


I have tried doing this on my local developer machine. For 1GB database
size (tpc-h scale factor 1), I see no plan change with and without this
patch.

I have tried with scale factor 10, but query is not executing well due to
space and memory constraints. Can someone try out that?


>   Do they get faster or not?  Anyone have
> other ideas for what to test?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-27 Thread Jeevan Chalke
Hi,

Attached new patch-set here. Changes include:

1. Added separate patch for costing Append node as discussed up-front in the
patch-set.
2. Since we now cost Append node, we don't need
partition_wise_agg_cost_factor
GUC. So removed that. The remaining patch hence merged into main
implementation
patch.
3. Updated rows in test-cases so that we will get partition-wise plans.

Thanks


On Wed, Oct 18, 2017 at 9:53 AM, Dilip Kumar <dilipbal...@gmail.com> wrote:

> On Tue, Oct 17, 2017 at 10:44 PM, Jeevan Chalke
> <jeevan.cha...@enterprisedb.com> wrote:
> >
>
> > I didn't get what you mean by regression here. Can you please explain?
> >
> > I see that PWA plan is selected over regular plan when enabled on the
> basis
> > of costing.
> > Regular planning need a Result node due to which costing increases where
> as
> > PWA don't need that and thus wins.
>
> Sorry for not clearly explaining,  I meant that with normal plan
> execution time is 263.678 ms whereas with PWA its 339.929 ms.
>
> I only set enable_partition_wise_agg=on and it switched to PWA and
> execution time increased by 30%.
> I understand that the this is the worst case for PWA where
> FinalizeAggregate is getting all the tuple.
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v6.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-17 Thread Jeevan Chalke
On Tue, Oct 17, 2017 at 7:13 PM, Dilip Kumar <dilipbal...@gmail.com> wrote:

> On Fri, Oct 13, 2017 at 12:06 PM, Jeevan Chalke
> <jeevan.cha...@enterprisedb.com> wrote:
> >
> While playing around with the patch I have noticed one regression with
> the partial partition-wise aggregate.
>
> I am consistently able to reproduce this on my local machine.
>
> Scenario: Group by on non-key column and only one tuple per group.
>

I didn't get what you mean by regression here. Can you please explain?

I see that PWA plan is selected over regular plan when enabled on the basis
of costing.
Regular planning need a Result node due to which costing increases where as
PWA don't need that and thus wins.


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-16 Thread Jeevan Chalke
On Fri, Oct 13, 2017 at 1:13 PM, David Rowley <david.row...@2ndquadrant.com>
wrote:

>
> I looked over the patch and saw this:
>
> @@ -1800,6 +1827,9 @@ cost_merge_append(Path *path, PlannerInfo *root,
>   */
>   run_cost += cpu_operator_cost * tuples;
>
> + /* Add MergeAppend node overhead like we do it for the Append node */
> + run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;
> +
>   path->startup_cost = startup_cost + input_startup_cost;
>   path->total_cost = startup_cost + run_cost + input_total_cost;
>  }
>
> You're doing that right after a comment that says we don't do that. It
> also does look like the "run_cost += cpu_operator_cost * tuples" is
> trying to do the same thing, so perhaps it's worth just replacing
> that, which by default will double that additional cost, although
> doing so would have the planner slightly prefer a MergeAppend to an
> Append than previously.
>

I think we can remove that code block entirely. I have added relevant
comments
around DEFAULT_APPEND_COST_FACTOR already.
However, I am not sure of doing this as you correctly said it may prefer
MergeAppend to an Append. Will it be fine we remove that code block?


> +#define DEFAULT_APPEND_COST_FACTOR 0.5
>
> I don't really think the DEFAULT_APPEND_COST_FACTOR adds much. it
> means very little by itself. It also seems that most of the other cost
> functions just use the magic number.
>

Agree, but those magic numbers used only once at that place. But here there
are two places. So if someone wants to update it, (s)he needs to make sure
to update that at two places. To minimize that risk, having a #define seems
better.


>
> --
>  David Rowley       http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-13 Thread Jeevan Chalke
On Tue, Oct 10, 2017 at 1:31 PM, David Rowley <david.row...@2ndquadrant.com>
wrote:

> On 10 October 2017 at 17:57, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
> > Append node just returns the result of ExecProcNode(). Charging
> > cpu_tuple_cost may make it too expensive. In other places where we
> > charge cpu_tuple_cost there's some processing done to the tuple like
> > ExecStoreTuple() in SeqNext(). May be we need some other measure for
> > Append's processing of the tuple.
>
> I don't think there's any need to invent any new GUC. You could just
> divide cpu_tuple_cost by something.
>
> I did a quick benchmark on my laptop to see how much Append really
> costs, and with the standard costs the actual cost seems to be about
> cpu_tuple_cost / 2.4. So probably cpu_tuple_cost / 2 might be
> realistic. create_set_projection_path() does something similar and
> brincostestimate() does some similar magic and applies 0.1 *
> cpu_operator_cost to the total cost.
>
> # create table p (a int, b int);
> # create table p1 () inherits (p);
> # insert into p1 select generate_series(1,100);
> # vacuum analyze p1;
> # \q
> $ echo "select count(*) from p1;" > p1.sql
> $ echo "select count(*) from p;" > p.sql
> $ pgbench -T 60 -f p1.sql -n
>
> latency average = 58.567 ms
>
> $ pgbench -T 60 -f p.sql -n
> latency average = 72.984 ms
>
> $ psql
> psql (11devel)
> Type "help" for help.
>
> # -- check the cost of the plan.
> # explain select count(*) from p1;
> QUERY PLAN
> --
>  Aggregate  (cost=16925.00..16925.01 rows=1 width=8)
>->  Seq Scan on p1  (cost=0.00..14425.00 rows=100 width=0)
> (2 rows)
>
> # -- selecting from the parent is the same due to zero Append cost.
> # explain select count(*) from p;
>QUERY PLAN
> 
>  Aggregate  (cost=16925.00..16925.01 rows=1 width=8)
>->  Append  (cost=0.00..14425.00 rows=101 width=0)
>  ->  Seq Scan on p  (cost=0.00..0.00 rows=1 width=0)
>  ->  Seq Scan on p1  (cost=0.00..14425.00 rows=100 width=0)
> (4 rows)
>
> # -- extrapolate the additional time taken for the Append scan and
> work out what the planner
> # -- should add to the plan's cost, then divide by the number of rows
> in p1 to work out the
> # -- tuple cost of pulling a row through the append.
> # select (16925.01 * (72.984 / 58.567) - 16925.01)  / 100;
> ?column?
> 
>  0.00416630302337493743
> (1 row)
>
> # show cpu_tuple_cost;
>  cpu_tuple_cost
> 
>  0.01
> (1 row)
>
> # -- How does that compare to the cpu_tuple_cost?
> # select current_Setting('cpu_tuple_cost')::float8 /
> 0.00416630302337493743;
> ?column?
> 
>  2.400209476818
> (1 row)
>
> Maybe it's worth trying with different row counts to see if the
> additional cost is consistent, but it's probably not worth being too
> critical here.
>

I have tried exactly same tests to get to this factor on my local developer
machine. And with parallelism enabled I got this number as 7.9. However, if
I disable the parallelism (and I believe David too disabled that), I get
this number as 1.8. Whereas for 1 rows, I get this number to 1.7

-- With Gather
# select current_Setting('cpu_tuple_cost')::float8 / ((10633.56 * (81.035 /
72.450) - 10633.56)  / 100);
7.9

-- Without Gather
# select current_Setting('cpu_tuple_cost')::float8 / ((16925.01 * (172.838
/ 131.400) - 16925.01)  / 100);
1.8

-- With 1 rows (so no Gather too)
# select current_Setting('cpu_tuple_cost')::float8 / ((170.01 * (1.919 /
1.424) - 170.01)  / 1);
1.7

So it is not so straight forward to come up the correct heuristic here.
Thus using 50% of cpu_tuple_cost look good to me here.

As suggested by Ashutosh and Robert, attached separate small WIP patch for
it.

I think it will be better if we take this topic on another mail-thread.
Do you agree?


>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ce32b8a4..c0bf602 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1742,6 +1742,33 @@ cost_sort(Path *path, PlannerInfo *root,
 }
 
 /*
+ * cost_append
+ *	  Determines and returns the cost of an Appe

Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-10 Thread Jeevan Chalke
On Tue, Oct 10, 2017 at 10:27 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Tue, Oct 10, 2017 at 3:15 AM, David Rowley
> <david.row...@2ndquadrant.com> wrote:
> > On 10 October 2017 at 01:10, Jeevan Chalke
> > <jeevan.cha...@enterprisedb.com> wrote:
> >> Attached new patch set having HEAD at 84ad4b0 with all these review
> points
> >> fixed. Let me know if I missed any thanks.
> >
> > I've only really skimmed over this thread and only opened the code
> > enough to extract the following:
> >
> > + /* Multiply the costs by partition_wise_agg_cost_factor. */
> > + apath->startup_cost *= partition_wise_agg_cost_factor;
> > + apath->total_cost *= partition_wise_agg_cost_factor;
> >
> > I've not studied how all the path plumbing is done, but I think
> > instead of doing this costing magic we should really stop pretending
> > that Append/MergeAppend nodes are cost-free. I think something like
> > charging cpu_tuple_cost per row expected through Append/MergeAppend
> > would be a better approach to this.
> >
> > If you perform grouping or partial grouping before the Append, then in
> > most cases the Append will receive less rows, so come out cheaper than
> > if you perform the grouping after it. I've not learned the
> > partition-wise join code enough to know if this is going to affect
> > that too, but for everything else, there should be no plan change,
> > since there's normally no alternative paths. I see there's even a
> > comment in create_append_path() which claims the zero cost is a bit
> > optimistic.
> >
>
> +1.


Yes. Me and Ashutosh had a thought on this offlist that we will need to
cost Append node too as having an extra GUC to control this is not a good
idea per se. Thanks for your vote too.

I will try doing this and will see how plan goes with it.

Partition-wise join will also benefit from costing Append
> processing. Number of rows * width of join result compared with the
> sum of that measure for joining relations decides whether Append node
> processes more data in Append->Join case than Join->Append case.
>
> Append node just returns the result of ExecProcNode(). Charging
> cpu_tuple_cost may make it too expensive. In other places where we
> charge cpu_tuple_cost there's some processing done to the tuple like
> ExecStoreTuple() in SeqNext(). May be we need some other measure for
> Append's processing of the tuple.
>
> May be we should try to measure the actual time spent in Append node
> as a fraction of say time spent in child seq scans. That might give us
> a clue as to how Append processing can be charged in terms of costing.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 66449694

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-09 Thread Jeevan Chalke
 BY clause or a degenerate (constant)
> one,
> + * in which case there will only be a single group.
> + */
> append path here can be output of either merge append or append. If it's
> output
> of merge append, we don't need to sort it again, do we?
>

Yes, you are right, we don't need an explicit sort over merge-append.
Done those changes.


> create_partition_agg_paths() creates append paths and then adds
> finalization
> path if necessary. The code to add finalization path seems to be similar
> to the
> code that adds finalization path for parallel query. May be we could take
> out
> common code into a function and call that function in two places. I see
> this
> function as accepting a partial aggregation/grouping path and returning a
> path
> that finalizes partial aggregates/groups.
>

It seems that it will become messy. Per my understanding the only common
code
is related to the add_path() call with appropriate create_agg_path() or
create_group_path(). Those are anyways function calls and I don't see any
reason to split them into the separate function.


>
>
Attached new patch set having HEAD at 84ad4b0 with all these review points
fixed. Let me know if I missed any thanks.

I have merged parallelism changes into main patch i.e. 0007 as most of the
changes in that patch are actual modifying same lines added by 0007.

Thanks

-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v5.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-27 Thread Jeevan Chalke
Thanks Ashutosh for reviewing.

Attached new patch-set with following changes:

1. Removed earlier 0007 and 0008 patches which were PoC for supporting
partial aggregation over fdw. I removed them as it will be a different
issue altogether and hence I will tackle them separately once this is
done.

This patch-set now includes support for parallel plans within partitions.

Notes:
HEAD: 59597e6
Partition-wise Join Version: 34

(First six patches 0001 - 0006, remains the same functionality-wise)
0007 - Refactors partial grouping paths creation into the separate function.
0008 - Enables parallelism within the partition-wise aggregation.

This patch also includes a fix for the crash reported by Rajkumar.
While forcibly applying scan/join target to all the Paths for the scan/join
rel, earlier I was using apply_projection_to_path() which modifies the path
in-place which causing this crash as the path finally chosen has been
updated by partition-wise agg path creation. Now I have used
create_projection_path() like we do in partial aggregation paths.

Also, fixed issues reported by Ashutosh.

On Tue, Sep 26, 2017 at 6:16 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Hi Jeevan,
> I have started reviewing these patches.
>
> 0001 looks fine. There might be some changes that will be needed, but
> those will be clear when I review the patch that uses this
> refactoring.
>
> 0002
> + *
> + * If targetlist is provided, we use it else use targetlist from the root.
>   */
>  static double
>  get_number_of_groups(PlannerInfo *root,
>  double path_rows,
> -grouping_sets_data *gd)
> +grouping_sets_data *gd,
> +List *tlist)
>  {
> Query  *parse = root->parse;
> double  dNumGroups;
> +   List   *targetList = (tlist == NIL) ? parse->targetList : tlist;
>
> May be we should just pass targetlist always. Instead of passing NIL,
> pass parse->targetList directly. That would save us one conditional
> assignment. May be passing NIL is required for the patches that use
> this refactoring, but that's not clear as is in this patch.
>

Done in attached patch-set.


>
> 0003
> In the documenation of enable_partition_wise_aggregate, we should
> probably explain why the default is off or like partition_wise_join
> GUC, explain the consequences of turning it off.


I have updated this. Please have a look.


> I doubt if we could
> accept something like partition_wise_agg_cost_factor looks. But we can
> discuss this at a later stage. Mean time it may be worthwhile to fix
> the reason why we would require this GUC. If the regular aggregation
> has cost lesser than partition-wise aggregation in most of the cases,
> then probably we need to fix the cost model.
>

Yep. I will have a look mean-while.


>
> I will continue reviewing rest of the patches.
>
>
Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v4.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-22 Thread Jeevan Chalke
On Thu, Sep 21, 2017 at 5:00 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Jeevan,
>
> while testing latest v3 patches, I am  getting a server crash if I reset
> partition_wise_agg_cost_factor, please take a look.
>
> CREATE TABLE lp (a TEXT, b FLOAT, c INT) PARTITION BY LIST(c);
> CREATE TABLE lp1 PARTITION OF lp FOR VALUES IN (10,20);
> CREATE TABLE lp2 PARTITION OF lp FOR VALUES IN (30,40);
>
> INSERT INTO lp VALUES ('a1',800, 20);
> INSERT INTO lp VALUES ('a2',1250,30);
> INSERT INTO lp VALUES ('a3',2975,20);
> INSERT INTO lp VALUES ('a3',2850,30);
>
> postgres=# SET enable_partition_wise_agg TO true;
> SET
> postgres=# SET partition_wise_agg_cost_factor TO 0.5;
> SET
> postgres=#
> postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER
> BY 1,2;
>  max  |  avg
> --+
>  2975 | 2912.5
> (1 row)
>
> postgres=# RESET partition_wise_agg_cost_factor;
> RESET
> postgres=#
> postgres=# SELECT MAX(b), AVG(b) FROM lp GROUP BY a HAVING a = 'a3' ORDER
> BY 1,2;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
>
Oops. Will fix this.

I have added these tests in testcase, but testcase is working as expected.
However running those steps on psql reproduces the crash (not consistent
though).

Looking into it. Thanks for reporting.


> Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation
>


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-20 Thread Jeevan Chalke
other partition-wise
> >> optimizations individually. So, they are somewhat independent
> >> switches. I don't think we should bundle all of those into one.
> >> Whatever names we choose for those GUCs, I think they should have same
> >> naming convention e.g. "partition_wise_xyz". I am open to suggestions
> >> about the names.
> >
> > I think the chances of you getting multiple GUCs for different
> > partition-wise optimizations past Tom are pretty low.
>
> We do have enable_hashjoin and enable_hashagg to control use of
> hashing for aggregate and join. On similar lines we can have three
> GUCs to enable use of partition-wise strategy, one for each of join,
> aggregation and sorting. Having granular switches would be useful for
> debugging and may be to turn partition-wise strategies off when they
> are not optimal.


I think having a granular control over each of these optimization will be
handy for the DBAs too.


> Do we want a switch to turn ON/OFF partition pruning?
> Said, that I am fine with single GUC controlling all. We won't set any
> partitioning information in RelOptInfo if that GUC is turned OFF.
>
> [1] https://pdfs.semanticscholar.org/27c2/ba75f8b6a39d4bce85d5579dace609
> c9abaa.pdf
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-18 Thread Jeevan Chalke
On Tue, Sep 12, 2017 at 6:21 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>>
>> Hi Jeevan,
>>
>> I have started testing partition-wise-aggregate and got one observation,
>> please take a look.
>> with the v2 patch, here if I change target list order, query is not
>> picking full partition-wise-aggregate.
>>
>
> Thanks Rajkumar for reporting this.
>
> I am looking into this issue and will post updated patch with the fix.
>

Logic for checking whether partition keys lead group by keys needs to be
updated here. The group by expressions can appear in any order without
affecting the final result. And thus, the need for partition keys should
be leading the group by keys to have full aggregation is not mandatory.
Instead we must ensure that the partition keys are part of the group by
keys to compute full aggregation on a partition.

Attached, revised patch-set with above fix.

Also, in test-cases, I have removed DROP/ANALYZE commands on child
relations and also removed VERBOSE from the EXPLAIN.

Notes:
HEAD: 8edacab209957520423770851351ab4013cb0167
Partition-wise Join patch-set version: v32

Thanks

-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v3.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Constraint exclusion for partitioned tables

2017-09-13 Thread Jeevan Chalke
On Tue, Sep 12, 2017 at 8:12 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Tue, Sep 12, 2017 at 7:08 AM, Jeevan Chalke
> <jeevan.cha...@enterprisedb.com> wrote:
> > This patch clearly improves the planning time with given conditions.
> >
> > To verify that, I have created a table like:
> > create table foo(a int, b int check (b > 100), c text) partition by
> > range(a);
> > And then used following query to get planning time:
> > select * from foo where b < 100;
> >
> > And on my local setup, I have observed that,
> > For 16 partitions, planning time was 0.234692 ms, which reduced to
> 0.112948
> > ms with this patch.
> > For 128 partitions, planning time was 1.62305 ms, which reduced to
> 0.654252
> > ms with this patch.
> > For 1024 partitions, planning time was 18.720993 ms, which reduced to
> > 9.667395 ms with this patch.
> >
> > This clearly shows an improvement in planning time.
>
> What about the extra cost of checking the parent when it doesn't help?
>  In that case we will have some loss.
>
> I'm inclined to think that's OK, but it's something to think about.
>

I have updated query like:
select * from foo where b > 100;
Which matches with the CHECK constraint, and here are the result on my
local setup:

Time in milliseconds
Partitions | without patch | with patch
---|---|
2  | 0.072551  | 0.074154
4  | 0.102537  | 0.108024
8  | 0.162703  | 0.175017
16 | 0.288589  | 0.305285
128|  2.7119   | 2.636247
1024   | 29.101347 | 29.48275

So yes, as you said, it will have slight (may be negligible) overhead.

This observation are from local setup and I have also seen a large standard
deviation in the runs.

Thanks


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-12 Thread Jeevan Chalke
On Tue, Sep 12, 2017 at 3:24 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

>
> On Fri, Sep 8, 2017 at 5:47 PM, Jeevan Chalke <jeevan.chalke@enterprisedb.
> com> wrote:
>
>> Here are the new patch-set re-based on HEAD (f0a0c17) and
>> latest partition-wise join (v29) patches.
>>
>
> Hi Jeevan,
>
> I have started testing partition-wise-aggregate and got one observation,
> please take a look.
> with the v2 patch, here if I change target list order, query is not
> picking full partition-wise-aggregate.
>

Thanks Rajkumar for reporting this.

I am looking into this issue and will post updated patch with the fix.

Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation
>
>
>
>

-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Constraint exclusion for partitioned tables

2017-09-12 Thread Jeevan Chalke
Hi,

I had a look at these changes and here are my observations:

1. Patch applies cleanly with "git apply'.
2. make / make install / make check-world all are good.

This patch clearly improves the planning time with given conditions.

To verify that, I have created a table like:
create table foo(a int, b int check (b > 100), c text) partition by
range(a);
And then used following query to get planning time:
select * from foo where b < 100;

And on my local setup, I have observed that,
For 16 partitions, planning time was 0.234692 ms, which reduced to 0.112948
ms with this patch.
For 128 partitions, planning time was 1.62305 ms, which reduced to 0.654252
ms with this patch.
For 1024 partitions, planning time was 18.720993 ms, which reduced to
9.667395 ms with this patch.

This clearly shows an improvement in planning time.

Patch looks good to me. So passing that to the committer.

Thanks


On Tue, Aug 1, 2017 at 7:17 AM, Robert Haas <robertmh...@gmail.com> wrote:

> On Thu, Apr 6, 2017 at 6:47 AM, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
> > I am guessing that for normal inheritance, a constraint on parent
> > doesn't necessarily imply the same constraint on the child (Amit
> > Langote gives me an example of NOT NULL constraint).
>
> CHECK constraints that apply to the parent would apply to all
> children, unless they are NO INHERIT, so even for regular inheritance,
> it might still be possible to prove something by ignoring things that
> won't necessarily cascade.
>
> For partitioning, it may be that we've got enough restrictions in
> place on what can happen that we can assume everything can cascade.
> Actually, I hope that's true, since the partitioned table has no
> storage of its own.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal - using names as primary names of plpgsql function parameters instead $ based names

2017-09-11 Thread Jeevan Chalke
Hi Pavel,


On Sat, Sep 9, 2017 at 11:42 AM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

> Hi
>
> 2017-09-08 9:36 GMT+02:00 Jeevan Chalke <jeevan.cha...@enterprisedb.com>:
>
>> Hi Pavel,
>> I like the idea of using parameter name instead of $n symbols.
>>
>> However, I am slightly worried that, at execution time if we want to
>> know the parameter position in the actual function signature, then it
>> will become difficult to get that from the corresponding datum
>> variable. I don't have any use-case for that though. But apart from
>> this concern, idea looks good to me.
>>
>
> Understand - but it was reason why I implemented this function - when I
> have to search parameter name via offset, I cannot to use string searching.
> When you know the parameter name, you can use a string searching in text
> editor, in pager.
>
> It is better supported now, then current behave.
>

Make sense.


>
>
>>
>> BTW, instead of doing all these changes, I have done these changes this
>> way:
>>
>> -   /* Build variable and add to datum list */
>> -   argvariable = plpgsql_build_variable(buf, 0,
>> -argdtype, false);
>> +   /*
>> +* Build variable and add to datum list.  If there's a
>> name for
>> +* the argument, then use that else use $n name.
>> +*/
>> +   argvariable = plpgsql_build_variable((argnames &&
>> argnames[i][0] != '\0') ?
>> +argnames[i] : buf,
>> +0, argdtype, false);
>>
>> This requires no new variable and thus no more changes elsewhere.
>>
>> Attached patch with these changes. Please have a look.
>>
>
> Looks great - I added check to NULL only
>

Looks good.
I have not made those changes in my earlier patch as I did not want to
update other code which is not touched by this patch.

Anyways, your changes related to NULL check seems reasonable.
However, in attached patch I have fixed indentation.

Passing it on to the committer.

Thanks
-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index e9d7ef5..f450156 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -433,9 +433,14 @@ do_compile(FunctionCallInfo fcinfo,
 			 errmsg("PL/pgSQL functions cannot accept type %s",
 	format_type_be(argtypeid;
 
-/* Build variable and add to datum list */
-argvariable = plpgsql_build_variable(buf, 0,
-	 argdtype, false);
+/*
+ * Build variable and add to datum list.  If there's a name for
+ * the argument, then use that else use $n name.
+ */
+argvariable = plpgsql_build_variable((argnames != NULL &&
+	  argnames[i][0] != '\0') ?
+	 argnames[i] : buf,
+	 0, argdtype, false);
 
 if (argvariable->dtype == PLPGSQL_DTYPE_VAR)
 {
@@ -461,7 +466,7 @@ do_compile(FunctionCallInfo fcinfo,
 add_parameter_name(argitemtype, argvariable->dno, buf);
 
 /* If there's a name for the argument, make an alias */
-if (argnames && argnames[i][0] != '\0')
+if (argnames != NULL && argnames[i][0] != '\0')
 	add_parameter_name(argitemtype, argvariable->dno,
 	   argnames[i]);
 			}
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 7109996..cf589d5 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -6029,3 +6029,17 @@ SELECT * FROM list_partitioned_table() AS t;
  2
 (2 rows)
 
+--
+-- Check argument name is used instead of $n
+--
+CREATE TYPE ct AS (a int, b int);
+-- Should fail, error message should contain argument name instead of $1
+CREATE OR REPLACE FUNCTION fx(x ct) RETURNS void AS $$
+BEGIN
+  GET DIAGNOSTICS x = ROW_COUNT;
+  RETURN;
+END; $$ LANGUAGE plpgsql;
+ERROR:  "x" is not a scalar variable
+LINE 3:   GET DIAGNOSTICS x = ROW_COUNT;
+  ^
+DROP TYPE ct;
diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql
index 771d682..42f51e9 100644
--- a/src/test/regress/sql/plpgsql.sql
+++ b/src/test/regress/sql/plpgsql.sql
@@ -4811,3 +4811,17 @@ BEGIN
 END; $$ LANGUAGE plpgsql;
 
 SELECT * FROM list_partitioned_table() AS t;
+
+--
+-- Check argument name is used instead of $n
+--
+CREATE TYPE ct AS (a int, b int);
+
+-- Should fail, error message should contain argument name instead of $1
+CREATE OR REPLACE FUNCTION fx(x ct) RETURNS void AS $$
+BEGIN
+  GET DIAGNOSTICS x = ROW_COUNT;
+  RETURN;
+END; $$ LANGUAGE plpgsql;
+
+DROP TYPE ct;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-09-08 Thread Jeevan Chalke
On Wed, Aug 23, 2017 at 4:43 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

> Hi,
>
> Attached is the patch to implement partition-wise aggregation/grouping.
>
> As explained earlier, we produce a full aggregation for each partition when
> partition keys are leading group by clauses and then append is performed.
> Else we do a partial aggregation on each partition, append them and then
> add
> finalization step over it.
>
> I have observed that cost estimated for partition-wise aggregation and cost
> for the plans without partition-wise aggregation is almost same. However,
> execution time shows significant improvement (as explained my in the very
> first email) with partition-wise aggregates. Planner chooses a plan
> according
> to the costs, and thus most of the time plan without partition-wise
> aggregation is chosen. Hence, to force partition-wise plans and for the
> regression runs, I have added a GUC named partition_wise_agg_cost_factor to
> adjust the costings.
>
> This feature is only used when enable_partition_wise_agg GUC is set to on.
>
> Here are the details of the patches in the patch-set:
>

Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.


>
> 0001 - Refactors sort and hash final grouping paths into separate
> functions.
> Since partition-wise aggregation too builds paths same as that of
> create_grouping_paths(), separated path creation for sort and hash agg into
> separate functions. These functions later used by main partition-wise
> aggregation/grouping patch.
>
> 0002 - Passes targetlist to get_number_of_groups().
> We need to estimate groups for individual child relations and thus need to
> pass targetlist corresponding to the child rel.
>
> 0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor
> GUCs.
>
> 0004 - Implements partition-wise aggregation.
>
> 0005 - Adds test-cases.
>
> 0006 - postgres_fdw changes which enable pushing aggregation for other
> upper
> relations.
>

0007 - Provides infrastructure to allow partial aggregation
This will allow us to push the partial aggregation over fdw.
With this one can write SUM(PARTIAL x) to get a partial sum
result. Since PARTIAL is used in syntax, I need to move that
to a reserved keywords category. This is kind of PoC patch
and needs input over approach and the way it is implemented.

0008 - Teaches postgres_fdw to push partial aggregation
With this we can push aggregate on remote server when
GROUP BY key does not match with the PARTITION key too.


>
>
> Since this patch is highly dependent on partition-wise join [1], one needs
> to
> apply all those patches on HEAD (my repository head was at:
> 66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
> order.
>
> Suggestions / feedback / inputs ?
>
> [1] https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-
> XqWY006d07TJ+VXuhXCbdj=p1juky...@mail.gmail.com
>
>
>
-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v2.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal - using names as primary names of plpgsql function parameters instead $ based names

2017-09-08 Thread Jeevan Chalke
Hi Pavel,

On Sat, May 20, 2017 at 11:55 AM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

>
>
>
> 2017-05-19 5:48 GMT+02:00 Pavel Stehule <pavel.steh...@gmail.com>:
>
>>
>>
>> 2017-05-19 3:14 GMT+02:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.
>> com>:
>>
>>> On 5/15/17 14:34, Pavel Stehule wrote:
>>> > Now, I when I working on plpgsql_check, I have to check function
>>> > parameters. I can use fn_vargargnos and out_param_varno for list of
>>> > arguments and related varno(s). when I detect some issue, I am
>>> using
>>> > refname. It is not too nice now, because these refnames are $
>>> based.
>>> > Long names are alias only. There are not a possibility to find
>>> > related alias.
>>> >
>>> > So, my proposal. Now, we can use names as refname of parameter
>>> > variable. $ based name can be used as alias. From user perspective
>>> > there are not any change.
>>> >
>>> > Comments, notes?
>>> >
>>> > here is a patch
>>>
>>>
I like the idea of using parameter name instead of $n symbols.

However, I am slightly worried that, at execution time if we want to
know the parameter position in the actual function signature, then it
will become difficult to get that from the corresponding datum
variable. I don't have any use-case for that though. But apart from
this concern, idea looks good to me.

Here are review comments on the patch:

1.
+char   *argname = NULL;

There is no need to initialize argname here. The Later code does that.

2.
+argname = (argnames && argnames[i][0] != 0) ? argnames[i]
: NULL;

It will be better to check '\0' instead of 0, like we have that already.

3.
Check for argname exists is not consistent. At one place you have used
"argname != NULL" and other place it is "argname != '\0'".
Better to have "argname != NULL" at both the places.

4.
-- should fail -- message should to contain argument name
Should be something like this:
-- Should fail, error message should contain argument name

5.
+argvariable = plpgsql_build_variable(argname != NULL ?
+   argname : buf,
+   0, argdtype,
false);

Please correct indentation.

---

BTW, instead of doing all these changes, I have done these changes this way:

-   /* Build variable and add to datum list */
-   argvariable = plpgsql_build_variable(buf, 0,
-argdtype, false);
+   /*
+* Build variable and add to datum list.  If there's a name
for
+* the argument, then use that else use $n name.
+*/
+   argvariable = plpgsql_build_variable((argnames &&
argnames[i][0] != '\0') ?
+argnames[i] : buf,
+0, argdtype, false);

This requires no new variable and thus no more changes elsewhere.

Attached patch with these changes. Please have a look.

Thanks


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index e9d7ef5..dd17bb5 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -433,9 +433,13 @@ do_compile(FunctionCallInfo fcinfo,
 			 errmsg("PL/pgSQL functions cannot accept type %s",
 	format_type_be(argtypeid;
 
-/* Build variable and add to datum list */
-argvariable = plpgsql_build_variable(buf, 0,
-	 argdtype, false);
+/*
+ * Build variable and add to datum list.  If there's a name for
+ * the argument, then use that else use $n name.
+ */
+argvariable = plpgsql_build_variable((argnames && argnames[i][0] != '\0') ?
+	 argnames[i] : buf,
+	 0, argdtype, false);
 
 if (argvariable->dtype == PLPGSQL_DTYPE_VAR)
 {
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 7109996..cf589d5 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -6029,3 +6029,17 @@ SELECT * FROM list_partitioned_table() AS t;
  2
 (2 rows)
 
+--
+-- Check argument name is used instead of $n
+--
+CREATE TYPE ct AS (a int, b int);
+-- Should fail, error message should contain argument name instead of $1
+CREATE OR REPLACE FUNCTION fx(x ct) RETURNS void AS $$
+BEGIN
+  GET DIAGNOSTICS x = ROW_COUNT;
+  RETURN;
+END; $$ LANGUAGE plpgsql;
+ERROR:  "x

Re: [HACKERS] WIP: Aggregation push-down

2017-09-07 Thread Jeevan Chalke
Hi Antonin,

To understand the feature you have proposed, I have tried understanding
your patch. Here are few comments so far on it:

1.
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)

I did not understand, why you are not pushing aggregate in above cases?
Can you please explain?

2. "make check" in contrib/postgres_fdw crashes.

  SELECT COUNT(*) FROM ft1 t1;
! server closed the connection unexpectedly
!   This probably means the server terminated abnormally
!   before or while processing the request.
! connection to server was lost

>From your given setup, if I wrote a query like:
EXPLAIN VERBOSE SELECT count(*) FROM orders_0;
it crashes.

Seems like missing few checks.

3. In case of pushing partial aggregation on the remote side, you use schema
named "partial", I did not get that change. If I have AVG() aggregate,
then I end up getting an error saying
"ERROR:  schema "partial" does not exist".

4. I am not sure about the code changes related to pushing partial
aggregate on the remote side. Basically, you are pushing a whole aggregate
on the remote side and result of that is treated as partial on the
basis of aggtype = transtype. But I am not sure that is correct unless
I miss something here. The type may be same but the result may not.

5. There are lot many TODO comments in the patch-set, are you working
on those?

Thanks


On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska <a...@cybertec.at> wrote:

> Antonin Houska <a...@cybertec.at> wrote:
>
> > Antonin Houska <a...@cybertec.at> wrote:
> >
> > > This is a new version of the patch I presented in [1].
> >
> > Rebased.
> >
> > cat .git/refs/heads/master
> > b9a3ef55b253d885081c2d0e9dc45802cab71c7b
>
> This is another version of the patch.
>
> Besides other changes, it enables the aggregation push-down for
> postgres_fdw,
> although only for aggregates whose transient aggregation state is equal to
> the
> output type. For other aggregates (like avg()) the remote nodes will have
> to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.
>
> shard.tgz demonstrates the typical postgres_fdw use case. One query shows
> base
> scans of base relation's partitions being pushed to shard nodes, the other
> pushes down a join and performs aggregation of the join result on the
> remote
> node. Of course, the query can only references one particular partition,
> until
> the "partition-wise join" [1] patch gets committed and merged with this my
> patch.
>
> One thing I'm not sure about is whether the patch should remove
> GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> obsolete. Or should it only be deprecated so far? I understand that
> deprecation is important for "ordinary SQL users", but FdwRoutine is an
> interface for extension developers, so the policy might be different.
>
> [1] https://commitfest.postgresql.org/14/994/
>
> Any feedback is appreciated.
>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Replacing lfirst() with lfirst_node() appropriately in planner.c

2017-09-05 Thread Jeevan Chalke
Hi,

lfirst_node() changes are missing for List node type and I was thinking
about adding those. But it looks like we cannot do so as List can be a
T_List, T_IntList, or T_OidList. So I am OK with that.

Apart from this, changes look good to me. Everything works fine.

As we are now doing it for lfirst(), can we also do this for linitial()?
I did not find any usage for lsecond(), lthird(), lfourth() and llast()
though.

Attached patch for replacing linitial() with linital_node() appropriately in
planner.c

Thanks

On Fri, Jul 14, 2017 at 2:25 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Thu, Jul 13, 2017 at 9:15 PM, Alvaro Herrera
> <alvhe...@2ndquadrant.com> wrote:
> > Ashutosh Bapat wrote:
> >
> >> Happened to stumble across some instances of lfirst() which could use
> >> lfirst_node() in planner.c. Here's patch which replaces calls to
> >> lfirst() extracting node pointers by lfirst_node() in planner.c.
> >
> > Sounds good.
> >
> >> Are we carrying out such replacements in master or this will be
> >> considered for v11?
> >
> > This is for pg11 definitely; please add to the open commitfest.
>
> Thanks. Added. https://commitfest.postgresql.org/14/1195/
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a1dd157..9115655 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1556,8 +1556,8 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 			/*--
 			  translator: %s is a SQL row locking clause such as FOR UPDATE */
 	 errmsg("%s is not allowed with UNION/INTERSECT/EXCEPT",
-			LCS_asString(((RowMarkClause *)
-		  linitial(parse->rowMarks))->strength;
+			LCS_asString(linitial_node(RowMarkClause,
+	   parse->rowMarks)->strength;
 
 		/*
 		 * Calculate pathkeys that represent result ordering requirements
@@ -1687,7 +1687,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		qp_extra.tlist = tlist;
 		qp_extra.activeWindows = activeWindows;
 		qp_extra.groupClause = (gset_data
-? (gset_data->rollups ? ((RollupData *) linitial(gset_data->rollups))->groupClause : NIL)
+? (gset_data->rollups ? linitial_node(RollupData, gset_data->rollups)->groupClause : NIL)
 : parse->groupClause);
 
 		/*
@@ -1757,25 +1757,25 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 			split_pathtarget_at_srfs(root, final_target, sort_input_target,
 	 _targets,
 	 _targets_contain_srfs);
-			final_target = (PathTarget *) linitial(final_targets);
+			final_target = linitial_node(PathTarget, final_targets);
 			Assert(!linitial_int(final_targets_contain_srfs));
 			/* likewise for sort_input_target vs. grouping_target */
 			split_pathtarget_at_srfs(root, sort_input_target, grouping_target,
 	 _input_targets,
 	 _input_targets_contain_srfs);
-			sort_input_target = (PathTarget *) linitial(sort_input_targets);
+			sort_input_target = linitial_node(PathTarget, sort_input_targets);
 			Assert(!linitial_int(sort_input_targets_contain_srfs));
 			/* likewise for grouping_target vs. scanjoin_target */
 			split_pathtarget_at_srfs(root, grouping_target, scanjoin_target,
 	 _targets,
 	 _targets_contain_srfs);
-			grouping_target = (PathTarget *) linitial(grouping_targets);
+			grouping_target = linitial_node(PathTarget, grouping_targets);
 			Assert(!linitial_int(grouping_targets_contain_srfs));
 			/* scanjoin_target will not have any SRFs precomputed for it */
 			split_pathtarget_at_srfs(root, scanjoin_target, NULL,
 	 _targets,
 	 _targets_contain_srfs);
-			scanjoin_target = (PathTarget *) linitial(scanjoin_targets);
+			scanjoin_target = linitial_node(PathTarget, scanjoin_targets);
 			Assert(!linitial_int(scanjoin_targets_contain_srfs));
 		}
 		else
@@ -2194,7 +2194,7 @@ preprocess_grouping_sets(PlannerInfo *root)
 		/*
 		 * Get the initial (and therefore largest) grouping set.
 		 */
-		gs = linitial(current_sets);
+		gs = linitial_node(GroupingSetData, current_sets);
 
 		/*
 		 * Order the groupClause appropriately.  If the first grouping set is
@@ -2344,8 +2344,8 @@ preprocess_rowmarks(PlannerInfo *root)
 		 * CTIDs invalid.  This is also checked at parse time, but that's
 		 * insufficient because of rule substitution, query pullup, etc.
 		 */
-		CheckSelectLocki

Re: [HACKERS] Partition-wise aggregation/grouping

2017-08-23 Thread Jeevan Chalke
Hi,

Attached is the patch to implement partition-wise aggregation/grouping.

As explained earlier, we produce a full aggregation for each partition when
partition keys are leading group by clauses and then append is performed.
Else we do a partial aggregation on each partition, append them and then add
finalization step over it.

I have observed that cost estimated for partition-wise aggregation and cost
for the plans without partition-wise aggregation is almost same. However,
execution time shows significant improvement (as explained my in the very
first email) with partition-wise aggregates. Planner chooses a plan
according
to the costs, and thus most of the time plan without partition-wise
aggregation is chosen. Hence, to force partition-wise plans and for the
regression runs, I have added a GUC named partition_wise_agg_cost_factor to
adjust the costings.

This feature is only used when enable_partition_wise_agg GUC is set to on.

Here are the details of the patches in the patch-set:

0001 - Refactors sort and hash final grouping paths into separate functions.
Since partition-wise aggregation too builds paths same as that of
create_grouping_paths(), separated path creation for sort and hash agg into
separate functions. These functions later used by main partition-wise
aggregation/grouping patch.

0002 - Passes targetlist to get_number_of_groups().
We need to estimate groups for individual child relations and thus need to
pass targetlist corresponding to the child rel.

0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor
GUCs.

0004 - Implements partition-wise aggregation.

0005 - Adds test-cases.

0006 - postgres_fdw changes which enable pushing aggregation for other upper
relations.


Since this patch is highly dependent on partition-wise join [1], one needs
to
apply all those patches on HEAD (my repository head was at:
66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
order.

Suggestions / feedback / inputs ?

[1]
https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-XqWY006d07TJ+VXuhXCbdj=p1juky...@mail.gmail.com


On Tue, Mar 21, 2017 at 12:47 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

> Hi all,
>
> Declarative partitioning is supported in PostgreSQL 10 and work is already
> in
> progress to support partition-wise joins. Here is a proposal for
> partition-wise
> aggregation/grouping.  Our initial performance measurement has shown 7
> times
> performance when partitions are on foreign servers and approximately 15%
> when
> partitions are local.
>
> Partition-wise aggregation/grouping computes aggregates for each partition
> separately.  If the group clause contains the partition key, all the rows
> belonging to a given group come from one partition, thus allowing
> aggregates
> to be computed completely for each partition.  Otherwise, partial
> aggregates
> computed for each partition are combined across the partitions to produce
> the
> final aggregates. This technique improves performance because:
> i. When partitions are located on foreign server, we can push down the
> aggregate to the foreign server.
> ii. If hash table for each partition fits in memory, but that for the whole
> relation does not, each partition-wise aggregate can use an in-memory hash
> table.
> iii. Aggregation at the level of partitions can exploit properties of
> partitions like indexes, their storage etc.
>
> Attached an experimental patch for the same based on the partition-wise
> join
> patches posted in [1].
>
> This patch currently implements partition-wise aggregation when group
> clause
> contains the partitioning key.  A query below, involving a partitioned
> table
> with 3 partitions containing 1M rows each, producing total 30 groups showed
> 15% improvement over non-partition-wise aggregation. Same query showed 7
> times
> improvement when the partitions were located on the foreign servers.
>
> Here is the sample plan:
>
> postgres=# set enable_partition_wise_agg to true;
> SET
> postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
>   QUERY
> PLAN
> 
> --
>  Append  (cost=5100.00..61518.90 rows=30 width=12) (actual
> time=324.837..944.804 rows=30 loops=1)
>->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
> time=324.837..324.838 rows=10 loops=1)
>  Relations: Aggregate on (public.fplt1_p1 plt1)
>->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
> time=309.954..309.956 rows=10 loops=1)
>  Relations: Aggregate on (public.fplt1_p2 plt1)
>->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
> time=310.00

[HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling

2017-08-08 Thread Jeevan Chalke
Hi,

We have observed a random server crash (FailedAssertion), while running few
tests at our end. Stack-trace is attached.

By looking at the stack-trace, and as discussed it with my team members;
what we have observed that in SearchCatCacheList(), we are incrementing
refcount and then decrementing it at the end. However for some reason, if
we are in TRY() block (where we increment the refcount), and hit with any
interrupt, we failed to decrement the refcount due to which later we get
assertion failure.

To mimic the scenario, I have added a sleep in SearchCatCacheList() as
given below:

diff --git a/src/backend/utils/cache/catcache.c
b/src/backend/utils/cache/catcache.c
index e7e8e3b..eb6d4b5 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1520,6 +1520,9 @@ SearchCatCacheList(CatCache *cache,
hashValue = CatalogCacheComputeTupleHashValue(cache, ntp);
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);

+   elog(INFO, "Sleeping for 0.1 seconds.");
+   pg_usleep(10L); /* 0.1 seconds */
+
bucket = >cc_bucket[hashIndex];
dlist_foreach(iter, bucket)
{

And then followed these steps to get a server crash:

-- Terminal 1
DROP TYPE typ;
DROP FUNCTION func(x int);

CREATE TYPE typ AS (X VARCHAR(50), Y INT);

CREATE OR REPLACE FUNCTION func(x int) RETURNS int AS $$
DECLARE
  rec typ;
  var2 numeric;
BEGIN
  RAISE NOTICE 'Function Called.';
  REC.X := 'Hello';
  REC.Y := 0;

  IF (rec.Y + var2) = 0 THEN
RAISE NOTICE 'Check Pass';
  END IF;

  RETURN 1;
END;
$$ LANGUAGE plpgsql;

SELECT pg_backend_pid();

SELECT func(1);

-- Terminal 2, should be run in parallel when SELECT func(1) is in progress
in terminal 1.
SELECT pg_terminate_backend();


I thought it worth posting here to get others attention.

I have observed this on the master branch, but can also be reproducible on
back-branches.

Thanks
-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Program terminated with signal 6, Aborted.
#0  0x7fdff78951d7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install 
audit-libs-2.6.5-3.el7_3.1.x86_64 cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 
glibc-2.17-157.el7_3.1.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 
krb5-libs-1.14.1-27.el7_3.x86_64 libcap-ng-0.7.5-4.el7.x86_64 
libcom_err-1.42.9-9.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 
libicu-50.1.2-15.el7.x86_64 libselinux-2.5-6.el7.x86_64 
libstdc++-4.8.5-11.el7.x86_64 nspr-4.13.1-1.0.el7_3.x86_64 
nss-3.28.2-1.6.el7_3.x86_64 nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 
nss-util-3.28.2-1.1.el7_3.x86_64 openldap-2.4.40-13.el7.x86_64 
openssl-libs-1.0.1e-60.el7_3.1.x86_64 pam-1.1.8-18.el7.x86_64 
pcre-8.32-15.el7_2.1.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x7fdff78951d7 in raise () from /lib64/libc.so.6
#1  0x7fdff78968c8 in abort () from /lib64/libc.so.6
#2  0x009d43ff in ExceptionalCondition (conditionName=0xc02255 
"!(ct->refcount == 0)", errorType=0xc020b4 "FailedAssertion", 
fileName=0xc02060 "catcache.c", lineNumber=567) at assert.c:54
#3  0x009b59e0 in AtEOXact_CatCache (isCommit=0 '\000') at 
catcache.c:567
#4  0x00538a8d in AbortTransaction () at xact.c:2613
#5  0x0053a95e in AbortOutOfAnyTransaction () at xact.c:4271
#6  0x009e8e99 in ShutdownPostgres (code=1, arg=0) at postinit.c:1146
#7  0x008456e1 in shmem_exit (code=1) at ipc.c:228
#8  0x008455d5 in proc_exit_prepare (code=1) at ipc.c:185
#9  0x00845543 in proc_exit (code=1) at ipc.c:102
#10 0x009d4bae in errfinish (dummy=0) at elog.c:543
#11 0x008761b4 in ProcessInterrupts () at postgres.c:2882
#12 0x009d4bea in errfinish (dummy=0) at elog.c:565
#13 0x009d7097 in elog_finish (elevel=17, fmt=0xc02678 "Sleeping for 
0.1 seconds.") at elog.c:1378
#14 0x009b765b in SearchCatCacheList (cache=0x27636d0, nkeys=1, 
v1=11604184, v2=0, v3=0, v4=0) at catcache.c:1523
#15 0x009cbd20 in SearchSysCacheList (cacheId=37, nkeys=1, 
key1=11604184, key2=0, key3=0, key4=0) at syscache.c:1338
#16 0x0056e22e in OpernameGetCandidates (names=0x27f3b08, oprkind=98 
'b', missing_schema_ok=0 '\000') at namespace.c:1576
#17 0x005f09a6 in oper (pstate=0x27f3ec0, opname=0x27f3b08, ltypeId=23, 
rtypeId=1700, noError=0 '\000', location=14) at parse_oper.c:414
#18 0x005f132d in make_op (pstate=0x27f3ec0, opname=0x27f3b08, 
ltree=0x27f40f0, rtree=0x27f4128, last_srf=0x0, location=14) at parse_oper.c:778
#19 0x005e5e8d in transformAExprOp (pstate=0x27f3ec0, a=0x27f3a60) at 
parse_expr.c:933
#20 0x005e454e in transformExprRecurse (pstate=0x27f3ec0, 
expr=0x27f3a60) at parse_expr.c:217
#21 0x005e5e44 in transformAExprOp (pstate=0x27f3ec0, a=0x27f3b78) at 
parse_expr.c:930
#22 0x00

Re: [HACKERS] Partition-wise aggregation/grouping

2017-04-27 Thread Jeevan Chalke
On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <a...@cybertec.at> wrote:

> Robert Haas <robertmh...@gmail.com> wrote:
>
> > On Wed, Apr 26, 2017 at 6:28 AM, Antonin Houska <a...@cybertec.at> wrote:
> > > Attached is a diff that contains both patches merged. This is just to
> prove my
> > > assumption, details to be elaborated later. The scripts attached
> produce the
> > > following plan in my environment:
> > >
> > >QUERY PLAN
> > > 
> > >  Parallel Finalize HashAggregate
> > >Group Key: b_1.j
> > >->  Append
> > >  ->  Parallel Partial HashAggregate
> > >Group Key: b_1.j
> > >->  Hash Join
> > >  Hash Cond: (b_1.j = c_1.k)
> > >  ->  Seq Scan on b_1
> > >  ->  Hash
> > >->  Seq Scan on c_1
> > >  ->  Parallel Partial HashAggregate
> > >Group Key: b_2.j
> > >->  Hash Join
> > >  Hash Cond: (b_2.j = c_2.k)
> > >  ->  Seq Scan on b_2
> > >  ->  Hash
> > >->  Seq Scan on c_2
> >
> > Well, I'm confused.  I see that there's a relationship between what
> > Antonin is trying to do and what Jeevan is trying to do, but I can't
> > figure out whether one is a subset of the other, whether they're both
> > orthogonal, or something else.  This plan looks similar to what I
> > would expect Jeevan's patch to produce,
>
> The point is that the patch Jeevan wanted to work on is actually a subset
> of
> [1] combined with [2].
>

Seems like, as you are targeting every relation whether or not it is
partitioned. Where as I am targeting only partitioned relations in my
patch.


>
> > except i have no idea what "Parallel" would mean in a plan that contains
> no
> > Gather node.
>
> parallel_aware field was set mistakenly on the AggPath. Fixed patch is
> attached below, producing this plan:
>
>QUERY PLAN
> 
>  Finalize HashAggregate
>Group Key: b_1.j
>->  Append
>  ->  Partial HashAggregate
>Group Key: b_1.j
>->  Hash Join
>  Hash Cond: (b_1.j = c_1.k)
>  ->  Seq Scan on b_1
>  ->  Hash
>->  Seq Scan on c_1
>  ->  Partial HashAggregate
>Group Key: b_2.j
>->  Hash Join
>  Hash Cond: (b_2.j = c_2.k)
>  ->  Seq Scan on b_2
>  ->  Hash
>->  Seq Scan on c_2
>

With my patch, I am getting following plan where we push entire
aggregation below append.

QUERY PLAN
--
 Append
   ->  HashAggregate
 Group Key: b_1.j
 ->  Hash Join
   Hash Cond: (b_1.j = c_1.k)
   ->  Seq Scan on b_1
   ->  Hash
 ->  Seq Scan on c_1
   ->  HashAggregate
 Group Key: b_2.j
 ->  Hash Join
   Hash Cond: (b_2.j = c_2.k)
   ->  Seq Scan on b_2
   ->  Hash
 ->  Seq Scan on c_2
(15 rows)


Antonin, I have tried applying your patch on master but it doesn't get
apply. Can you please provide the HEAD and any other changes required
to be applied first?

How the plan look like when GROUP BY key does not match with the
partitioning key i.e. GROUP BY b.v ?


>
> [1] https://www.postgresql.org/message-id/9666.1491295317%40localhost
>
> [2] https://commitfest.postgresql.org/14/994/
>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 66449694

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] Partition-wise aggregation/grouping

2017-03-23 Thread Jeevan Chalke
On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <a...@cybertec.at> wrote:

> Jeevan Chalke <jeevan.cha...@enterprisedb.com> wrote:
>
> > Declarative partitioning is supported in PostgreSQL 10 and work is
> already in
> > progress to support partition-wise joins. Here is a proposal for
> partition-wise
> > aggregation/grouping. Our initial performance measurement has shown 7
> times
> > performance when partitions are on foreign servers and approximately 15%
> when
> > partitions are local.
> >
> > Partition-wise aggregation/grouping computes aggregates for each
> partition
> > separately. If the group clause contains the partition key, all the rows
> > belonging to a given group come from one partition, thus allowing
> aggregates
> > to be computed completely for each partition. Otherwise, partial
> aggregates
> > computed for each partition are combined across the partitions to
> produce the
> > final aggregates. This technique improves performance because:
>
> > i. When partitions are located on foreign server, we can push down the
> > aggregate to the foreign server.
>
> > ii. If hash table for each partition fits in memory, but that for the
> whole
> > relation does not, each partition-wise aggregate can use an in-memory
> hash
> > table.
>
> > iii. Aggregation at the level of partitions can exploit properties of
> > partitions like indexes, their storage etc.
>
> I suspect this overlaps with
>
> https://www.postgresql.org/message-id/29111.1483984605%40localhost
>
> I'm working on the next version of the patch, which will be able to
> aggregate
> the result of both base relation scans and joins. I'm trying hard to make
> the
> next version available before an urgent vacation that I'll have to take at
> random date between today and early April. I suggest that we coordinate the
> effort, it's lot of work in any case.
>

IIUC, it seems that you are trying to push down the aggregation into the
joining relations. So basically you are converting
Agg -> Join -> {scan1, scan2} into
FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
In addition to that your patch pushes aggregates on base rel to its
children,
if any.

Where as what I propose here is pushing down aggregation below the append
node keeping join/scan as is. So basically I am converting
Agg -> Append-> Join -> {scan1, scan2} into
Append -> Agg -> Join -> {scan1, scan2}.
This will require partition-wise join as posted in [1].
But I am planning to make this work for partitioned relations and not for
generic inheritance.

I treat these two as separate strategies/paths to be consider while
planning.

Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.

I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these
task,
then we may need to find a way to integrate them.

[1]
https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=ead...@mail.gmail.com#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=ead...@mail.gmail.com



>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] Partition-wise aggregation/grouping

2017-03-21 Thread Jeevan Chalke
Hi all,

Declarative partitioning is supported in PostgreSQL 10 and work is already
in
progress to support partition-wise joins. Here is a proposal for
partition-wise
aggregation/grouping.  Our initial performance measurement has shown 7 times
performance when partitions are on foreign servers and approximately 15%
when
partitions are local.

Partition-wise aggregation/grouping computes aggregates for each partition
separately.  If the group clause contains the partition key, all the rows
belonging to a given group come from one partition, thus allowing aggregates
to be computed completely for each partition.  Otherwise, partial aggregates
computed for each partition are combined across the partitions to produce
the
final aggregates. This technique improves performance because:
i. When partitions are located on foreign server, we can push down the
aggregate to the foreign server.
ii. If hash table for each partition fits in memory, but that for the whole
relation does not, each partition-wise aggregate can use an in-memory hash
table.
iii. Aggregation at the level of partitions can exploit properties of
partitions like indexes, their storage etc.

Attached an experimental patch for the same based on the partition-wise join
patches posted in [1].

This patch currently implements partition-wise aggregation when group clause
contains the partitioning key.  A query below, involving a partitioned table
with 3 partitions containing 1M rows each, producing total 30 groups showed
15% improvement over non-partition-wise aggregation. Same query showed 7
times
improvement when the partitions were located on the foreign servers.

Here is the sample plan:

postgres=# set enable_partition_wise_agg to true;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
  QUERY
PLAN
--
 Append  (cost=5100.00..61518.90 rows=30 width=12) (actual
time=324.837..944.804 rows=30 loops=1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
time=324.837..324.838 rows=10 loops=1)
 Relations: Aggregate on (public.fplt1_p1 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
time=309.954..309.956 rows=10 loops=1)
 Relations: Aggregate on (public.fplt1_p2 plt1)
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual
time=310.002..310.004 rows=10 loops=1)
 Relations: Aggregate on (public.fplt1_p3 plt1)
 Planning time: 0.370 ms
 Execution time: 945.384 ms
(9 rows)

postgres=# set enable_partition_wise_agg to false;
SET
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;
  QUERY
PLAN
---
 HashAggregate  (cost=121518.01..121518.31 rows=30 width=12) (actual
time=6498.452..6498.459 rows=30 loops=1)
   Group Key: plt1.a
   ->  Append  (cost=0.00..106518.00 rows=301 width=4) (actual
time=0.595..5769.592 rows=300 loops=1)
 ->  Seq Scan on plt1  (cost=0.00..0.00 rows=1 width=4) (actual
time=0.007..0.007 rows=0 loops=1)
 ->  Foreign Scan on fplt1_p1  (cost=100.00..35506.00 rows=100
width=4) (actual time=0.587..1844.506 rows=100 loops=1)
 ->  Foreign Scan on fplt1_p2  (cost=100.00..35506.00 rows=100
width=4) (actual time=0.384..1839.633 rows=100 loops=1)
 ->  Foreign Scan on fplt1_p3  (cost=100.00..35506.00 rows=100
width=4) (actual time=0.402..1876.505 rows=100 loops=1)
 Planning time: 0.251 ms
 Execution time: 6499.018 ms
(9 rows)

Patch needs a lot of improvement including:
1. Support for partial partition-wise aggregation
2. Estimating number of groups for every partition
3. Estimating cost of partition-wise aggregation based on sample partitions
similar to partition-wise join
and much more.

In order to support partial aggregation on foreign partitions, we need
support
to fetch partially aggregated results from the foreign server. That can be
handled as a separate follow-on patch.

Though is lot of work to be done, I would like to get suggestions/opinions
from
hackers.

I would like to thank Ashutosh Bapat for providing a draft patch and helping
me off-list on this feature while he is busy working on partition-wise join
feature.

[1]
https://www.postgresql.org/message-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR%3DPbXj1e%3D9a%3DqMoQ%40mail.gmail.com

Thanks

-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


pg_partwise_agg_WIP.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2017-03-13 Thread Jeevan Chalke
Hi,

I have reviewed this patch further and here are my comments:

1.
Will it be better to use compare_pathkeys() instead of
equal(root->query_pathkeys, pathkeys)?

2.
I think it will be good if we add a simple test-case in postgres_fdw.sql
which shows that LIMIT is passed to remote server. We might convert one of
the affected EXPLAIN to EXPLAIN ANALYZE.
With this patch, we do push LIMIT on foreign server but not at planning
time.
Thus we still show remote query in EXPLAIN output without LIMIT clause but
we
actually push that at execution time. Such explain plan confuses the reader.
So I think we better convert them to EXPLAIN ANALYZE or put some comments
before the query explaining this. I prefer to put comments as EXPLAIN
ANALYZE
is costly.

3.
"-- CROSS JOIN, not pushed down"
Above comment need to be changed now.
As you explained already, due to limit clause, CROSS-JOIN is pushed down to
remote server, thus need to update this comment.

Rest of the changes look good to me.

Thanks

-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2016-11-24 Thread Jeevan Chalke
On Mon, Nov 21, 2016 at 1:59 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:

> Hello,
>
> The attached patch is a revised version of pass-down LIMIT to FDW/CSP.
>
> Below is the updates from the last version.
>
> 'ps_numTuples' of PlanState was declared as uint64, instead of long
> to avoid problems on 32bits machine when a large LIMIT clause is
> supplied.
>
> 'ps_numTuples' is re-interpreted; 0 means that its upper node wants
> to fetch all the tuples. It allows to eliminate a boring initialization
> on ExecInit handler for each executor node.
>
> Even though it was not suggested, estimate_path_cost_size() of postgres_fdw
> adjusts number of rows if foreign-path is located on top-level of
> the base-relations and LIMIT clause takes a constant value.
> It will make more adequate plan as follows:
>
> * WITHOUT this patch
> 
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>QUERY PLAN
> 
> 
>  Limit  (cost=261.17..274.43 rows=100 width=88)
>Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>->  Hash Join  (cost=261.17..581.50 rows=2416 width=88)
>  Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>  Hash Cond: (t_a.id = t_b.id)
>  Join Filter: (t_a.x < t_b.x)
>  ->  Foreign Scan on public.t_a  (cost=100.00..146.12 rows=1204
> width=44)
>Output: t_a.id, t_a.x, t_a.y
>Remote SQL: SELECT id, x, y FROM public.t
>  ->  Hash  (cost=146.12..146.12 rows=1204 width=44)
>Output: t_b.id, t_b.x, t_b.y
>->  Foreign Scan on public.t_b  (cost=100.00..146.12
> rows=1204 width=44)
>  Output: t_b.id, t_b.x, t_b.y
>  Remote SQL: SELECT id, x, y FROM public.t
> (14 rows)
>
> * WITH this patch
> -
> postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id
> and t_a.x < t_b.x LIMIT 100;
>
> QUERY PLAN
> 
> --
>  Limit  (cost=100.00..146.58 rows=100 width=88)
>Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>->  Foreign Scan  (cost=100.00..146.58 rows=100 width=88)
>  Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y
>  Relations: (public.t_a) INNER JOIN (public.t_b)
>  Remote SQL: SELECT r1.id, r1.x, r1.y, r2.id, r2.x, r2.y FROM
> (public.t r1 INNER JOIN public.t r2 ON (((r1.x < r2.x)) AND ((r1.id =
> r2.id
> (6 rows)
>
>
> That's nice.


> On the other hands, I noticed it is not safe to attach LIMIT clause at
> the planner stage because root->limit_tuples is declared as double.
> Even if LIMIT clause takes a constant value, it is potentially larger
> than 2^53 which is the limitation we can represent accurately with
> float64 data type but LIMIT clause allows up to 2^63-1.
> So, postgres_fdw now attaches LIMIT clause on the remote query on
> execution time only.
>

I think, it's OK.

Here are few comments on latest patch:

1.
make/make check is fine, however I am getting regression failure in
postgres_fdw contrib module (attached regression.diff).
Please investigate and fix.

2.
+ *
+ * MEMO: root->limit_tuples is not attached when query contains
+ * grouping-clause or aggregate functions. So, we don's adjust
+ * rows even if LIMIT  is supplied.

Can you please explain why you are not doing this for grouping-clause or
aggregate functions.

3.
Typo:

don's  => don't

Rest of the changes look good to me.

Thanks


> Thanks,
> 
> PG-Strom Project / NEC OSS Promotion Center
> KaiGai Kohei <kai...@ak.jp.nec.com>
>
>
> > -Original Message-
> > From: Robert Haas [mailto:robertmh...@gmail.com]
> > Sent: Thursday, November 10, 2016 3:08 AM
> > To: Kaigai Kouhei(海外 浩平) <kai...@ak.jp.nec.com>
> > Cc: pgsql-hackers@postgresql.org; Jeevan Chalke
> > <jeevan.cha...@enterprisedb.com>; Etsuro Fujita
> > <fujita.ets...@lab.ntt.co.jp>; Andres Freund <and...@anarazel.de>
> > Subject: ##freemail## Re: PassDownLimitBound for ForeignScan/CustomScan
> > [take-2]
> >
> > On Mon, Oct 31, 2016 at 10:20 AM, Kouhei Kaigai <kai...@ak.jp.nec.com>
> > wrote:
> > > As an example, I enhanced postgres_fdw to understand the ps_numTuples
> > > if it is set. If and when remote ORDER BY is pushed down, the latest
> > > code tries to sort the entire remote table because 

Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2016-11-08 Thread Jeevan Chalke
Hi,

I have reviewed the patch.

Patch applies cleanly. make/"make install"/"make check" all are fine.

I too see a good performance in execution time when LIMIT is pushed with
cursor query in postgres_fdw at execution time

However here are few comments on the changes:

1. ps_numTuples is declared as long, however offset and count members in
LimitState struct and bound member in SortState struct is int64.  However
long on 32 bit machine may be 32 bits and thus I think tuples_needed which
is long may have overflow hazards as it may store int64 + int64.  I think
ps_numTuples should be int64.

2. Robert suggested following in the previous discussion:
"For example, suppose we add a new PlanState member "long
numTuples" where 0 means that the number of tuples that will be needed
is unknown (so that most node types need not initialize it), a
positive value is an upper bound on the number of tuples that will be
fetched, and -1 means that it is known for certain that we will need
all of the tuples."

We should have 0 for the default case so that we don't need to initialize it
at most of the places.  But I see many such changes in the patch.  I think
this is not possible here since 0 can be a legal user provided value which
cannot be set as a default (default is all rows).

However do you think, can we avoid that? Is there any other way so that we
don't need every node having ps_numTuples to be set explicitly?

Apart from this patch look good to me.


Regards,
-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Substantial bloat in postgres_fdw regression test runtime

2016-11-03 Thread Jeevan Chalke
On Wed, Nov 2, 2016 at 10:09 PM, Tom Lane  wrote:

> In 9.6, "make installcheck" in contrib/postgres_fdw takes a shade
> under 3 seconds on my machine.  In HEAD, it's taking 10 seconds.
> I am not happy, especially not since there's no parallelization
> of the contrib regression tests.  That's a direct consumption of
> my time and all other developers' time too.  This evidently came
> in with commit 7012b132d (Push down aggregates to remote servers),
> and while that's a laudable feature, I cannot help thinking that
> it does not deserve this much of an imposition on every make check
> that's ever done for the rest of eternity.
>

Thanks Tom for reporting this substantial bloat in postgres_fdw regression
test runtime. On my machine "make installcheck" in contrib/postgres_fdw
takes 6.2 seconds on master (HEAD:
770671062f130a830aa89100c9aa2d26f8d4bf32).
However if I remove all tests added for aggregate push down, it drops down
to 2.2 seconds. Oops 4 seconds more.

I have timed each of my tests added as part of aggregate push down patch and
observed that one of the test (LATERAL one) is taking around 3.5 seconds.
This is causing because of the parameterization. I have added a filter so
that we will have less number of rows for parameterization. Doing this,
lateral query in question now runs in 100ms. Also updated few more queries
which were taking more than 100ms to have runtime around 30ms or so. So
effectively, with changes "make installcheck" now takes around 2.5 seconds.

Attached patch with test-case modification.

Thanks


>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


speedup_agg_push_down_tests.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-10-24 Thread Jeevan Chalke
On Sat, Oct 22, 2016 at 9:09 PM, Tom Lane  wrote:

> brolga is still not terribly happy with this patch: it's choosing not to
> push down the aggregates in one of the queries.  While I failed to
> duplicate that result locally, investigation suggests that brolga's result
> is perfectly sane; in fact it's not very clear why we're not getting that
> from multiple critters, because the plan brolga is choosing is not
> inferior to the expected one.
>
> The core of the problem is this subquery:
>
> contrib_regression=# explain verbose select min(13), avg(ft1.c1),
> sum(ft2.c1) from ft1 right join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
>
>  QUERY PLAN
> 
> 
> -
>  Foreign Scan  (cost=108.61..108.64 rows=1 width=44)
>Output: (min(13)), (avg(ft1.c1)), (sum(ft2.c1))
>Relations: Aggregate on ((public.ft1) INNER JOIN (public.ft2))
>Remote SQL: SELECT min(13), avg(r1."C 1"), sum(r2."C 1") FROM ("S 1"."T
> 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" =
> 12
> (4 rows)
>
> If you look at the estimate to just fetch the data, it's:
>
> contrib_regression=# explain verbose select ft1.c1, ft2.c1 from ft1 right
> join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
>   QUERY PLAN
> 
> --
>  Foreign Scan  (cost=100.55..108.62 rows=1 width=8)
>Output: ft1.c1, ft2.c1
>Relations: (public.ft1) INNER JOIN (public.ft2)
>Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN
> "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" = 12
> (4 rows)
>
> Note we're expecting only one row out of the join.  Now the cost of doing
> three aggregates on a single row of input is not a lot.  Comparing these
> local queries:
>
> regression=# explain select min(13),avg(q1),sum(q2) from int8_tbl where
> q2=456;
>   QUERY PLAN
> ---
>  Aggregate  (cost=1.07..1.08 rows=1 width=68)
>->  Seq Scan on int8_tbl  (cost=0.00..1.06 rows=1 width=16)
>  Filter: (q2 = 456)
> (3 rows)
>
> regression=# explain select (q1),(q2) from int8_tbl where q2=456;
>QUERY PLAN
> -
>  Seq Scan on int8_tbl  (cost=0.00..1.06 rows=1 width=16)
>Filter: (q2 = 456)
> (2 rows)
>
> we seem to have startup = input cost + .01 and then another .01
> for total.  So the estimate to do the above remote scan and then
> aggregate locally should have been 108.63 startup and 108.64 total,
> give or take.  The estimate for aggregating remotely is a hair better,
> but it's not nearly better enough to guarantee that the planner won't
> see it as fuzzily the same cost.
>
> In short: the problem with this test case is that it's considering
> aggregation over only a single row, which is a situation in which
> pushing down the aggregate actually doesn't save us anything, because
> we're retrieving one row from the remote either way.  So it's not at all
> surprising that we don't get a stable plan choice.  The test query needs
> to be adjusted so that the aggregation is done over multiple rows,
> allowing fdw_tuple_cost to kick in and provide some daylight between
> the cost estimates.
>

Attached patch which performs aggrgation over 1000 rows as suggested by Tom.

I believe it will have a stable output/plan now.

Thanks


>
> regards, tom lane
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


agg_push_down_fix_testcase.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-10-20 Thread Jeevan Chalke
On Thu, Oct 20, 2016 at 12:49 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> The patch compiles and make check-world doesn't show any failures.
>
> >>
> >
> >
> > I have tried it. Attached separate patch for it.
> > However I have noticed that istoplevel is always false (at-least for the
> > testcase we have, I get it false always). And I also think that taking
> > this decision only on PlanState is enough. Does that in attached patch.
> > To fix this, I have passed deparse_namespace to the private argument and
> > accessed it in get_special_variable(). Changes looks very simple. Please
> > have a look and let me know your views.
> > This issue is not introduced by the changes done for the aggregate push
> > down, it only got exposed as we now ship expressions in the target list.
> > Thus I think it will be good if we make these changes separately as new
> > patch, if required.
> >
>
>
> The patch looks good and pretty simple.
>
> +* expression.  However if underneath PlanState is ForeignScanState,
> then
> +* don't force parentheses.
> We need to explain why it's safe not to add paranthesis. The reason
> this function adds paranthesis so as to preserve any operator
> precedences imposed by the expression tree of which this IndexVar is
> part. For ForeignScanState, the Var may represent the root of the
> expression tree, and thus not need paranthesis. But we need to verify
> this and explain it in the comments.
>
> As you have explained this is an issue exposed by this patch;
> something not must have in this patch. If committers feel that
> aggregate pushdown needs this fix as well, please provide a patch
> addressing the above comment.
>
>
Sure.


>
> Ok. PFA patch with cosmetic changes (mostly rewording comments). Let
> me know if those look good. I am marking this patch is ready for
> committer.
>

Changes look good to me.
Thanks for the detailed review.

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-10-12 Thread Jeevan Chalke
On Thu, Sep 8, 2016 at 10:41 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> I think we should try to measure performance gain because of aggregate
> pushdown. The EXPLAIN
> doesn't show actual improvement in the execution times.
>

I did performance testing for aggregate push down and see good performance
with the patch.

Attached is the script I have used to get the performance numbers along with
the results I got with and without patch (pg_agg_push_down_v3.patch).  Also
attached few GNU plots from the readings I got.  These were run on my local
VM having following details:

Linux centos7 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux
RAM alloted: 8 GB
CPUs alloted: 8
postgresql.conf is default.

With aggregate push down I see around 12x performance for count(*)
operation.
In another test, I have observed that if number of groups returned from
remote server is same as that of input rows, then aggregate push down
performs
slightly poor which I think is expected. However in all other cases where
number of groups are less than input rows, pushing down aggregate gives
better
performance than performing grouping on the local server.

I did this performance testing on my local setup. I would expected even
better
numbers on a specialized high-end performance machine.  I would be more than
happy if someone does this testing on such high-end machine.

Let me know if you need any help.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
=== WITH PATCH ===

query|  rows   
| avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time 
-+-+--+--+--+--
 select count(*) from fperf1 |   1 
| 141.8686 |  4.4668353500495 |  138.903 |  152.241
 select c2, avg(c1) from fperf1 group by c2 having count(*) < 34 |   2 
| 405.7661 | 11.9403142844368 |  400.689 |  439.675
 select c2, sum(c1) from fperf1 group by c2  |   3 
| 363.2299 | 4.29278180851687 |  354.815 |  369.739
 select c3, avg(c1), sum(c2) from fperf1 group by c3 |   5 
| 454.4478 | 3.98680590057494 |  447.248 |  457.955
 select c4, avg(c1), sum(c2) from fperf1 group by c4 |  10 
| 501.0197 | 5.26951028823454 |  491.726 |  508.574
 select c5, avg(c1), sum(c2) from fperf1 group by c5 | 100 
| 490.0783 | 5.64261263462349 |   480.78 |  496.662
 select c6, avg(c1), sum(c2) from fperf1 group by c6 |1000 
| 582.6842 |  9.9196984474776 |  564.425 |   592.69
 select c1%1, avg(c1), sum(c2) from fperf1 group by c1%1 |   1 
| 901.1682 | 9.58382273302904 |  888.383 |  923.386
 select c1%10, avg(c1), sum(c2) from fperf1 group by c1%10   |  10 
|1032.1959 | 6.89087326268629 | 1018.598 | 1045.423
 select c1%100, avg(c1), sum(c2) from fperf1 group by c1%100 | 100 
|1076.3834 | 11.1022883947539 | 1061.305 | 1093.892
 select c1%1000, avg(c1), sum(c2) from fperf1 group by c1%1000   |1000 
|1113.6001 | 11.2143472634172 | 1092.863 | 1133.007
 select c1%1, avg(c1), sum(c2) from fperf1 group by c1%1 |   1 
|1182.1374 | 32.5953859659133 | 1148.961 | 1231.296
 select c1%10, avg(c1), sum(c2) from fperf1 group by c1%10   |  10 
|1467.1811 | 14.3535175437048 |  1443.95 | 1485.645
 select c1%100, avg(c1), sum(c2) from fperf1 group by c1%100 | 100 
|5466.2306 | 633.367848489717 | 5127.339 | 7248.381
(14 rows)


=== WITHOUT PATCH ===

query|  rows   
| avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time 
-+-+--+--+--+--
 select count(*) from fperf1 |   1 
|1674.5339 | 27.1549108570754 | 1637.345 | 1725.057
 select c2, avg(c1) from fperf1 group by c2 having count(*) < 34 |   2 
|2467.8368 |  22.606929678949 | 2437.674 | 2506.438
 select c2, sum(c1) from fperf1 group by c2  |   3 
|  2387.39 | 34.3686766983568 | 2350.396 | 2444.313
 select c3, avg(c1), sum(c2) from fperf1 group by c3 |   5 
|2702.3344 | 28.0312843452488 | 2665.317 | 2746.862
 select c4, avg(c1), sum(c2) from fperf1 group by c4 |  10 
|2850.9818 | 42.5758532759606 | 2813.562 | 2946.991
 select c5, avg(c1), 

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-30 Thread Jeevan Chalke
On Mon, Sep 26, 2016 at 6:15 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> This patch will need some changes to conversion_error_callback(). That
> function reports an error in case there was an error converting the
> result obtained from the foreign server into an internal datum e.g.
> when the string returned by the foreign server is not acceptable by
> local input function for the expected datatype. In such cases, the
> input function will throw error and conversion_error_callback() will
> provide appropriate context for that error. postgres_fdw.sql has tests
> to test proper context
> We need to fix the error context to provide meaningful information or
> at least not crash. This has been discussed briefly in [1].
>

Oops. I had that in mind when working on this. Somehow skipped checking
for conversion error context. I have fixed that in v3 patch.
Removed assert and for non Var expressions, printing generic context.
This context is almost in-line with the discussion you referred here.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-29 Thread Jeevan Chalke
Hi Stephen,


> 4. It will be good if we have an example for this in section
> > "5.7. Row Security Policies"
>
> I haven't added one yet, but will plan to do so.
>
> I think you are going to add this in this patch itself, right?

I have reviewed your latest patch and it fixes almost all my review
comments.
Also I am agree with your responses for couple of comments like response on
ALTER POLICY and tab completion. No issues with that.

However in documentation, PERMISSIVE and RESTRICTIVE are actually literals
and not parameters as such.  Also can we combine these two options into one
like below (similar to how we document CASCADE and RESTRICT for DROP
POLICY):

   
PERMISSIVE
RESTRICTIVE


 
... explain PERMISSIVE ...
 
 
... explain RESTRICTIVE ...
 

   

Apart from this changes look excellent to me.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-27 Thread Jeevan Chalke
On Tue, Sep 27, 2016 at 12:45 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

> Hello Stephen,
>


> I am reviewing the latest patch in detail now and will post my review
> comments later.
>


Here are the review comments:

1. In documentation, we should put both permissive as well as restrictive in
the header like permissive|restrictive. Or may be a generic term, say,
policy
type (like we have command and then options mentioned like all, select
etc.),
followed by options in the description. Or like we have CASCADE and RESTRICT
in drop case.

2. "If the policy is a "permissive" or "restrictive" policy." seems broken
as
sentence starts with "If" and there is no other part to it. Will it be
better
to say "Specifies whether the policy is a "permissive" or "restrictive"
policy."?

3. " .. a policy can instead by "restrictive""
Do you mean "instead be" here?

4. It will be good if we have an example for this in section
"5.7. Row Security Policies"

5. AS ( PERMISSIVE | RESTRICTIVE )
should be '{' and '}' instead of parenthesis.

6. I think following changes are irrelevant for this patch.
Should be submitted as separate patch if required.

@@ -2133,6 +2139,25 @@ psql_completion(const char *text, int start, int end)
 /* Complete "CREATE POLICY  ON  USING (" */
 else if (Matches6("CREATE", "POLICY", MatchAny, "ON", MatchAny,
"USING"))
 COMPLETE_WITH_CONST("(");
+/* CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE FOR
ALL|SELECT|INSERT|UPDATE|DELETE */
+else if (Matches8("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "FOR"))
+COMPLETE_WITH_LIST5("ALL", "SELECT", "INSERT", "UPDATE", "DELETE");
+/* Complete "CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE
FOR INSERT TO|WITH CHECK" */
+else if (Matches9("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "FOR", "INSERT"))
+COMPLETE_WITH_LIST2("TO", "WITH CHECK (");
+/* Complete "CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE
FOR SELECT|DELETE TO|USING" */
+else if (Matches9("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "FOR", "SELECT|DELETE"))
+COMPLETE_WITH_LIST2("TO", "USING (");
+/* CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE FOR
ALL|UPDATE TO|USING|WITH CHECK */
+else if (Matches9("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "FOR", "ALL|UPDATE"))
+COMPLETE_WITH_LIST3("TO", "USING (", "WITH CHECK (");
+/* Complete "CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE
TO " */
+else if (Matches8("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "TO"))
+COMPLETE_WITH_QUERY(Query_for_list_of_grant_roles);
+/* Complete "CREATE POLICY  ON  AS PERMISSIVE|RESTRICTIVE
USING (" */
+else if (Matches8("CREATE", "POLICY", MatchAny, "ON", MatchAny, "AS",
MatchAny, "USING"))
+COMPLETE_WITH_CONST("(");

7. Natts_pg_policy should be updated to 7 now.

8. In following error, $2 and @2 should be used to correctly display the
option and location.

ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg("unrecognized row security option
\"%s\"", $1),
 errhint("Only PERMISSIVE or RESTRICTIVE
policies are supported currently."),
 parser_errposition(@1)));

You will see following result otherwise:

postgres=# CREATE POLICY my_policy ON foo AS a123;
ERROR:  unrecognized row security option "as"
LINE 1: CREATE POLICY my_policy ON foo AS a123;
   ^
HINT:  Only PERMISSIVE or RESTRICTIVE policies are supported currently.

I think adding negative test to test this error should be added in
regression.

9. Need to update following comments in gram.y to reflect new changes.

 *QUERIES:
 *CREATE POLICY name ON table [FOR cmd] [TO role, ...]
 *[USING (qual)] [WITH CHECK (with_check)]

10. ALTER POLICY has no changes for this. Any reason why that is not
considered here.

11. Will it be better to use boolean for polpermissive in _policyInfo?
And then set that appropriately while getting the policies. So that later we
only need to test the boolean avoiding string comparison.

12

Re: [HACKERS] Add support for restrictive RLS policies

2016-09-27 Thread Jeevan Chalke
Hello Stephen,

On Tue, Sep 27, 2016 at 12:57 AM, Stephen Frost <sfr...@snowman.net> wrote:

> Jeevan,
>
> * Jeevan Chalke (jeevan.cha...@enterprisedb.com) wrote:
> > I have started reviewing this patch and here are couple of points I have
> > observed so far:
> >
> > 1. Patch applies cleanly
> > 2. make / make install / initdb all good.
> > 3. make check (regression) FAILED. (Attached diff file for reference).
>
> I've re-based my patch on top of current head and still don't see the
> failures which you are getting during the regression tests.  Is it
> possible you were doing the tests without a full rebuild of the source
> tree..?
>
> Can you provide details of your build/test environment and the full
> regression before and after output?
>

I still get same failures with latest sources and with new patch. Here are
few details of my setup. Let me know if I missed any.

$ uname -a
Linux centos7 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux

HEAD at
commit 51c3e9fade76c12e4aa37bffdf800bbf74fb3fb1

configure switches:
./configure --with-openssl --with-tcl --with-perl --with-python
--with-ossp-uuid --with-ldap --with-pam --with-zlib --with-pgport=5432
--enable-depend --enable-debug --enable-cassert --prefix=`pwd`/install
CFLAGS="-g -O0"

Regression FAILED. Regression diff is same as previous one.

Without patch I don't get any regression failure.

Well, I could not restrict myself debugging this mystery and finally able
to find the reason why this is failing. It was strange that it did not
crash and simply gave different results.

With this patch, pg_policy catalog now has seven columns, however
Natts_pg_policy is still set to 6. It should be updated to 7 now.
Doing this regression seems OK.

I am reviewing the latest patch in detail now and will post my review
comments later.

Thanks


>
> Thanks!
>
> Stephen
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Jeevan Chalke
Hi,

I have started reviewing this patch and here are couple of points I have
observed so far:

1. Patch applies cleanly
2. make / make install / initdb all good.
3. make check (regression) FAILED. (Attached diff file for reference).

Please have a look over failures.

Meanwhile I will go ahead and review the code changes.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-26 Thread Jeevan Chalke
On Mon, Sep 12, 2016 at 7:27 AM, Stephen Frost  wrote:

> Robert,
>
> * Robert Haas (robertmh...@gmail.com) wrote:
> > On Thu, Sep 8, 2016 at 5:21 PM, Tom Lane  wrote:
> > > Stephen Frost  writes:
> > >> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > >>> Can't you keep those words as Sconst or something (DefElems?) until
> the
> > >>> execution phase, so that they don't need to be keywords at all?
> > >
> > >> Seems like we could do that, though I'm not convinced that it really
> > >> gains us all that much.  These are only unreserved keywords, of
> course,
> > >> so they don't impact users the way reserved keywords (of any kind)
> can.
> > >> While there may be some places where we use a string to represent a
> set
> > >> of defined options, I don't believe that's typical
> > >
> > > -1 for having to write them as string literals; but I think what Alvaro
> > > really means is to arrange for the words to just be identifiers in the
> > > grammar, which you strcmp against at execution.  See for example
> > > reloption_list.  (Whether you use DefElem as the internal
> representation
> > > is a minor detail, though it might help for making the parsetree
> > > copyObject-friendly.)
> > >
> > > vacuum_option_elem shows another way to avoid making a word into a
> > > keyword, although to me that one is more of an antipattern; it'd be
> better
> > > to leave the strcmp to execution, since there's so much other code that
> > > does things that way.
> >
> > There are other cases like that, too, e.g. AlterOptRoleElem; I don't
> > think it's a bad pattern.  Regardless of the specifics, I do think
> > that it would be better not to bloat the keyword table with things
> > that don't really need to be keywords.
>
> The AlterOptRoleElem case is, I believe, what Tom was just suggesting as
> an antipattern, since the strcmp() is being done at parse time instead
> of at execution time.
>
> If we are concerned about having too many unreserved keywords, then I
> agree that AlterOptRoleElem is a good candidate to look at for reducing
> the number we have, as it appears to contain 3 keywords which are not
> used anywhere else (and 1 other which is only used in one other place).
>
> I do think that using IDENT for the various role attributes does make
> sense, to be clear, as there are quite a few of them, they change
> depending on major version, and those keywords are very unlikely to ever
> have utilization elsewhere.
>
> For this case, there's just 2 keywords which seem like they may be used
> again (perhaps for ALTER or DROP POLICY, or default policies if we grow
> those in the future), and we're unlikly to grow more in the future for
> that particular case (as we only have two binary boolean operators and
> that seems unlikely to change), though, should that happens, we could
> certainly revisit this.
>

To me, adding two new keywords for two new options does not look good as it
will bloat keywords list. Per my understanding we should add keyword if and
only if we have no option than adding at, may be to avoid grammar conflicts.

I am also inclined to think that using identifier will be a good choice
here.
However I would prefer to do the string comparison right into the grammar
itself, so that if we have wrong option as input there, then we will not
proceed further with it. We are anyway going to throw an error later then
why not at early stage.

Thanks


>
> Thanks!
>
> Stephen
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-16 Thread Jeevan Chalke
On Mon, Sep 12, 2016 at 5:19 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Mon, Sep 12, 2016 at 12:20 PM, Prabhat Sahu <
> prabhat.s...@enterprisedb.com> wrote:
>
>> Hi,
>>
>> While testing "Aggregate pushdown", i found the below error:
>> -- GROUP BY alias showing different behavior after adding patch.
>>
>> -- Create table "t1", insert few records.
>> create table t1(c1 int);
>> insert into t1 values(10), (20);
>>
>> -- Create foreign table:
>> create foreign table f_t1 (c1 int) server db1_server options (table_name
>> 't1');
>>
>> -- with local table:
>> postgres=# select 2 a, avg(c1) from t1 group by a;
>>  a | avg
>> ---+-
>>  2 | 15.
>> (1 row)
>>
>> -- with foreign table:
>> postgres=# select 2 a, avg(c1) from f_t1 group by a;
>> ERROR:  aggregate functions are not allowed in GROUP BY
>> CONTEXT:  Remote SQL command: EXPLAIN SELECT 2, avg(c1) FROM public.t1
>> GROUP BY 2
>>
>>
>>
> Thanks for reporting this bug in *v1.patch Prabhat.
>
> I will have a look over this issue and will post a fix in next version.
>

To fix this issue, we need to make deparseConst() function aware of showtype
flag exactly as that of get_const_expr().  While deparsing Const in GROUP BY
clause, we need to show "::typename" so that it won't treat the constant
value
as a column position in the target list and rather treat it as constant
value.

Fixed this in earlier attached patch and added test-case too.

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-12 Thread Jeevan Chalke
On Mon, Sep 12, 2016 at 12:20 PM, Prabhat Sahu <
prabhat.s...@enterprisedb.com> wrote:

> Hi,
>
> While testing "Aggregate pushdown", i found the below error:
> -- GROUP BY alias showing different behavior after adding patch.
>
> -- Create table "t1", insert few records.
> create table t1(c1 int);
> insert into t1 values(10), (20);
>
> -- Create foreign table:
> create foreign table f_t1 (c1 int) server db1_server options (table_name
> 't1');
>
> -- with local table:
> postgres=# select 2 a, avg(c1) from t1 group by a;
>  a | avg
> ---+-
>  2 | 15.
> (1 row)
>
> -- with foreign table:
> postgres=# select 2 a, avg(c1) from f_t1 group by a;
> ERROR:  aggregate functions are not allowed in GROUP BY
> CONTEXT:  Remote SQL command: EXPLAIN SELECT 2, avg(c1) FROM public.t1
> GROUP BY 2
>
>
>
Thanks for reporting this bug in *v1.patch Prabhat.

I will have a look over this issue and will post a fix in next version.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-12 Thread Jeevan Chalke
On Thu, Sep 8, 2016 at 10:41 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
>
>
>> While checking for shippability, we build the target list which is passed
>> to
>> the foreign server as fdw_scan_tlist. The target list contains
>> a. All the GROUP BY expressions
>> b. Shippable entries from the target list of upper relation
>> c. Var and Aggref nodes from non-shippable entries from the target list of
>> upper relation
>>
>
> The code in the patch doesn't seem to add Var nodes explicitly. It assumes
> that
> the Var nodes will be part of GROUP BY clause. The code is correct, I
> think.
>

Yes. Code is correct. Var nodes are already part of GROUP BY else we hit
error well before this point.

Thanks Ashutosh for the detailed review comments.

I am working on it and will post updated patch once I fix all your concerns.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan

2016-09-06 Thread Jeevan Chalke
Hi,

Changes look good to me.

However there are couple of minor issues need to be fixed.

1.
"under" repeated on second line. Please remove.
+if and when CustomScanState is located under
+under LimitState; which implies the underlying node is
not

2.
Typo: dicsussion => discussion
Please fix.

Apart from this I see no issues.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan

2016-09-02 Thread Jeevan Chalke
Hi,

On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai  wrote:

> Hello,
>
> The attached patch adds an optional callback to support special
> optimization
> if ForeignScan/CustomScan are located under the Limit node in plan-tree.
>
> Our sort node wisely switches the behavior when we can preliminary know
> exact number of rows to be produced, because all the Sort node has to
> return is the top-k rows when it is located under the Limit node.
> It is much lightweight workloads than sorting of entire input rows when
> nrows is not small.
>
> In my case, this information is very useful because GPU can complete its
> sorting operations mostly on L1-grade memory if we can preliminary know
> the top-k value is enough small and fits to size of the fast memory.
>
> Probably, it is also valuable for Fujita-san's case because this
> information
> allows to attach "LIMIT k" clause on the remote query of postgres_fdw.
> It will reduce amount of the network traffic and remote CPU consumption
> once we got support of sort pushdown.
>

> One thing we need to pay attention is cost estimation on the planner stage.
> In the existing code, only create_ordered_paths() and
> create_merge_append_path()
> considers the limit clause for cost estimation of sorting. They use the
> 'limit_tuples' of PlannerInfo; we can reference the structure when
> extension
> adds ForeignPath/CustomPath, so I think we don't need a special enhancement
> on the planner stage.
>
>
I believe this hook is gets called at execution time.
So to push LIMIT clause like you said above we should use "limit_tuples" at
the time of planning and then use this hook to optimize at runtime, right?

Apart from that, attached patch applies cleanly on latest sources and found
no issues with make or with regressions.

However this patch is an infrastructure for any possible optimization when
foreign/customscan is under LIMIT.

So look good to me.

I quickly tried adding a hook support in postgres_fdw, and it gets called
correctly when we have foreignscan with LIMIT (limit being evaluated on
local server).

So code wise no issue. Also add this hook details in documentation.

Thanks


> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei 
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Small patch for snapmgr.c

2016-08-31 Thread Jeevan Chalke
Hi Aleksander,

This has already been fixed with commit
4f9f495889d3d410195c9891b58228727b340189

Thanks

On Fri, Apr 8, 2016 at 6:02 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:

> Hello
>
> Currently there is a following piece of code in snapmgr.c:
>
> ```
> /* Copy all required fields */
> snapshot = (Snapshot) MemoryContextAlloc(TopTransactionContext, size);
> snapshot->satisfies = HeapTupleSatisfiesMVCC;
> snapshot->xmin = serialized_snapshot->xmin;
> snapshot->xmax = serialized_snapshot->xmax;
> snapshot->xip = NULL;
> snapshot->xcnt = serialized_snapshot->xcnt;
> snapshot->subxip = NULL;
> /* ... */
>
> /* Copy XIDs, if present. */
> if (serialized_snapshot->xcnt > 0)
> {
> snapshot->xip = (TransactionId *) (snapshot + 1);
> memcpy(snapshot->xip, serialized_xids,
>serialized_snapshot->xcnt * sizeof(TransactionId));
> }
>
> /* Copy SubXIDs, if present. */
> if (serialized_snapshot->subxcnt > 0)
> {
> snapshot->subxip = snapshot->xip + serialized_snapshot->xcnt;
> memcpy(snapshot->subxip, ...
> ```
>
> It assumes that subxcnt > 0 iff xcnt > 0. As I understand this is true.
> At least I hope so, otherwise this code can call memcpy passing NULL as
> a first argument. But Clang Static Analyzer is not aware of all this:
>
> http://afiskon.ru/s/db/5c956077e9_snapmgr.png
>
> I propose a patch that makes static analyzers happy and makes intention
> of this code a little more obvious.
>
> --
> Best regards,
> Aleksander Alekseev
> http://eax.me/
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-08-31 Thread Jeevan Chalke
On Tue, Aug 30, 2016 at 6:51 PM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

> Hi
>
> 2016-08-30 15:02 GMT+02:00 Jeevan Chalke <jeevan.cha...@enterprisedb.com>:
>
>> Hi all,
>>
>> Attached is the patch which adds support to push down aggregation and
>> grouping
>> to the foreign server for postgres_fdw. Performing aggregation on foreign
>> server results into fetching fewer rows from foreign side as compared to
>> fetching all the rows and aggregating/grouping locally. Performing
>> grouping on
>> foreign server may use indexes if available. So pushing down aggregates/
>> grouping on foreign server performs better than doing that locally.
>> (Attached
>> EXPLAIN output for few simple grouping queries, with and without push
>> down).
>>
>
> is it work without FDW too?. It can be pretty interesting too.
>

No. Aggrgate push down is supported through the GetForeignUpperPaths() hook
added for postgres_fdw. Thus it works only with postgres_fdw.

Do you mean whether this works with any extensions via implementing
create_upper_paths_hook() function?
The answer is No. This patch does not touch this hook.


>
> Regards
>
> Pavel
>
>
>>
>>
>

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] Strange result with LATERAL query

2016-08-23 Thread Jeevan Chalke
Hi,

While playing with LATERAL along with some aggregates in sub-query, I have
observed somewhat unusual behavior.

Consider following steps:

create table tab1(c1 int, c2 int);
insert into tab1 select id, 1 from generate_series(1, 3) id;
create function sum_tab1(extra int) returns setof bigint as $$
  select sum(c1 + extra) sum from tab1 group by c1
$$ language sql;

-- This gives wrong output
select c1, c2, sum from tab1 t1, lateral
  (select sum(t2.c1 + t1.c1) sum from tab1 t2 group by t2.c1) qry
  order by 1, 2, 3;
-- This gives correct output
select c1, c2, sum from tab1 t1, lateral
  (select sum_tab1 sum from sum_tab1(c1)) qry
  order by 1, 2, 3;

I would expect same result from both these queries, but unfortunately not.
Per my understanding, second query involving function gives me correct
result
where as first query's output seems wrong.

Is this an expected behavior OR we are giving wrong result in case of first
query?

Thanks
-- 
Jeevan B Chalke


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-09 Thread Jeevan Chalke
Hi,

I have reviewed the patch and it looks good to me.
make/make install/make check is fine (when done without -Wall -Werror).

Here are few comments:
1.
With -Wall -Werror, I see couple of warnings:

postgres_fdw.c: In function ‘estimate_path_cost_size’:
postgres_fdw.c:2248:13: error: ‘run_cost’ may be used uninitialized in this
function [-Werror=uninitialized]
postgres_fdw.c: In function ‘conversion_error_callback’:
postgres_fdw.c:3832:6: error: ‘attname’ may be used uninitialized in this
function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [postgres_fdw.o] Error 1

2. Typo:
scna_clauses => scan_clauses

3. Does this new addition requires documentation?


I did not see any issues with my testing. Code changes are good too.
Patch has very good test-cases testing everything required. Nice work.

Thanks.

On Mon, Feb 8, 2016 at 7:11 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Mon, Feb 8, 2016 at 4:15 PM, Etsuro Fujita  > wrote:
>
>> On 2016/02/05 17:50, Ashutosh Bapat wrote:
>>
>> Btw, IIUC, I think the patch fails to adjust the targetlist of the
>>> top plan created that way, to output the fdw_scan_tlist, as
>>> discussed in [1] (ie, I think the attached patch is needed, which is
>>> created on top of your patch pg_fdw_join_v8.patch).
>>>
>>
>> fdw_scan_tlist represents the output fetched from the foreign server and
>>> is not necessarily the output of ForeignScan. ForeignScan node's output
>>> is represented by tlist argument to.
>>>
>>> 1119 return make_foreignscan(tlist,
>>> 1120 local_exprs,
>>> 1121 scan_relid,
>>> 1122 params_list,
>>> 1123 fdw_private,
>>> 1124 fdw_scan_tlist,
>>> 1125 remote_exprs,
>>> 1126 outer_plan);
>>>
>>> This tlist is built using build_path_tlist() for all join plans. IIUC,
>>> all of them output the same targetlist. We don't need to make sure that
>>> targetlist match as long as we are using the targetlist passed in by
>>> create_scan_plan(). Do you have a counter example?
>>>
>>
>> Maybe my explanation was not correct, but I'm saying that the targertlist
>> of the above outer_plan should be set to the fdw_scan_tlist, to avoid
>> misbehavior.  Here is such an example (add() in the example is a user
>> defined function that simply adds two arguments, defined by: create
>> function add(integer, integer) returns integer as '/path/to/func', 'add'
>> language c strict):
>>
>> postgres=# create foreign table foo (a int) server myserver options
>> (table_name 'foo');
>> postgres=# create foreign table bar (a int) server myserver options
>> (table_name 'bar');
>> postgres=# create table tab (a int, b int);
>> postgres=# insert into foo select a from generate_series(1, 1000) a;
>> postgres=# insert into bar select a from generate_series(1, 1000) a;
>> postgres=# insert into tab values (1, 1);
>> postgres=# analyze foo;
>> postgres=# analyze bar;
>> postgres=# analyze tab;
>>
>> [Terminal 1]
>> postgres=# begin;
>> BEGIN
>> postgres=# update tab set b = b + 1 where a = 1;
>> UPDATE 1
>>
>> [Terminal 2]
>> postgres=# explain verbose select tab.* from tab, foo, bar where foo.a =
>> bar.a and add(foo.a, bar.a) > 0 limit 1 for update;
>>
>> QUERY PLAN
>>
>>
>> 
>> -
>>  Limit  (cost=100.00..107.70 rows=1 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  LockRows  (cost=100.00..2663.48 rows=333 width=70)
>>  Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>  ->  Nested Loop  (cost=100.00..2660.15 rows=333 width=70)
>>Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
>>->  Foreign Scan  (cost=100.00..2654.97 rows=333 width=56)
>>  Output: foo.*, bar.*
>>  Filter: (add(foo.a, bar.a) > 0)
>>  Relations: (public.foo) INNER JOIN (public.bar)
>>  Remote SQL: SELECT ROW(r2.a), ROW(r3.a), r2.a, r3.a
>> FROM (public.foo r2 INNER JOIN public.bar r3 ON (TRUE)) WHERE ((r2.a =
>> r3.a)) F
>> OR UPDATE OF r2 FOR UPDATE OF r3
>>  ->  Hash Join  (cost=247.50..301.25 rows=333
>> width=56)
>>Output: foo.*, bar.*
>>Hash Cond: (foo.a = bar.a)
>>Join Filter: (add(foo.a, bar.a) > 0)
>>->  Foreign Scan on public.foo
>> (cost=100.00..135.00 rows=1000 width=32)
>>  Output: foo.*, foo.a
>>  Remote SQL: SELECT a FROM public.foo FOR
>> UPDATE
>>->  Hash  

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-16 Thread Jeevan Chalke
On Thu, Oct 15, 2015 at 10:44 PM, Robert Haas  wrote:

> On Thu, Oct 15, 2015 at 3:04 AM, Kyotaro HORIGUCHI
>  wrote:
> > I confirmed that an epqtuple of foreign parameterized scan is
> > correctly rejected by fdw_recheck_quals with modified outer
> > tuple.
> >
> > I have no objection to this and have two humble comments.
> >
> > In file_fdw.c, the comment for the last parameter just after the
> > added line seems to be better to be aligned with other comments.
>
> I've pgindented the file.  Any other space we might choose would just
> be changed by the next pgindent run, so there's no point in trying to
> vary.
>
> > In subselect.c, the added break is in the added curly-braces but
> > it would be better to place it after the closing brace, like the
> > other cases.
>
> Changed that, and committed.
>

With the latest sources having this commit, when I follow same steps,
I get
 ERROR:  unrecognized node type: 525
error.

It looks like, we have missed to handle T_RestrictInfo.
I am getting this error from expression_tree_mutator().


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-16 Thread Jeevan Chalke
On Fri, Oct 16, 2015 at 3:10 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Thu, Oct 15, 2015 at 10:44 PM, Robert Haas <robertmh...@gmail.com>
> wrote:
>
>> On Thu, Oct 15, 2015 at 3:04 AM, Kyotaro HORIGUCHI
>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> > I confirmed that an epqtuple of foreign parameterized scan is
>> > correctly rejected by fdw_recheck_quals with modified outer
>> > tuple.
>> >
>> > I have no objection to this and have two humble comments.
>> >
>> > In file_fdw.c, the comment for the last parameter just after the
>> > added line seems to be better to be aligned with other comments.
>>
>> I've pgindented the file.  Any other space we might choose would just
>> be changed by the next pgindent run, so there's no point in trying to
>> vary.
>>
>> > In subselect.c, the added break is in the added curly-braces but
>> > it would be better to place it after the closing brace, like the
>> > other cases.
>>
>> Changed that, and committed.
>>
>
> With the latest sources having this commit, when I follow same steps,
> I get
>  ERROR:  unrecognized node type: 525
> error.
>
> It looks like, we have missed to handle T_RestrictInfo.
> I am getting this error from expression_tree_mutator().
>

Ignore this.
It was caused due to some compilation issue on my system.

It is working as expected in the latest sources.

Sorry for the noise and inconvenience caused.


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Getting sorted data from foreign server

2015-10-13 Thread Jeevan Chalke
> On Thu, Oct 8, 2015 at 9:39 PM, Robert Haas  wrote:
>
>>
>> In the interest of full disclosure, I asked Ashutosh to work on this
>> patch and have discussed the design with him several times.  I believe
>> that this is a good direction for PostgreSQL to be going.  It's
>> trivially easy right now to write a query against an FDW that performs
>> needlessly easy, because a join, or a sort, or an aggregate is
>> performed on the local server rather than the remote one.   I, and
>> EnterpriseDB, want that to get fixed.  However, we also want it to get
>> fixed in the best possible way, and not to do anything unless there is
>> consensus on it.  So, if anyone has opinions on this topic, please
>> jump in.
>>
>

Are we planning to push sorting on foreign server with parametrized
foreign path?

We might get a parametrized path when local table is small enough and
foreign table is bigger, like, for this query
SELECT big_ft.x FROM big_ft, small_lt WHERE big_ft.x = small_lt.y;
we might end up getting parametrized foreign path with remote query
like:
SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1;

And with this, if we have an ORDER BY clause like "ORDER BY big_ft.x"
we will get remote query like:
SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1 ORDER BY big_ft.x;

Is this possible???

If yes, then don't we need to sort again on local server?

Assume each row on local server matches three rows from foreign table,
then for each $1, we will have 3 rows returned from the foreign server,
of-course sorted. But then all these fetched rows in batch of 3, needs
to be re-sorted on local server, isn't it?
If yes, this will be much more costly than fetching unsorted rows and
sorting then locally only once.

Or am I missing something here?

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-09 Thread Jeevan Chalke
On Fri, Oct 9, 2015 at 3:35 PM, Etsuro Fujita 
wrote:

Hi,

Just to have hands on, I started looking into this issue and trying to
grasp it as this is totally new code for me. And later I want to review
this code changes.

I have noticed that, this thread started saying we are getting a crash
with the given steps with foreign_join_v16.patch, I am correct?

Then there are various patches which trying to fix this,
fdw-eval-plan-qual-*.patch

I have tried applying foreign_join_v16.patch, which was good. And tried
reproducing the crash. But instead of crash I am getting following error.

ERROR:  could not serialize access due to concurrent update
CONTEXT:  Remote SQL command: SELECT a FROM public.foo FOR UPDATE
Remote SQL command: SELECT a FROM public.tab FOR UPDATE


Then I have applied fdw-eval-plan-qual-3.0.patch on top of it. It was not
getting applied cleanly (may be due to some other changes meanwhile).
I fixed the conflicts and the warnings to make it compile.

When I run same sql sequence, I am getting crash in terminal 2 at EXPLAIN
it self.

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Following sql statement I am using:

create table tab (a int, b int);
create foreign table foo (a int) server myserver options(table_name 'tab');
create foreign table bar (a int) server myserver options(table_name 'tab');

insert into tab values (1, 1);
insert into foo values (1);
insert into bar values (1);

analyze tab;
analyze foo;
analyze bar;


Run the example:

[Terminal 1]
begin;
update tab set b = b + 1 where a = 1;

[Terminal 2]
explain verbose select tab.* from tab, foo, bar where tab.a =
foo.a and foo.a = bar.a for update;


Am I missing something here?
Do I need to apply any other patch from other mail-threads?

Do you have sample test-case explaining the issue and fix?

With these simple questions, I might have taking the thread slightly off
from the design considerations, please excuse me for that.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Wed, Sep 23, 2015 at 7:29 PM, Tom Lane  wrote:

> Removing that entirely would be quite incorrect, because then you'd be
> lying to the parent node about what collation your node outputs.
>

Yes. I too thought so and thus wanted to fix that code block by
considering the default collation.


>
> After thinking a bit more about the existing special case for non-foreign
> Vars, I wonder if what we should do is change these code blocks to look
> like
>
> collation = r->resultcollid;
> if (collation == InvalidOid)
> state = FDW_COLLATE_NONE;
> else if (inner_cxt.state == FDW_COLLATE_SAFE &&
>  collation == inner_cxt.collation)
> state = FDW_COLLATE_SAFE;
> +   else if (collation == DEFAULT_COLLATION_OID)
> +   state = FDW_COLLATE_NONE;
> else
> state = FDW_COLLATE_UNSAFE;
>
> That is, only explicit introduction of a non-default collation causes
> a subexpression to get labeled FDW_COLLATE_UNSAFE.  Default collations
> would lose out when getting merged with a nondefault collation from a
> foreign Var, so they should work all right.
>

Agree.
I had suggested similar changes in approach (2)
but you put that check at exact required place.


> regards, tom lane
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Wed, Sep 23, 2015 at 10:15 PM, Tom Lane  wrote:

> I wrote:
> > Hm ... actually, we probably need *both* types of changes if that's
> > what we believe the state values mean.
>
>
I too was confused with the state explanations from the code-comments which
we have them now. With your explanation here clears that.
Thanks for explaining those.

After a bit more thinking and experimentation, I propose the attached
> patch.
>

I had a look over the patch and reviewed it. It is in excellent state to
check-in.


> regards, tom lane
>
>

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Thu, Sep 24, 2015 at 10:22 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Jeevan Chalke <jeevan.cha...@enterprisedb.com> writes:
> > On Wed, Sep 23, 2015 at 10:15 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> After a bit more thinking and experimentation, I propose the attached
> >> patch.
>
> > I had a look over the patch and reviewed it. It is in excellent state to
> > check-in.
>
> After further thought I decided that the base case for
> Const/Param/non-foreign-Vars wasn't quite right either.  If we don't like
> the collation we should just set the state to UNSAFE not fail immediately,
> because it might appear in a context where collation doesn't matter.
> An example is "var IS NOT NULL".
>

Make sense.


>
> So I've committed the attached modification of that patch.
>
> Thanks


> regards, tom lane
>
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-23 Thread Jeevan Chalke
Hi Tom,

On Tue, Sep 22, 2015 at 12:31 AM, Tom Lane  wrote:

>
> I think you're blaming the wrong code; RelabelType is handled basically
> the same as most other cases.
>
> It strikes me that this function is really going about things the wrong
> way.  Rather than trying to determine the output collation per se, what
> we ought to be asking is "does every operator in the proposed expression
> have an input collation that can be traced to some foreign Var further
> down in the expression"?  That is, given the example in hand,
>
> RelabelType(ForeignVar) = RelabelType(LocalVar)
>
> the logic ought to be like "the ForeignVar has collation X, and that
> bubbles up without change through the RelabelType, and then the equals
> operator's inputcollation matches that, so accept it --- regardless of
> where the other operand's collation came from exactly".  The key point
> is that we want to validate operator input collations, not output
> collations, as having something to do with what the remote side would do.
>
> This would represent a fairly significant rewrite of foreign_expr_walker's
> collation logic; although I think the end result would be no more
> complicated, possibly simpler, than it is now.
>
> regards, tom lane
>


IIUC, you are saying that collation check for output collation is not
necessary for all OpExpr/FuncExpr/ArrayRef etc.
Should we remove code blocks like
collation = r->resultcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
 collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
else
state = FDW_COLLATE_UNSAFE;

and just bubble up the collation and state to the next level?

Here I see that, in the result collation validation, we missed the case when
result collation is default collation. For foreign var, we return collation
as is in inner context with the state set to SAFE. But in case of local var,
we are only allowing InvalidOid or Default oid for collation, however while
returning back, we set collation to InvalidOid and state to NONE even for
default collation. I think we are missing something here.

To handle this case, we need to either,
1. allow local var to set inner_cxt collation to what var actually has
(which
will be either Invalid or DEFAULT) and set state to NONE if non collable or
set state to SAFE if default collation. Like:
  In T_Var, local var case
collation = var->varcollid;
state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE;

OR
2. In above code block, which checks result collation, we need to handle
default collation. Like:
else if (collation == DEFAULT_COLLATION_OID)
state = inner_cxt.state;

Let me know if I missed any.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-21 Thread Jeevan Chalke
Hi,

It is observed that, when we have one remote (huge) table and one local
(small) table and a join between them, then
 1. If the column type is text, then we push the join qual to the remote
server, so that we will have less rows to fetch, and thus execution time
is very less.
 2. If the column type is varchar, then we do not push the join qual to the
remote server, resulting into large number of data fetch and thus
execution time is very high.

Here is the EXPLAIN plan for such queries:

When VARCHAR column:

QUERY
PLAN
---
 Nested Loop  (cost=100.15..4594935.73 rows=230 width=120) (actual
time=0.490..291.339 rows=1 loops=1)
   Output: a.ename, d.dname
   Join Filter: ((a.deptno)::text = (d.deptno)::text)
   Rows Removed by Join Filter: 100099
   ->  Index Scan using emp2_pk on public.emp2 a  (cost=0.15..8.17 rows=1
width=76) (actual time=0.009..0.013 rows=1 loops=1)
 Output: a.empno, a.ename, a.deptno
 Index Cond: (a.empno = '7369'::numeric)
   ->  Foreign Scan on public.fdw_dept2 d  (cost=100.00..4594353.50
rows=45925 width=120) (actual time=0.466..274.990 rows=100100 loops=1)
 Output: d.deptno, d.dname
 Remote SQL: SELECT deptno, dname FROM public.dept2
 Planning time: 0.697 ms
 Execution time: 291.467 ms
(12 rows)


When TEXT column:

  QUERY
PLAN
--
 Nested Loop  (cost=100.57..216.63 rows=238 width=120) (actual
time=0.375..0.378 rows=1 loops=1)
   Output: a.ename, d.dname
   ->  Index Scan using emp3_pk on public.emp3 a  (cost=0.15..8.17 rows=1
width=70) (actual time=0.010..0.011 rows=1 loops=1)
 Output: a.empno, a.ename, a.deptno
 Index Cond: (a.empno = '7369'::numeric)
   ->  Foreign Scan on public.fdw_dept3 d  (cost=100.42..208.45 rows=1
width=114) (actual time=0.362..0.362 rows=1 loops=1)
 Output: d.deptno, d.dname
 Remote SQL: SELECT deptno, dname FROM public.dept3 WHERE
(($1::text = deptno))
 Planning time: 1.220 ms
 Execution time: 0.498 ms
(10 rows)


Attached test script to reproduce this theory.

I have observed that, since we do not have an equality operator for VARCHAR
type, we convert VARCHAR to TEXT using RelabelType and use texteq operator
function.
However in foreign_expr_walker(), for T_RelabelType case, we have these
conditions which do not allow us push the qual to remote.

/*
 * RelabelType must not introduce a collation not derived
from
 * an input foreign Var.
 */
collation = r->resultcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
 collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
else
state = FDW_COLLATE_UNSAFE;

I guess, since we do push qual to remote in case of TEXT, we should do the
same for VARCHAR too.

Also given that RelabelType are just dummy wrapper for binary compatible
types, can we simply set collation and state from its inner context instead
on above check block. Like

/*
 * Since RelabelType represents a "dummy" type coercion
between
 * two binary-compatible datatypes, set collation and state
got
 * from the inner_cxt.
 */
collation = inner_cxt.collation;
state = inner_cxt.state;

Inputs/Thought?


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
\c template1

-- Create database localdb and foreigndb;
CREATE DATABASE localdb;
CREATE DATABASE foreigndb;

-- Create tables in foreigndb
\c foreigndb

DROP TABLE IF EXISTS dept1; 

-- #Case 1, deptno datatype is NUMERIC. 
CREATE TABLE dept1 (
  deptno  NUMERIC(10) NOT NULL CONSTRAINT dept1_pk PRIMARY KEY,
  dname   VARCHAR(32) NOT NULL DEFAULT (md5(random()::VARCHAR))
);

INSERT INTO dept1 VALUES (generate_series(1,100100));

DROP TABLE IF EXISTS dept2; 

-- #Case 2, deptno datatype is VARCHAR.
CREATE TABLE dept2 (
  deptno  VARCHAR(10) NOT NULL CONSTRAINT dept2_pk PRIMARY KEY,
  dname   VARCHAR(32) NOT NULL DEFAULT (md5(random()::VARCHAR))
);

INSERT INTO dept2 VALUES (trim(to_char(generate_series(1,100100),'999')));

DROP TABLE IF EXISTS dept3; 

-- #Case 3, deptno datatype is TEXT.
CREATE TABLE dept3 (
  deptno  TEXT NOT NULL CONSTRAINT dept3_pk PRIMARY KEY,
  dname   VARCHAR(32) NOT NULL DEFAULT (md5(random()::VARCHAR))
);

INSERT INTO dept3 VALUES 

Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

2015-07-24 Thread Jeevan Chalke
Hi,

This will fail too.

Note that, when we have only one element in GROUPING SETS,
we add that in group by list and set parse-groupingSets to NULL.

And hence it will have same issue.

However tests added in my patch failing too.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5

2015-07-22 Thread Jeevan Chalke
Hi

It looks like we have broken the ROW expression without explicit
ROW keyword in GROUP BY.
I mean, after Grouping sets merge, if we have (c1, c2) in group by,
we are treating it as ROW expression for grouping, but at the same
time we are allowing individual column in the target list.
However this was not true with PG9.4 where we error out saying
column c1 must appear in the GROUP BY clause...

But if I use explicit ROW keyword, like ROW(c1, c2), then on PG95
it error outs for individual column reference in select list.

Example may clear more:

ON PG 9.5 (after grouping sets implementation)

postgres=# create view gstest1(a,b,v)
postgres-#   as values (1,1,10),(1,1,11),(1,2,12),(1,2,13),(1,3,14),
postgres-# (2,3,15),
postgres-# (3,3,16),(3,4,17),
postgres-# (4,1,18),(4,1,19);
CREATE VIEW

postgres=#
postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b)
ORDER BY 1, 2, 3 DESC;
 a | b | max
---+---+-
 1 | 1 |  11
 1 | 2 |  13
 1 | 3 |  14
 2 | 3 |  15
 3 | 3 |  16
 3 | 4 |  17
 4 | 1 |  19
(7 rows)

postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b)
ORDER BY 1, 2, 3 DESC;
ERROR:  column gstest1.a must appear in the GROUP BY clause or be used in
an aggregate function
LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY...
   ^

In above example, you see that when we have only (a, b), it is working fine.
But when we have ROW(a, b), it is throwing an error.
On PG 9.4 both cases are failing. Here it is:

postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b)
ORDER BY 1, 2, 3 DESC;
ERROR:  column gstest1.a must appear in the GROUP BY clause or be used in
an aggregate function
LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b) ORDER BY 1,...
   ^
postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b)
ORDER BY 1, 2, 3 DESC;
ERROR:  column gstest1.a must appear in the GROUP BY clause or be used in
an aggregate function
LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY...
   ^

Do we broke ROW expression semantics in grouping sets implementation?

Any idea why is this happening?

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug

2015-07-20 Thread Jeevan Chalke
On Sat, Jul 18, 2015 at 12:27 AM, Andrew Gierth and...@tao11.riddles.org.uk
 wrote:

  Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp
 writes:

  Kyotaro Hello, this looks to be a kind of thinko. The attached patch
  Kyotaro fixes it.

 No, that's still wrong. Just knowing that there is a List is not enough
 to tell whether to concat it or append it.

 Jeevan's original patch tries to get around this by making the RowExpr
 case wrap another List around its result (which is then removed by the
 concat), but this is the wrong approach too because it breaks nested
 RowExprs (which isn't valid syntax in the spec, because the spec allows
 only column references in GROUP BY, not arbitrary expressions, but which
 we have no reason not to support).

 Attached is the current version of my fix (with Jeevan's regression
 tests plus one of mine).


Looks good to me.



 --
 Andrew (irc:RhodiumToad)




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] GSets: Getting collation related error when GSets has text column

2015-07-17 Thread Jeevan Chalke
Hi,

When we have text column in the GROUPING SETS (and with some specific
order of columns), we are getting error saying
could not determine which collation to use for string comparison

Here is the example:

postgres=# select sum(ten) from onek group by rollup(four::text), two
order by 1;
ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

However if we have sql like;
select sum(ten) from onek group by two, rollup(four::text)
order by 1;

It is not failing.

After spending enough time, If I understood the code correctly, we are
re-ordering the sort columns but we are not remapping the grpColIdx
correctly.

Attached patch which attempts to fix this issue. However I am not sure
whether it is correct. But it does not add any new issues as such.
Added few test in the patch as well.

Please have a look.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a6ce96e..96def6b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2402,13 +2402,8 @@ build_grouping_chain(PlannerInfo *root,
 	 * Prepare the grpColIdx for the real Agg node first, because we may need
 	 * it for sorting
 	 */
-	if (list_length(rollup_groupclauses)  1)
-	{
-		Assert(rollup_lists  llast(rollup_lists));
-
-		top_grpColIdx =
-			remap_groupColIdx(root, llast(rollup_groupclauses));
-	}
+	if (parse-groupingSets)
+		top_grpColIdx = remap_groupColIdx(root, llast(rollup_groupclauses));
 
 	/*
 	 * If we need a Sort operation on the input, generate that.
@@ -2429,11 +2424,14 @@ build_grouping_chain(PlannerInfo *root,
 	while (list_length(rollup_groupclauses)  1)
 	{
 		List	   *groupClause = linitial(rollup_groupclauses);
-		List	   *gsets = linitial(rollup_lists);
+		List	   *gsets;
 		AttrNumber *new_grpColIdx;
 		Plan	   *sort_plan;
 		Plan	   *agg_plan;
 
+		Assert(rollup_lists  linitial(rollup_lists));
+		gsets = linitial(rollup_lists);
+
 		Assert(groupClause);
 		Assert(gsets);
 
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 5c47717..4ff5963 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -755,4 +755,27 @@ select array(select row(v.a,s1.*) from (select two,four, count(*) from onek grou
  {(2,0,0,250),(2,0,2,250),(2,0,,500),(2,1,1,250),(2,1,3,250),(2,1,,500),(2,,0,250),(2,,1,250),(2,,2,250),(2,,3,250),(2,,,1000)}
 (2 rows)
 
+-- Grouping on text columns
+select sum(ten) from onek group by two, rollup(four::text) order by 1;
+ sum  
+--
+ 1000
+ 1000
+ 1250
+ 1250
+ 2000
+ 2500
+(6 rows)
+
+select sum(ten) from onek group by rollup(four::text), two order by 1;
+ sum  
+--
+ 1000
+ 1000
+ 1250
+ 1250
+ 2000
+ 2500
+(6 rows)
+
 -- end
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index e478d34..cc5b89a 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -205,4 +205,8 @@ group by rollup(ten);
 select * from (values (1),(2)) v(a) left join lateral (select v.a, four, ten, count(*) from onek group by cube(four,ten)) s on true order by v.a,four,ten;
 select array(select row(v.a,s1.*) from (select two,four, count(*) from onek group by cube(two,four) order by two,four) s1) from (values (1),(2)) v(a);
 
+-- Grouping on text columns
+select sum(ten) from onek group by two, rollup(four::text) order by 1;
+select sum(ten) from onek group by rollup(four::text), two order by 1;
+
 -- end

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug

2015-07-17 Thread Jeevan Chalke
On Wed, Jul 15, 2015 at 10:21 PM, Andrew Gierth and...@tao11.riddles.org.uk
 wrote:

  Jeevan == Jeevan Chalke jeevan.cha...@enterprisedb.com writes:

  Jeevan Hi,
  Jeevan It looks like we do support nested GROUPING SETS, I mean Sets
  Jeevan withing Sets, not other types.  However this nesting is broken.

 Good catch, but I'm not yet sure your fix is correct; I'll need to look
 into that.


Sure. Thanks.

However I wonder why we are supporting GROUPING SETS inside GROUPING SETS.
On Oracle, it is throwing an error.
We are not trying to be Oracle compatible, but just curious to know.

I have tried restricting it in attached patch.

But it may require few comment adjustment.

Thanks


 --
 Andrew (irc:RhodiumToad)




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e0ff6f1..738715f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -371,9 +371,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 relation_expr_list dostmt_opt_list
 transform_element_list transform_type_list
 
-%type list	group_by_list
+%type list	group_by_list grouping_sets_list
 %type node	group_by_item empty_grouping_set rollup_clause cube_clause
-%type node	grouping_sets_clause
+%type node	grouping_sets_clause grouping_sets_item
 
 %type list	opt_fdw_options fdw_options
 %type defelt	fdw_option
@@ -10343,6 +10343,18 @@ group_by_item:
 			| grouping_sets_clause	{ $$ = $1; }
 		;
 
+grouping_sets_list:
+			grouping_sets_item{ $$ = list_make1($1); }
+			| grouping_sets_list ',' grouping_sets_item		{ $$ = lappend($1,$3); }
+		;
+
+grouping_sets_item:
+			a_expr	{ $$ = $1; }
+			| empty_grouping_set	{ $$ = $1; }
+			| cube_clause			{ $$ = $1; }
+			| rollup_clause			{ $$ = $1; }
+		;
+
 empty_grouping_set:
 			'(' ')'
 {
@@ -10371,7 +10383,7 @@ cube_clause:
 		;
 
 grouping_sets_clause:
-			GROUPING SETS '(' group_by_list ')'
+			GROUPING SETS '(' grouping_sets_list ')'
 {
 	$$ = (Node *) makeGroupingSet(GROUPING_SET_SETS, $4, @1);
 }
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 842c2ae..e75dceb 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -145,6 +145,25 @@ select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
|   |  12 |   36
 (6 rows)
 
+-- nesting with grouping sets
+select sum(c) from gstest2
+  group by grouping sets((), grouping sets(((a, b
+  order by 1 desc;
+ERROR:  syntax error at or near sets
+LINE 2:   group by grouping sets((), grouping sets(((a, b
+  ^
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(rollup(c), grouping sets(cube(c
+  order by 1 desc;
+ERROR:  syntax error at or near sets
+LINE 2:   group by grouping sets(grouping sets(rollup(c), grouping s...
+  ^
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(a, grouping sets(a), a))
+  order by 1 desc;
+ERROR:  syntax error at or near sets
+LINE 2:   group by grouping sets(grouping sets(a, grouping sets(a), ...
+  ^
 -- empty input: first is 0 rows, second 1, third 3 etc.
 select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),a);
  a | b | sum | count 
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 0bffb85..b7e4826 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -73,6 +73,17 @@ select grouping(a), a, array_agg(b),
 select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
   from gstest2 group by rollup (a,b) order by rsum, a, b;
 
+-- nesting with grouping sets
+select sum(c) from gstest2
+  group by grouping sets((), grouping sets(((a, b
+  order by 1 desc;
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(rollup(c), grouping sets(cube(c
+  order by 1 desc;
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(a, grouping sets(a), a))
+  order by 1 desc;
+
 -- empty input: first is 0 rows, second 1, third 3 etc.
 select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),a);
 select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),());

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Grouping Sets: Fix unrecognized node type bug

2015-07-15 Thread Jeevan Chalke
Hi,

It looks like we do support nested GROUPING SETS, I mean Sets withing
Sets, not other types.  However this nesting is broken.

Here is the simple example where I would expect three rows in the
result.  But unfortunately it is giving unrecognized node type
error.  Which is something weird and need a fix.

postgres=# create table gstest2 (a integer, b integer, c integer);
postgres=# insert into gstest2 values (1,1,1), (1,1,1), (1,1,1),
(1,1,1), (1,1,1), (1,1,1), (1,1,2), (1,2,2), (2,2,2);
postgres=# select sum(c) from gstest2
  group by grouping sets((), grouping sets((), grouping sets((
  order by 1 desc;
ERROR:  unrecognized node type: 926


I spend much time to understand the cause and was looking into
transformGroupingSet() and transformGroupClauseList() function.
I have tried fixing unrecognized node type: 926 error there,
but later it is failing with unrecognized node type: 656.

Later I have realized that we have actually have an issue while
flattening grouping sets.  If we have nested grouping sets like
above, then we are getting GroupingSet node inside the list and
transformGroupClauseList() does not expect that and end up with
this error.

I have tried fixing this issue in flatten_grouping_sets(), after
flattening grouping sets node, we need to concat the result with
the existing list and should not append.  This alone does not
solve the issue as we need a list when we have ROW expression.
Thus there, if not top level, I am creating a list now.

Attached patch with few testcases too.

Please have a look.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index e90e1d6..31d4331 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -1779,8 +1779,19 @@ flatten_grouping_sets(Node *expr, bool toplevel, bool *hasGroupingSets)
 RowExpr*r = (RowExpr *) expr;
 
 if (r-row_format == COERCE_IMPLICIT_CAST)
-	return flatten_grouping_sets((Node *) r-args,
- false, NULL);
+{
+	Node   *n1 = flatten_grouping_sets((Node *) r-args,
+	   false, NULL);
+
+	/*
+	 * Make a list for row expression if toplevel is false,
+	 * return flatten list otherwise
+	 */
+	if (toplevel)
+		return (Node *) n1;
+	else
+		return (Node *) list_make1(n1);
+}
 			}
 			break;
 		case T_GroupingSet:
@@ -1805,7 +1816,10 @@ flatten_grouping_sets(Node *expr, bool toplevel, bool *hasGroupingSets)
 {
 	Node	   *n2 = flatten_grouping_sets(lfirst(l2), false, NULL);
 
-	result_set = lappend(result_set, n2);
+	if (IsA(n2, List))
+		result_set = list_concat(result_set, (List *) n2);
+	else
+		result_set = lappend(result_set, n2);
 }
 
 /*
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index adb39b3..5c47717 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -145,6 +145,112 @@ select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
|   |  12 |   36
 (6 rows)
 
+-- nesting with grouping sets
+select sum(c) from gstest2
+  group by grouping sets((), grouping sets((), grouping sets((
+  order by 1 desc;
+ sum 
+-
+  12
+  12
+  12
+(3 rows)
+
+select sum(c) from gstest2
+  group by grouping sets((), grouping sets((), grouping sets(((a, b)
+  order by 1 desc;
+ sum 
+-
+  12
+  12
+   8
+   2
+   2
+(5 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(rollup(c), grouping sets(cube(c
+  order by 1 desc;
+ sum 
+-
+  12
+  12
+   6
+   6
+   6
+   6
+(6 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(a, grouping sets(a, cube(b)))
+  order by 1 desc;
+ sum 
+-
+  12
+  10
+  10
+   8
+   4
+   2
+   2
+(7 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(grouping sets((a, (b
+  order by 1 desc;
+ sum 
+-
+   8
+   2
+   2
+(3 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(grouping sets((a, b)))
+  order by 1 desc;
+ sum 
+-
+   8
+   2
+   2
+(3 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(a, grouping sets(a), a))
+  order by 1 desc;
+ sum 
+-
+  10
+  10
+  10
+   2
+   2
+   2
+(6 rows)
+
+select sum(c) from gstest2
+  group by grouping sets(grouping sets(a, grouping sets(a, grouping sets(a), ((a)), a, grouping sets(a), (a)), a))
+  order by 1 desc;
+ sum 
+-
+  10
+  10
+  10
+  10
+  10
+  10
+  10
+  10
+   2
+   2
+   2
+   2
+   2
+   2
+   2
+   2
+(16 rows)
+
 -- empty input: first is 0 rows, second 1, third 3 etc.
 select a, b, sum(v), count(*) from gstest_empty group by grouping sets ((a,b),a);
  a | b | sum | count 
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 0883afd..e478d34 100644
--- 

[HACKERS] GSets: Fix bug involving GROUPING and HAVING together

2015-07-14 Thread Jeevan Chalke
Hi,

I have observed some fishy behavior related to GROUPING in HAVING clause
and when we have only one element in GROUPING SETS.

Basically, when we have only one element in GROUING SETS, we are assuming
it as a simple GROUP BY with one column. Due to which we are ending up with
this error.

If we have ROLLUP/CUBE or GROUPING SETS with multiple elements, then we are
not getting this error.

Here are some examples:

postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten) having grouping(ten) = 0
postgres-# order by 2,1;
ERROR:  parent of GROUPING is not Agg node
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten, four) having grouping(ten)  0
postgres-# order by 2,1;
 ten | grouping
-+--
(0 rows)fix_bug_related_to_grouping_v1.patch

postgres=# select ten, grouping(ten) from onek
postgres-# group by rollup(ten) having grouping(ten)  0
postgres-# order by 2,1;
 ten | grouping
-+--
 |1
(1 row)

postgres=# select ten, grouping(ten) from onek
postgres-# group by cube(ten) having grouping(ten)  0
postgres-# order by 2,1;
 ten | grouping
-+--
 |1
(1 row)


I had a look over relevant code and found that contain_agg_clause_walker()
is not considering GroupingFunc as an aggregate node, due to which it is
failing to consider it in a having clause in subquery_planner().

Fix this by adding GroupingFunc node in this walker.  We do it correctly in
contain_aggs_of_level_walker() in which we have handling for GroupingFunc
there.

Attached patch to fix this.

The side effect is that, if we have plain group by clause, then too we can
use GROUPING in HAVING clause. But I guess it is fine.

Let me know if I missed anything to consider here.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index d40083d..cdb6955 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -390,7 +390,7 @@ make_ands_implicit(Expr *clause)
 
 /*
  * contain_agg_clause
- *	  Recursively search for Aggref nodes within a clause.
+ *	  Recursively search for Aggref/GroupingFunc nodes within a clause.
  *
  *	  Returns true if any aggregate found.
  *
@@ -417,6 +417,11 @@ contain_agg_clause_walker(Node *node, void *context)
 		Assert(((Aggref *) node)-agglevelsup == 0);
 		return true;			/* abort the tree traversal and return true */
 	}
+	if (IsA(node, GroupingFunc))
+	{
+		Assert(((GroupingFunc *) node)-agglevelsup == 0);
+		return true;			/* abort the tree traversal and return true */
+	}
 	Assert(!IsA(node, SubLink));
 	return expression_tree_walker(node, contain_agg_clause_walker, context);
 }
diff --git a/src/test/regress/expected/groupingsets.out b/src/test/regress/expected/groupingsets.out
index 842c2ae..adb39b3 100644
--- a/src/test/regress/expected/groupingsets.out
+++ b/src/test/regress/expected/groupingsets.out
@@ -486,6 +486,68 @@ having exists (select 1 from onek b where sum(distinct a.four) = b.four);
9 |   3
 (25 rows)
 
+-- HAVING with GROUPING queries
+select ten, grouping(ten) from onek
+group by grouping sets(ten) having grouping(ten) = 0
+order by 2,1;
+ ten | grouping 
+-+--
+   0 |0
+   1 |0
+   2 |0
+   3 |0
+   4 |0
+   5 |0
+   6 |0
+   7 |0
+   8 |0
+   9 |0
+(10 rows)
+
+select ten, grouping(ten) from onek
+group by grouping sets(ten, four) having grouping(ten)  0
+order by 2,1;
+ ten | grouping 
+-+--
+ |1
+ |1
+ |1
+ |1
+(4 rows)
+
+select ten, grouping(ten) from onek
+group by rollup(ten) having grouping(ten)  0
+order by 2,1;
+ ten | grouping 
+-+--
+ |1
+(1 row)
+
+select ten, grouping(ten) from onek
+group by cube(ten) having grouping(ten)  0
+order by 2,1;
+ ten | grouping 
+-+--
+ |1
+(1 row)
+
+select ten, grouping(ten) from onek
+group by (ten) having grouping(ten) = 0
+order by 2,1;
+ ten | grouping 
+-+--
+   0 |0
+   1 |0
+   2 |0
+   3 |0
+   4 |0
+   5 |0
+   6 |0
+   7 |0
+   8 |0
+   9 |0
+(10 rows)
+
 -- FILTER queries
 select ten, sum(distinct four) filter (where four::text ~ '123') from onek a
 group by rollup(ten);
diff --git a/src/test/regress/sql/groupingsets.sql b/src/test/regress/sql/groupingsets.sql
index 0bffb85..0883afd 100644
--- a/src/test/regress/sql/groupingsets.sql
+++ b/src/test/regress/sql/groupingsets.sql
@@ -154,6 +154,23 @@ select ten, sum(distinct four) from onek a
 group by grouping sets((ten,four),(ten))
 having exists (select 1 from onek b where sum(distinct a.four) = b.four);
 
+-- HAVING with GROUPING queries
+select ten, grouping(ten) from onek
+group by 

Re: [HACKERS] GSets: Fix bug involving GROUPING and HAVING together

2015-07-14 Thread Jeevan Chalke
On Tue, Jul 14, 2015 at 4:23 PM, Andrew Gierth and...@tao11.riddles.org.uk
wrote:

  Jeevan == Jeevan Chalke jeevan.cha...@enterprisedb.com writes:

  Jeevan Basically, when we have only one element in GROUING SETS, we
  Jeevan are assuming it as a simple GROUP BY with one column. Due to
  Jeevan which we are ending up with this error.

  Jeevan If we have ROLLUP/CUBE or GROUPING SETS with multiple elements,
  Jeevan then we are not getting this error.

 There's two issues here. One you correctly identified, which is that
 contain_agg_clause() should be true for GroupingFuncs too.

 The other is that in subquery_planner, the optimization of converting
 HAVING clauses to WHERE clauses is suppressed if parse-groupingSets
 isn't empty. (It is empty if there's either no group by clause at all,
 or if there's exactly one grouping set, which must not be empty, however
 derived.) This is costing us some optimizations, especially in the case
 of an explicit GROUP BY () clause;


I have observed that. But I thought that it is indeed intentional.
As we don't want to choose code path for GSets when we have
only one column which is converted to plain group by and
thus setting parse-groupingSets to FALSE.


 I'll look into this.


OK. Thanks


 In the meantime your patch looks OK (and necessary) to me.

  Jeevan The side effect is that, if we have plain group by clause, then
  Jeevan too we can use GROUPING in HAVING clause. But I guess it is
  Jeevan fine.

 GROUPING is, per spec, explicitly allowed anywhere you could have an
 aggregate call, whether the group by clause is plain or not.

 --
 Andrew (irc:RhodiumToad)




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] [PATCH] two-arg current_setting() with fallback

2015-07-07 Thread Jeevan Chalke
On Fri, Jul 3, 2015 at 2:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jeevan Chalke jeevan.cha...@enterprisedb.com writes:
  Attached patch which fixes my review comments.

 Applied with minor adjustments (mostly cosmetic, but did neither of you
 notice the compiler warning?)


Oops. Sorry for that.
Added  -Wall -Werror in my configuration.

Thanks


 regards, tom lane




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


[HACKERS] Missing tab-complete for PASSWORD word in CREATE ROLE syntax

2015-06-19 Thread Jeevan Chalke
Hi,

I have observed that we are not tab-completing word PASSWORD in the
following
syntaxes:

1.
CREATE|ALTER ROLE|USER rolname

2.
CREATE|ALTER ROLE|USER rolname WITH

PASSWORD is used many times and should be in the tab-complete list.

Was there any reason we have deliberately kept this out?
If yes, please ignore my patch.
Attached patch to add those missing tab-completes.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


tab_complete_password_in_create_alter_role_v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Dead code in Create/RenameRole() after RoleSpec changes related to CURRENT/SESSION_USER

2015-06-09 Thread Jeevan Chalke
Hi,

I found some dead code in CREATE/RENAME ROLE code path.
Attached patch to remove those.

We have introduced RoleSpec and handled public and none role names in
grammar
itself. We do have these handling in CreateRole() and RenameRole() which is
NO more valid now.

Here is the related commit:
commit 31eae6028eca4365e7165f5f33fee1ed0486aee0
Author: Alvaro Herrera alvhe...@alvh.no-ip.org
Date:   Mon Mar 9 15:41:54 2015 -0300

Allow CURRENT/SESSION_USER to be used in certain commands ...


Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 3b381c5..d232f52 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -311,13 +311,6 @@ CreateRole(CreateRoleStmt *stmt)
 	 errmsg(permission denied to create role)));
 	}
 
-	if (strcmp(stmt-role, public) == 0 ||
-		strcmp(stmt-role, none) == 0)
-		ereport(ERROR,
-(errcode(ERRCODE_RESERVED_NAME),
- errmsg(role name \%s\ is reserved,
-		stmt-role)));
-
 	/*
 	 * Check the pg_authid relation to be certain the role doesn't already
 	 * exist.
@@ -1159,13 +1152,6 @@ RenameRole(const char *oldname, const char *newname)
 (errcode(ERRCODE_DUPLICATE_OBJECT),
  errmsg(role \%s\ already exists, newname)));
 
-	if (strcmp(newname, public) == 0 ||
-		strcmp(newname, none) == 0)
-		ereport(ERROR,
-(errcode(ERRCODE_RESERVED_NAME),
- errmsg(role name \%s\ is reserved,
-		newname)));
-
 	/*
 	 * createrole is enough privilege unless you want to mess with a superuser
 	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: incomplete implementation of errhidecontext

2015-06-09 Thread Jeevan Chalke
On Mon, Jun 8, 2015 at 8:19 PM, Andres Freund and...@anarazel.de wrote:

 On 2015-06-08 14:44:53 +, Jeevan Chalke wrote:
  The following review has been posted through the commitfest application:
  make installcheck-world:  tested, passed
  Implements feature:   tested, passed
  Spec compliant:   tested, passed
  Documentation:tested, passed
 
  This is trivial bug fix in the area of hiding error context.
 
  I observed that there are two places from which we are calling this
 function
  to hide the context in log messages. Those were broken.

 Broken in which sense? They did prevent stuff to go from the server log?

 I'm not convinced that hiding stuff from the client is really
 necessarily the same as hiding it from the server log. We e.g. always
 send the verbose log to the client, even if we only send the terse
 version to the server log.  I don't mind adjusting things for
 errhidecontext(), but it's not just a bug.


Function name itself says that we need to hide the context.
And this I assume it means from all the logs/client etc.

I said it is broken as these two calls are calling this function
with passing TRUE explicitly. But even though I can see the
context messages on the client.

Anyway, I don't want to argue on whether it is a bug or not.



 Greetings,

 Andres Freund


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2015-06-09 Thread Jeevan Chalke
Hi

Patch looks excellent now. No issues.

Found a typo which I have fixed in the attached patch.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 62a3b21..4467b8c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1651,6 +1651,34 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
   varlistentry
+termliteral\ev optional replaceable 
class=parameterviewname/ optional  replaceable 
class=parameterline_number/ /optional /optional /literal/term
+
+listitem
+para
+ This command fetches and edits the definition of the named view,
+ in the form of a commandCREATE OR REPLACE VIEW/ command.
+ Editing is done in the same way as for literal\edit/.
+ After the editor exits, the updated command waits in the query buffer;
+ type semicolon or literal\g/ to send it, or literal\r/
+ to cancel.
+/para
+
+para
+ If no view is specified, a blank commandCREATE VIEW/
+ template is presented for editing.
+/para
+
+para
+ If a line number is specified, applicationpsql/application will
+ position the cursor on the specified line of the view definition.
+ (Note that the view definition typically does not begin on the first
+ line of the file.)
+/para
+/listitem
+  /varlistentry
+
+
+  varlistentry
 termliteral\encoding [ replaceable 
class=parameterencoding/replaceable ]/literal/term
 
 listitem
@@ -2522,6 +2550,26 @@ testdb=gt; userinput\setenv LESS -imx4F/userinput
 
 
   varlistentry
+termliteral\sv[+] replaceable class=parameterviewname/ 
/literal/term
+
+listitem
+ para
+  This command fetches and shows the definition of the named view,
+  in the form of a commandCREATE OR REPLACE VIEW/ command.
+  The definition is printed to the current query output channel,
+  as set by command\o/command.
+ /para
+
+ para
+  If literal+/literal is appended to the command name, then the
+  output lines are numbered, with the first line of the view definition
+  being line 1.
+ /para
+/listitem
+  /varlistentry
+
+
+  varlistentry
 termliteral\t/literal/term
 listitem
 para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 38253fa..71a2dfd 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -49,6 +49,16 @@
 #include settings.h
 #include variables.h
 
+/*
+ * Editable database object types.
+ * Currently functions and views are supported.
+ */
+typedef enum PgObjType
+{
+   PgObjTypeFunction,
+   PgObjTypeView
+   /* add new editable object type here */
+} PgObjType;
 
 /* functions for use in this file */
 static backslashResult exec_command(const char *cmd,
@@ -59,10 +69,17 @@ static bool do_edit(const char *filename_arg, PQExpBuffer 
query_buf,
 static bool do_connect(char *dbname, char *user, char *host, char *port);
 static bool do_shell(const char *command);
 static bool do_watch(PQExpBuffer query_buf, long sleep);
-static bool lookup_function_oid(const char *desc, Oid *foid);
-static bool get_create_function_cmd(Oid oid, PQExpBuffer buf);
-static int strip_lineno_from_funcdesc(char *func);
+static bool lookup_object_oid(const char *desc,
+ PgObjType obj_type,
+ Oid *obj_oid);
+static bool get_create_object_cmd(Oid oid, PQExpBuffer buf, PgObjType type);
+static void format_create_view_cmd(char *view, PQExpBuffer buf);
+static int strip_lineno_from_objdesc(char *obj);
 static void minimal_error_message(PGresult *res);
+static int count_lines_in_buf(PQExpBuffer buf);
+static void print_with_linenumbers(FILE *output,
+  char *lines,
+  const char 
*header_cmp_keyword);
 
 static void printSSLInfo(void);
 static bool printPsetInfo(const char *param, struct printQueryOpt *popt);
@@ -612,7 +629,7 @@ exec_command(const char *cmd,
 
func = psql_scan_slash_option(scan_state,

  OT_WHOLE_LINE, NULL, true);
-   lineno = strip_lineno_from_funcdesc(func);
+   lineno = strip_lineno_from_objdesc(func);
if (lineno == 0)
{
/* error already reported */
@@ -629,12 +646,12 @@ exec_command(const char *cmd,
  AS 
$function$\n
  

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2015-06-09 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Patch looks good to pass to committer.

The new status of this patch is: Ready for Committer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: incomplete implementation of errhidecontext

2015-06-08 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

This is trivial bug fix in the area of hiding error context.

I observed that there are two places from which we are calling this function
to hide the context in log messages. Those were broken.

This patch fixes those.

So good to go in.

The new status of this patch is: Ready for Committer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] two-arg current_setting() with fallback

2015-06-04 Thread Jeevan Chalke
Hi,

Attached patch which fixes my review comments.

Since code changes were good, just fixed reported cosmetic changes.

David, can you please cross check?

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/contrib/tsearch2/tsearch2.c b/contrib/tsearch2/tsearch2.c
index 143dabb..4354c5b 100644
--- a/contrib/tsearch2/tsearch2.c
+++ b/contrib/tsearch2/tsearch2.c
@@ -363,7 +363,7 @@ tsa_tsearch2(PG_FUNCTION_ARGS)
 		tgargs[i + 1] = trigger-tgargs[i];
 
 	tgargs[1] = pstrdup(GetConfigOptionByName(default_text_search_config,
-			  NULL));
+			  NULL, false));
 	tgargs_old = trigger-tgargs;
 	trigger-tgargs = tgargs;
 	trigger-tgnargs++;
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c6e3540..7f6c4ad 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -16435,7 +16435,7 @@ SELECT collation for ('foo' COLLATE de_DE);
 indexterm
  primarycurrent_setting/primary
 /indexterm
-literalfunctioncurrent_setting(parametersetting_name/parameter)/function/literal
+literalfunctioncurrent_setting(parametersetting_name/parameter [, parametermissing_ok/parameter ])/function/literal
/entry
entrytypetext/type/entry
entryget current value of setting/entry
@@ -16474,7 +16474,9 @@ SELECT collation for ('foo' COLLATE de_DE);
 The function functioncurrent_setting/function yields the
 current value of the setting parametersetting_name/parameter.
 It corresponds to the acronymSQL/acronym command
-commandSHOW/command.  An example:
+commandSHOW/command.  If parametersetting/parameter does not exist,
+throws an error unless parametermissing_ok/parameter is
+literaltrue/literal.  An example:
 programlisting
 SELECT current_setting('datestyle');
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index b3c9f14..a43925d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -7137,7 +7137,7 @@ ExtractSetVariableArgs(VariableSetStmt *stmt)
 		case VAR_SET_VALUE:
 			return flatten_set_variable_args(stmt-name, stmt-args);
 		case VAR_SET_CURRENT:
-			return GetConfigOptionByName(stmt-name, NULL);
+			return GetConfigOptionByName(stmt-name, NULL, false);
 		default:
 			return NULL;
 	}
@@ -7206,7 +7206,7 @@ set_config_by_name(PG_FUNCTION_ARGS)
 			 true, 0, false);
 
 	/* get the new current value */
-	new_value = GetConfigOptionByName(name, NULL);
+	new_value = GetConfigOptionByName(name, NULL, false);
 
 	/* Convert return string to text */
 	PG_RETURN_TEXT_P(cstring_to_text(new_value));
@@ -7633,7 +7633,7 @@ GetPGVariableResultDesc(const char *name)
 		const char *varname;
 
 		/* Get the canonical spelling of name */
-		(void) GetConfigOptionByName(name, varname);
+		(void) GetConfigOptionByName(name, varname, false);
 
 		/* need a tuple descriptor representing a single TEXT column */
 		tupdesc = CreateTemplateTupleDesc(1, false);
@@ -7656,7 +7656,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	char	   *value;
 
 	/* Get the value and canonical spelling of name */
-	value = GetConfigOptionByName(name, varname);
+	value = GetConfigOptionByName(name, varname, false);
 
 	/* need a tuple descriptor representing a single TEXT column */
 	tupdesc = CreateTemplateTupleDesc(1, false);
@@ -7740,19 +7740,26 @@ ShowAllGUCConfig(DestReceiver *dest)
 }
 
 /*
- * Return GUC variable value by name; optionally return canonical
- * form of name.  Return value is palloc'd.
+ * Return GUC variable value by name; optionally return canonical form of
+ * name.  If the GUC is unset, then throw an error unless missing_ok is true,
+ * in which case return NULL.  Return value is palloc'd.
  */
 char *
-GetConfigOptionByName(const char *name, const char **varname)
+GetConfigOptionByName(const char *name, const char **varname, bool missing_ok)
 {
 	struct config_generic *record;
 
 	record = find_option(name, false, ERROR);
 	if (record == NULL)
+	{
+		if (missing_ok)
+			return NULL;
+
 		ereport(ERROR,
 (errcode(ERRCODE_UNDEFINED_OBJECT),
 			   errmsg(unrecognized configuration parameter \%s\, name)));
+	}
+
 	if ((record-flags  GUC_SUPERUSER_ONLY)  !superuser())
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
@@ -8046,12 +8053,42 @@ show_config_by_name(PG_FUNCTION_ARGS)
 	varname = TextDatumGetCString(PG_GETARG_DATUM(0));
 
 	/* Get the value */
-	varval = GetConfigOptionByName(varname, NULL);
+	varval = GetConfigOptionByName(varname, NULL, false);
+
+	/* Convert to text */
+	PG_RETURN_TEXT_P(cstring_to_text(varval));
+}
+
+/*
+ * show_config_by_name_missing_ok - equiv to SHOW X command but implemented as
+ * a function.  If X does not exist, suppress the error and just return NULL
+ * if missing_ok is TRUE.
+ */
+Datum
+show_config_by_name_missing_ok(PG_FUNCTION_ARGS)
+{
+	char	   *varname;
+	bool   missing_ok;
+	char	   *varval;

Re: [HACKERS] [PATCH] two-arg current_setting() with fallback

2015-06-04 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I have reviewed the patch.
Here are my review comments:

1. Patch does not apply due to some recent changes in pg_proc.h

2. 
-GetConfigOptionByName(const char *name, const char **varname)
+GetConfigOptionByNameMissingOK(const char *name, const char **varname, bool 
missing_ok)

Will it be better if we keep the name as is and change the callers to pass
false for missing_ok parameter?
It looks weired to have an extra #define just to avoid that.
I see countable callers and thus see NO issues changing those.

3. Oid used for new function is already used. Check unused_oids.sh.

4. Changes in builtins.h are accidental. Need to remove that.


However, code changes looks good and implements the desired feature.


The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2015-06-01 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, failed

I have reviewed this patch. Most of the code is just rearranged.
Since this is based upon, \ef and \sf, code is almost similar.

However hare are my review comments:

1.
make failed with docs

2.
 \ev vw1 3

This syntax is supported. But documentation only says:
\ev [ viewname ] 
Missing optional line_number clause

3.
 strip_lineno_from_objdesc(char *func)

Can we have parameter name as obj instead of func.
You have renamed the function name, as it is now called in case of views as
well. Better rename the parameter names as well.

4.
Also please update the comments above strip_lineno_from_objdesc().
It is specific to functions which is not the case now.

5.
 print_with_linenumbers(FILE *output,
  char *lines,
  const char *header_cmp_keyword,
  size_t header_cmp_sz)

Can't we calculate the length of header (header_cmp_sz) inside function?
This will avoid any sloppy changes like, change in the keyword but forgot to
change the size.
Lets just accept the keyword and calculate the size within the function.

6.
*
* Note that this loop scribbles on func_buf.
*/

These lines at commands.c:1357, looks NO more valid now as there is NO loop
there.

7.
I see few comment lines explaining which is line 1 in case of function, for
which AS  is used. Similarly, for view SELECT  is used.
Can you add similar kind of explanation there?

8.
 get_create_object_cmd_internal
 get_create_function_cmd
 get_create_view_cmd

Can these three functions grouped together in just get_create_object_cmd(). 
This function will take an extra parameter to indicate the object type.
Say O_FUNC and O_VIEW for example.

For distinct part, just have a switch case over this type.

This will add a flexibility that if we add another such \e and \s options, we
don't need new functions, rather just need new enum like O_new and a new case
in this switch statement.
Also it will look good to read the code as well.

similarly you can do it for
 lookup_object_oid_internal
 get_create_function_cmd
 lookup_function_oid

9.
 static int count_lines_in_buf(PQExpBuffer buf)
 static void print_with_linenumbers(FILE *output, .. )
 static bool lookup_view_oid(const char *desc, Oid *view_oid)
 static bool lookup_object_oid_internal(PQExpBuffer query, Oid *obj_oid)

Can we have smaller description, explaining what's the function doing for
these functions at the definition?

10.
 + \\e, \\echo, \\ef, \\ev, \\encoding,

Can you keep this sorted?
It will be good if it sorted, but I see no such restriction as I see few out
of order options. But better keep it ordered.
Ignore if you dis-agree.


The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bugfix: incomplete implementation of errhidecontext

2015-05-29 Thread Jeevan Chalke
Pavel, will it be good if you separately submit the
bugfix: incomplete implementation of errhidecontext
patch in this commitfest?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tabcomplete - minor bugfix - tabcomplete for SET ROLE TO xxx

2015-05-29 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

I agree with Peter that We don't tab-complete everything we possibly could, 
but using tabs after SET ROLE TO  provides DEFAULT as an option which seems 
wrong.
This patch adds list of roles over there, which I guess good to have than 
giving something unusual.

I reviewed this straight forward patch and looks good to me.

Since we might not want this, review is done and thus passing it to committer 
to decide.

The new status of this patch is: Ready for Committer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-18 Thread Jeevan Chalke
Álvaro,

I think, there are few open questions here and thus marking it back to Waiting 
on Author.

Please have your views on the review comments already posted.
Also make changes as Tom suggested about placing pstate at the beginning.

I am more concerned about this:

1.
postgres=# create or replace function
f1(a abc.test.id%type) returns int as
$$ select 1; $$
language sql;
ERROR:  schema abc does not exist

Is that expected?

2.
Also what about pushing setup_parser_errposition_callback() inside 
func_get_detail() as well, just to limit it for namespace lookup?

Thanks

The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-03-12 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Looks good. Passing it to committer.

The new status of this patch is: Ready for Committer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-03-04 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

 1.
 +#include utils/acl.h

 Can you please add it at appropriate place as #include list is an ordered
 list

regrolein calls reg_role_oid in acl.c, which is declared in acl.h.

Well. I was suggesting that putting  #include utils/acl.h line after
#include parser/parse_type.h and before #include utils/builtins.h
so that they will be in order.
I understand that it is needed for reg_role_oid() call.

Review comments on *_v4 patches:

1.
+ The OID alias types don't sctrictly comply the transaction isolation

Typo. sctrictly = strictly


2.
+ joining the system tables correnspond to the OID types.
Typo. correnspond = correspond

Apart from these typos, I see no issues.
However, you can have a look over Jim's suggestion on doc wordings.

If you feel comfortable, I have no issues if you mark this as
Ready for Committer once you fix the typos and doc wordings.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-03 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:not tested

Tom suggested few changes already which I too think author needs to address.
So marking it Waiting on Author.

However, I see following, example does not work well.

postgres=# create or replace function
f1(a abc.test.id%type) returns int as
$$ select 1; $$
language sql;
ERROR:  schema abc does not exist

Is that expected?
I guess we need it at all places in parse_*.c where we will look for namespace.
Please fix.


Also, like Tom's suggestion on make_oper_cache_key, can we push down this
inside func_get_detail() as well, just to limit it for namespace lookup?

However, patch is not getting applied cleanly on latest sources. Need rebase.

 On Tom comments on parse_utilcmd.c:
I guess the block is moved after the pstate and CreateStmtContext are setup
properly.  I guess, we can move just after pstate setup, so that it will
result into minimal changes?

Can we have small test-case? Or will it be too much for this feature?

The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-02-27 Thread Jeevan Chalke
 The attatched are the fourth version of this patch.

 0001-Add-regrole_v4.patch
 0002-Add-regnamespace_v4.patch


Seems like you have missed to attach both the patches.

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Review of GetUserId() Usage

2015-02-26 Thread Jeevan Chalke
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I have reviewed the patch.
Patch is excellent in shape and does what is expected and discussed.
Also changes are straight forward too.

So looks good to go in.

However I have one question:

What is the motive for splitting the function return value from
SIGNAL_BACKEND_NOPERMISSION into
SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION?

Is that required for some other upcoming patches OR just for simplicity?

Currently, we have combined error for both which is simply split into two.
No issue as such, just curious as it does not go well with the subject.

You can mark this for ready for committer.

The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-02-24 Thread Jeevan Chalke
Hi,

Personally, I was looking for something like this as I need to see rolename
and namespace name many times in my queries rather than it's oid.
But making a JOIN expression every-time was a pain. This certainly makes it
easier. And I see most DBAs are looking for it.

I agree on Tom's concern on MVCC issue, but we already hit that when we
introduced regclass and others. So I see no additional issue with these
changes as such. About planner slowness, I guess updated documentation looks
perfect for that.

So I went ahead reviewing these patches.

All patches are straight forward and since we have similar code already
exists, I did not find any issue at code level. They are consistent with
other
functions.

Patches applies with patch -p1. make, make install, make check has
no issues. Testing was fine too.

However here are few review comments on the patches attached:

Review points on 0001-Add-regrole.patch
---
1.
+#include utils/acl.h

Can you please add it at appropriate place as #include list is an ordered
list

2.
+char   *role_or_oid = PG_GETARG_CSTRING(0);

Can we have variable named as role_name_or_oid, like other similar
functions?

3.
+/*
+ * Normal case: parse the name into components and see if it matches
any
+ * pg_role entries in the current search path.
+ */

I believe, roles are never searched per search path. Thus need to update
comments here.


Review points on 0002-Add-regnamespace.patch
---
1.
+ * regnamespacein- converts classname to class OID

Typos. Should be nspname instead of classname and namespase OID instead of
class OID


Review points on 0003-Check-newpatch
---
1.
@@ -1860,6 +1860,7 @@ StoreAttrDefault(Relation rel, AttrNumber attnum,
 OidattrdefOid;
 ObjectAddress colobject,
 defobject;
+Oidexprtype;

This seems unrelated. Please remove.


Apart from this, it will be good if you have ONLY two patches,
(1) For regrole and (2) For regnamespace specific
With all related changes into one patch. I mean, all role related changes
i.e.
0001 + 0003 role related changes + 0004 role related changes and docs update
AND
0002 + 0003 nsp related changes + 0004 nsp related changes

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] How about to have relnamespace and relrole?

2015-02-24 Thread Jeevan Chalke
Reviewed posted directly on mail thread instead of posting it on commitfest app.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] detect custom-format dumps in psql and emit a useful error

2014-10-17 Thread Jeevan Chalke
Hi,

Regarding Loading Custom Format Dump:
===
When we supply plain sql file to pg_restore, we get following error:
$ ./install/bin/pg_restore a.sql
pg_restore: [archiver] input file does not appear to be a valid archive

So I would expect similar kind of message when we provide non-plain sql
file to psql. Something like:
input file does not appear to be a valid sql script file
(use pg_restore instead)

I have added additional details in parenthesis as we correctly identified
it as a custom dump file and user wanted it to restore.

However I do not see any issue with the patch.


Regarding Directory Error:
===
I strongly against the proposal. This patch changing error message to
something like this:
psql:blah:0: Input path is a directory. Use pg_restore to restore
directory-format database dumps.

So even though I accidentally provide a directory instead of a sql script
file when I have NO intention of restoring a dump, above message looks
weired. Instead current message looks perfectly fine here. i.e.
could not read from input file: Is a directory

psql always expect a file and NOT directory. Also it is not necessarily
working on restoring a dump.


Thanks

On Fri, Oct 17, 2014 at 9:41 AM, Craig Ringer cr...@2ndquadrant.com wrote:

 On 09/18/2014 05:58 PM, Andres Freund wrote:
  I don't think we need to make any discinction between psql -f mydb.dump,
  psql  mydb.dump, and whatever | psql. Just check, when noninteractively
  reading the first line in mainloop.c:MainLoop(), whether it starts with
  the magic header. That'd also trigger the warning on \i pg_restore_file,
  but that's hardly a problem.

 Done, patch attached.

 If psql sees that the first line begins with PGDMP it'll emit:

   The input is a PostgreSQL custom-format dump. Use the pg_restore
   command-line client to restore this dump to a database.

 then discard the rest of the current input source.

  pg_restore already knows to tell you to use psql if it sees an SQL file
  as input. Having something similar for pg_dump would be really useful.
 
  Agreed.
 
  We could additionally write out a hint whenever a directory is fed to
  psql -f that psql can't be used to read directory type dumps.

 Unlike the confusion between pg_restore and psql for custom file format
 dumps I haven't seen people getting this one muddled. Perhaps directory
 format dumps are just a bit more niche, or perhaps it's just more
 obvious that:

 psql:sometump:0: could not read from input file: Is a directory

 ... means psql is the wrong tool.

 Still, separate patch attached. psql will now emit:

 psql:blah:0: Input path is a directory. Use pg_restore to restore
 directory-format database dumps.

 I'm less sure that this is a worthwhile improvement than the check for
 PGDMP and custom format dumps though.

 --
  Craig Ringer   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-05 Thread Jeevan Chalke
On Thu, Sep 4, 2014 at 11:41 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 I am sory

 too much patches


:)

Patch looks good to me.
Marking Ready for Committer.

Thanks



 Regards

 Pavel



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-03 Thread Jeevan Chalke
Hi Pavel,

Here are few more comments on new implementation.

1.
 /*
- * SQL function row_to_json(row)
+ * SQL function row_to_json(row record, pretty bool, ignore_nulls bool)
  */

In above comments, parameter name row should changed to rowval.

2.
-DATA(insert OID = 3155 (  row_to_json   PGNSP PGUID 12 1 0 0 0 f f f f
t f s 1 0 114 2249 _null_ _null_ _null_ _null_ row_to_json _null_ _null_
_null_ ));
+DATA(insert OID = 3155 (  row_to_json   PGNSP PGUID 12 1 0 0 0 f f f f
t f s 1 0 114 2249 16 16 _null_ _null_ {rowval,pretty,ignore_nulls}
_null_ row_to_json _null_ _null_ _null_ ));

Number of arguments (pronargs) should be 3 now. However, when we create it
again with default values it gets updated. But still here we should not have
inconsistency.

3.
 extern Datum row_to_json(PG_FUNCTION_ARGS);
 extern Datum row_to_json_pretty(PG_FUNCTION_ARGS);
+extern Datum row_to_json_pretty_choosy(PG_FUNCTION_ARGS);
 extern Datum to_json(PG_FUNCTION_ARGS);

With this new implementation, we have NOT added row_to_json_pretty_choosy()
function. So need to remove that added line. Also we have only one function
with default arguments and thus removed row_to_json_pretty() function as
well. Thus need to remove extern for that too.

4.
Can we have couple of test cases with named argument along with skipped
pretty parameter test?


Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-03 Thread Jeevan Chalke
Hi Pavel,

You have attached wrong patch.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-02 Thread Jeevan Chalke
Hi Pavel,

it needs a redesign of original implementation, we should to change API to
 use default values with named parameters

 but it doesn't help too much (although it can be readable little bit more)

 instead row_to_json(x, false, true)

 be

 row_ro_json(x, ignore_null := true)

 it is not too much work, but I need a names for parameters


I have tried adding dummy names (a, b, c) in pg_proc entry you have added.
But that is not sufficient. We need to have default values provided to these
arguments to work row_ro_json(x, ignore_null := true) call.
It was not trivial. So I have not put much thought on that.

For name, I choose (row, pretty, ignore_nulls) or similar.

However it was my thought.
If it is too complex of not so useful then we can ignore it.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-09-01 Thread Jeevan Chalke
Hi Pavel,

Patch does look good to me. And found no issues as such.

However here are my optional suggestions:

1. Frankly, I did not like name of the function row_to_json_pretty_choosy.
Something like row_to_json_pretty_ignore_nulls seems better to me.

2. To use ignore nulls feature, I have to always pass pretty flag.
Which seems weired.

Since we do support named argument, can we avoid that?
No idea how much difficult it is. If we have a default arguments to this
function then we do not need one and two argument variations for this
function as well. And we can use named argument for omitting the required
one. Just a thought.

Rest looks good to me.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors

2014-08-22 Thread Jeevan Chalke
Hi Pavel,

You have said that XMLFOREST has something which ignores nulls, what's that?
Will you please provide an example ?

I am NOT sure, but here you are trying to omit entire field from the output
when its value is NULL. But that will add an extra efforts at other end
which is using output of this. That application need to know all fields and
then need to replace NULL values for missing fields. However we have an
optional argument for ignoring nulls and thus we are safe. Application will
use as per its choice.

Well, apart from that, tried reviewing the patch. Patch was applied but make
failed with following error.

make[3]: Entering directory `/home/jeevan/pg_master/src/backend/catalog'
cd ../../../src/include/catalog  '/usr/bin/perl' ./duplicate_oids
3255
make[3]: *** [postgres.bki] Error 1

Please run unused_oids script to find unused oid.

However, I had a quick look over code changes. At first glance it looks
good. But still need to check on SQL level and then code walk-through.

Waiting for updated patch.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-08-22 Thread Jeevan Chalke
 I would like to ignore this as UINTMAX lines are too much for a input
  buffer to hold. It is almost NIL chances to hit this.

 Yeah, most likely you will run out of memory before reaching that point,
 or out of patience.

 Yep.

BTW, I have marked this as waiting for committer.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-08-20 Thread Jeevan Chalke
Hi,

I have reviewed this:

I have initialize cur_lineno to UINTMAX - 2. And then observed following
behaviour to check wrap-around.

postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[18446744073709551613]=# \set PROMPT2 '%/[%l]%R%# '
postgres[18446744073709551613]=# select
postgres[18446744073709551614]-# a
postgres[18446744073709551615]-# ,
postgres[0]-# b
postgres[1]-# from
postgres[2]-# dual;

It is wrapping to 0, where as line number always start with 1. Any issues?

I would like to ignore this as UINTMAX lines are too much for a input
buffer to hold. It is almost NIL chances to hit this.


However, I think you need to use UINT64_FORMAT while printing uint64
number. Currently it is %u which wrap-around at UINT_MAX.
See how pset.lineno is displayed.

Apart from this, I didn't see any issues in my testing.

Patch applies cleanly. make/make install/initdb/make check all are well.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Jeevan Chalke
Hi,

Found new issues with latest patch:


 Thank you for reviewing the patch with variable cases.
 I have revised the patch, and attached latest patch.

  A:
  Will you please explain the idea behind these changes ?
 I thought wrong about adding new to tail of query_buf.
 The latest patch does not change related to them.

 Thanks.


  B:
 I added the condition of cur_line  0.


A.

However, this introduced new bug. As I told, when editor number of lines
reaches INT_MAX it starts giving negative number. You tried overcoming this
issue by adding  0 check. But I guess you again fumbled in setting that
correctly. You are setting it to INT_MAX - 1. This enforces each new line
to show line number as INT_MAX - 1 which is incorrect.

postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '
postgres[1]=# \e
postgres[2147483646]-# limit
postgres[2147483646]-# 1;
   relname
--
 pg_statistic
(1 row)

postgres[1]=# \e
postgres[2147483646]-# =
postgres[2147483646]-# '
postgres[2147483646]'# abc
postgres[2147483646]'# '
postgres[2147483646]-# ;
 relname
-
(0 rows)

postgres[1]=# select
relname
from
pg_class
where
relname
=
'
abc
'
;


Again to mimic that, I have simply intialized newline to INT_MAX - 2.
Please don't take me wrong, but it seems that your unit testing is not
enough. Above issue I discovered by doing exactly same steps I did in
reviewing previous patch. If you had tested your new patch with those steps
I guess you have caught that yourself.


B.

+ /* Calculate the line number */
+ if (scan_result != PSCAN_INCOMPLETE)
+ {
+ /* The one new line is always added to tail of query_buf
*/
+ newline = (newline != 0) ? (newline + 1) : 1;
+ cur_line += newline;
+ }

Here in above code changes, in any case you are adding 1 to newline. i.e.
when it is 0 you are setting it 1 (+1) and when  0 you are setting nl + 1
(again +1).
So why can't you simply use
if (scan_result != PSCAN_INCOMPLETE)
cur_line += (newline + 1);

Or better, why can't you initialize newline with 1 itself and then directly
assign cur_line with newline. That will eliminate above entire code block,
isn't it?
Or much better, simply get rid of newline, and use cur_line itself, will
this work well for you?


C. Typos:
1.
/* Count the number of new line for calculate ofline number */

Missing space between 'of' and 'line'.
However better improve that to something like (just suggestion):
Count the number of new lines to correctly adjust current line number

2.
/* Avoid cur_line and newline exceeds the INT_MAX */

You are saying avoid cur_line AND newline, but there is no adjustment for
newline in the code following the comment.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Jeevan Chalke
Hi,


On Fri, Jul 11, 2014 at 3:13 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 
 

 To my understating cleanly, you means that line number is not changed
 when newline has reached to INT_MAX, is incorrect?


As per my thinking yes.


 And the line number should be switched to 1 when line number has
 reached to INT_MAX?


Yes, when it goes beyond INT_MAX, wrap around to 1.

BTW, I wonder, can't we simply use unsigned int instead?

Also, what is the behaviour on LINE n, in error message in case of such
wrap-around?



 
  Or much better, simply get rid of newline, and use cur_line itself, will
  this work well for you?

 this is better.  I will change code to this.
 Thanks.
 I will fix it.



Meanwhile I have tried this. Attached patch to have your suggestion on
that.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 255e8ca..030f4d0 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3298,6 +3298,11 @@ testdb=gt; userinputINSERT INTO my_table VALUES (:'content');/userinput
   /varlistentry
 
   varlistentry
+termliteral%l/literal/term
+listitemparaThe current line number/para/listitem
+  /varlistentry
+
+  varlistentry
 termliteral%/literalreplaceable class=parameterdigits/replaceable/term
 listitem
 para
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index c3aff20..675b550 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -8,6 +8,7 @@
 #include postgres_fe.h
 #include mainloop.h
 
+#include limits.h
 
 #include command.h
 #include common.h
@@ -58,6 +59,7 @@ MainLoop(FILE *source)
 	pset.cur_cmd_source = source;
 	pset.cur_cmd_interactive = ((source == stdin)  !pset.notty);
 	pset.lineno = 0;
+	cur_line = 1;
 
 	/* Create working state */
 	scan_state = psql_scan_create();
@@ -225,6 +227,7 @@ MainLoop(FILE *source)
 		{
 			PsqlScanResult scan_result;
 			promptStatus_t prompt_tmp = prompt_status;
+			char *tmp = line;
 
 			scan_result = psql_scan(scan_state, query_buf, prompt_tmp);
 			prompt_status = prompt_tmp;
@@ -235,6 +238,30 @@ MainLoop(FILE *source)
 exit(EXIT_FAILURE);
 			}
 
+			/* 
+			 * Increase current line number counter with the new lines present
+			 * in the line buffer
+			 */
+			while (*tmp != '\0'  scan_result != PSCAN_INCOMPLETE)
+			{
+if (*(tmp++) == '\n')
+	cur_line++;
+			}
+
+			/* The one new line is always added to tail of query_buf */
+			if (scan_result != PSCAN_INCOMPLETE)
+cur_line++;
+
+			/*
+			 * If we overflow, then we start at INT_MIN and move towards 0.  So
+			 * to get +ve wrap-around line number we have to add INT_MAX + 2 to
+			 * this number.  We add 2 due to the fact that we have difference
+			 * of 1 in absolute value of INT_MIN and INT_MAX and another 1 as
+			 * line number starts at one and not at zero.
+			 */
+			if (cur_line  0)
+cur_line += INT_MAX + 2;
+
 			/*
 			 * Send command if semicolon found, or if end of line and we're in
 			 * single-line mode.
@@ -256,6 +283,7 @@ MainLoop(FILE *source)
 /* execute query */
 success = SendQuery(query_buf-data);
 slashCmdStatus = success ? PSQL_CMD_SEND : PSQL_CMD_ERROR;
+cur_line = 1;
 
 /* transfer query to previous_buf by pointer-swapping */
 {
@@ -303,6 +331,7 @@ MainLoop(FILE *source)
  query_buf : previous_buf);
 
 success = slashCmdStatus != PSQL_CMD_ERROR;
+cur_line = 1;
 
 if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT) 
 	query_buf-len == 0)
diff --git a/src/bin/psql/prompt.c b/src/bin/psql/prompt.c
index 26fca04..6a62e5f 100644
--- a/src/bin/psql/prompt.c
+++ b/src/bin/psql/prompt.c
@@ -44,6 +44,7 @@
  *		in prompt2 -, *, ', or ;
  *		in prompt3 nothing
  * %x - transaction status: empty, *, !, ? (unknown or no connection)
+ * %l - the line number
  * %? - the error code of the last query (not yet implemented)
  * %% - a percent sign
  *
@@ -229,6 +230,9 @@ get_prompt(promptStatus_t status)
 		}
 	break;
 
+case 'l':
+	sprintf(buf, %d, cur_line);
+	break;
 case '?':
 	/* not here yet */
 	break;
diff --git a/src/bin/psql/prompt.h b/src/bin/psql/prompt.h
index 4d2f7e3..f1f80d2 100644
--- a/src/bin/psql/prompt.h
+++ b/src/bin/psql/prompt.h
@@ -22,4 +22,7 @@ typedef enum _promptStatus
 
 char	   *get_prompt(promptStatus_t status);
 
+/* Current line number */
+intcur_line;
+
 #endif   /* PROMPT_H */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-10 Thread Jeevan Chalke
Hi,

Found few more bugs in new code:

A:
This got bad:

jeevan@ubuntu:~/pg_master$ ./install/bin/psql postgres
psql (9.5devel)
Type help for help.

postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '
postgres[1]=# select
postgres[2]-# *
postgres[3]-# from
postgres[4]-# abc;
ERROR:  syntax error at or near fromabc
LINE 1: select*fromabc;
   ^
postgres[1]=#
postgres[1]=#
postgres[1]=# \e
ERROR:  syntax error at or near fromabc
LINE 1: select*fromabc;
   ^
postgres[1]=# select*fromabc;
ERROR:  syntax error at or near fromabc
LINE 1: select*fromabc;
   ^
postgres[1]=#


See query text in LINE 1:. This is because, you have removed addition of
newline character. Related added_nl_pos. Need more investigation here.
However I don't think these changes are relevant to what you wanted in this
feature.
Will you please explain the idea behind these changes ?

Moreover, if you don't want to add newline character, then I guess entire
logic related to added_nl_pos is NO more required. You may remove this
variable and its logic altogether, not sure though. Also make sure you
update the relevant comments while doing so. Here you have removed the code
which is adding the newline but the comment there still reads:
/* insert newlines into query buffer between source lines */

Need more thoughts on this.


B:
postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '
postgres[1]=# \e
postgres[-2147483645]-# limit 1;
   relname
--
 pg_statistic
(1 row)

postgres[1]=#
postgres[1]=# select
relname
from
pg_class
limit 1;

Logic related to wrapping around the cur_line counter is wrong. Actually
issue is with newline variable. If number of lines in \e editor goes beyond
INT_MAX (NOT sure about the practical use), then newline will be -ve which
then enforces cur_line to be negative. To mimic this I have initialized
newline = INT_MAX - 1.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-09 Thread Jeevan Chalke
Hi,



 With further testing I noticed that the patch was not allowing ANTI joins
 in cases like this:

 explain select * from a where id not in(select x from b natural join c);



I too found this with natural joins and was about to report that. But its
good that you found that and fixed it as well.

I have reviewed this and didn't find any issues as such. So marking it
Ready for Committer.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-07-07 Thread Jeevan Chalke
Hi,

Found two (A and B) issues with latest patch:

A:
-- Set prompts
postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '

postgres[1]=#
postgres[1]=# select
postgres[2]-# *
postgres[3]-# from
postgres[4]-# abc;
ERROR:  relation abc does not exist
LINE 4: abc;
^

Now I used \e to edit the source. Deleted last line i.e. abc; and
returned to prompt, landed at line 4, typed abc;

postgres[1]=# \e
postgres[4]-# abc;
ERROR:  relation abc does not exist
LINE 5: abc;
^

In above steps, error message says LINE 5, where as on prompt abc is at
line 4.

postgres[1]=# select
*
from
abc;
ERROR:  relation abc does not exist
LINE 4: abc;
^

Here I again see error at line 4. Something fishy. Please investigate.
Looks like bug in LINE numbering in error message, not sure though.
But with prompt line number feature, it should be sync to each other.


B:
However, I see that you have removed the code changes related to INT_MAX.
Why?
I have set cur_line to INT_MAX - 2 and then observed that after 2 lines I
start getting negative numbers.

Thanks


On Sun, Jul 6, 2014 at 10:48 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Fri, Jun 20, 2014 at 7:17 PM, Jeevan Chalke
 jeevan.cha...@enterprisedb.com wrote:
  Hi Sawada Masahiko,
 
  I liked this feature. So I have reviewed it.
 
  Changes are straight forward and looks perfect.
  No issues found with make/make install/initdb/regression.
 
  However I would suggest removing un-necessary braces at if, as we have
 only
  one statement into it.
 
  if (++cur_line = INT_MAX)
  {
  cur_line = 1;
  }
 
  Also following looks wrong:
 
  postgres[1]=# select
  postgres[2]-# *
  postgres[3]-# from
  postgres[4]-# tab;
   a
  ---
  (0 rows)
 
  postgres[1]=# select
  *
  from
  tab
  postgres[2]-# where t  10;
  ERROR:  column t does not exist
  LINE 5: where t  10;
^
 
  Line number in ERROR is 5 which is correct.
  But line number in psql prompt is wrong.
 
  To get first 4 lines I have simply used up arrow followed by an enter for
  which I was expecting 5 in psql prompt.
  But NO it was 2 which is certainly wrong.
 
  Need to handle above carefully.
 
  Thanks
 
 
  On Thu, Jun 12, 2014 at 10:46 PM, Sawada Masahiko sawada.m...@gmail.com
 
  wrote:
 
  Hi all,
 
  The attached IWP patch is one prompt option for psql, which shows
  current line number.
  If the user made syntax error with too long SQL then psql outputs
  message as following.
 
  ERROR:  syntax error at or near a
  LINE 250: hoge
  ^
  psql teaches me where syntax error is occurred, but it is not kind
  when SQL is too long.
  We can use write SQL with ¥e(editor) command(e.g., emacs) , and we can
  know line number.
  but it would make terminal log dirty . It will make analyzing of log
  difficult after error is occurred.
  (I think that we usually use copy  paste)
 
  After attached this patch, we will be able to use %l option as prompting
  option.
 
  e.g.,
  $ cat ~/.psqlrc
  \set PROMPT2 '%/[%l]%R%# '
  \set PROMPT1 '%/[%l]%R%# '
  $ psql -d postgres
  postgres[1]=# select
  postgres[2]-# *
  postgres[3]-# from
  postgres[4]-# hoge;
 
  The past discussion is following.
 
  
 http://www.postgresql.org/message-id/cafj8prc1rupk6+cha1jpxph3us_zipabdovmceex4wpbp2k...@mail.gmail.com
 
 
  Please give me feedback.
 
  Regards,
 
  ---
  Sawada Masahiko
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
 
 
 
  --
  Jeevan B Chalke
  Principal Software Engineer, Product Development
  EnterpriseDB Corporation
  The Enterprise PostgreSQL Company
 
  Phone: +91 20 30589500
 
  Website: www.enterprisedb.com
  EnterpriseDB Blog: http://blogs.enterprisedb.com/
  Follow us on Twitter: http://www.twitter.com/enterprisedb
 
  This e-mail message (and any attachment) is intended for the use of the
  individual or entity to whom it is addressed. This message contains
  information from EnterpriseDB Corporation that may be privileged,
  confidential, or exempt from disclosure under applicable law. If you are
 not
  the intended recipient or authorized to receive this for the intended
  recipient, any use, dissemination, distribution, retention, archiving, or
  copying of this communication is strictly prohibited. If you have
 received
  this e-mail in error, please notify the sender immediately by reply
 e-mail
  and delete this message.

 Thank you for reviewing patch, and sorry for late response.

 I have updated this patch, and attached it.

  postgres[1]=# select
  *
  from
  tab
  postgres[2]-# where t  10;
  ERROR:  column t does not exist
  LINE 5: where t  10;
 Attached patch can handle this case.

 Please give me feedback.

 Regards,

 ---
 Sawada Masahiko




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone

Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-02 Thread Jeevan Chalke
On Sun, Jun 29, 2014 at 4:18 PM, David Rowley dgrowle...@gmail.com wrote:

 I think I'm finally ready for a review again, so I'll update the
 commitfest app.


I have reviewed this on code level.

1. Patch gets applied cleanly.
2. make/make install/make check all are fine

No issues found till now.

However some cosmetic points:

1.
 * The API of this function is identical to convert_ANY_sublink_to_join's,
 * except that we also support the case where the caller has found NOT
EXISTS,
 * so we need an additional input parameter under_not.

Since now, we do support NOT IN handling in convert_ANY_sublink_to_join() we
do have under_not parameter there too. So above comments near
convert_EXISTS_sublink_to_join() function is NO more valid.


2. Following is the unnecessary change. Can be removed:

@@ -506,6 +560,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
return NULL;
}
}
+
}
/* Else return it unmodified */
return node;


3. Typos:

a.
+ * queryTargetListListCanHaveNulls
...
+queryTargetListCanHaveNulls(Query *query)

Function name is queryTargetListCanHaveNulls() not
queryTargetListListCanHaveNulls()

b.
 *For example the presense of; col IS NOT NULL, or col = 42 would
allow

presense = presence


4. get_attnotnull() throws an error for invalid relid/attnum.
But I see other get_att* functions which do not throw an error rather
returning some invalid value, eg. NULL in case of attname, InvalidOid in
case of atttype and InvalidAttrNumber in case of attnum. I have observed
that we cannot return such invalid value for type boolean and I guess that's
the reason you are throwing an error. But somehow looks unusual. You had put
a note there which is good. But yes, caller of this function should be
careful enough and should not assume its working like other get_att*()
functions.
However for this patch, I would simply accept this solution. But I wonder if
someone thinks other way round.


Testing more on SQL level.

However, assigning it to author to think on above cosmetic issues.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] add line number as prompt option to psql

2014-06-20 Thread Jeevan Chalke
Hi Sawada Masahiko,

I liked this feature. So I have reviewed it.

Changes are straight forward and looks perfect.
No issues found with make/make install/initdb/regression.

However I would suggest removing un-necessary braces at if, as we have only
one statement into it.

if (++cur_line = INT_MAX)
{
cur_line = 1;
}

Also following looks wrong:

postgres[1]=# select
postgres[2]-# *
postgres[3]-# from
postgres[4]-# tab;
 a
---
(0 rows)

postgres[1]=# select
*
from
tab
postgres[2]-# where t  10;
ERROR:  column t does not exist
LINE 5: where t  10;
  ^

Line number in ERROR is 5 which is correct.
But line number in psql prompt is wrong.

To get first 4 lines I have simply used up arrow followed by an enter for
which I was expecting 5 in psql prompt.
But NO it was 2 which is certainly wrong.

Need to handle above carefully.

Thanks


On Thu, Jun 12, 2014 at 10:46 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 Hi all,

 The attached IWP patch is one prompt option for psql, which shows
 current line number.
 If the user made syntax error with too long SQL then psql outputs
 message as following.

 ERROR:  syntax error at or near a
 LINE 250: hoge
 ^
 psql teaches me where syntax error is occurred, but it is not kind
 when SQL is too long.
 We can use write SQL with ¥e(editor) command(e.g., emacs) , and we can
 know line number.
 but it would make terminal log dirty . It will make analyzing of log
 difficult after error is occurred.
 (I think that we usually use copy  paste)

 After attached this patch, we will be able to use %l option as prompting
 option.

 e.g.,
 $ cat ~/.psqlrc
 \set PROMPT2 '%/[%l]%R%# '
 \set PROMPT1 '%/[%l]%R%# '
 $ psql -d postgres
 postgres[1]=# select
 postgres[2]-# *
 postgres[3]-# from
 postgres[4]-# hoge;

 The past discussion is following.
 
 http://www.postgresql.org/message-id/cafj8prc1rupk6+cha1jpxph3us_zipabdovmceex4wpbp2k...@mail.gmail.com
 

 Please give me feedback.

 Regards,

 ---
 Sawada Masahiko


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-06-17 Thread Jeevan Chalke
On Tue, Mar 4, 2014 at 11:28 AM, Wang, Jing ji...@fast.au.fujitsu.com
wrote:

 I don't buy your argument. Why isn't verbose option sufficient? Did you
 read the old thread about this [1]?
 [1] http://www.postgresql.org/message-id/3677.1253912...@sss.pgh.pa.us

 AFAICS a lot of people compare pg_dump diffs. If we apply this patch,
 it would break those applications.


Well, as it already mentioned by Peter in above old mail thread that
a diff of the same database made by different (major) versions of pg_dump
will already be different in most situations, so adding the pg_dump version
number in it is essentially free from this perspective

So I don't think there will be any issue with application break.

Also, it is *already* available if
 you add verbose option (which is sufficient to satisfy those that want
 the client and/or
 server version) in plain mode (the other modes already include the
 desired info by default). In the past, timestamps were removed to avoid
 noise in diffs.


Verbose option do include timestamps which will fail while comparing.
Also Verbose has too many details which may not be interested by people but
I
guess more people will clearly wanted to see the server and client version.

Thus I see this patch is useful and thus reviewed it.


 Sorry, I don't understand which application will break?  Can you give me
 more detail information?
 Timestamps always vary which do affect the diff.  But I can't imagine
 why adding the version will affect the pg_dump diffs.
 I don't think the version number vary  frequently.


Review comments:

1. Patch applies cleanly. make/make install/make check all are fine.
2. No issues fine as such in my unit testing.
3. As Tom suggested,
pg_dump  foo is giving identical output to pg_dump -Fc | pg_restore  foo

Basically only the code lines are re-arranged and thus it has NO affect on
system as such. So no issues found and as per me it is good to go in.

However following code chunk looks weird:

+   else
+   {
+   ahprintf(AH, \n);
+   }

You need extra line when NOT verbose mode but you don't need that when it is
verbose mode. This is just because of the fact that dumpTimestamp() call in
verbose mode adds an extra line.
Can't we remove this else all-together and print extra line unconditionally
?
Also can we remove curly braces in if and else near these code changes ?
(Attached patch along those lines)

Anyone has any other views ?

Thanks

-- 
Jeevan B Chalke
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f6fbf44..4aed1cb 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -353,16 +353,17 @@ RestoreArchive(Archive *AHX)
 
 	ahprintf(AH, --\n-- PostgreSQL database dump\n--\n\n);
 
+	if (AH-archiveRemoteVersion)
+		ahprintf(AH, -- Dumped from database version %s\n,
+ AH-archiveRemoteVersion);
+	if (AH-archiveDumpVersion)
+		ahprintf(AH, -- Dumped by pg_dump version %s\n,
+ AH-archiveDumpVersion);
+
 	if (AH-public.verbose)
-	{
-		if (AH-archiveRemoteVersion)
-			ahprintf(AH, -- Dumped from database version %s\n,
-	 AH-archiveRemoteVersion);
-		if (AH-archiveDumpVersion)
-			ahprintf(AH, -- Dumped by pg_dump version %s\n,
-	 AH-archiveDumpVersion);
 		dumpTimestamp(AH, Started on, AH-createDate);
-	}
+
+	ahprintf(AH, \n);
 
 	if (ropt-single_txn)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >