Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-24 Thread Tomas Vondra

Hi,

Thanks for the review and time spent on reworking the patch!

On 09/24/2015 07:41 AM, David Rowley wrote:

On 23 September 2015 at 17:11, David Rowley
> wrote:

find_foreign_key_clauses() should look for the longest match and
return a Bitmap set of the list indexes to the caller.
It might be possible to fool the longest match logic by duplicating
clauses, e.g. a1 = b1 AND a1 = b1 and a1 = b1 AND a2 = b2 AND a3 =
b3, but I can't imagine that matters, but if it did, we could code
it to be smart enough to see through that.


I took a bash at implementing what I described, and I've ended up with
the attached.

git diff -stat gives me:

  src/backend/optimizer/path/costsize.c | 717
--
  src/backend/optimizer/plan/analyzejoins.c |   1 +
  src/backend/optimizer/util/plancat.c  | 112 +++--
  3 files changed, 228 insertions(+), 602 deletions(-)

So it's removed quite a bit of code. I also discovered that: 1.0 /
Max(rel->tuples,1.0) is no good for SEMI and ANTI joins. I've coded
around this in the attached, but I'm not certain it's the best way of
doing things.

I thought that it might be possible to add some regression test around
this, if we simply just find a plan the uses a nested loop due to
underestimation of matching rows, and then make sure that it no longer
uses a nested loop when the foreign key is added. I've not yet done this
in the attached.

Patched attached in delta form and complete form.

I still need to perform more analysis on the plancat.c changes.

Have I made any changes that you disagree with?


I think the changes in general are a step in the right direction, but 
I've noticed some differences in behavior - some of them may be actually 
desirable, other are bugs I believe.



1) (rel)hasindex


I'm perfectly fine with getting rid of hasindex, it was mostly 
cargo-cult programming anyway.



2) find_best_match_foreign_key
--

I think the comment before the function needs rephrasing (seems a bit 
broken to me). I do like the approach in general, although it changes 
the semantics a bit. The original code only considered "fully matching" 
fkeys, while the new code simply takes the longest match.


Let me illustrate on a somewhat artificial example (fkjoin1.sql):

create table t1 (a int, b int, c int, d int,
 unique(a,b), unique(a,b,c,d));

create table t2 (a int, b int, c int, d int,
 foreign key (a,b) references t1(a,b),
 foreign key (a,b,c,d) references t2(a,b,c,d));

insert into t1 select i, i, i, i from generate_series(1,100) s(i);
insert into t2 select i, i, i, i from generate_series(1,100) s(i);

Now, let's say a query joining the tables on (a,b,c). The original code 
did this


explain select * from t1 join t2 using (a,b,c);

   QUERY PLAN
-
 Hash Join  (cost=37789.00..79094.01 rows=1 width=20)
   Hash Cond: ((t1.a = t2.a) AND (t1.b = t2.b) AND (t1.c = t2.c))
->  Seq Scan on t1  (cost=0.00..15406.00 rows=100 width=16)
->  Hash  (cost=15406.00..15406.00 rows=100 width=16)
->  Seq Scan on t2  (cost=0.00..15406.00 rows=100 width=16)

while the new code does this:

   QUERY PLAN
-
 Hash Join  (cost=37789.00..79094.01 rows=100 width=20)
   Hash Cond: ((t1.a = t2.a) AND (t1.b = t2.b) AND (t1.c = t2.c))
   ->  Seq Scan on t1  (cost=0.00..15406.00 rows=100 width=16)
   ->  Hash  (cost=15406.00..15406.00 rows=100 width=16)
 ->  Seq Scan on t2  (cost=0.00..15406.00 rows=100 width=16)

This happens (I believe) because the new code only looks for the longest 
match, without the "full match" restriction. So while the old code finds 
(a,b)+c, the new code finds (a,b,c,d).


I'm not saying the new code is wrong - the "full match" restriction was 
there just to simplify v1 of the patch, because dealing with partial 
matches is a bit more complicated. So it's desirable to relax this 
condition.


It's also true that the new code works better in some cases, e.g. this

explain select * from t1 join t2 using (a,b,c,d);

 QUERY PLAN
---
 Hash Join  (cost=40289.00..85344.01 rows=100 width=16)
   Hash Cond: ((t1.a = t2.a) AND (t1.b = t2.b) AND (t1.c = t2.c)
 AND (t1.d = t2.d))
   ->  Seq Scan on t1  (cost=0.00..15406.00 rows=100 width=16)
   ->  Hash  (cost=15406.00..15406.00 rows=100 width=16)
 ->  Seq Scan on t2  (cost=0.00..15406.00 rows=100 width=16)

was originally estimated like this

 QUERY PLAN

Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-24 Thread Gavin Flower


An example from a book on PostgreSQL server programming that I'm working 
through (Note that it is obviously awkward to write with gender pronouns 
when gender is irrelevant, note the "he she" in one place and "he/she" 
in another!):


   "If the user is a superuser, then he she has permission to see the
   full query. If the user is a regular user, he/she will only see the
   full query for his/her queries."

Written in 'Gender Appropriate' style (only refer to gender when it is 
relevant):


   "If the user is a superuser, then they have permission to see the
   full query. If the user is a regular user, they will only see the
   full query for their queries."

I think the second version is easier to read - and in this case, shorter!


-Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Table partition + join pushdown

2015-09-24 Thread Taiki Kondo
Hello, KaiGai-san.

Thank you for your comment, and sorry for late response.

The attached patch is completely rewritten from previous patch[1], at your 
suggestion[2].
Please find attached.


This patch contains following implementation, but I can't determine this is 
correct or wrong.

1. Cost estimation
In this patch, additional row filter is implemented for Hash, Merge join and 
Nested Loop.
I implemented cost estimation feature for this filter by watching other parts 
of filters,
but I am not sure this implementation is correct.


2. Workaround for failing assertion at allpaths.c
In standard_join_search(), we expect to have a single rel at the final level.
But this expectation is disappointed by join pushdown feature, because this will
search for the combinations not searched by original standard_join_serch()
at the final level. Therefore, once trying join pushdown is succeeded,
failing assertion occurs in allpaths.c.

So I implemented workaround by temporary set NULL to root->join_rel_level while
trying join pushdown, but I think this implementation may be wrong.


3. Searching pathkeys for Merge Join
When join pushdown feature chooses merge join for pushed-down join operation,
planner fails to create merge join node because it is unable to find pathkeys
for this merge join. I found this is caused by skipping child table in finding
pathkeys.

I expect that it is for making planner faster, and I implemented that
planner doesn't skip child table in finding pathkeys for merge join.
But I am not sure this expectation is correct.


Any comments/suggestions are welcome.


Remarks :
[1] 
http://www.postgresql.org/message-id/12a9442fbae80d4e8953883e0b84e0885c0...@bpxm01gp.gisp.nec.co.jp
[2] 
http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f801134...@bpxm15gp.gisp.nec.co.jp

Best regards,
--
Taiki Kondo

NEC Solution Innovators, Ltd.


-Original Message-
From: Kaigai Kouhei(海外 浩平) [mailto:kai...@ak.jp.nec.com] 
Sent: Tuesday, August 18, 2015 5:47 PM
To: Kondo Taiki(近藤 太樹); pgsql-hackers@postgresql.org
Cc: Iwaasa Akio(岩浅 晃郎)
Subject: RE: [Proposal] Table partition + join pushdown

Hello Kondo-san,

I briefly checked your patch. Let me put some comments about its design and 
implementation, even though I have no arguments towards its concept. :-)

* Construction of RelOptInfo

In your patch, try_hashjoin_pushdown() called by try_hashjoin_path() constructs 
RelOptInfo of the join-rel between inner-rel and a subpath of Append node. It 
is entirely wrong implementation.

I can understand we (may) have no RelOptInfo for the joinrel between
tbl_child_0 and other_table, when planner investigates a joinpath to process 
join Append path with the other_table.

>   HashJoin
> -> Append
>   -> SeqScan on tbl_child_0
>   -> SeqScan on tbl_child_1
>   -> SeqScan on tbl_child_2
>   -> SeqScan on tbl_child_3
> -> Hash
>   -> SeqScan on other_table
>
How about these alternatives?

- calls make_join_rel() to the pair of tbl_child_X and other_table
  by try_hashjoin_pushdown() or somewhere. make_join_rel() internally
  construct a RelOptInfo for the supplied pair of relations, so
  relevant RelOptInfo shall be properly constructed.
- make_join_rel() also calls add_paths_to_joinrel() towards all the
  join logic, so it makes easier to support to push down other join
  logic including nested-loop or custom-join.
- It may be an idea to add an extra argument to make_join_rel() to
  inform expressions to be applied for tuple filtering on
  construction of inner hash table.

* Why only SeqScan is supported

I think it is the role of Hash-node to filter out inner tuples obviously 
unrelated to the join (if CHECK constraint of outer relation gives 
information), because this join-pushdown may be able to support multi-stacked 
pushdown.

For example, if planner considers a path to join this Append-path with another 
relation, and join clause contains a reference to X?

>   Append
>-> HashJoin
>  -> SeqScan on tbl_child_0
>  -> Hash ... Filter: hash_func(X) % 4 = 0
>-> SeqScan on other_table
>-> HashJoin
>  -> SeqScan on tbl_child_1
>  -> Hash ... Filter: hash_func(X) % 4 = 1
>-> SeqScan on other_table
>-> HashJoin
>  -> SeqScan on tbl_child_2
>  -> Hash ... Filter: hash_func(X) % 4 = 2
>-> SeqScan on other_table
>-> HashJoin
>  -> SeqScan on tbl_child_3
>  -> Hash ... Filter: hash_func(X) % 4 = 3
>-> SeqScan on other_table
>
It may be a good challenge to consider additional join pushdown, even if 
subpaths of Append are HashJoin, not SeqScan, like:

   Append
-> HashJoin
  -> HashJoin
-> SeqScan on tbl_child_0
-> Hash ... Filter: hash_func(X) % 4 = 0
  -> SeqScan on other_table
  -> Hash ... Filter: hash_func(X) % 4 = 0
 -> SeqScan on another_table
-> HashJoin
  -> HashJoin
-> SeqScan on tbl_child_1
-> Hash ... Filter: hash_func(X) % 4 = 

Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra

Hi,

On 09/23/2015 11:25 PM, Tom Lane wrote:

Tomas Vondra  writes:

On 09/11/2015 07:16 PM, Robert Haas wrote:

On Fri, Sep 11, 2015 at 1:12 PM, Tomas Vondra
 wrote:

I'm arguing for fixing the existing bug, and then addressing the case of
over-estimation separately, with proper analysis.



Well, this is part of how we're looking it differently.  I think the
bug is "we're passing a value to palloc that is too large, so
sometimes it fails" and the way to fix that is to properly limit the
value.  You are clearly defining the bug a bit differently.



Yes, I see it differently.



I don't quite understand why limiting the value is more "proper" than
using a function that can handle the actual value.



The proposed bugfix addresses the issue in the most straightforward way,
without introducing additional considerations about possible
over-estimations (which the current code completely ignores, so this is
a new thing). I think bugfixes should not introduce such changes to
behavior (albeit internal), especially not without any numbers.


This thread seems to have stalled out...

After re-reading it, I'm going to agree with Robert that we should
clamp the initial pointer-array size to work with palloc (ie, 512MB
worth of pointers, which please recall is going to represent at
least 10X that much in hashtable entries, probably more).


10x that much in entries? Do you mean NTUP_PER_BUCKET? Because that was 
reduced to 1 last year as part of the hashjoin improvements. So we do 
have more buckets than tuples (to keep load factor below 1.0). It's 
still true the entries will need more space than buckets (because of 
headers and such), but it may easily get well below 10x.


In the example reported by KaiGai-san, the entries are 8B wide (~24B 
with header), while buckets are 8B. That's 1:3 ratio. It is a bit 
extreme example because in other cases the tuples will be wider.


It also seems to me that the higher the ratio, the lower the need to 
actually impose such limit because it increases the natural pressure 
keeping buckets down (because both buckets and entries need to share 
work_mem of limited size).



The argument that allowing it to be larger would be a performance win
seems entirely unbased on any evidence, while the risks of allowing
arbitrarily large allocations based on planner estimates seem pretty
obvious to me.


Do we agree that resizing the hash table is not free? Because my
argument was that we're forcing the well-estimated cases to do
additional resize, so maybe we should measure the impact first.

Now, maybe it does not really matter in this case - we probably get 
slightly inaccurate estimates all the time. Not terribly wrong, but 
enough to make the initial number of buckets too low, so we may actually 
do the resize quite anyway. Also, if we're dealing with hash tables of 
this size, we're probably dealing with much larger outer relation and 
the additional resize is going to be just noise ...


I however quite dislike the dismissal of the possible impact. It should 
be the responsibility of the person introducing the change to show that 
no such impact actually exists, not just waving it off as "unbased on 
any evidence" when there's no evidence presented.


Had I been adding such limit, I'd do at least some testing and presented 
the results here. Perhaps I'm a bit over-protective of this piece of 
code as I've spent quite a bit of time getting it faster, but I believe 
the principle that the person proposing a change should demonstrate the 
(lack of) performance impact is sound.



And the argument that such overestimates are a bug that we can easily
fix is based on even less evidence; in fact, I would dismiss that out
of hand as hubris.


I don't think anyone was suggesting the overestimates are easy to fix 
(or even possible to fix in general). If I ever claimed that, I was 
clearly wrong.




Now there is a separate issue about whether we should allow hashtable
resizes to exceed that limit.  There I would vote yes, as long as the
resize is based on arrival of actual data and not estimates (following
Robert's point that the existing uses of repalloc_huge are driven by
actual need).

So I don't like any of the proposed patches exactly as-is.

BTW, just looking at the code in question, it seems to be desperately
in need of editorial review.  A couple of examples:

* Some of the code goes to the trouble of maintaining a
log2_nbuckets_optimal value, but ExecHashIncreaseNumBuckets doesn't
know about that and recomputes the value.


Yeah, that's a stupid bug.



* ExecHashIncreaseNumBuckets starts out with a run-time test on
something that its sole caller has just Assert()'d to not be the
case, and which really ought to be impossible with or without that
Assert.


Yeah, right. Excessively defensive coding on my side (I think I've added 
the Assert later, or something).




* ExecHashTableInsert believes it can only increase 

Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-24 Thread Geoff Winkless
On 24 September 2015 at 11:33, Gavin Flower 
wrote:

> An example from a book on PostgreSQL server programming that I'm working
> through (Note that it is obviously awkward to write with gender pronouns
> when gender is irrelevant, note the "he she" in one place and "he/she" in
> another!):
>
>"If the user is a superuser, then they have permission to see the
>full query. If the user is a regular user, they will only see the
>full query for their queries."
>

Can I quietly suggest "​Users with superuser pemissions can always see the
full query​, while regular users will only see the full query for their own
queries."?

Geoff


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-24 Thread Gavin Flower

On 24/09/15 22:41, Geoff Winkless wrote:
On 24 September 2015 at 11:33, Gavin Flower 
>wrote:


An example from a book on PostgreSQL server programming that I'm
working through (Note that it is obviously awkward to write with
gender pronouns when gender is irrelevant, note the "he she" in
one place and "he/she" in another!):

   "If the user is a superuser, then they have permission to see the
   full query. If the user is a regular user, they will only see the
   full query for their queries."

Can I quietly suggest "​Users with superuser pemissions can always see 
the full query​, while regular users will only see the full query for 
their own queries."?


Geoff

By all means say it quietly!  :-)

But I was simply trying to change it into Gender Appropriate form, 
rather improve it in other aspects.


However, your rephrasing is better still!


-Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Josh Berkus wrote:
> > When we discussed this 8 years ago, Debian said debbugs wasn't ready for
> > anyone else to use.  Has that changed?
> 
> Emacs uses debbugs now, so there's at least one non-Debian user.

Oh?  Nice.  debbugs matches up largely with what Thomas Munro was just
suggesting regarding the workflow also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 5:50 AM, Tomas Vondra
 wrote:
> I however quite dislike the dismissal of the possible impact. It should be
> the responsibility of the person introducing the change to show that no such
> impact actually exists, not just waving it off as "unbased on any evidence"
> when there's no evidence presented.

So, we're talking about determining the behavior in a case that
currently fails.  Making it behave like a case that currently works
can't but be an improvement.  Making it do something that currently
never happens might be better still, or it might be equivalent, or it
might be worse.  I just don't buy the argument that somebody's got to
justify on performance grounds a decision not to allocate more memory
than we currently ever allocate.  That seems 100% backwards to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
* Thomas Kellerer (spam_ea...@gmx.net) wrote:
> >  And email integration for Jira is nonexistant.
> 
> That is not true. We do have an email integration where customers can create
> issues by sending an email to a specific "Jira Email" address. And as far as
> I know this is a standard module from Atlassian. I _think_ it can also be
> configured that you can reply to the notification emails and the reply will
> be added as a comment to the issue.

I've tried to make email integration work also and it's really painful.
I agree it exists but it's not nearly as good as debbugs.

> One thing that is indeed still missing is a Git integration the way the
> Subversion integration works. Jira scans the commit messages for ticket
> numbers and automatically links an issue to the commits. 

Not hard to do w/ debbugs.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-24 Thread Shulgin, Oleksandr
On Thu, Sep 24, 2015 at 12:30 AM, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> Personally I think Alvaro's position is unduly conservative: to the
> extent
> >> that plans change it'd likely be for the better.  But I'm not excited
> >> enough to fight hard about it.
>
> > I don't really care enough.  We have received some complaints about
> > keeping plans stable, but maybe it's okay.
>
> The other side of the coin is that there haven't been so many requests for
> changing this; more than just this one, but not a groundswell.  So 9.5
> only seems like a good compromise unless we get more votes for back-patch.
>
> I reviewed the patch and concluded that it would be better to split
> compute_minimal_stats into two functions instead of sprinkling it so
> liberally with if's.  So I did that and pushed it.
>

Thanks, I was not really happy about all the checks because some of them
were rather implicit (e.g. num_mcv being 0 due to track being NULL, etc.).
Adding this as a separate function makes me feel safer.

--
Alex


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Thomas Kellerer
>  And email integration for Jira is nonexistant.

That is not true. We do have an email integration where customers can create
issues by sending an email to a specific "Jira Email" address. And as far as
I know this is a standard module from Atlassian. I _think_ it can also be
configured that you can reply to the notification emails and the reply will
be added as a comment to the issue.

> like that you can't have more than one person assigned to a bug

That is true, there is only one "Assignee" for an issue - the person who is
responsible for it. 

But given the flexibility of Jira I'm pretty sure one could configure an
additional field (e.g. "is being worked on by" that can be assigned multiple
users. 

One thing that is indeed still missing is a Git integration the way the
Subversion integration works. Jira scans the commit messages for ticket
numbers and automatically links an issue to the commits. 

Thomas

(Note that I'm not in any way affiliated with Atlassian - just to avoid that
impression)



--
View this message in context: 
http://postgresql.nabble.com/No-Issue-Tracker-Say-it-Ain-t-So-tp5867020p5867093.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Wed, Sep 23, 2015 at 7:29 PM, Tom Lane  wrote:

> Removing that entirely would be quite incorrect, because then you'd be
> lying to the parent node about what collation your node outputs.
>

Yes. I too thought so and thus wanted to fix that code block by
considering the default collation.


>
> After thinking a bit more about the existing special case for non-foreign
> Vars, I wonder if what we should do is change these code blocks to look
> like
>
> collation = r->resultcollid;
> if (collation == InvalidOid)
> state = FDW_COLLATE_NONE;
> else if (inner_cxt.state == FDW_COLLATE_SAFE &&
>  collation == inner_cxt.collation)
> state = FDW_COLLATE_SAFE;
> +   else if (collation == DEFAULT_COLLATION_OID)
> +   state = FDW_COLLATE_NONE;
> else
> state = FDW_COLLATE_UNSAFE;
>
> That is, only explicit introduction of a non-default collation causes
> a subexpression to get labeled FDW_COLLATE_UNSAFE.  Default collations
> would lose out when getting merged with a nondefault collation from a
> foreign Var, so they should work all right.
>

Agree.
I had suggested similar changes in approach (2)
but you put that check at exact required place.


> regards, tom lane
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Wed, Sep 23, 2015 at 10:15 PM, Tom Lane  wrote:

> I wrote:
> > Hm ... actually, we probably need *both* types of changes if that's
> > what we believe the state values mean.
>
>
I too was confused with the state explanations from the code-comments which
we have them now. With your explanation here clears that.
Thanks for explaining those.

After a bit more thinking and experimentation, I propose the attached
> patch.
>

I had a look over the patch and reviewed it. It is in excellent state to
check-in.


> regards, tom lane
>
>

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Torsten Zuehlsdorff


On 23.09.2015 20:43, Robert Haas wrote:

On Wed, Sep 23, 2015 at 2:30 PM, Kam Lasater  wrote:

Thanks for the suggestion. However, an issue tracker is not a
replacement for mailing list(s) and vice versa. They are both
necessary for success.


I venture to say that we are succeeding as it is, although of course
we might have more success if we did some things better, including
this.  However, as Stephen says, the problem with an issue tracker is
that, unless some person or persons committed to keep it up to date,
it would just fill up with crap. We have an issue tracker for database
server issues here at EnterpriseDB, and keeping it up to date is a ton
of work.  If nobody's volunteering to do that work in the PostgreSQL
community, an issue tracker is going to end up not being useful,
because it will just be wrong all the time.


I would volunteering to do that work if the community decides to get a 
bug tracker.


Greetings,
Torsten


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Torsten Zuehlsdorff


On 24.09.2015 01:33, Josh Berkus wrote:

On 09/23/2015 03:05 PM, Jim Nasby wrote:

On 9/23/15 3:12 PM, Thomas Kellerer wrote:

They also support Postgres as their backend (and you do find hints
here and
there
that it is the recommended open source DBMS for them - but they don't
explicitly state it like that). We are using Jira at the company I
work for
and
all Jira installations run on Postgres there.


I'll second Jira as well. It's the only issue tracker I've seen that you
can actually use for multiple different things without it becoming a
mess. IE: it could track Postgres bugs, infrastructure issues, and the
TODO list if we wanted, allow issues to reference each other
intelligently, yet still keep them as 3 separate bodies.


Speaking as someone who uses Jira for commericial work, I'm -1 on them.


Full ACK. -1 from me too.

Greetings,
Torsten


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-09-24 Thread Masahiko Sawada
On Fri, Sep 11, 2015 at 10:15 AM, Michael Paquier
 wrote:
> On Fri, Sep 11, 2015 at 3:41 AM, Beena Emerson  
> wrote:
>> Please find attached the WIP patch for the proposed feature. It is built
>> based on the already discussed design.
>>
>> Changes made:
>> - add new parameter "sync_file" to provide the location of the pg_syncinfo
>> file. The default is 'ConfigDir/pg_syncinfo.conf', same as for pg_hba and
>> pg_ident file.
>
> I am not sure that's really necessary. We could just hardcode its location.
>
>> - pg_syncinfo file will hold the sync rep information in the approved JSON
>> format.
>
> OK. Have you considered as well the approach to add support for
> multi-line GUC parameters? This has been mentioned a couple of time
> above as well, with something like that I imagine:
> param = 'value1,' \
> 'value2,' \
> 'value3'
> and this reads as 'value1,value2,value3'. This would benefit as well
> for other parameters.
>

I agree with adding support for multi-line GUC parameters.
But I though it is:
param = 'param1,
param2,
param3'

This reads as 'value1,value2,value3'.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 1:25 AM, Thomas Munro
 wrote:
> The two most common interactions could go something like this:
>
> 1.  User enters bug report via form, creating an issue in NEW state
> and creating a pgsql-bugs thread.  Someone responds by email that this
> is expected behaviour, not a bug, not worth fixing or not a Postgres
> issue etc using special trigger words.  The state is automatically
> switched to WORKS_AS_DESIGNED or WONT_FIX.  No need to touch the web
> interface: the only change from today's workflow is awareness of the
> right wording to trigger the state change.
>
> 2.  User enters bug report via form, creating issue #1234 in NEW
> state.   Someone responds by email to acknowledge that that may indeed
> be an issue, and any response to an issue in NEW state that doesn't
> reject it switches it to UNDER_DISCUSSION.  Maybe if a commitfest item
> references the same thread (or somehow references the issue number?)
> its state is changed to IN_COMMITFEST, or maybe as you say there could
> be a way to generate the commitfest item from the issue, not sure
> about that.  Eventually a commit log message says "Fixes bug #1234"
> and the state automatically goes to FIXED.
>
> Other interactions (curation of unresolved issues, reopening disputed
> issues, resolving issues where the committer or responder forgot to
> use the right magic words, etc etc) could be done via the web
> interface which would initially have only a couple of pages: one for
> paging through issues in different states and one for viewing/editing
> them.  (Obviously you could go further and deal with assigning issues
> to people, and adding more states etc etc).
>
> I don't know much about pgweb and friends but it seems like we already
> have a bunch of Python machinery processing all mailing list traffic,
> a database and a webapp doing something pretty closely related.  How
> hard would it be to teach it to track issues this way, building on the
> existing infrastructure, compared to rolling out a new separate
> product, and could the result be better integrated?

I think all this sounds pretty cool, frankly.  The astute among you
will have picked up on the fact that bug-trackers are not my absolute
favorite piece of technology, but it seems like something of this sort
could be a significant advance over the status quo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra



On 09/24/2015 01:51 PM, Robert Haas wrote:

On Thu, Sep 24, 2015 at 5:50 AM, Tomas Vondra
 wrote:

I however quite dislike the dismissal of the possible impact. It should be
the responsibility of the person introducing the change to show that no such
impact actually exists, not just waving it off as "unbased on any evidence"
when there's no evidence presented.


So, we're talking about determining the behavior in a case that
currently fails. Making it behave like a case that currently works
can't but be an improvement. Making it do something that currently
never happens might be better still, or it might be equivalent, or
it might be worse. I just don't buy the argument that somebody's got
to justify on performance grounds a decision not to allocate more
memory than we currently ever allocate. That seems 100% backwards to
me.


Yes, it's true that if you hit the issue it fails, so I understand your 
view that it's a win to fix this by introducing the (arbitrary) limit. I 
disagree with this view because the limit changes at the limit - if you 
get a good estimate just below the limit, you get no resize, if you get 
slightly higher estimate you get resize.


So while it does not introduce behavior change in this particular case 
(because it fails, as you point out), it introduces a behavior change in 
general - it simply triggers behavior that does not happen below the 
limit. Would we accept the change if the proposed limit was 256MB, for 
example?


It also seems to me that we don't really need the hash table until after 
MultiExecHash(), so maybe building the hash table incrementally is just 
unnecessary and we could simply track the optimal number of buckets and 
build the buckets at the end of MultiExecHash (essentially at the place 
where we do the resize now). We'd have to walk the tuples and insert 
them into the buckets, but that seems more efficient than the 
incremental build (no data to support that at this point).



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add pages deleted from pending list to FSM

2015-09-24 Thread Fujii Masao
On Sat, Sep 19, 2015 at 5:20 AM, Jeff Janes  wrote:
> On Fri, Sep 18, 2015 at 6:27 AM, Tom Lane  wrote:
>>
>> [ moving thread to -hackers ]
>>
>> Fujii Masao  writes:
>> > So autoanalyze still doesn't call IndexFreeSpaceMapVacuum().
>> > That is, only backend can clean the list in INSERT-only workload.
>> > I don't think that this is desirable. Because the backend will
>> > periodically take a big performance penalty.
>
>
> Calling IndexFreeSpaceMapVacuum is only need to reuse the space freed up by
> cleaning the pending list.
>
> The list is still getting cleaned and truncated by autoanalyze, it is just
> that the space is not getting marked for reuse.
>
> When I wrote this, my thought was that a table vacuum is going to call
> IndexFreeSpaceMapVacuum() from another place in its code and so should not
> call it here as well.  I neglected to consider the autoanalyze case.
>
>
>>
>> > So I'm thinking that even autoanalyze should call
>> > IndexFreeSpaceMapVacuum()
>> > to clean the list in a background, in order to avoid such spikes in
>> > INSERT response time. Thought?
>
>
> But it already is cleaning the list.  If that space is not marked as
> reusable, that causes index bloat, but doesn't cause latency spikes.

Yep, you're right. The problem here is an index bloat.

>> It seems quite bizarre for auto-analyze to do that.  auto-vacuum, sure,
>> but I do not think this should get plugged into ANALYZE.
>
>
> It may be odd that autoanalyze cleans the list at all (which this patch
> doesn't change), but given that it does clean the list I don't see why it
> would be bizarre to make the cleaned-up space re-usable.

Agreed. It's odd to clean up the list but not mark the pages as re-usable.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-24 Thread Masahiko Sawada
On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila  wrote:
> Hello,
>
> Please find attached patch with bugs reported by Thom and Sawada-san solved.
>
>>* The progress of vacuum by autovacuum seems not to be displayed.
> The progress is stored in shared variables during autovacuum. I guess the 
> reason they are not visible is that the entries are deleted as soon as the 
> process exits.
> But the progress can be viewed while autovacuum worker is running.
>

Thank you for updating the patch.

I tested the latest version patch.
The followings are my review comments and questions.

* pg_stat_vacuum_progress should have the oid of relation being vacuumed.

When we run "VACUUM;", the all tables of current database will be vacuumed.
So pg_stat_vacuum_progress should have these oid in order to show
which table is vacuumed now.

* progress_message variable in PgBackendStatus is not used at all.
IIRC, progress_message variable is set the description of processing.

* The progress of VACUUM FULL seems wrong.
When I run VACUUM FULL for a table, I got following progress.

postgres(1)=# select * from pg_stat_vacuum_progress ;
-[ RECORD 1 ]---+--
pid | 19190
total_pages | 1
scanned_pages   | 1
total_heap_pages| 1
scanned_heap_pages  | 1
total_index_pages   |
scanned_index_pages |
percent_complete| 100

The table being vacuumed is 400MB, so it's not 1 page table.

* The vacuum by autovacuum is not displayed.
I tested about this by the executing the following queries in a row,
but the vacuum by autovacuum is not displayed,

postgres(1)=# select datname, pid, backend_start, query, state from
pg_stat_activity ;
 datname  |  pid  | backend_start |
  query   | state
--+---+---+--+
 postgres | 20123 | 2015-09-24 17:44:26.467021+09 | autovacuum: VACUUM
ANALYZE public.hoge   | active
 postgres | 19779 | 2015-09-24 17:42:31.57918+09  | select datname,
pid, backend_start, query, state from pg_stat_activity ; | active
(3 rows)

postgres(1)=# selecT * from pg_stat_vacuum_progress ;
 pid | total_pages | scanned_pages | total_heap_pages |
scanned_heap_pages | total_index_pages | scanned_index_pages |
percent_complete
-+-+---+--++---+-+--
(0 rows)

postgres(1)=# select datname, pid, backend_start, query, state from
pg_stat_activity ;
 datname  |  pid  | backend_start |
  query   | state
--+---+---+--+
 postgres | 20123 | 2015-09-24 17:44:26.467021+09 | autovacuum: VACUUM
ANALYZE public.hoge   | active
 postgres | 19779 | 2015-09-24 17:42:31.57918+09  | select datname,
pid, backend_start, query, state from pg_stat_activity ; | active
(3 rows)

The vacuuming for hoge table took about 2min, but the progress of
vacuum is never displayed.
Could you check this on your environment?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-24 Thread Thom Brown
On 23 September 2015 at 13:10, Michael Paquier
 wrote:
>
>
> On Wed, Sep 23, 2015 at 2:15 AM, Robert Haas  wrote:
>>
>> On Tue, Sep 22, 2015 at 11:23 AM, Andrew Dunstan 
>> wrote:
>> > "git bisect" is your friend.
>>
>> Yeah, but finding someone who has a working Windows build environment
>> and a lot of time to run this down is my enemy.  We're trying, but if
>> anyone else has a clue, that would be much appreciated.
>
>
> That's not cool. I have added this problem in the list of open items for
> 9.5.

This appears that it might be related to the version of OpenSSL that's
been packaged with PostgreSQL 9.5 alpha 2.  When swapping this out for
the version that's shipped with 9.4, it works.  I don't have the
specific OpenSSL versions to hand, but I'll report back anything as I
learn more.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog sentences

2015-09-24 Thread Fujii Masao
On Fri, Sep 18, 2015 at 1:58 AM, Euler Taveira  wrote:
> On 17-09-2015 00:25, Fujii Masao wrote:
>>
>> One relevant question is; why doesn't pg_controldata report
>> newestCommitTs?
>>
> I thought about it while looking at the code but forgot to ask. AFAICS it is
> an oversight. See attached patch.

Applied. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Andrew Dunstan



On 09/24/2015 10:28 AM, k...@rice.edu wrote:

On Wed, Sep 23, 2015 at 04:33:33PM -0700, Josh Berkus wrote:

On 09/23/2015 03:05 PM, Jim Nasby wrote:

On 9/23/15 3:12 PM, Thomas Kellerer wrote:

They also support Postgres as their backend (and you do find hints
here and
there
that it is the recommended open source DBMS for them - but they don't
explicitly state it like that). We are using Jira at the company I
work for
and
all Jira installations run on Postgres there.

I'll second Jira as well. It's the only issue tracker I've seen that you
can actually use for multiple different things without it becoming a
mess. IE: it could track Postgres bugs, infrastructure issues, and the
TODO list if we wanted, allow issues to reference each other
intelligently, yet still keep them as 3 separate bodies.

Speaking as someone who uses Jira for commericial work, I'm -1 on them.
  I simply don't find Jira to be superior to OSS BT systems, and inferior
in several ways (like that you can't have more than one person assigned
to a bug).  And email integration for Jira is nonexistant.

When we discussed this 8 years ago, Debian said debbugs wasn't ready for
anyone else to use.  Has that changed?


I do not think using a commercial system is a good idea. Currently, Jira
is free for open-source, but there is no guarantee. That could change at
anytime and result in possibly an expensive license cost or port to another
system. We use Jira/Confluence and the random loss of support for various
plugins caused by forced security-based upgrades has resulted in a lot of
unexpected work to maintain the system.





+1

Regardless of the quality of any non-OSS tracker, about which I have no 
comment, I firmly believe that as an OSS project we should use OSS 
infrastructure.


About 10 years ago I helped get Bugzilla over the hurdle of database 
mono-culturism (basically by coming up with the initial version of this: 
). 
Part of my motivation was to have a tracker to support the PostgreSQL 
project that would run on PostgreSQL. We can see how well that worked 
out :-)


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread k...@rice.edu
On Wed, Sep 23, 2015 at 04:33:33PM -0700, Josh Berkus wrote:
> On 09/23/2015 03:05 PM, Jim Nasby wrote:
> > On 9/23/15 3:12 PM, Thomas Kellerer wrote:
> >> They also support Postgres as their backend (and you do find hints
> >> here and
> >> there
> >> that it is the recommended open source DBMS for them - but they don't
> >> explicitly state it like that). We are using Jira at the company I
> >> work for
> >> and
> >> all Jira installations run on Postgres there.
> > 
> > I'll second Jira as well. It's the only issue tracker I've seen that you
> > can actually use for multiple different things without it becoming a
> > mess. IE: it could track Postgres bugs, infrastructure issues, and the
> > TODO list if we wanted, allow issues to reference each other
> > intelligently, yet still keep them as 3 separate bodies.
> 
> Speaking as someone who uses Jira for commericial work, I'm -1 on them.
>  I simply don't find Jira to be superior to OSS BT systems, and inferior
> in several ways (like that you can't have more than one person assigned
> to a bug).  And email integration for Jira is nonexistant.
> 
> When we discussed this 8 years ago, Debian said debbugs wasn't ready for
> anyone else to use.  Has that changed?
> 

I do not think using a commercial system is a good idea. Currently, Jira
is free for open-source, but there is no guarantee. That could change at
anytime and result in possibly an expensive license cost or port to another
system. We use Jira/Confluence and the random loss of support for various
plugins caused by forced security-based upgrades has resulted in a lot of
unexpected work to maintain the system.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 9:49 AM, Tomas Vondra
 wrote:
> So while it does not introduce behavior change in this particular case
> (because it fails, as you point out), it introduces a behavior change in
> general - it simply triggers behavior that does not happen below the limit.
> Would we accept the change if the proposed limit was 256MB, for example?

So, I'm a huge fan of arbitrary limits.

That's probably the single thing I'll say this year that sounds most
like a troll, but it isn't.  I really, honestly believe that.
Doubling things is very sensible when they are small, but at some
point it ceases to be sensible.  The fact that we can't set a
black-and-white threshold as to when we've crossed over that line
doesn't mean that there is no line.  We can't imagine that the
occasional 32GB allocation when 4GB would have been optimal is no more
problematic than the occasional 32MB allocation when 4MB would have
been optimal.  Where exactly to put the divider is subjective, but
"what palloc will take" is not an obviously unreasonable barometer.

Of course, if we can postpone sizing the hash table until after the
input size is known, as you suggest, then that would be better still
(but not back-patch material).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Robert Haas
On Wed, Sep 23, 2015 at 7:25 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Sep 23, 2015 at 5:39 PM, Tom Lane  wrote:
>>> Yeah, though I think of that as a longer-term issue, ie we could clean it
>>> up sometime later.
>
>> So, you're thinking of something as simple as the attached?
>
> Right.  I'll make a mental to-do to see about getting rid of set_opfuncid
> later.

Cool, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Alvaro Herrera
Tom Lane wrote:

> To my mind though, the lack of an ALTER OPERATOR SET FUNCTION command
> is on par with our very limited ability to alter the contents of
> an operator class.  In principle it would be nice, but the practical
> value is so small that it's not surprising it hasn't been done ---
> and we shouldn't continue to hold the door open for a simple way of
> implementing it when there are significant costs to doing so.

I think allowing an operator's implementation function to change would
be rather problematic, would it not?  There's no way to know whether the
semantic changes to stored rules would make sense, not least because the
person running ALTER OPERATOR wouldn't know (== has no easy way to find
out) what is being changed in the first place.

To me, it looks like we should just not allow ALTER OPERATOR SET FUNCTION
to be implemented at all.

It's not like changing an operator's implementation is an oft-requested
feature anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tom Lane
Robert Haas  writes:
> Of course, if we can postpone sizing the hash table until after the
> input size is known, as you suggest, then that would be better still
> (but not back-patch material).

AFAICS, it works that way today as long as the hash fits in memory
(ie, single-batch).  We load into a possibly seriously undersized hash
table, but that won't matter for performance until we start probing it.
At the conclusion of loading, MultiExecHash will call
ExecHashIncreaseNumBuckets which will re-hash into a better-sized hash
table.  I doubt this can be improved on much.

It would be good if we could adjust the numbuckets choice at the
conclusion of the input phase for the multi-batch case as well.
The code appears to believe that wouldn't work, but I'm not sure if
it's right about that, or how hard it'd be to fix if so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-09-24 Thread Andres Freund
On 2015-09-15 20:16:10 +0300, YUriy Zhuravlev wrote:
> We will be tested.

Did you have a chance to run some benchmarks?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-09-24 Thread Alexander Korotkov
On Thu, Sep 24, 2015 at 6:32 PM, Andres Freund  wrote:

> On 2015-09-15 20:16:10 +0300, YUriy Zhuravlev wrote:
> > We will be tested.
>
> Did you have a chance to run some benchmarks?
>

Yes, we now have 60 physical cores intel server and we're running
benchmarks on it.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Andres Freund
On 2015-09-23 17:29:50 -0400, Robert Haas wrote:
> Well, I can't vouch for what any human being on earth has thought
> about over a twenty-year period.  It's not intrinsically unreasonable
> in my mind to want to alter an operator to point at a different
> procedure.

Wouldn't we use plan invalidation to deal with that anyway?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Tom Lane
Andres Freund  writes:
> On 2015-09-23 17:29:50 -0400, Robert Haas wrote:
>> Well, I can't vouch for what any human being on earth has thought
>> about over a twenty-year period.  It's not intrinsically unreasonable
>> in my mind to want to alter an operator to point at a different
>> procedure.

> Wouldn't we use plan invalidation to deal with that anyway?

Plan invalidation wouldn't help, because the obsolete data exists
on-disk in stored rules.  You'd have to run through the pg_rewrite
entries and update them.

To my mind though, the lack of an ALTER OPERATOR SET FUNCTION command
is on par with our very limited ability to alter the contents of
an operator class.  In principle it would be nice, but the practical
value is so small that it's not surprising it hasn't been done ---
and we shouldn't continue to hold the door open for a simple way of
implementing it when there are significant costs to doing so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 11:54 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2015-09-23 17:29:50 -0400, Robert Haas wrote:
>>> Well, I can't vouch for what any human being on earth has thought
>>> about over a twenty-year period.  It's not intrinsically unreasonable
>>> in my mind to want to alter an operator to point at a different
>>> procedure.
>
>> Wouldn't we use plan invalidation to deal with that anyway?
>
> Plan invalidation wouldn't help, because the obsolete data exists
> on-disk in stored rules.  You'd have to run through the pg_rewrite
> entries and update them.
>
> To my mind though, the lack of an ALTER OPERATOR SET FUNCTION command
> is on par with our very limited ability to alter the contents of
> an operator class.  In principle it would be nice, but the practical
> value is so small that it's not surprising it hasn't been done ---
> and we shouldn't continue to hold the door open for a simple way of
> implementing it when there are significant costs to doing so.

Also, it's not like this change couldn't be UN-done at a future point.
I mean, Tom didn't like the flag I added aesthetically, but if we
needed it, we could have it.  Or we could engineer something else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ON CONFLICT issues around whole row vars,

2015-09-24 Thread Andres Freund
On 2015-09-19 18:40:14 -0700, Peter Geoghegan wrote:
> > An actually trivial, although not all that pretty, fix is to simply
> > accept wholerow references in fix_join_expr_mutator(), even if not in
> > the targetlist. As far as I can see the problem right now really can
> > only be hit for whole row references.
> 
> I am concerned about the risk of adding bugs to unrelated code paths
> that this could create.

How? This is a must-not-reach code path currently?

Stuff I want to fix by tomorrow:
* Whole row var references to exclude
* wrong offsets for columns after dropped ones
* INSTEAD DO UPDATE for tables with oids

Do you know of anything else?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Lower *_freeze_max_age minimum values.

2015-09-24 Thread Alvaro Herrera
Cc'ing -hackers.

Andres Freund wrote:
> On 2015-09-24 10:37:33 -0400, Tom Lane wrote:
> > Andres Freund  writes:

> > Should this patch not have also touched the per-table limits in
> > reloptions.c?
> 
> Hm. I guess that'd make sense. It's not really related to the goal of
> making it realistic to test multixact/clog truncation, but it's less
> confusing if consistent.

Yeah, agreed.

> > and I found places in create_table.sgml that claim these variables can be
> > set to zero.  You didn't break that with this patch, but it's still wrong.
> 
> Seems to have been "broken" back in 834a6da4f7 - the old table based
> approach doesn't seem to have imposed lower limits. I'm not really sure
> whether making the limits consistent and updating the docs or removing
> them alltogether is the better approach.

I'm surprised the error has survived this long.  Without checking I
can't say what's the best solution either, but I would opt for
documenting the limits we have -- if we want to change them back to 0 I
say that merits its own discussion.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Tom Lane
Alvaro Herrera  writes:
> I think allowing an operator's implementation function to change would
> be rather problematic, would it not?  There's no way to know whether the
> semantic changes to stored rules would make sense, not least because the
> person running ALTER OPERATOR wouldn't know (== has no easy way to find
> out) what is being changed in the first place.

> To me, it looks like we should just not allow ALTER OPERATOR SET FUNCTION
> to be implemented at all.

> It's not like changing an operator's implementation is an oft-requested
> feature anyway.

Well, the point is that usually anything you want in this line can be
accomplished by executing CREATE OR REPLACE FUNCTION on the operator's
function.  It's up to you that that doesn't create any interesting
semantic incompatibilities.  That would still be true for an ALTER
OPERATOR SET FUNCTION command: if you break it, you get to keep both
pieces.  But the availability of that alternative really cuts down
on the plausible use-cases for ALTER OPERATOR SET FUNCTION.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Tom Lane
Robert Haas  writes:
> Also, it's not like this change couldn't be UN-done at a future point.
> I mean, Tom didn't like the flag I added aesthetically, but if we
> needed it, we could have it.  Or we could engineer something else.

For the record: that's true for the patch you just committed.  But once
I remove the hopefully-now-dead planner support for recomputing opfuncid,
it would get a lot more painful to reverse the decision.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics / patch v7

2015-09-24 Thread Josh Berkus
Tomas,

> attached is v7 of the multivariate stats patch. The main improvement is
> major refactoring of the clausesel.c portion - splitting the awfully
> long spaghetti-style functions into smaller pieces, making it much more
> understandable etc.

So presumably v7 handles varlena attributes as well, yes?   I have a
destruction test case for correlated column stats, so I'd like to test
your patch on it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra



On 09/24/2015 05:09 PM, Robert Haas wrote:

On Thu, Sep 24, 2015 at 9:49 AM, Tomas Vondra
 wrote:

So while it does not introduce behavior change in this particular
case (because it fails, as you point out), it introduces a behavior
change in general - it simply triggers behavior that does not
happen below the limit. Would we accept the change if the proposed
limit was 256MB, for example?


So, I'm a huge fan of arbitrary limits.

That's probably the single thing I'll say this year that sounds most
 like a troll, but it isn't. I really, honestly believe that.
Doubling things is very sensible when they are small, but at some
point it ceases to be sensible. The fact that we can't set a
black-and-white threshold as to when we've crossed over that line
doesn't mean that there is no line. We can't imagine that the
occasional 32GB allocation when 4GB would have been optimal is no
more problematic than the occasional 32MB allocation when 4MB would
have been optimal. Where exactly to put the divider is subjective,
but "what palloc will take" is not an obviously unreasonable
barometer.


There are two machines - one with 32GB of RAM and work_mem=2GB, the 
other one with 256GB of RAM and work_mem=16GB. The machines are hosting 
about the same data, just scaled accordingly (~8x more data on the large 
machine).


Let's assume there's a significant over-estimate - we expect to get 
about 10x the actual number of tuples, and the hash table is expected to 
almost exactly fill work_mem. Using the 1:3 ratio (as in the query at 
the beginning of this thread) we'll use ~512MB and ~4GB for the buckets, 
and the rest is for entries.


Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the 
buckets, so we're wasting ~448MB (13% of RAM) on the small machine and 
~3.5GB (~1.3%) on the large machine.


How does it make any sense to address the 1.3% and not the 13%?



Of course, if we can postpone sizing the hash table until after the
input size is known, as you suggest, then that would be better still
 (but not back-patch material).


This dynamic resize is 9.5-only anyway.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 12:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Also, it's not like this change couldn't be UN-done at a future point.
>> I mean, Tom didn't like the flag I added aesthetically, but if we
>> needed it, we could have it.  Or we could engineer something else.
>
> For the record: that's true for the patch you just committed.  But once
> I remove the hopefully-now-dead planner support for recomputing opfuncid,
> it would get a lot more painful to reverse the decision.

True.  I think it's pretty wacky that we store the opfuncid in the
tree at all.  If somebody were to propose adding a dependent value of
that sort to a node type that didn't already have it, I suspect either
you or I would do our best to shoot that down.  The only possible
argument for having that in there at all is that the performance gains
from so doing are so large that we have no choice but to sacrifice a
principle to expediency.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
All,

* Stephen Frost (sfr...@snowman.net) wrote:
> Not saying it's perfect, of course, but it's probably the best option
> for minimizing impact on our existing process.

I discussed the current state of debbugs with Don Armstrong (one of the
main individuals behind it) and his opinion is that debbugs could
certainly work.  Further, he's been working to add support for
PostgreSQL to it, which would certainly be nice.

* Josh Berkus (j...@agliodbs.com) wrote:
> I adore "Toggle useless messages" as a feature.  ;-)

Ditto. :)

There are a few questions regarding just how it would work and what the
structure would be, but my thinking is that we'd handle it in much the
same way that Debian already does, therefore, without thinking about it
too much, I imagine we'd have:

'source' packages which map up to individual git repos from
git.postgresql.org.

'binary' packages (for the 'postgres' source package; other source
packages can make their own decisions) which map up to each binary we
have (psql, pg_dump, etc).  That's a bit more granular than Debian does
but I don't think that's a bad thing.

I'm sure there are a number of other bits regarding the setup that need
to be considered and how it integrates with our existing mailing lists,
etc, etc, some of which will probably involve discussion with Don as we
work through them (based on my sub-1-hour response from him today, I
don't anticipate that being an issue), but the next big question is:

Are there any objections to pginfra standing up bugs.postgresql.org with
debbugs?  Obviously, it'd be more-or-less beta as we play with it, and
we could set it up as beta-bugs.p.o, if there's concern about that.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Josh Berkus
On 09/24/2015 10:24 AM, Stephen Frost wrote:
> * Joe Conway (m...@joeconway.com) wrote:
>> On 09/24/2015 10:08 AM, Stephen Frost wrote:
>>> debbugs does most of the above by default, no programming needed...  I'm
>>> sure we could get it to integrate with the commitfest and have a git
>>> commit hook which sends the appropriate email to it also.
>>>
>>> That the emacs folks are using it makes me *much* more interested in the
>>> idea of getting debbugs up and running..
>>
>> I'm not familiar with debbugs myself, but given that description it
>> sounds to me like it would be worth giving it a try.
> 
> It started out as Debian's bug tracking system, but apparently others
> are using it now also.
> 
> Here's an example.
> 
> The main view for a particular package:
> 
> https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=util-linux;dist=unstable
> 
> A specific bug:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786804

I adore "Toggle useless messages" as a feature.  ;-)


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra



On 09/24/2015 07:42 PM, Robert Haas wrote:

On Thu, Sep 24, 2015 at 12:40 PM, Tomas Vondra
 wrote:

There are two machines - one with 32GB of RAM and work_mem=2GB, the other
one with 256GB of RAM and work_mem=16GB. The machines are hosting about the
same data, just scaled accordingly (~8x more data on the large machine).

Let's assume there's a significant over-estimate - we expect to get about
10x the actual number of tuples, and the hash table is expected to almost
exactly fill work_mem. Using the 1:3 ratio (as in the query at the beginning
of this thread) we'll use ~512MB and ~4GB for the buckets, and the rest is
for entries.

Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the
buckets, so we're wasting ~448MB (13% of RAM) on the small machine and
~3.5GB (~1.3%) on the large machine.

How does it make any sense to address the 1.3% and not the 13%?


One of us is confused, because from here it seems like 448MB is 1.3%
of 32GB, not 13%.


Meh, you're right - I got the math wrong. It's 1.3% in both cases.

However the question still stands - why should we handle the 
over-estimate in one case and not the other? We're wasting the same 
fraction of memory in both cases.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Ryan Pedela
Kam Lasater wrote:
> I'd suggest: Github Issues, Pivotal Tracker or Redmine (probably in
> that order). There are tens to hundreds of other great ones out there,
> I'm sure one of them would also work.

Why not just use Github issues?

1. You can set it up to send emails to the list when an issue is created or
updated.
2. Replies to the email will automatically update the issue on Github.
3. Github is where most of the OSS activity happens now.
4. If Github goes away, so what? The core workflow never changes.

Thanks,
Ryan Pedela


[HACKERS] Rename withCheckOptions to insertedCheckClauses

2015-09-24 Thread Stephen Frost
All,

and this totally should have gone to -hackers instead, sorry about that.
Please ignore the one to -committers, if possible.

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> My vote would be to rename and reposition the field in HEAD and 9.5
> and accept the corresponding initdb.  We already forced an initdb
> since alpha2, so it's basically free as far as testers are concerned,
> and keeping 9.5 in sync with HEAD in this area seems like a really
> good idea for awhile to come.

Alright, attached is an attempt at doing these renames.  I went a bit
farther since it seemed to make sense to me (at least at the time, I'm
wondering a bit about it now), so, please provide any thoughts or
feedback you have regarding these changes.

Thanks!

Stephen
From 579dcfb80ffb922cabf10590219de221bb12267a Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 24 Sep 2015 10:21:42 -0400
Subject: [PATCH] Rename withCheckOptions to insertedCheckClauses

CHECK clauses which are generated from the WITH CHECK OPTION being
specified on a view or from RLS WITH CHECK policies are always inserted
by the rewriter and therefore don't ever need to be persisted out on
disk for a given query.

This modifies outfuncs to no longer output the rewriter-added nodes for
a Query or a ModifyTable, and readfuncs to no longer read them in.

Further, rename 'withCheckOptions' in Query and ModifyTable to
'insertedCheckClauses', ExecWithCheckOption to ExecCheckClauses,
ri_WithCheckOptions and ri_WithCheckOptionExprs to
ri_InsertedCheckClauses and ri_InsertedCheckExprs, respectively, all to
make it clear that these are the check clauses which were added by the
rewriter, not the actual WITH CHECK OPTION indication for a view (which
is stored in reloptions for the view) nor the WITH CHECK expressions for
a policy (which are stored in pg_policy).

Requires a catversion bump.

Per discussion with Tom and Alvaro.

Backpatch to 9.5 since we're still in alpha.
---
 src/backend/executor/execMain.c   | 16 +--
 src/backend/executor/nodeModifyTable.c| 45 +++
 src/backend/nodes/copyfuncs.c |  4 +--
 src/backend/nodes/equalfuncs.c|  2 +-
 src/backend/nodes/nodeFuncs.c |  4 +--
 src/backend/nodes/outfuncs.c  |  2 --
 src/backend/nodes/readfuncs.c | 21 ---
 src/backend/optimizer/plan/createplan.c   |  8 +++---
 src/backend/optimizer/plan/planner.c  | 42 +
 src/backend/optimizer/plan/setrefs.c  |  4 +--
 src/backend/optimizer/prep/prepsecurity.c |  2 +-
 src/backend/rewrite/rewriteHandler.c  | 37 -
 src/backend/rewrite/rowsecurity.c | 28 +--
 src/include/executor/executor.h   |  2 +-
 src/include/nodes/execnodes.h |  8 +++---
 src/include/nodes/parsenodes.h|  8 --
 src/include/nodes/plannodes.h |  2 +-
 src/include/optimizer/planmain.h  |  2 +-
 src/include/rewrite/rowsecurity.h |  2 +-
 19 files changed, 114 insertions(+), 125 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 85ff46b..d344b09 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1738,17 +1738,17 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
 }
 
 /*
- * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs
+ * ExecCheckClauses -- check that tuple satisfies any inserted CHECK clauses,
  * of the specified kind.
  *
  * Note that this needs to be called multiple times to ensure that all kinds of
- * WITH CHECK OPTIONs are handled (both those from views which have the WITH
- * CHECK OPTION set and from row level security policies).  See ExecInsert()
- * and ExecUpdate().
+ * CHECK clauses are handled (both those from views which have the WITH CHECK
+ * OPTION set and from row level security policies).  See ExecInsert() and
+ * ExecUpdate().
  */
 void
-ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo,
-	 TupleTableSlot *slot, EState *estate)
+ExecCheckClauses(WCOKind kind, ResultRelInfo *resultRelInfo,
+ TupleTableSlot *slot, EState *estate)
 {
 	Relation	rel = resultRelInfo->ri_RelationDesc;
 	TupleDesc	tupdesc = RelationGetDescr(rel);
@@ -1766,8 +1766,8 @@ ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo,
 	econtext->ecxt_scantuple = slot;
 
 	/* Check each of the constraints */
-	forboth(l1, resultRelInfo->ri_WithCheckOptions,
-			l2, resultRelInfo->ri_WithCheckOptionExprs)
+	forboth(l1, resultRelInfo->ri_InsertedCheckClauses,
+			l2, resultRelInfo->ri_InsertedCheckClauseExprs)
 	{
 		WithCheckOption *wco = (WithCheckOption *) lfirst(l1);
 		ExprState  *wcoExpr = (ExprState *) lfirst(l2);
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1ef76d0..a310299 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ 

Re: [HACKERS] Rename withCheckOptions to insertedCheckClauses

2015-09-24 Thread Tom Lane
Stephen Frost  writes:
> Alright, attached is an attempt at doing these renames.  I went a bit
> farther since it seemed to make sense to me (at least at the time, I'm
> wondering a bit about it now), so, please provide any thoughts or
> feedback you have regarding these changes.

> Further, rename 'withCheckOptions' in Query and ModifyTable to
> 'insertedCheckClauses', ExecWithCheckOption to ExecCheckClauses,
> ri_WithCheckOptions and ri_WithCheckOptionExprs to
> ri_InsertedCheckClauses and ri_InsertedCheckExprs, respectively, all to
> make it clear that these are the check clauses which were added by the
> rewriter, not the actual WITH CHECK OPTION indication for a view (which
> is stored in reloptions for the view) nor the WITH CHECK expressions for

That commitlog entry doesn't seem to quite square with the patch,
wherein I see

- * WithCheckOptionslist of WithCheckOption's to be 
checked
- * WithCheckOptionExprslist of WithCheckOption expr states
+ * InsertedCheckClauseslist of WithCheckOption's to be checked
+ * InsertedCheckClauseExprslist of WithCheckOption expr 
states

-   List   *ri_WithCheckOptions;
-   List   *ri_WithCheckOptionExprs;
+   List   *ri_InsertedCheckClauses;
+   List   *ri_InsertedCheckClauseExprs;

The distinction between a "clause" and an "expr" is not very obvious,
and certainly most other places in the code use those terms pretty
interchangeably, so I find both the old and new names unclear here.
How about ri_InsertedCheckClauseStates instead for the second list?
And similarly if you're using "Expr" to mean ExprState anywhere else.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Are there any objections to pginfra standing up bugs.postgresql.org with
> > debbugs?  Obviously, it'd be more-or-less beta as we play with it, and
> > we could set it up as beta-bugs.p.o, if there's concern about that.
> 
> I agree with the idea that we don't yet want to give the impression that
> this is the official bug tracker.  However, "beta-bugs" could give the
> impression that it was specifically for bugs about 9.5beta, without
> dispelling the idea that it is official.  Maybe "bugs-test.p.o"?

Works for me.

> (But let's not spend too much time bikeshedding the site name.)

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra



On 09/24/2015 07:04 PM, Tom Lane wrote:

Tomas Vondra  writes:

But what about computing the number of expected batches, but always
start executing assuming no batching? And only if we actually fill
work_mem, we start batching and use the expected number of batches?


Hmm. You would likely be doing the initial data load with a "too
small" numbuckets for single-batch behavior, but if you successfully
loaded all the data then you could resize the table at little
penalty. So yeah, that sounds like a promising approach for cases
where the initial rowcount estimate is far above reality.


I don't understand the comment about "too small" numbuckets - isn't 
doing that the whole point of using the proposed limit? The batching is 
merely a consequence of how bad the over-estimate is.



But I kinda thought we did this already, actually.


I don't think so - I believe we haven't modified this aspect at all. It 
may not have been as pressing thanks to NTUP_PER_BUCKET=10 in the past.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
* Joe Conway (m...@joeconway.com) wrote:
> On 09/24/2015 10:08 AM, Stephen Frost wrote:
> > debbugs does most of the above by default, no programming needed...  I'm
> > sure we could get it to integrate with the commitfest and have a git
> > commit hook which sends the appropriate email to it also.
> > 
> > That the emacs folks are using it makes me *much* more interested in the
> > idea of getting debbugs up and running..
> 
> I'm not familiar with debbugs myself, but given that description it
> sounds to me like it would be worth giving it a try.

It started out as Debian's bug tracking system, but apparently others
are using it now also.

Here's an example.

The main view for a particular package:

https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=util-linux;dist=unstable

A specific bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786804

Note that the interface for working with the bug tracker is primairly
email, while everything is available for display over the web.  There is
an interface at the bottom of the 'main' package page for submitting a
bug and, iirc, there's a system-wide bug submission form somewhere too.

You'll notice that, at the bottom of the bug referenced above, is a
message which is auto-generated from a Debian Developer uploading a new
version that included 'Closes: #786804' in the changelog.  Having
something similar work for commits wouldn't be hard at all.

Including the bug's email address (all bugs have their own email
address) on the thread is also an excellent way to keep track of all of
the discussion which happened around a certain bug, even through subject
changes or whatever.

Not saying it's perfect, of course, but it's probably the best option
for minimizing impact on our existing process.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-09-24 Thread Tom Lane
Robert Haas  writes:
> On Thu, Sep 24, 2015 at 12:35 PM, Tom Lane  wrote:
>> For the record: that's true for the patch you just committed.  But once
>> I remove the hopefully-now-dead planner support for recomputing opfuncid,
>> it would get a lot more painful to reverse the decision.

> True.  I think it's pretty wacky that we store the opfuncid in the
> tree at all.  If somebody were to propose adding a dependent value of
> that sort to a node type that didn't already have it, I suspect either
> you or I would do our best to shoot that down.  The only possible
> argument for having that in there at all is that the performance gains
> from so doing are so large that we have no choice but to sacrifice a
> principle to expediency.

Hm.  It might be interesting to try removing those node fields altogether
and see what it costs us.  Setting the field at parse time is zero-cost,
at least in the primary code path through make_op(), because we have our
hands on the pg_operator row at that time anyway.  (I have a vague
recollection that that was once not true, but it certainly is true now
and has been for a long time.)  Not having it would cost us one syscache
lookup per operator at execution start, and maybe more than that if there
are additional places that would need the function OID, which there
probably are --- volatility checks in the planner come to mind.  But
I'm not sure how significant that would be.  There are certainly plenty
of syscache lookups being done at plan startup anyway.

But this is mostly moot unless someone is actively planning to try to
implement ALTER OPERATOR SET FUNCTION.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Tom Lane
Stephen Frost  writes:
> Are there any objections to pginfra standing up bugs.postgresql.org with
> debbugs?  Obviously, it'd be more-or-less beta as we play with it, and
> we could set it up as beta-bugs.p.o, if there's concern about that.

I agree with the idea that we don't yet want to give the impression that
this is the official bug tracker.  However, "beta-bugs" could give the
impression that it was specifically for bugs about 9.5beta, without
dispelling the idea that it is official.  Maybe "bugs-test.p.o"?

(But let's not spend too much time bikeshedding the site name.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_security GUC, BYPASSRLS

2015-09-24 Thread Noah Misch
On Tue, Sep 22, 2015 at 10:38:53AM -0400, Stephen Frost wrote:
> * Noah Misch (n...@leadboat.com) wrote:
> > On Mon, Sep 21, 2015 at 09:30:15AM -0400, Stephen Frost wrote:
> > > One item which wasn't discussed, that I recall, is just how it will work
> > > without SECURITY_ROW_LEVEL_DISABLED, or something similar, to
> > > differentiate when internal referencial integrity queries are being run,
> > > which should still bypass RLS (even in the FORCE ROW SECURITY case), and
> > > when regular or SECURITY DEFINER originated queries are being run.
> > 
> > If the table owner enables FORCE ROW SECURITY, policies will affect
> > referential integrity queries.  Choose policies accordingly.  For example,
> > given only ON UPDATE NO ACTION constraints, it would be no problem to set
> > owner-affecting policies for INSERT, UPDATE and/or DELETE.
> 
> Perhaps I'm not following correctly, but the above doesn't look correct
> to me.  An ON UPDATE NO ACTION constraint would run a query against the
> referring table (which has FORCE ROW SECURITY set, perhaps by mistake
> after a debugging session of the owner, with a policy that does not
> allow any records to be seen by the owner), fail to find any rows, and
> conclude that no error needs to be thrown, resulting in the referring
> table having records which refer to keys in the referred-to table that
> no longer exist (the UPDATE having changed them).

Yes, the table owner could define policies that thwart referential integrity.

> This would lead to trivial to cause (and likely hard to diagnose)
> referential integrity data corruption issues.  I find that a very hard
> pill to swallow in favor of a theoretical concern that new code may open
> avenues of exploit for a new security context mode to bypass RLS when
> doing internal referential integrity checks.  Further, it changes this
> additional capability, which was agreed would be added to offset removal
> of the 'row_security = force' option, from useful to downright
> dangerous.

In schema reviews, I will raise a red flag for use of this feature; the best
designs will instead use additional roles.  I forecast that PostgreSQL would
fare better with no owner-constrained-by-RLS capability.  Even so, others want
it, and FORCE ROW SECURITY would deliver it with an acceptable risk profile.
"SET row_security=force" was too risky, and not in a way particular to foreign
key constraints, because the session user chose row_security=force independent
of object owners.  With FORCE ROW SECURITY, each table owner would make both
decisions.  A foreign key constraint, plus a SELECT policy hiding rows from
the table owner, plus FORCE ROW SECURITY is one example of self-contradictory
policy design.  That example is unexceptional amidst the countless ways a
table owner can get security policy wrong.

SECURITY_ROW_LEVEL_DISABLED could have been okay.  I removed an incomplete
implementation (e.g. didn't affect CASCADE constraints).  Writing a full one
would be a mammoth job, and for what?  Setting the wrong SELECT policies can
disrupt _any_ application logic; no foreign keys or FORCE ROW SECURITY need be
involved.  Protecting just foreign keys brings some value, but it will not
materially reduce the vigilance demanded of RLS policy authors and reviewers.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Greg Stark
On Thu, Sep 24, 2015 at 6:16 PM, Tom Lane  wrote:
> Yeah; "let's write our own bug tracker" is a good way to make sure nothing
> comes of this.  On the other hand, "let's get all the Postgres hackers to
> change their habits" is an equally good way to make sure nothing comes of
> this.


I think the way to achieve something is to set a limited scope and
agree what we're going to use the tool to accomplish. If some people
want to drive the entire development workflow off the tool and others
are using it to track issues and others to track bugs then the three
groups will all end up dissatisfied and give up.

My personal feeling is that we should use it basically like a
spreadsheet listing the current pending bugs that we don't want to
forget about. Generally with Postgres there are maybe a half-dozen
such bugs at any time that are being actively worked on and perhaps a
few dozen other bugs that we consider wishlist items that live on
people's todo lists.

What we should NOT do is

a) drive the entire workflow off a trouble ticketing system which is
what tools like Jira are intended for. That will be frustrating
because some people will spend a lot of effort entering information
and then get annoyed that others are ignoring it and working on what
they want. There's no need to mark bug "owners" or keep track of
"milestone targets" to track who is going to do what when about the
bug -- the goal is just that we don't forget about it.

b) try to keep a massive database to search through of every user
report that ever came in. This seems to be what Bugzilla is designed
for -- the default screen doesn't show all the bug reports instead it
has lots of tools for tagging and curating your bugs and searching
through them. Every bugzilla I'm familiar with is full of thousands of
bugs that get ignored until the release they were reported on gets
EOL'd and lots of effort goes into curating this list.

I think those two antipatterns are what most people are afraid of in
this community. I think Debbugs was the best culture fit in that it
isn't designed for either of these two usage models and is email
drive. I think github issues might be a good alternative though.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Feng Tian
Hi,

Here is an extension for 64 and 128 bit decimal types using IEEE decimal
floating point.  The original idea/implementation is from
http://pgxn.org/dist/pgdecimal/1.0.0/   Original thread for dicussion is at

http://www.postgresql.org/message-id/CAFj8pRApakE6s-H2yJcXD=ubpukwa6i7rx4vuvtb4puhga5...@mail.gmail.com

I reimplemented 64/128 bits instead of 32/64 bits.  The code use decNumber
library instead of _Decimal64/128 of GCC.   Also added more operators.

Compared to numeric type, decimal64 arithmetics is about 2x faster,
decimal128 is about 1.5x faster.  However, the cast between decimal and
float4/8 is implemented rather naively and slow.   As always, it depends on
workload, decimal may take more, or less space, may be slower if cast is
frequently performed.

Agains, thanks to the original author okbob (Pavel).   Enjoy.

Thanks,
Feng


Re: [HACKERS] PGXS "check" target forcing an install ?

2015-09-24 Thread Noah Misch
On Fri, Jun 26, 2015 at 09:09:15AM -0400, Robert Haas wrote:
> On Tue, Jun 23, 2015 at 1:31 AM, Michael Paquier  
> wrote:
> >> I tracked the dangerous -rf to come from Makefile.global and it's empty
> >> string being due to abs_top_builddir not being define in my own Makefile.
> >> But beside that, which I can probably fix, it doesn't sound correct
> >> that a "check" rule insists in finding an "install" rule.
> >
> > Oops, this is a regression, and a dangerous one indeed. This is caused
> > by dcae5fac.
> >
> > One fix is to use NO_TEMP_INSTALL=yes in Makefile.global in the
> > context of PGXS, like in the patch attached, this variable needing to
> > be set before Makefile.global is loaded.

This seems reasonable in concept, though the patch's addition is off-topic in
a section marked "# Support for VPATH builds".  However, ...

> Gulp.  I certainly agree that emitting rm -rf /tmp_install is a scary
> thing for a PostgreSQL Makefile to be doing.  Fortunately, people
> aren't likely to have a directory under / by that name, and maybe not
> permissions on it even if they did, but all the same it's not good.  I
> propose trying to guard against that a bit more explicitly, as in the
> attached.

... agreed.

> --- a/src/Makefile.global.in
> +++ b/src/Makefile.global.in
> @@ -304,12 +304,14 @@ check: temp-install
>  .PHONY: temp-install
>  temp-install:
>  ifndef NO_TEMP_INSTALL
> +ifneq ($(abs_top_builddir),)
>  ifeq ($(MAKELEVEL),0)
>   rm -rf '$(abs_top_builddir)'/tmp_install
>   $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install 
> install
>  endif
>   $(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C 
> '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install 
> || exit; done)
>  endif
> +endif

With this in place, there's no need for the NO_TEMP_INSTALL change.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-24 Thread Jim Nasby

On 9/24/15 7:37 AM, Masahiko Sawada wrote:

* The progress of VACUUM FULL seems wrong.
When I run VACUUM FULL for a table, I got following progress.


It never occurred to me that this patch was attempting to measure the 
progress of a CLUSTER (aka VACUUM FULL). I'm not sure that's such a 
great idea, as the progress estimation presumably needs to be 
significantly different.


More to the point, you can't estimate a CLUSTER unless you can estimate 
the progress of an index build. That'd be a cool feature to have as 
well, but it seems like a bad idea to mix that in with this patch.


Keep in mind that running a VACUUM FULL is presumably a LOT less common 
than regular vacuums, so I don't think leaving it out for now is that 
big a deal.



* The vacuum by autovacuum is not displayed.
I tested about this by the executing the following queries in a row,
but the vacuum by autovacuum is not displayed,


IIRC this is the second problem related to autovacuum... is there some 
way to regression test that? Maybe disable autovac on a table, dirty it, 
then re-enable (all with an absurdly low autovacuum naptime)?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3.9 and pg_multixact corruption

2015-09-24 Thread Jim Nasby

On 9/20/15 9:23 AM, Christoph Berg wrote:

a short update here: the customer updated the compiler to a newer
version, is now compiling using -O2 instead of -O3, and the code
generated now looks sane, so this turned out to be a compiler issue.
(Though it's unclear if the upgrade fixed it, or the different -O
level.)


Do we officially not support anything > -O2? If so it'd be nice if 
configure threw at least a warning (if not an error that you had to 
explicitly over-ride).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Thomas Munro
On Fri, Sep 25, 2015 at 9:23 AM, Feng Tian  wrote:
>
>
> On Thu, Sep 24, 2015 at 2:17 PM, Feng Tian  wrote:
>>
>>
>>
>> On Thu, Sep 24, 2015 at 1:55 PM, Peter Geoghegan  wrote:
>>>
>>> On Thu, Sep 24, 2015 at 1:53 PM, Tom Lane  wrote:
>>> > Please include the actual patch as an attachment.  We do not consider
>>> > mere
>>> > URLs to be acceptable patch submission format, because that provides no
>>> > permanent record in our archives of what was submitted.
>>>
>>> I was under the impression that this was not intended as a patch
>>> submission.
>>>
>>>
>>> --
>>> Peter Geoghegan
>>
>>
>> If there is enough interest, would be great for it to go into the official
>> contrib dir.
>> Thanks,
>>
>>
> Second thought, the extension depends on decNumber, which is either GPL, or
> ICU license.  Maybe this is trouble.

This is a very cool feature.  I would be great to get a useful class
of decimal numbers into a pass-by-value fixed sized standardised data
type.

The Intel BID library seems to have a more permissive license at first
glance.  I have heard that the Intel library is faster than the IBM
library at a variety of arithmetic and conversions (YMMV; I saw an
unpublished benchmark result that I can't share) on commodity hardware
at least, and it would be interesting to test that.  I wonder if BID
(a single integer significand field) is inherently better for software
implementations than DPD (the significand as an array of 10 bit wide
base-1000 digits called "declets", not entirely unlike our numeric's
encoding).  Those using POWER hardware might want the option to use
DPD though, because they have hardware support for that.

Perhaps ideally there could be a build option to use any of the following:

1.  The IBM decNum library
2.  The IBM DFPAL library[1] (this maps to IBM hardware if available,
or decNum otherwise)
3.  The Intel library
4.  The future built-in C language support[2] (which could use either
binary format!), currently only a proposal but already implemented by
IBM XL C and GCC (using the libraries above)

I have a suspicion that if only one of those has to be chosen, the
Intel library would be best for the majority of users based on license
+ performances.

[1] http://speleotrove.com/decimal/dfpal/dfpalugaio.html
[2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1312.pdf,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1781.pdf

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3.9 and pg_multixact corruption

2015-09-24 Thread Alvaro Herrera
Jim Nasby wrote:
> On 9/20/15 9:23 AM, Christoph Berg wrote:
> >a short update here: the customer updated the compiler to a newer
> >version, is now compiling using -O2 instead of -O3, and the code
> >generated now looks sane, so this turned out to be a compiler issue.
> >(Though it's unclear if the upgrade fixed it, or the different -O
> >level.)
> 
> Do we officially not support anything > -O2? If so it'd be nice if configure
> threw at least a warning (if not an error that you had to explicitly
> over-ride).

Keep in mind this is Sun OS C -- not one of the most popular compilers
in the world.  I don't know what you suggest: have a test program that
configure runs and detects whether the compiler does the wrong thing?
It doesn't seem a sane idea to maintain test cases for all known
compiler bugs ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Peter Geoghegan
On Thu, Sep 24, 2015 at 1:29 PM, Feng Tian  wrote:
> Here is an extension for 64 and 128 bit decimal types using IEEE decimal
> floating point.  The original idea/implementation is from
> http://pgxn.org/dist/pgdecimal/1.0.0/

Interesting. A default B-Tree operator class for the decimal types
would be nice.

I would worry about the implicit casts you've added. They might cause problems.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Jim Nasby

On 9/24/15 3:35 PM, Peter Geoghegan wrote:

I would worry about the implicit casts you've added. They might cause problems.


Given the cycle created between numeric->decimal and decimal->numeric, I 
can pretty much guarantee they will. In any case, I don't think implicit 
casting from numeric->decimal is a good idea since it can overflow. I'm 
not sure that the other direction is safe either... I can't remember 
offhand if casting correctly obeys typmod or not.


BTW, have you talked to Pavel about making these changes to his code? 
Seems a shame to needlessly fork it. :/

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Peter Geoghegan
On Thu, Sep 24, 2015 at 1:53 PM, Tom Lane  wrote:
> Please include the actual patch as an attachment.  We do not consider mere
> URLs to be acceptable patch submission format, because that provides no
> permanent record in our archives of what was submitted.

I was under the impression that this was not intended as a patch submission.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Feng Tian
On Thu, Sep 24, 2015 at 1:55 PM, Peter Geoghegan  wrote:

> On Thu, Sep 24, 2015 at 1:53 PM, Tom Lane  wrote:
> > Please include the actual patch as an attachment.  We do not consider
> mere
> > URLs to be acceptable patch submission format, because that provides no
> > permanent record in our archives of what was submitted.
>
> I was under the impression that this was not intended as a patch
> submission.
>
>
> --
> Peter Geoghegan
>

If there is enough interest, would be great for it to go into the official
contrib dir.
Thanks,


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Feng Tian
On Thu, Sep 24, 2015 at 2:17 PM, Feng Tian  wrote:

>
>
> On Thu, Sep 24, 2015 at 1:55 PM, Peter Geoghegan  wrote:
>
>> On Thu, Sep 24, 2015 at 1:53 PM, Tom Lane  wrote:
>> > Please include the actual patch as an attachment.  We do not consider
>> mere
>> > URLs to be acceptable patch submission format, because that provides no
>> > permanent record in our archives of what was submitted.
>>
>> I was under the impression that this was not intended as a patch
>> submission.
>>
>>
>> --
>> Peter Geoghegan
>>
>
> If there is enough interest, would be great for it to go into the official
> contrib dir.
> Thanks,
>
>
> Second thought, the extension depends on decNumber, which is either GPL,
or ICU license.  Maybe this is trouble.


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stefan Kaltenbrunner
On 09/24/2015 07:03 PM, Josh Berkus wrote:
> On 09/23/2015 10:25 PM, Thomas Munro wrote:
>> On Thu, Sep 24, 2015 at 1:31 PM, Joe Conway  wrote:
>>> On 09/23/2015 05:21 PM, Thomas Munro wrote:
 Do you think it would make any sense to consider evolving what we have
 already?  At the moment, we have a bug form, and when you submit it it
 does this (if I'm looking at the right thing, please correct me if I'm
 not):
> 
> I know we're big on reinventing the wheel here, but it would really be a
> better idea to use an established product than starting over from
> scratch. Writing a bug tracker is a lot of work and maintenance.
> 
>> The two most common interactions could go something like this:
>>
>> 1.  User enters bug report via form, creating an issue in NEW state
>> and creating a pgsql-bugs thread.  Someone responds by email that this
>> is expected behaviour, not a bug, not worth fixing or not a Postgres
>> issue etc using special trigger words.  The state is automatically
>> switched to WORKS_AS_DESIGNED or WONT_FIX.  No need to touch the web
>> interface: the only change from today's workflow is awareness of the
>> right wording to trigger the state change.
>>
>> 2.  User enters bug report via form, creating issue #1234 in NEW
>> state.   Someone responds by email to acknowledge that that may indeed
>> be an issue, and any response to an issue in NEW state that doesn't
>> reject it switches it to UNDER_DISCUSSION.  Maybe if a commitfest item
>> references the same thread (or somehow references the issue number?)
>> its state is changed to IN_COMMITFEST, or maybe as you say there could
>> be a way to generate the commitfest item from the issue, not sure
>> about that.  Eventually a commit log message says "Fixes bug #1234"
>> and the state automatically goes to FIXED.
> 
> I don't know debbugs, but I know that it would be possible to program RT
> to do all of the above, except add the item to the commitfest.

well minus the fact that the commitfest process looked
different/non-existent back in 2007/2008 this is basically how the BZ
based PoC I did back than worked (hooked into the bug tracking form to
allow BZ to "learn" about an issue/bug/thread and keep tracking it
automatically afterwards. You could send it simply commands as part as
the mail or click in the gui (or do nothing at all - though it would
close the bug if it found the bug id in a commit).

Even adding something to the commitfest should be fairly easy to do in
most tools because they all have hooks to send stuff via email, to
twitter, hipchat, IRC or whatnot.


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Josh Berkus
On 09/24/2015 12:55 PM, Tom Lane wrote:
> Stephen Frost  writes:
>> Are there any objections to pginfra standing up bugs.postgresql.org with
>> debbugs?  Obviously, it'd be more-or-less beta as we play with it, and
>> we could set it up as beta-bugs.p.o, if there's concern about that.
> 
> I agree with the idea that we don't yet want to give the impression that
> this is the official bug tracker.  However, "beta-bugs" could give the
> impression that it was specifically for bugs about 9.5beta, without
> dispelling the idea that it is official.  Maybe "bugs-test.p.o"?

I'd suggest instead just having a big banner up in the page header which
says "this system is currently beta and not yet the canonical source for
postgres bug information".  That way, if it does become the canonical
source, we won't go breaking everyone's links when we change the domain
name.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bad row estimation with indexed func returning bool

2015-09-24 Thread Tom Lane
I wrote:
>> The implication of doing it like this would be that the default estimate
>> in the absence of any matching stats would be 0.5 (since eqsel defaults
>> to 1/ndistinct, and get_variable_numdistinct will report 2.0 for any
>> boolean-type expression it has no stats for).  That's not a huge change
>> from the existing 0.333 estimate, which seems pretty unprincipled
>> anyway ... but it would probably be enough to annoy people if we did it in
>> stable branches.  So I'd be inclined to propose changing this in HEAD and
>> maybe 9.5, but not further back.  (For non-function expressions, 0.5 is
>> the default already, so those would not change behavior.)

> I experimented with the attached patch.  The change in the default
> estimate for a function results in just one change in the standard
> regression test results, so far as I can find.

After more thought I concluded that changing the default behavior is
something not to do without a lot more testing than I have time for now.
So I modified the patch to preserve the old default estimate for
statistics-less function calls, and pushed it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Feng Tian
On Thu, Sep 24, 2015 at 1:29 PM, Feng Tian  wrote:

> Hi,
>
> Here is an extension for 64 and 128 bit decimal types using IEEE decimal
> floating point.  The original idea/implementation is from
> http://pgxn.org/dist/pgdecimal/1.0.0/   Original thread for dicussion is
> at
>
>
> http://www.postgresql.org/message-id/CAFj8pRApakE6s-H2yJcXD=ubpukwa6i7rx4vuvtb4puhga5...@mail.gmail.com
>
> I reimplemented 64/128 bits instead of 32/64 bits.  The code use decNumber
> library instead of _Decimal64/128 of GCC.   Also added more operators.
>
> Compared to numeric type, decimal64 arithmetics is about 2x faster,
> decimal128 is about 1.5x faster.  However, the cast between decimal and
> float4/8 is implemented rather naively and slow.   As always, it depends on
> workload, decimal may take more, or less space, may be slower if cast is
> frequently performed.
>
> Agains, thanks to the original author okbob (Pavel).   Enjoy.
>
> Thanks,
> Feng
>
>
Ah, link.

https://github.com/vitesse-ftian/pgdecimal

Thanks,
Feng


Re: [HACKERS] Rename withCheckOptions to insertedCheckClauses

2015-09-24 Thread Dean Rasheed
On 24 September 2015 at 21:25, Tom Lane  wrote:
> I wrote:
>> - List   *ri_WithCheckOptions;
>> - List   *ri_WithCheckOptionExprs;
>> + List   *ri_InsertedCheckClauses;
>> + List   *ri_InsertedCheckClauseExprs;
>
>> The distinction between a "clause" and an "expr" is not very obvious,
>> and certainly most other places in the code use those terms pretty
>> interchangeably, so I find both the old and new names unclear here.
>> How about ri_InsertedCheckClauseStates instead for the second list?
>> And similarly if you're using "Expr" to mean ExprState anywhere else.
>
> Actually ... does struct ResultRelInfo need to carry the original WCO
> clauses at all, rather than just the exprstate list?  In most places
> we do not store expr and exprstate lists in the same node in the first
> place, so we can get away with using the same field name for corresponding
> lists in plan and planstate nodes.  That's why we don't already have a
> convention like "fooStates" for such lists.
>
> Another thought is that as long as these are lists specifically of
> WithCheckOption nodes, and not arbitrary expressions, "clause" isn't an
> especially good term for them; it implies generality that isn't there.
> And CheckClauses invites confusion with, for example, CHECK clauses of
> domain types.  So maybe better names would be "ri_InsertedCheckOptions"
> (and "ri_InsertedCheckOptionStates" if you still need that).  Or maybe
> "ri_InsertedWCOClauses" and "ri_InsertedWCOClauseStates".  I'm less sure
> about whether this is an improvement, though.
>

None of this renaming seems like an improvement to me.

ri_InsertedCheckClauses is misleading because they're not clauses,
they're WithCheckOption nodes carrying various pieces of information,
only one of which is the clause to check.

ri_InsertedCheckOptions is a bit better from that perspective, but it
still seems messy for the executor to carry the knowledge that these
checks were inserted by the rewriter. In the executor they're just
checks to be executed, and "WithCheck" reflects their original source
better than "InsertedCheck".

ri_InsertedCheckOptionStates is inconsistent with the names of the
other lists of expr states on that node -- ri_TrigWhenExprs and
ri_ConstraintExprs.

Also, these were added in 9.4, so introducing this many differences
between 9.4 and 9.5+ will make back-patching harder.

The original objection to the name WithCheckOptions was in relation to
the Query struct, and the fact that this field isn't the parsed
representation of WITH CHECK OPTION's on the query. I think that can
be cured with a suitable comment.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Manual bitswizzling -> LOCKBIT_ON

2015-09-24 Thread Thomas Munro
Hi

While studying lmgr code, I noticed that there are a couple of places
that use 1 << x to convert a LOCKMODE to a LOCKMASK instead of the
macro that is used elsewhere.  Should that be changed for consistency,
as in the attached?

-- 
Thomas Munro
http://www.enterprisedb.com


use-lockbit-on-macro.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-24 Thread Amit Kapila
On Thu, Sep 24, 2015 at 12:00 AM, Robert Haas  wrote:
>
> On Thu, Sep 3, 2015 at 6:21 AM, Amit Kapila 
wrote:
> > [ new patches ]
>
>
> It looks to me like there would be trouble if an initPlan or subPlan
> were kept below a Funnel, or as I guess we're going to call it, a
> Gather node.  That's because a SubPlan doesn't actually have a pointer
> to the node tree for the sub-plan in it.  It just has an index into
> PlannedStmt.subplans.  But create_parallel_worker_plannedstmt sets the
> subplans list to NIL.  So that's not gonna work.  Now maybe there's no
> way for an initPlan or a subPlan to creep down under the funnel, but I
> don't immediately see what would prevent it.
>

I think initPlan will work with the existing patches as we are always
executing it in master and then sending the result to workers. Refer
below code in funnel patch:

ExecFunnel()
{
..

+ /*

+ * Evaluate the InitPlan and pass the PARAM_EXEC params, so that

+ * values can be shared with worker backend.  This is different from

+ * the way InitPlans are evaluated (lazy evaluation) at other places

+ * as instead of sharing the InitPlan to all the workers and let them

+ * execute, we pass the values which can be directly used by worker

+ * backends.

+ */

+ serialized_param_exec = ExecAndFormSerializeParamExec(econtext,

+   node->ss.ps.plan->lefttree->allParam);
}

For Subplan, as mentioned in yesterday's mail it is still to be dealt
by patch, but I think if we assign the subplans to the planned statement
we are passing to worker, it should work.  Here we need to avoid un-safe
subplans to be pushed down, some more thoughts are required to see
what exactly needs to be done.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Tom Lane
Jeevan Chalke  writes:
> On Wed, Sep 23, 2015 at 10:15 PM, Tom Lane  wrote:
>> After a bit more thinking and experimentation, I propose the attached
>> patch.

> I had a look over the patch and reviewed it. It is in excellent state to
> check-in.

After further thought I decided that the base case for
Const/Param/non-foreign-Vars wasn't quite right either.  If we don't like
the collation we should just set the state to UNSAFE not fail immediately,
because it might appear in a context where collation doesn't matter.
An example is "var IS NOT NULL".

So I've committed the attached modification of that patch.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 81cb2b4..697de60 100644
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 17,27 
   * We do not consider that it is ever safe to send COLLATE expressions to
   * the remote server: it might not have the same collation names we do.
   * (Later we might consider it safe to send COLLATE "C", but even that would
!  * fail on old remote servers.)  An expression is considered safe to send only
!  * if all collations used in it are traceable to Var(s) of the foreign table.
!  * That implies that if the remote server gets a different answer than we do,
!  * the foreign table's columns are not marked with collations that match the
!  * remote table's columns, which we can consider to be user error.
   *
   * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
   *
--- 17,28 
   * We do not consider that it is ever safe to send COLLATE expressions to
   * the remote server: it might not have the same collation names we do.
   * (Later we might consider it safe to send COLLATE "C", but even that would
!  * fail on old remote servers.)  An expression is considered safe to send
!  * only if all operator/function input collations used in it are traceable to
!  * Var(s) of the foreign table.  That implies that if the remote server gets
!  * a different answer than we do, the foreign table's columns are not marked
!  * with collations that match the remote table's columns, which we can
!  * consider to be user error.
   *
   * Portions Copyright (c) 2012-2015, PostgreSQL Global Development Group
   *
*** typedef struct foreign_glob_cxt
*** 69,77 
   */
  typedef enum
  {
! 	FDW_COLLATE_NONE,			/* expression is of a noncollatable type */
  	FDW_COLLATE_SAFE,			/* collation derives from a foreign Var */
! 	FDW_COLLATE_UNSAFE			/* collation derives from something else */
  } FDWCollateState;
  
  typedef struct foreign_loc_cxt
--- 70,81 
   */
  typedef enum
  {
! 	FDW_COLLATE_NONE,			/* expression is of a noncollatable type, or
!  * it has default collation that is not
!  * traceable to a foreign Var */
  	FDW_COLLATE_SAFE,			/* collation derives from a foreign Var */
! 	FDW_COLLATE_UNSAFE			/* collation is non-default and derives from
!  * something other than a foreign Var */
  } FDWCollateState;
  
  typedef struct foreign_loc_cxt
*** foreign_expr_walker(Node *node,
*** 272,284 
  else
  {
  	/* Var belongs to some other table */
! 	if (var->varcollid != InvalidOid &&
! 		var->varcollid != DEFAULT_COLLATION_OID)
! 		return false;
! 
! 	/* We can consider that it doesn't set collation */
! 	collation = InvalidOid;
! 	state = FDW_COLLATE_NONE;
  }
  			}
  			break;
--- 276,299 
  else
  {
  	/* Var belongs to some other table */
! 	collation = var->varcollid;
! 	if (collation == InvalidOid ||
! 		collation == DEFAULT_COLLATION_OID)
! 	{
! 		/*
! 		 * It's noncollatable, or it's safe to combine with a
! 		 * collatable foreign Var, so set state to NONE.
! 		 */
! 		state = FDW_COLLATE_NONE;
! 	}
! 	else
! 	{
! 		/*
! 		 * Do not fail right away, since the Var might appear
! 		 * in a collation-insensitive context.
! 		 */
! 		state = FDW_COLLATE_UNSAFE;
! 	}
  }
  			}
  			break;
*** foreign_expr_walker(Node *node,
*** 288,303 
  
  /*
   * If the constant has nondefault collation, either it's of a
!  * non-builtin type, or it reflects folding of a CollateExpr;
!  * either way, it's unsafe to send to the remote.
   */
! if (c->constcollid != InvalidOid &&
! 	c->constcollid != DEFAULT_COLLATION_OID)
! 	return false;
! 
! /* Otherwise, we can consider that it doesn't set collation */
! collation = InvalidOid;
! state = FDW_COLLATE_NONE;
  			}
  			break;
  		case T_Param:
--- 303,318 
  
  /*
   * If the constant has nondefault collation, either it's of a
!  * non-builtin type, or it reflects folding of a CollateExpr.
!  * It's unsafe to send to the remote unless it's used in a
!  * 

Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tomas Vondra



On 09/24/2015 05:18 PM, Tom Lane wrote:

Robert Haas  writes:

Of course, if we can postpone sizing the hash table until after the
input size is known, as you suggest, then that would be better still
(but not back-patch material).


AFAICS, it works that way today as long as the hash fits in memory
(ie, single-batch).  We load into a possibly seriously undersized hash
table, but that won't matter for performance until we start probing it.
At the conclusion of loading, MultiExecHash will call
ExecHashIncreaseNumBuckets which will re-hash into a better-sized hash
table.  I doubt this can be improved on much.

It would be good if we could adjust the numbuckets choice at the
conclusion of the input phase for the multi-batch case as well.
The code appears to believe that wouldn't work, but I'm not sure if
it's right about that, or how hard it'd be to fix if so.


So you suggest to use a small hash table even when we expect batching?

That would be rather difficult to do because of the way we derive 
buckets and batches from the hash value - they must not overlap. The 
current code simply assumes that once we start batching the number of 
bits needed for buckets does not change anymore.


It's possible to rework of course - the initial version of the patch 
actually did just that (although it was broken in other ways).


But I think the real problem here is the batching itself - if we 
overestimate and start batching (while we could actually run with a 
single batch), we've already lost.


But what about computing the number of expected batches, but always 
start executing assuming no batching? And only if we actually fill 
work_mem, we start batching and use the expected number of batches?


I.e.

1) estimate nbatches, but use nbatches=1

2) run until exhausting work_mem

3) start batching, with the initially estimated number of batches


regards


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics / patch v7

2015-09-24 Thread Tomas Vondra

Hi,

On 09/24/2015 06:43 PM, Josh Berkus wrote:

Tomas,


attached is v7 of the multivariate stats patch. The main improvement is
major refactoring of the clausesel.c portion - splitting the awfully
long spaghetti-style functions into smaller pieces, making it much more
understandable etc.


So presumably v7 handles varlena attributes as well, yes?   I have a
destruction test case for correlated column stats, so I'd like to test
your patch on it.


Yes, it should handle varlena OK. Let me know if you need help with 
that, and I'd like to hear feedback - whether it fixed your test case or 
not, etc.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Josh Berkus
On 09/23/2015 10:25 PM, Thomas Munro wrote:
> On Thu, Sep 24, 2015 at 1:31 PM, Joe Conway  wrote:
>> On 09/23/2015 05:21 PM, Thomas Munro wrote:
>>> Do you think it would make any sense to consider evolving what we have
>>> already?  At the moment, we have a bug form, and when you submit it it
>>> does this (if I'm looking at the right thing, please correct me if I'm
>>> not):

I know we're big on reinventing the wheel here, but it would really be a
better idea to use an established product than starting over from
scratch. Writing a bug tracker is a lot of work and maintenance.

> The two most common interactions could go something like this:
> 
> 1.  User enters bug report via form, creating an issue in NEW state
> and creating a pgsql-bugs thread.  Someone responds by email that this
> is expected behaviour, not a bug, not worth fixing or not a Postgres
> issue etc using special trigger words.  The state is automatically
> switched to WORKS_AS_DESIGNED or WONT_FIX.  No need to touch the web
> interface: the only change from today's workflow is awareness of the
> right wording to trigger the state change.
> 
> 2.  User enters bug report via form, creating issue #1234 in NEW
> state.   Someone responds by email to acknowledge that that may indeed
> be an issue, and any response to an issue in NEW state that doesn't
> reject it switches it to UNDER_DISCUSSION.  Maybe if a commitfest item
> references the same thread (or somehow references the issue number?)
> its state is changed to IN_COMMITFEST, or maybe as you say there could
> be a way to generate the commitfest item from the issue, not sure
> about that.  Eventually a commit log message says "Fixes bug #1234"
> and the state automatically goes to FIXED.

I don't know debbugs, but I know that it would be possible to program RT
to do all of the above, except add the item to the commitfest.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb_set array append hack?

2015-09-24 Thread Dmitry Dolgov
>> For that matter, we should probably disallow NULL path elements also,
shouldn't we?
> I'd say yes.

Well, here is the new `setPath` function with this modification. Is it what
did you mean?


non_integer_in_path2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-24 Thread Fujii Masao
On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila  wrote:
> Hello,
>
> Please find attached patch with bugs reported by Thom and Sawada-san solved.

The regression test failed on my machine, so you need to update the
regression test,
I think.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Tom Lane
Tomas Vondra  writes:
> But what about computing the number of expected batches, but always 
> start executing assuming no batching? And only if we actually fill 
> work_mem, we start batching and use the expected number of batches?

Hmm.  You would likely be doing the initial data load with a "too small"
numbuckets for single-batch behavior, but if you successfully loaded all
the data then you could resize the table at little penalty.  So yeah,
that sounds like a promising approach for cases where the initial rowcount
estimate is far above reality.

But I kinda thought we did this already, actually.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
> I know we're big on reinventing the wheel here, but it would really be a
> better idea to use an established product than starting over from
> scratch. Writing a bug tracker is a lot of work and maintenance.

I tend to agree.

> > The two most common interactions could go something like this:
> > 
> > 1.  User enters bug report via form, creating an issue in NEW state
> > and creating a pgsql-bugs thread.  Someone responds by email that this
> > is expected behaviour, not a bug, not worth fixing or not a Postgres
> > issue etc using special trigger words.  The state is automatically
> > switched to WORKS_AS_DESIGNED or WONT_FIX.  No need to touch the web
> > interface: the only change from today's workflow is awareness of the
> > right wording to trigger the state change.
> > 
> > 2.  User enters bug report via form, creating issue #1234 in NEW
> > state.   Someone responds by email to acknowledge that that may indeed
> > be an issue, and any response to an issue in NEW state that doesn't
> > reject it switches it to UNDER_DISCUSSION.  Maybe if a commitfest item
> > references the same thread (or somehow references the issue number?)
> > its state is changed to IN_COMMITFEST, or maybe as you say there could
> > be a way to generate the commitfest item from the issue, not sure
> > about that.  Eventually a commit log message says "Fixes bug #1234"
> > and the state automatically goes to FIXED.
> 
> I don't know debbugs, but I know that it would be possible to program RT
> to do all of the above, except add the item to the commitfest.

debbugs does most of the above by default, no programming needed...  I'm
sure we could get it to integrate with the commitfest and have a git
commit hook which sends the appropriate email to it also.

That the emacs folks are using it makes me *much* more interested in the
idea of getting debbugs up and running..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Joe Conway
On 09/24/2015 10:08 AM, Stephen Frost wrote:
> debbugs does most of the above by default, no programming needed...  I'm
> sure we could get it to integrate with the commitfest and have a git
> commit hook which sends the appropriate email to it also.
> 
> That the emacs folks are using it makes me *much* more interested in the
> idea of getting debbugs up and running..

I'm not familiar with debbugs myself, but given that description it
sounds to me like it would be worth giving it a try.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Tom Lane
I promised myself I'd stay out of this discussion, but ...

Josh Berkus  writes:
> I know we're big on reinventing the wheel here, but it would really be a
> better idea to use an established product than starting over from
> scratch. Writing a bug tracker is a lot of work and maintenance.

Yeah; "let's write our own bug tracker" is a good way to make sure nothing
comes of this.  On the other hand, "let's get all the Postgres hackers to
change their habits" is an equally good way to make sure nothing comes of
this.  (We tried that once already, with I-forget-which tracker; the
results were, um, forgettable.)  I think the only approach that has any
chance of success is to connect up some existing tracker software to more
or less our existing work patterns.  So somebody who wants to make this
happen needs to sit down and do that.

FWIW, I concur with the opinions that it needs to be an OSS tracker.
This project has been around for twenty years and I have every expectation
that it will survive for another twenty.  I have no confidence in any
closed-source product still being available (and free) in 2035.  We need
something with an active development/support community, too, so it doesn't
end up being us supporting it on our ownsome ten years out.  Other than
that, I'm agnostic as to what gets picked.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Thomas Munro
On Fri, Sep 25, 2015 at 10:25 AM, Jim Nasby  wrote:
> On 9/24/15 3:35 PM, Peter Geoghegan wrote:
>>
>> I would worry about the implicit casts you've added. They might cause
>> problems.
>
>
> Given the cycle created between numeric->decimal and decimal->numeric, I can
> pretty much guarantee they will. In any case, I don't think implicit casting
> from numeric->decimal is a good idea since it can overflow. I'm not sure
> that the other direction is safe either... I can't remember offhand if
> casting correctly obeys typmod or not.

FWIW it looks like DB2 promotes DECIMAL to DECFLOAT, not the other way around.

https://www-304.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/com.ibm.db2z10.doc.sqlref/src/tpc/db2z_promotionofdatatypes.dita

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 1:58 PM, Tomas Vondra
 wrote:
> Meh, you're right - I got the math wrong. It's 1.3% in both cases.
>
> However the question still stands - why should we handle the over-estimate
> in one case and not the other? We're wasting the same fraction of memory in
> both cases.

Well, I think we're going around in circles here.  It doesn't seem
likely that either of us will convince the other.

But for the record, I agree with you that in the scenario you lay out,
it's the about the same problem in both cases.  I could argue that
it's slightly different because of [ tedious and somewhat tenuous
argument omitted ], but I'll spare you that.  However, consider the
alternative scenario where, on the same machine, perhaps even in the
same query, we perform two hash joins, one of which involves hashing a
small table (say, 2MB) and one of which involves hashing a big table
(say, 2GB).  If the small query uses twice the intended amount of
memory, probably nothing bad will happen.  If the big query does the
same thing, a bad outcome is much more likely.  Say the machine has
16GB of RAM.  Well, a 2MB over-allocation is not going to break the
world.  A 2GB over-allocation very well might.

I really don't see why this is a controversial proposition.  It seems
clearly as daylight from here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila  wrote:
> I have fixed most of the review comments raised in this mail
> as well as other e-mails and rebased the patch on commit-
> 020235a5.  Even though I have fixed many of the things, but
> still quite a few comments are yet to be handled.  This patch
> is mainly a rebased version to ease the review.  We can continue
> to have discussion on the left over things and I will address
> those in consecutive patches.

Thanks for the update.  Here are some more review comments:

1. parallel_seqscan_degree is still not what we should have here.  As
previously mentioned, I think we should have a GUC for the maximum
degree of parallelism in a query generally, not the maximum degree of
parallel sequential scan.

2. fix_node_funcids() can go away because of commit
9f1255ac859364a86264a67729dbd1a36dd63ff2.

3. cost_patialseqscan is still misspelled.  I pointed this out before, too.

4. Much more seriously than any of the above,
create_parallelscan_paths() generates plans that are badly broken:

rhaas=# explain select * from pgbench_accounts where filler < random()::text;
   QUERY PLAN
-
 Funnel on pgbench_accounts  (cost=0.00..35357.73 rows=333 width=97)
   Filter: ((filler)::text < (random())::text)
   Number of Workers: 10
   ->  Partial Seq Scan on pgbench_accounts  (cost=0.00..35357.73
rows=333 width=97)
 Filter: ((filler)::text < (random())::text)
(5 rows)

This is wrong both because random() is parallel-restricted and thus
can't be executed in a parallel worker, and also because enforcing a
volatile qual twice is no good.

rhaas=# explain select * from pgbench_accounts where aid % 1 = 0;
  QUERY PLAN
---
 Funnel on pgbench_accounts  (cost=0.00..28539.55 rows=5 width=97)
   Filter: ((aid % 1) = 0)
   Number of Workers: 10
   ->  Partial Seq Scan on pgbench_accounts  (cost=0.00..28539.55
rows=5 width=97)
 Filter: ((aid % 1) = 0)
(5 rows)

This will work, but it's a bad plan because we shouldn't need to
re-apply the filter condition in the parallel leader after we've
already checked it in the worker.

rhaas=# explain select * from pgbench_accounts a where a.bid not in
(select bid from pgbench_branches);
QUERY PLAN
---
 Funnel on pgbench_accounts a  (cost=2.25..26269.07 rows=500 width=97)
   Filter: (NOT (hashed SubPlan 1))
   Number of Workers: 10
   ->  Partial Seq Scan on pgbench_accounts a  (cost=2.25..26269.07
rows=500 width=97)
 Filter: (NOT (hashed SubPlan 1))
 SubPlan 1
   ->  Seq Scan on pgbench_branches  (cost=0.00..2.00 rows=100 width=4)
   SubPlan 1
 ->  Seq Scan on pgbench_branches  (cost=0.00..2.00 rows=100 width=4)
(9 rows)

This will not work, because the subplan isn't available inside the
worker.  Trying it without the EXPLAIN crashes the server.  This is
more or less the same issue as one of the known issues you already
mentioned, but I mention it again here because I think this case is
closely related to the previous two.

Basically, you need to have some kind of logic for deciding which
things need to go below the funnel and which on the funnel itself.
The stuff that's safe should get pushed down, and the stuff that's not
safe should get attached to the funnel.  The unsafe stuff is whatever
contains references to initplans or subplans, or anything that
contains parallel-restricted functions ... and there might be some
other stuff, too, but at least those things.

Instead of preventing initplans or subplans from getting pushed down
into the funnel, we could instead try to teach the system to push them
down.  However, that's very complicated; e.g. a subplan that
references a CTE isn't safe to push down, and a subplan that
references another subplan must be pushed down if that other subplan
is pushed down, and an initplan that contains volatile functions can't
be pushed down because each worker would execute it separately and
they might not all get the same answer, and an initplan that
references a temporary table can't be pushed down because it can't be
referenced from a worker.  All in all, it seems better not to go there
right now.

Now, when you fix this, you're quickly going to run into another
problem, which is that as you have this today, the funnel node does
not actually enforce its filter condition, so the EXPLAIN plan is a
dastardly lie.  And when you try to fix that, you're going to run into
a third problem, which is that the stuff the funnel node needs in
order to evaluate its filter condition might not be in the partial seq
scan's target list.  So you need to fix both 

Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Pavel Stehule
2015-09-25 0:25 GMT+02:00 Jim Nasby :

> On 9/24/15 3:35 PM, Peter Geoghegan wrote:
>
>> I would worry about the implicit casts you've added. They might cause
>> problems.
>>
>
> Given the cycle created between numeric->decimal and decimal->numeric, I
> can pretty much guarantee they will. In any case, I don't think implicit
> casting from numeric->decimal is a good idea since it can overflow. I'm not
> sure that the other direction is safe either... I can't remember offhand if
> casting correctly obeys typmod or not.
>
> BTW, have you talked to Pavel about making these changes to his code?
> Seems a shame to needlessly fork it. :/
>

yes, he talked with me, and I gave a agreement to continue/enhance/fork
this project how will be necessary

Regards

Pavel


> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] TEXT vs VARCHAR join qual push down diffrence, bug or expected?

2015-09-24 Thread Jeevan Chalke
On Thu, Sep 24, 2015 at 10:22 PM, Tom Lane  wrote:

> Jeevan Chalke  writes:
> > On Wed, Sep 23, 2015 at 10:15 PM, Tom Lane  wrote:
> >> After a bit more thinking and experimentation, I propose the attached
> >> patch.
>
> > I had a look over the patch and reviewed it. It is in excellent state to
> > check-in.
>
> After further thought I decided that the base case for
> Const/Param/non-foreign-Vars wasn't quite right either.  If we don't like
> the collation we should just set the state to UNSAFE not fail immediately,
> because it might appear in a context where collation doesn't matter.
> An example is "var IS NOT NULL".
>

Make sense.


>
> So I've committed the attached modification of that patch.
>
> Thanks


> regards, tom lane
>
>


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-09-24 Thread Robert Haas
On Thu, Sep 24, 2015 at 12:40 PM, Tomas Vondra
 wrote:
> There are two machines - one with 32GB of RAM and work_mem=2GB, the other
> one with 256GB of RAM and work_mem=16GB. The machines are hosting about the
> same data, just scaled accordingly (~8x more data on the large machine).
>
> Let's assume there's a significant over-estimate - we expect to get about
> 10x the actual number of tuples, and the hash table is expected to almost
> exactly fill work_mem. Using the 1:3 ratio (as in the query at the beginning
> of this thread) we'll use ~512MB and ~4GB for the buckets, and the rest is
> for entries.
>
> Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the
> buckets, so we're wasting ~448MB (13% of RAM) on the small machine and
> ~3.5GB (~1.3%) on the large machine.
>
> How does it make any sense to address the 1.3% and not the 13%?

One of us is confused, because from here it seems like 448MB is 1.3%
of 32GB, not 13%.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread David Fetter
On Thu, Sep 24, 2015 at 12:10:07PM -0600, Ryan Pedela wrote:
> Kam Lasater wrote:
> > I'd suggest: Github Issues, Pivotal Tracker or Redmine (probably in
> > that order). There are tens to hundreds of other great ones out there,
> > I'm sure one of them would also work.
> 
> Why not just use Github issues?

Is Github issues something we can run ourselves?

If not, it's a proprietary system which has a proprietor whose
existence even next month is not guaranteed, and whose interests are
not guaranteed to align with ours into an indefinite future.

In some very important sense, it does not matter what features a
system has if it isn't one we can control.  At a minimum, this means
we need to run it in its entirety on resources we control, and we need
to be able to patch any piece of it on our own say-so.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CustomScan support on readfuncs.c

2015-09-24 Thread Kouhei Kaigai
Hi,

I tried to define two additional callbacks to support CustomScan
on readfuncs.c.

First of all, we need to pay attention how to treat output of
TextOutCustomScan when additional text output is generated.
Only custom-scan provider knows what shall be displayed, so
all we can do is invoke a new callback: TextReadCustomScan().
Because private fields shall be displayed next to the common
field of CustomScan, this callback is invoked once pg_strtok
pointer reached to the last field of CustomScan. Then, custom
scan provider allocates CustomScan or a structure embedding
CustomScan; only CSP knows exact size to be allocated.
It can fetch some tokens using pg_strtok and reconstruct its
private fields, but no need to restore the common field because
it shall be done by _readCustomScan.
If no callbacks are defined, _readCustomScan allocates
a CustomScan object and read only known fields.

Then, let's look back a bit. Next issue is how to reproduce
the "methods" pointer from the text representation.
I try to lookup the methods table using a pair of library
and symbol name; probably, it is a straightforward way.
The "methods" field is put earlier than all private fields
generated by TextOutCustomScan, so it should be reconstructable
prior to TextReadCustomScan.
To support this feature, I had to add an interface contract
that requires extensions to put library and symbol name on
CustomScanMethods table.
Although INIT_CUSTOM_SCAN_METHODS() macro can set up these
fields, author of extension needs to pay attention.

In addition to these enhancement, a new NodeCopyCustomScan
callback eliminates a restriction; custom-scan provider
cannot define its own structure that embeds CustomScan.
Only CSP knows exact size of the structure, this callback
is intended to allocate a new one and copy the private fields,
but no need to copy the common fields.

These three callbacks (one existing, two new) will make
CustomScan node copyObject, nodeToString and stringToNode
aware.


I also noticed that some static functions are valuable for
extensions, to avoid reinvention of same routines.
How about to expose these functions?
- _outToken
- _outBitmapset
- nullable_string
- _readBitmapset


The attached patch is conceptual, just compilable but not
tested yet. I'll try to make a test case soon, however,
it makes sense to get feedbacks earlier even if it is
based on the concept design.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


custom-scan-on-readfuncs.v1.patch
Description: custom-scan-on-readfuncs.v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-24 Thread David Rowley
On 24 September 2015 at 23:57, Tomas Vondra 
wrote:

>
> 2) find_best_match_foreign_key
> --
>
> I think the comment before the function needs rephrasing (seems a bit
> broken to me). I do like the approach in general, although it changes the
> semantics a bit. The original code only considered "fully matching" fkeys,
> while the new code simply takes the longest match.
>
>
>
Oops, I did not code this at all the way I had originally pictured it. I
guess the evidence of that is in the function comment, which I wrote before
coding the function.
I of course intended to only allow full matches.
A full patch which fixes this is attached. This also should fix the clause
duplication trick that I talked about.


> The comment before the function mentions it's possible to confuse the code
> with duplicate quals. Can you give an example?
>
>
Something like: SELECT * FROM a LEFT JOIN b ON a.id=b.id and a.id=b.id AND
a.id=b.id AND a.id2 = b.id2 AND a.id3 = b.id3;

Note a.id = b.id repeated 3 times.

Where a foreign key exists on (a.id) ref (b.id), and also (a.id2, a.id3)
ref (b.id2, b.id3). In this case we match 3 quals for the FK with 1 key,
and 2 quals with the FK with 2 keys. But this is now fixed in the attached.

I used an outer join here as they won't be transformed into eclass members
and back to quals again. INNER JOIN wouldn't allow the duplicates to
materialise again.



>
> 3) clauselist_join_selectivity
> --
>
> I think this is actually broken, i.e. it does not handle the cases it was
> handling before - namely the cases where more than 3 tables are joined
> (fkjoin2.sql).
>
> Imagine a "fact" table referencing two dimensions, using 2-column fkeys:
>
> create table a (a1 int, a2 int, unique (a1,a2));
> create table b (b1 int, b2 int, unique (b1,b2));
> create table f (a1 int, a2 int, b1 int, b2 int);
>
> insert into a select i,i from generate_series(0,99) s(i);
> insert into b select i,i from generate_series(0,99) s(i);
> insert into f select i/10,i/10,i/10,i/10
> from generate_series(0,999) s(i);
>
> alter table f add foreign key (a1,a2) references a(a1,a2);
> alter table f add foreign key (b1,b2) references b(b1,b2);
>
> Each dimension has 1M rows, fact has 10M rows (and references each row in
> dimensions 10x).
>
> Now, let's see this simple star join:
>
> explain select * from f join a using (a1,a2)
> join b using (b1,b2);
>
> This should return all 10M rows, and originally this was estimated like
> this:
>
>  QUERY PLAN
> ---
> Hash Join  (cost=4.00..573853.57 rows=1175 width=16)
>   Hash Cond: ((f.b1 = b.b1) AND (f.b2 = b.b2))
>   ->  Hash Join  (cost=2.00..363955.16 rows=1175 width=16)
>Hash Cond: ((f.a1 = a.a1) AND (f.a2 = a.a2))
>->  Seq Scan on f  (cost=0.00..154056.75 rows=1175 width=16)
>->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
>->  Seq Scan on a  (cost=0.00..14425.00 rows=100 width=8)
>->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
>  ->  Seq Scan on b  (cost=0.00..14425.00 rows=100 width=8)
>
> so pretty much perfectly accurate, while the new code does this:
>
>  QUERY PLAN
> --
>  Hash Join  (cost=4.00..573853.57 rows=10 width=16)
>Hash Cond: ((f.b1 = b.b1) AND (f.b2 = b.b2))
>->  Hash Join  (cost=2.00..363955.16 rows=1175 width=16)
>Hash Cond: ((f.a1 = a.a1) AND (f.a2 = a.a2))
> ->  Seq Scan on f  (cost=0.00..154056.75 rows=1175 width=16)
> ->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
>->  Seq Scan on a  (cost=0.00..14425.00 rows=100 width=8)
>->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
> ->  Seq Scan on b  (cost=0.00..14425.00 rows=100 width=8)
>
> I think this is due to this check in clauselist_join_selectivity:
>
>if (bms_get_singleton_member(sjinfo->min_righthand, ) &&
>bms_get_singleton_member(sjinfo->min_lefthand, ))
>
> which pretty much says "only work with joins on two base relations". So as
> long as you have a joinrel, (e.g. a fact + one of the dimensions), it's
> game over.
>
> I think the restriction is unnecessary, because when estimating joins, we
> effectively take cardinality of a cartesian product of all the base
> relations, and then apply selectivities for all the join quals (in a
> pairwise manner).
>
> So for the three tables we take
>
>card(join) = card(f) * card(a) * card(b) * sel(f,a) * sel(f,b)
>
> and we can consider the foreign keys in the pairwise selectivities.
>
>
I looked at this again, and I'm not all that sure it's possible to assume
that 1.0 /  is valid when there's more than one relation at either
side 

Re: [HACKERS] Parallel Seq Scan

2015-09-24 Thread Amit Kapila
On Fri, Sep 25, 2015 at 8:50 AM, Robert Haas  wrote:
>
> On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila 
wrote:
> > I have fixed most of the review comments raised in this mail
> > as well as other e-mails and rebased the patch on commit-
> > 020235a5.  Even though I have fixed many of the things, but
> > still quite a few comments are yet to be handled.  This patch
> > is mainly a rebased version to ease the review.  We can continue
> > to have discussion on the left over things and I will address
> > those in consecutive patches.
>
> Thanks for the update.  Here are some more review comments:
>
> 1. parallel_seqscan_degree is still not what we should have here.  As
> previously mentioned, I think we should have a GUC for the maximum
> degree of parallelism in a query generally, not the maximum degree of
> parallel sequential scan.
>

Agreed and upthread I have asked about your opinion after proposing a
suitable name.

Some suitable names could be:
degree_of_parallelism, max_degree_of_parallelism, degree_of_query_
parallelism, max_degree_of_query_parallelism


> 2. fix_node_funcids() can go away because of commit
> 9f1255ac859364a86264a67729dbd1a36dd63ff2.
>

Agreed.

> 3. cost_patialseqscan is still misspelled.  I pointed this out before,
too.
>

In the latest patch (parallel_seqscan_partialseqscan_v18.patch) posted by
me yesterday, this was fixed.  Am I missing something or by any chance
you are referring to wrong version of patch

> 4. Much more seriously than any of the above,
> create_parallelscan_paths() generates plans that are badly broken:
>
> rhaas=# explain select * from pgbench_accounts where filler <
random()::text;
>QUERY PLAN
>
-
>  Funnel on pgbench_accounts  (cost=0.00..35357.73 rows=333 width=97)
>Filter: ((filler)::text < (random())::text)
>Number of Workers: 10
>->  Partial Seq Scan on pgbench_accounts  (cost=0.00..35357.73
> rows=333 width=97)
>  Filter: ((filler)::text < (random())::text)
> (5 rows)
>
> This is wrong both because random() is parallel-restricted and thus
> can't be executed in a parallel worker, and also because enforcing a
> volatile qual twice is no good.
>

Yes, the patch needs more work in terms of dealing with parallel-restricted
expressions/functions.  One idea which I have explored previously is
push down only safe clauses to workers (via partialseqscan node) and
execute restricted clauses in master (via Funnel node).  My analysis
is as follows:

Usage of restricted functions in quals-
During create_plan() phase, separate out the quals that needs to be
executed at funnel node versus quals that needs to be executed on
partial seq scan node (do something similar to what is done in
create_indexscan_plan for index and non-index quals).

Basically PartialSeqScan node can contain two different list of quals,
one for non-restrictive quals and other for restrictive quals and then
Funnel node can retrieve restrictive quals from partialseqscan node,
assuming partialseqscan node is its left child.

Now, I think the above can only be possible under the assumption that
partialseqscan node is always a left child of funnel node, however that is
not true because a gating node (Result node) can be added between the
two and tomorrow there could be more cases when other nodes will be
added between the two, if we consider the case of aggregation, the
situation will be more complex as before partial aggregation, all the
quals should be executed.

Unless there is a good way to achieve the partial execution of quals,
I think it is better to prohibit parallel plans, if there is any restrictive
clause.  Yet another way could be don't push qual if it contains restricted
functions and execute it at Funnel node, but I think that will be quite
costly due additional flow of tuples from worker backends.


Usage of restricted functions in target list -
One way could be if target list contains any restricted function, then
parallel
worker needs to always send the complete tuple and the target list will be
evaluated by master backend during funnel node execution. I think some
of restrictions that are applies to quals apply to this also especially if
there
are other nodes like aggregation in-between Funnel and partialseqscan.

> rhaas=# explain select * from pgbench_accounts where aid % 1 = 0;
>   QUERY PLAN
>
---
>  Funnel on pgbench_accounts  (cost=0.00..28539.55 rows=5 width=97)
>Filter: ((aid % 1) = 0)
>Number of Workers: 10
>->  Partial Seq Scan on pgbench_accounts  (cost=0.00..28539.55
> rows=5 width=97)
>  Filter: ((aid % 1) = 0)
> (5 rows)
>
> This will work, but it's a bad plan because we shouldn't need to
> re-apply the filter 

Re: [HACKERS] Rename withCheckOptions to insertedCheckClauses

2015-09-24 Thread Tom Lane
I wrote:
> - List   *ri_WithCheckOptions;
> - List   *ri_WithCheckOptionExprs;
> + List   *ri_InsertedCheckClauses;
> + List   *ri_InsertedCheckClauseExprs;

> The distinction between a "clause" and an "expr" is not very obvious,
> and certainly most other places in the code use those terms pretty
> interchangeably, so I find both the old and new names unclear here.
> How about ri_InsertedCheckClauseStates instead for the second list?
> And similarly if you're using "Expr" to mean ExprState anywhere else.

Actually ... does struct ResultRelInfo need to carry the original WCO
clauses at all, rather than just the exprstate list?  In most places
we do not store expr and exprstate lists in the same node in the first
place, so we can get away with using the same field name for corresponding
lists in plan and planstate nodes.  That's why we don't already have a
convention like "fooStates" for such lists.

Another thought is that as long as these are lists specifically of
WithCheckOption nodes, and not arbitrary expressions, "clause" isn't an
especially good term for them; it implies generality that isn't there.
And CheckClauses invites confusion with, for example, CHECK clauses of
domain types.  So maybe better names would be "ri_InsertedCheckOptions"
(and "ri_InsertedCheckOptionStates" if you still need that).  Or maybe
"ri_InsertedWCOClauses" and "ri_InsertedWCOClauseStates".  I'm less sure
about whether this is an improvement, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Tom Lane
Feng Tian  writes:
> Ah, link.
> https://github.com/vitesse-ftian/pgdecimal

Please include the actual patch as an attachment.  We do not consider mere
URLs to be acceptable patch submission format, because that provides no
permanent record in our archives of what was submitted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread David Rowley
On 25 September 2015 at 08:29, Feng Tian  wrote:

> Compared to numeric type, decimal64 arithmetics is about 2x faster,
> decimal128 is about 1.5x faster.  However, the cast between decimal and
> float4/8 is implemented rather naively and slow.   As always, it depends on
> workload, decimal may take more, or less space, may be slower if cast is
> frequently performed.
>

Are you able to share the processor vendor, and perhaps some other specs of
the machine you obtained these results from?

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-24 Thread Tom Lane
Josh Berkus  writes:
> On 09/24/2015 12:55 PM, Tom Lane wrote:
>> I agree with the idea that we don't yet want to give the impression that
>> this is the official bug tracker.  However, "beta-bugs" could give the
>> impression that it was specifically for bugs about 9.5beta, without
>> dispelling the idea that it is official.  Maybe "bugs-test.p.o"?

> I'd suggest instead just having a big banner up in the page header which
> says "this system is currently beta and not yet the canonical source for
> postgres bug information".  That way, if it does become the canonical
> source, we won't go breaking everyone's links when we change the domain
> name.

Works for me, if it's not too hard to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decimal64 and Decimal128

2015-09-24 Thread Josh Berkus
On 09/24/2015 02:23 PM, Feng Tian wrote:
> If there is enough interest, would be great for it to go into the
> official contrib dir.
> Thanks, 
> 
> 
> Second thought, the extension depends on decNumber, which is either GPL,
> or ICU license.  Maybe this is trouble.
>  

Yes.  Please just build an external extension and submit it to PGXN.
Thanks!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Doubt in pgbench TPS number

2015-09-24 Thread Tatsuo Ishii
Today I got an interesting output from pgbench.
The scenario is executing SELECT pg_sleep(0.1).

$ cat test.sql 
select pg_sleep(0.1);

$ pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2883
latency average: 104.058 ms
tps = 95.799114 (including connections establishing)
tps = 124.487149 (excluding connections establishing)

Interesting thing is, the number from "excluding connections
establishing". 124.487149 tps means 0.008032 seconds per
transaction. Since the query executes pg_sleep(0.1), I think the
number should be equal to or greater than 0.1. Maybe a tolerance, but
20% of error seems to be too high for me.

Note that if "-C" does not present, the TPS number seems to be sane.

$ pgbench -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2970
latency average: 101.010 ms
tps = 98.692514 (including connections establishing)
tps = 98.747053 (excluding connections establishing)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doubt in pgbench TPS number

2015-09-24 Thread Fabien COELHO


Hello Tatsuo-san,


$ pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
tps = 95.799114 (including connections establishing)
tps = 124.487149 (excluding connections establishing)

Interesting thing is, the number from "excluding connections
establishing". 124.487149 tps means 0.008032 seconds per
transaction. Since the query executes pg_sleep(0.1), I think the
number should be equal to or greater than 0.1. Maybe a tolerance, but
20% of error seems to be too high for me.


Indeed.


Note that if "-C" does not present, the TPS number seems to be sane.


Hmmm... I never use -C. The formula seems ok:

tps_exclude = normal_xacts / (time_include -
(INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));

conn_total_time is the cumulated time spent by all threads.

A quick look at the logic reveals some minor issues (conn_time is passed 
as an argument to doCustom, although it is already available as a "thread" 
field, stange). I spotted "threads" used instead of "thread" in an 
accumulation, but it is not related to this computation.


Hmmm. I'll have a slower look...

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers