date:20121018

[HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

2012-10-18 Thread Amit kapila

From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] 
on behalf of Amit kapila [amit.kap...@huawei.com]
Sent: Monday, October 15, 2012 7:28 PM
To: robertmh...@gmail.com; j...@agliodbs.com
Cc: pgsql-hackers@postgresql.org
Subject: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows 
to reduce the size of the null bitmap [Review]

On Monday, October 15, 2012 7:28 PM Amit kapila wrote:
On Saturday, October 13, 2012 1:24 PM Amit kapila wrote:
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:

>> I see you posted up a follow-up email asking Tom what he had in mind.
>> Personally, I don't think this needs incredibly complicated testing.
>> I think you should just test a workload involving inserting and/or
>> updating rows with lots of trailing NULL columns, and then another
>> workload with a table of similar width that... doesn't.  If we can't
>> find a regression - or, better, we find a win in one or both cases -
>> then I think we're done here.

>As per the last discussion for this patch, performance data needs to be 
>provided before this patch's Review can proceed >further.
>So as per your suggestion and from the discussions about this patch, I have 
>collected the performance data as below:

>Results are taken with following configuration.
>1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT 
>type.
>2. shared_buffers = 10GB
>3. All the performance result are taken with single connection.
>4. Performance is collected for INSERT operation (insert into temptable select 
>* from inittable)

>Platform details:
>Operating System: Suse-Linux 10.2 x86_64
>Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
>RAM : 24GB

> Further to Performance data, I have completed the review of the Patch.

Please find the patch to address Review Comments attached with this mail.

IMO, now its ready for a committer.


With Regards,
Amit Kapila.

Truncate-trailing-null-columns-from-heap-rows.v2.patch
Description: Truncate-trailing-null-columns-from-heap-rows.v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Hitoshi Harada

On Thu, Oct 18, 2012 at 8:35 AM, Tom Lane  wrote:
> I wrote:
>> Hitoshi Harada  writes:
>>> If OOM happens during expand_table() in hash_search_with_hash_value()
>>> for RelationCacheInsert,
>
> the palloc-based allocator does throw
> errors.  I think that when that was designed, we were thinking that
> palloc-based hash tables would be thrown away anyway after an error,
> but of course that's not true for long-lived tables such as the relcache
> hash table.
>
> I'm not terribly comfortable with trying to use a PG_TRY block to catch
> an OOM error - there are too many ways that could break, and this code
> path is by definition not very testable.  I think moving up the
> expand_table action is probably the best bet.  Will you submit a patch?

Here it is. I factored out the bucket finding code to re-calculate it
after expansion.

Thanks,
-- 
Hitoshi Harada


hashoom.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-18 Thread Tom Lane

I wrote:
> Or we could go back to using ARRAY here --- that should be safe since
> ARRAY is already fully reserved.

Ugh ... no, that doesn't work, because ARRAY[...] is allowed in c_expr
and hence b_expr.  So the ambiguity would still be there.  We'd need a
different fully-reserved keyword to go this way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Ants Aasma

On Thu, Oct 18, 2012 at 10:03 PM, Hannu Krosing  wrote:
> Hmm. Maybe we should think of implementing this as REMOTE TABLE, that
> is a table which gets no real data stored locally but all insert got through
> WAL
> and are replayed as real inserts on slave side.

FWIW, MySQL calls this exact concept the "black hole" storage engine.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-18 Thread Tom Lane

Marco Nenciarini  writes:
> Please find the attached refreshed patch (v2) which fixes the loose ends
> you found.

I've started looking at this patch, and the first thing I notice is that
the syntax doesn't work.  It's ambiguous, and this:

  %left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
  /* kluge to keep xml_whitespace_option from causing shift/reduce conflicts */
  %rightPRESERVE STRIP_P
+ %nonassoc ELEMENT

  %%

is not in any way an acceptable fix.  All that that will do is cause an
arbitrary choice to be made when it's not clear what to do.  Half the
time the arbitrary choice will be wrong.  Consider for example

regression=# create table t1 (f1 int[] default 4! element references t2);
ERROR:  column "element" does not exist

The parser has resolved the ambiguity about whether "!" is an infix or
postfix operator by assuming it's infix.  (Yeah, I realize we've "fixed"
some similar cases with precedence hacks, but they are cases that were
forced on us by brain-dead syntax choices in the SQL standard.  We don't
need to go there for syntax we're making up ourselves.)

We could get around that by making ELEMENT a fully reserved word, but
I don't think that's a really acceptable solution.  ELEMENT is reserved
according to more recent versions of the SQL standard, but only as a
built-in function name, and in any case reserving it is very likely to
break people's existing applications.

Another possibility is to forget about the column constraint ELEMENT
REFERENCES syntax, and only support the table-constraint syntax with
ELEMENT inside the column list --- I've not checked, but I think that
syntax doesn't have any ambiguity problems.

Or we could go back to using ARRAY here --- that should be safe since
ARRAY is already fully reserved.

Or we could choose some other syntax.  I'm wondering about dropping the
use of a keyword entirely, and instead using '[]' decoration.  This
wouldn't work too badly in the table constraint case:

FOREIGN KEY (foo, bar[]) REFERENCES t (x,y)

but I'm less sure where to put the decoration for the column constraint
case.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 6:10 PM, Josh Berkus  wrote:
> Daniel,
>
>> I'm not going to disagree with that, I only feel it's reasonable to
>> ask why those who react so strongly against deprecation why they think
>> what they do, and receive a clinical response, because not everyone
>> has seen those use cases.  My level of interest in deprecation is only
>> as far as "if those who have to deal with the RULES implementation
>> don't want to work on it anymore in favor of other things, I think the
>> pain to users of deprecation is, from my vantage point, manageable if
>> given some time."
>
> Note that you have heard from one of the people maintaining RULES, who
> doesn't find them problematic to maintain (Tom).  Note that the original
> hackers calling for deprecation do not work on RULEs except where they
> touch other features.
>
> And I'll say again: if you want a full list of use-cases for RULEs, you
> need to go further than the -hackers list.  This is a small, insular
> community which does not represent the majority of PostgreSQL users.
> You have gone to Heroku's users, but given the nature of your user base,
> they seem like the least likely group of people to use RULEs.  That's
> like me polling a bunch of Data Warehousing geeks and then declaring
> that we don't really need SERIALIZABLE.

I have tried very assiduously to avoid generalizing, even though
perhaps I have failed.  I made a false assumption that nobody wanted
to work with RULES that I am very sorry for.

What I do not like, and stand by, is that I did not like the form of
terse dismissal of even the idea of deprecation, with a feeling that
it is entirely unnecessary to explain that assessment in any detail
for dozens of emails.  I don't think it's a healthy thing, especially
for a community where (traditionally) deprecation comes up so seldom.
I don't think this took place on the other deprecation threads so much
that spawned in this one's wake.

I can only offer the data I have.  Please do not over-read in what I
have been trying to communicate, or conflate my position with those of
other individuals, if you feel there is a chance of that.

Let us please consider the matter resolved unless you feel you have
more pointers for me -- anyone can send them to me individually,
perhaps. I am not soldiering for the deprecation of RULES, but rather
the community's approach to but a suggestion of certain kinds of
change. Perhaps this is but a one-off, because discussion in the other
threads has been seemingly healthier.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Josh Berkus

Daniel,

> I'm not going to disagree with that, I only feel it's reasonable to
> ask why those who react so strongly against deprecation why they think
> what they do, and receive a clinical response, because not everyone
> has seen those use cases.  My level of interest in deprecation is only
> as far as "if those who have to deal with the RULES implementation
> don't want to work on it anymore in favor of other things, I think the
> pain to users of deprecation is, from my vantage point, manageable if
> given some time."

Note that you have heard from one of the people maintaining RULES, who
doesn't find them problematic to maintain (Tom).  Note that the original
hackers calling for deprecation do not work on RULEs except where they
touch other features.

And I'll say again: if you want a full list of use-cases for RULEs, you
need to go further than the -hackers list.  This is a small, insular
community which does not represent the majority of PostgreSQL users.
You have gone to Heroku's users, but given the nature of your user base,
they seem like the least likely group of people to use RULEs.  That's
like me polling a bunch of Data Warehousing geeks and then declaring
that we don't really need SERIALIZABLE.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 1:55 PM, Andrew Dunstan  wrote:
>
> On 10/18/2012 01:11 PM, Daniel Farina wrote:
>
>> Here's another use case that in my history with RULES that didn't seem
>> to pan out so well: In my recollection, one way to use rules is to
>> retarget operations that happen against a view and move them to a
>> table, and as I recall to make this work as one expected one had to
>> have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
>> equality and not-null conditions to make it work as one would expect
>> (to not under-constrain the UPDATE).  This became a maintenance
>> headache whenever attributes were added to the underlying relation.
>
>
>
> Yes, but you also get a similar headache with a trigger. Unless you're VERY
> careful you can get a trigger failure by adding an attribute, and an almost
> guaranteed one by removing an attribute. It's true that the language for
> specifying the operations is more expressive, but no matter what mechanism
> you use, changing the shape of the objects can get you into trouble.
>
> I've never said that rules are perfect, nor that they should be used
> whenever possible. What I have said is that there are known cases where they
> are the best solution currently available. I still think that.

I'm not going to disagree with that, I only feel it's reasonable to
ask why those who react so strongly against deprecation why they think
what they do, and receive a clinical response, because not everyone
has seen those use cases.  My level of interest in deprecation is only
as far as "if those who have to deal with the RULES implementation
don't want to work on it anymore in favor of other things, I think the
pain to users of deprecation is, from my vantage point, manageable if
given some time."

I also want to be very clear that I know my vantage point is skewed,
but I feel like exposing what assessment of user activity I can to
-hackers is important, and the best I can do when it comes to
considering topics like these.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

2012-10-18 Thread Ants Aasma

On Thu, Oct 18, 2012 at 10:36 PM, Robert Haas  wrote:
> Sadly, the situation on Windows doesn't look so good.  I
> don't remember the exact numbers but I think it was something like 40
> or 60 or 80 times slower on the Windows box one of my colleagues
> tested than it is on Linux.

Do you happen to know the hardware and Windows version? Windows
QueryPerformanceCounter that instr_time.h uses should use RDTSC based
timing when the hardware can support it, just like Linux. I don't know
if Windows can avoid syscall overhead though.

> Maybe it's worth finding a platform where
> pg_test_timing reports that timing is very slow and then measuring how
> much impact this has on something like a pgbench or pgbench -S
> workload.

This can easily be tested on Linux by changing to the hpet or acpi_pm
clocksource. There probably still are platforms that can do worse than
this, but probably not by orders of magnitude.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Peter Eisentraut

On Thu, 2012-10-18 at 12:38 -0400, Tom Lane wrote:
> I think the argument that it causes user confusion is a fairly strong
> one, though.

What is confusing, IMO, is changing the hba syntax all the time.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Peter Eisentraut

On Thu, 2012-10-18 at 13:20 +0200, Magnus Hagander wrote:
> In particular, we made a couple of changes over sveral releases back
> in the authentication config, that we should perhaps consider
> finishing by removing the old stuff now?
> 
> 1. krb5 authentication. We've had gssapi since 8.3 (which means in all
> supported versions). krb5 has been deprecated, also since 8.3. Time to
> remove it?
> 
> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
> syntax deprecated but still mapping to the new one. Has it been there
> long enough that we should start throwing an error for ident on unix?
> 
The hba syntax changes between 8.3 and 8.4 continue to annoy me to this
day, so I'd like to avoid these in the future, especially if they are
for mostly cosmetic reasons.  I think any change should be backward
compatible to all supported versions, or alternatively to 8.4, since
that's incompatible with 8.3 anyway.  (Those two will be the same before
9.3 goes out.)

So, in my opinion, krb5 could be removed, assuming that gssapi is a full
substitute.  But ident-over-unix-sockets should stay, at least until 9.0
is EOL.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire

On Thu, Oct 18, 2012 at 5:30 PM, Claudio Freire  wrote:
> Backward:
>
>QUERY PLAN
> --
>  GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
> time=28.190..157708.405 rows=9001 loops=1)
>->  Index Only Scan Backward using pgbench_accounts_pkey on
> pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
> (actual time=28.178..135282.317 rows=9001 loops=1)
>  Index Cond: ((aid >= 1000) AND (aid <= 2))
>  Heap Fetches: 0
>  Total runtime: 160735.539 ms
> I/O thoughput averages 12MB/s (a small increase), and the 3-second
> difference seems related to it (it's consistent).
> I/O utilization averages 88% (important increase)
>
> This last result makes me think deeper prefetching could be
> potentially beneficial (it would result in read merges), but it's
> rather hard to implement without a skiplist of leaf pages. Maybe the
> backward-sequential pattern could be detected. I'll have to tinker
> with that.

Fun. That didn't take long.

With the attached anti-sequential scan patch, and effective_io_concurrency=8:


  QUERY PLAN
-
 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=26.964..84299.789 rows=9001 loops=1)
   ->  Index Only Scan Backward using pgbench_accounts_pkey on
pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
(actual time=26.955..62761.774 rows=9001 loops=1)
 Index Cond: ((aid >= 1000) AND (aid <= 2))
 Heap Fetches: 0
 Total runtime: 87170.355 ms
I/O thoughput 22MB/s (twice as fast)
I/O utilization 95% (I was expecting 100% but... hey... good enough)

With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times
faster). So, I'd like to know what you think, but maybe for
back-sequential scans, prefetch should be set to a multiple (ie: x24)
of e_i_c, in order to exploit read request merges.


postgresql-git-bt_prefetch_backseq.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-10-18 Thread Simon Riggs

On 18 October 2012 21:22, Alvaro Herrera  wrote:

> This patch seems to have been neglected by both its submitter and the
> reviewer.  Also, Simon said he was going to set it
> returned-with-feedback on his last reply, but I see it as needs-review
> still in the CF app.  Is this something that is going to be reconsidered
> and resubmitted for the next commitfest?  If so, please close it up in
> the current one.

I burned time on the unlogged table problems, so haven't got round to
this yet. I'm happier than I was with this.

I'm also conscious this is very important and there are no later patch
dependencies, so there's no rush to commit it and every reason to make
sure it happens without any mistakes. It will be there for 9.3.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Любен Каравелов


Well, it'd be nice to be able to rewrite a query referring to a table
to still refer to that same table, but you can't, because you get
infinite recursion.


If that was possible it would be quite easy to express any row/column level
security policies with it.


If you could do that, it'd presumably be a WHOLE lot faster than
triggers. Because frankly, if triggers were infinitely fast, I don't
think we'd be having this conversation. But they're not. They're
slow; really slow, and while we may be able to improve that somewhat
by some means, they're basically always going to be slow. Being able
to rewrite queries is inside the server is useful, and rules are not a
very good solution to that problem, but right now they're the only
thing we've got.


Moreover there are no triggers fired on selects. 


Best regards


--


Luben Karavelov

Re: [HACKERS] embedded list

2012-10-18 Thread Tom Lane

Alvaro Herrera  writes:
> Oops.  I mentioned this explicitely somewhere in the discussion.  I
> assumed you had seen that, and that you would have complained had you
> found it objectionable.

Sorry, I've been too busy to pay very much attention to this patch.

>> I think we should remove the head argument at least from dlist_delete,
>> and probably also dlist_insert_after and dlist_insert_before.

> There are more functions that get the list head just to run the check.
> Can I assume that you don't propose removing the argument from those?
> (dlist_next_node, dlist_prev_node I think are the only ones).

Yeah, I wondered whether to do the same for those.  But it's less of an
issue there, because in practice the caller is almost certainly going to
also need to do dlist_has_next or dlist_has_prev respectively, and those
require the list header.

On the other hand, applying the same principle to slists, you could
argue that slist_has_next and slist_next_node should not require the
head pointer (since that's throwing away an advantage of slists).
If we wanted to remove the head pointer from those, there would be some
value in not having the head argument in dlist_next_node/dlist_prev_node
for symmetry with slist_next_node.

I'm not as excited about these since it seems relatively less likely to
matter.  What do you think?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Andrew Dunstan



On 10/18/2012 01:11 PM, Daniel Farina wrote:


Here's another use case that in my history with RULES that didn't seem
to pan out so well: In my recollection, one way to use rules is to
retarget operations that happen against a view and move them to a
table, and as I recall to make this work as one expected one had to
have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
equality and not-null conditions to make it work as one would expect
(to not under-constrain the UPDATE).  This became a maintenance
headache whenever attributes were added to the underlying relation.



Yes, but you also get a similar headache with a trigger. Unless you're 
VERY careful you can get a trigger failure by adding an attribute, and 
an almost guaranteed one by removing an attribute. It's true that the 
language for specifying the operations is more expressive, but no matter 
what mechanism you use, changing the shape of the objects can get you 
into trouble.


I've never said that rules are perfect, nor that they should be used 
whenever possible. What I have said is that there are known cases where 
they are the best solution currently available. I still think that.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Boszormenyi Zoltan


2012-10-18 20:08 keltezéssel, Tom Lane írta:

Alvaro Herrera  writes:

Boszormenyi Zoltan escribió:

this is the latest one, fixing a bug in the accounting
of per-statement lock timeout handling and tweaking
some comments.

Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.


Fine by me. Thanks.

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] embedded list

2012-10-18 Thread Alvaro Herrera

Tom Lane escribió:
> Alvaro Herrera  writes:
> > Here's the final version.  I think this is ready to go in.
> 
> I got around to reviewing this today.  I'm pretty seriously annoyed at
> the definition of dlist_delete: it should *not* require the list header.
> The present coding simply throws away one of the primary advantages of
> a doubly-linked list over a singly-linked list, namely that you don't
> have to have your hands on the list header in order to unlink a node.
> This isn't merely academic either, as I see that the patch to catcache
> code actually added a field to struct catctup to support making the
> list header available.  That's a complete waste of 8 bytes (on a 64-bit
> machine) per catalog cache entry.  The only thing it buys for us is
> the ability to run dlist_check, which is something that isn't even
> compiled (not even in an Assert build), and which doesn't actually do
> that much useful even if it is compiled --- for instance, there's no way
> to verify that the nodes were actually in the list claimed.

Oops.  I mentioned this explicitely somewhere in the discussion.  I
assumed you had seen that, and that you would have complained had you
found it objectionable.  (It's hard enough to figure out if people don't
respond because they don't have a problem with something, or just
because they didn't see it.)

On the other hand, it's convenient to remove them, because in
predicate.c there are plenty of SHM_QUEUE node removals which is clearly
easier to port over to dlist if we don't have to figure out exactly
which list each node is in.  (Maybe in other SHM_QUEUE users as well,
but that's the most complex of the bunch.)

> I think we should remove the head argument at least from dlist_delete,
> and probably also dlist_insert_after and dlist_insert_before.

There are more functions that get the list head just to run the check.
Can I assume that you don't propose removing the argument from those?
(dlist_next_node, dlist_prev_node I think are the only ones).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire

I've noticed, doing some reporting queries once, that index scans fail
to saturate server resources on compute-intensive queries.

Problem is, just after fetching a page, postgres starts computing
stuff before fetching the next. This results in I/O - compute - I/O -
compute alternation that results in idle CPU and disk, both.

I've also noticed earlier patches attempted to implement prefetch of
index pages, yet they don't seem to be committed. I'm wondering why.

The patches themselves were quite complex, attempting to prefetch heap
tuples in addition to index tuples. I can see how this would be
beneficial, but heap prefetch is notoriously more complicated, and
with the advent of index-only scans, maybe index-page-only prefetching
would be of use.

To test this hypothesis, I wrote the attached patch. pgbench doesn't
seem to see any performance impact (expected, since pgbench uses
really tiny queries that touch a single page). Using pgbench's data
with a scale of 1000, I tried some other queries that were more taxing
of index-only scans. This was done on my personal computer, with a
really lame I/O subsystem (compared to database servers), and with
9.2.1 rather than git, but I think it should be significant anyway. I
will try to verify on RAID-ed disks, but I'm in short supply of test
systems at the moment.

Pgbench's biggest index (on pgbench_accounts.aid) is not
unsurprisingly quite sequential, since it has been just created. So, I
tested both forward and backward index scans, to get an idea of how
the patch impacts on sequential and non-sequential workloads. I
haven't been able to test truly random ones yet - I'm trying to load
up a dump from a production database that's both big and messy to
test.

A few things worth noting, is that the patch avoids doing prefetch on
single-page index access, and only when block numbers aren't
sequentially increasing (only when scanning the index nonsequentially
will prefetch be attempted), since I noticed the fadvise call in those
cases was being counterproductive.

So, here we go:

The base query I used, is:

explain analyze select aid, count(*) from pgbench_accounts where aid
between 1000 and 2 group by aid order by aid;

For backward scans, just order by aid desc.

The full command used is
sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches' ; pg_ctl start -l
${PGLOG} ; sleep 5 ; ( sleep 10 ; echo 'set effective_io_concurrency
to 0;' ; echo 'explain analyze select aid, count(*) from
pgbench_accounts where aid between 1000 and 2 group by aid
order by aid;' ) | psql -h localhost -p 5433 pgbench ; sleep 5 ;
pg_ctl stop
server starting

The server is started and stopped every time to make sure the shared
cache is empty, and sleeps are there to avoid backlash I/O from
dropping caches to influence the benchmark.

Results:

With effective_io_concurrency set to 0 (patch disabled):

Forward:

  QUERY PLAN

 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=47.552..31113.353 rows=9001 loops=1)
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
 (cost=0.00..2795179.71 rows=90257289 width=4) (actual
time=47.542..13197.982 rows=9001 loops=1)
 Index Cond: ((aid >= 1000) AND (aid <= 2))
 Heap Fetches: 0
 Total runtime: 33648.500 ms
I/O thoughtput averages 60MB/s (clearly sequential)
I/O utilization averages around 30%

Backward:

   QUERY PLAN
--
 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=10.279..159704.590 rows=9001 loops=1)
   ->  Index Only Scan Backward using pgbench_accounts_pkey on
pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
(actual time=10.266..132853.382 rows=9001 loops=1)
 Index Cond: ((aid >= 1000) AND (aid <= 2))
 Heap Fetches: 0
 Total runtime: 163202.869 ms
I/O thoughput averages 11MB/s (clearly not fully random, but neither sequential)
I/O utilization averages 68%


With effective_io_concurrency set to 1 (patch enabled):

Forward:

  QUERY PLAN

 GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=47.582..30474.222 rows=9001 loops=1)
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
 (cost=0.00..2795179.71 rows=90257289 width=4) (actual
time=47.571..12208.340 rows=9001 loops=1)
 Index Cond: ((aid >= 1000) AND (aid <= 2))
 Heap Fetches: 0
 Total runtime: 32875.6

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-10-18 Thread Alvaro Herrera

This patch seems to have been neglected by both its submitter and the
reviewer.  Also, Simon said he was going to set it
returned-with-feedback on his last reply, but I see it as needs-review
still in the CF app.  Is this something that is going to be reconsidered
and resubmitted for the next commitfest?  If so, please close it up in
the current one.

Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] embedded list

2012-10-18 Thread Tom Lane

Alvaro Herrera  writes:
> Here's the final version.  I think this is ready to go in.

I got around to reviewing this today.  I'm pretty seriously annoyed at
the definition of dlist_delete: it should *not* require the list header.
The present coding simply throws away one of the primary advantages of
a doubly-linked list over a singly-linked list, namely that you don't
have to have your hands on the list header in order to unlink a node.
This isn't merely academic either, as I see that the patch to catcache
code actually added a field to struct catctup to support making the
list header available.  That's a complete waste of 8 bytes (on a 64-bit
machine) per catalog cache entry.  The only thing it buys for us is
the ability to run dlist_check, which is something that isn't even
compiled (not even in an Assert build), and which doesn't actually do
that much useful even if it is compiled --- for instance, there's no way
to verify that the nodes were actually in the list claimed.

I think we should remove the head argument at least from dlist_delete,
and probably also dlist_insert_after and dlist_insert_before.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Robert Haas

On Wed, Oct 17, 2012 at 7:25 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> I would tend to say "well, they're not hurting anyone, why not keep
>> them?" Except that we're gathering an increasing number of features
>> (RETURNING, FDWs, CTEs, Command triggers) which don't work well together
>> with RULEs.
>
> Really?  On what do you base that claim?  The only one of those that I
> might believe is command triggers, but AFAIK we only have/plan command
> triggers for DDL, so there's no overlap.
>
> I'm fairly annoyed by the entire tenor of this conversation, because
> the people who are hollering the loudest seem to be people who have
> never actually touched any of the rules code, but nonetheless seem
> prepared to tell those of us who have what to spend our time on.
>
> Now having said that, I would definitely like to see rules in their
> current form go away eventually.  But not without a substitute.
> Triggers are not a complete replacement, and no amount of wishful
> thinking makes them so.
>
> Perhaps it would be more profitable to try to identify the pain points
> that make people so eager to get rid of rules, and then see if we could
> alleviate them.  One big problem I know about offhand is the
> multiple-evaluation risk, which seems at least in principle fixable.
> What others are there?

Well, it'd be nice to be able to rewrite a query referring to a table
to still refer to that same table, but you can't, because you get
infinite recursion.

If you could do that, it'd presumably be a WHOLE lot faster than
triggers.  Because frankly, if triggers were infinitely fast, I don't
think we'd be having this conversation.  But they're not.  They're
slow; really slow, and while we may be able to improve that somewhat
by some means, they're basically always going to be slow.  Being able
to rewrite queries is inside the server is useful, and rules are not a
very good solution to that problem, but right now they're the only
thing we've got.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 11:31 AM, Satoshi Nagayasu  wrote:
> A flight-recorder must not be disabled. Collecting
> performance data must be top priority for DBA.

This analogy is inapposite, though, because a flight recorder rarely
crashes the aircraft.  If it did, people might have second thoughts
about the "never disable the flight recorder" rule.  I have had a
couple of different excuses to look into the overhead of timing
lately, and it does indeed seem that on many modern Linux boxes even
extremely frequent gettimeofday calls produce only very modest amounts
of overhead.  Sadly, the situation on Windows doesn't look so good.  I
don't remember the exact numbers but I think it was something like 40
or 60 or 80 times slower on the Windows box one of my colleagues
tested than it is on Linux.  And it turns out that that overhead
really is measurable and does matter if you do it in a code path that
gets run frequently.  Of course I am enough of a Linux geek that I
don't use Windows myself and curse my fate when I do have to use it,
but the reality is that we have a huge base of users who only use
PostgreSQL at all because it runs on Windows, and we can't just throw
those people under the bus.  I think that older platforms like HP/UX
likely have problems in this area as well although I confess to not
having tested.

That having been said, if we're going to do this, this is probably the
right approach, because it only calls gettimeofday() in the case where
the lock acquisition is contended, and that is a lot cheaper than
calling it in all cases.  Maybe it's worth finding a platform where
pg_test_timing reports that timing is very slow and then measuring how
much impact this has on something like a pgbench or pgbench -S
workload.  We might find that it is in fact negligible.  I'm pretty
certain that it will be almost if not entirely negligible on Linux but
that's not really the case we need to worry about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7521: Cannot disable WAL log while using pg_dump

2012-10-18 Thread Gezeala M . Bacuño II

You may disable full_page_writes, but as you can see from my previous
post, disabling it did not do the trick. My zfs' USED property
continues to increase.

On Wed, Oct 17, 2012 at 3:55 PM, ichbinrene  wrote:
> I'm experiencing the exact same issue:
>
> PostgreSQL 9.1.6 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305
> (Red Hat 4.4.6-4), 32-bit
> CentOS release 6.3 (Final)
>
> I might also turn full_page_writes off but I fear for data integrity in case
> of a crash .
>
>
>
> --
> View this message in context: 
> http://postgresql.1045698.n5.nabble.com/BUG-7521-Cannot-disable-WAL-log-while-using-pg-dump-tp5722846p5728727.html
> Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-10-18 Thread Noah Misch

On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote:
> On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote:
> > > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane  wrote:
> > > >> There's also the big-picture question of whether we should just get rid
> > > >> of fuzzy comparisons in the geometric types instead of trying to hack
> > > >> indexes to work around them.
> 
> > In any event, I think we should entertain a patch to make the GiST operator
> > class methods bug-compatible with corresponding operators.  Even if we 
> > decide
> > to change operator behavior in HEAD, the back branches could use it.
> 
> We have broad agreement that the specific implementation of fuzz in geometric
> comparison operators is shoddy, but nobody has voiced interest in designing a
> concrete improvement.  I propose adding a TODO item "Remove or improve
> rounding in geometric comparison operators", endorsing Alexander's design, and
> reviewing his patch.  Objections?

TODO added, and here's a review:

The patch adds no regression tests; it should add tests illustrating the
problems it fixes.

I audited the other indexable geometric operators for similar problems.  This
passage in gist_point_consistent_internal(), which handles (point,point)
operators, caught my suspicion:

case RTSameStrategyNumber:
if (isLeaf)
{
result = FPeq(key->low.x, query->x)
&& FPeq(key->low.y, query->y);
}
else
{
result = (query->x <= key->high.x && query->x 
>= key->low.x &&
  query->y <= key->high.y && 
query->y >= key->low.y);
}
break;

A leaf entry reachable from an internal entry may fall exactly on the
internal-entry bounding box.  Since we would accept a fuzzy match at the leaf
level, I think we must also accept a fuzzy match at the internal level.

> *** a/src/backend/access/gist/gistproc.c
> --- b/src/backend/access/gist/gistproc.c

> ***
> *** 1326,1331  gist_point_consistent(PG_FUNCTION_ARGS)
> --- 1327,1333 
>   bool   *recheck = (bool *) PG_GETARG_POINTER(4);
>   boolresult;
>   StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
> + BOX*query, *key;

This function now has "query" variables within subsidiary blocks redundant
with and masking this one.  Avoid doing that.

>   
>   switch (strategyGroup)
>   {
> ***
> *** 1337,1348  gist_point_consistent(PG_FUNCTION_ARGS)
>   *recheck = false;
>   break;
>   case BoxStrategyNumberGroup:
> ! result = DatumGetBool(DirectFunctionCall5(
> ! 
>   gist_box_consistent,
> ! 
>   PointerGetDatum(entry),
> ! 
>   PG_GETARG_DATUM(1),
> !   
> Int16GetDatum(RTOverlapStrategyNumber),
> ! 
>0, PointerGetDatum(recheck)));
>   break;
>   case PolygonStrategyNumberGroup:
>   {
> --- 1339,1356 
>   *recheck = false;
>   break;
>   case BoxStrategyNumberGroup:
> ! /* 
> !  * This code repeats logic of on_ob which uses simple 
> comparison
> !  * rather than FP* functions.
> !  */
> ! query = PG_GETARG_BOX_P(1);
> ! key = DatumGetBoxP(entry->key);
> ! 
> ! *recheck = false;
> ! result = key->high.x >= query->low.x && 
> !  key->low.x <= query->high.x &&
> !  key->high.y >= query->low.y && 
> !  key->low.y <= query->high.y;

For leaf entries, this correctly degenerates to on_pb().  For internal
entries, it must, but does not, implement box_overlap().  (The fuzzy
box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
treatment of boxes resemble its treatment of circles and polygons; that eases
verifying their correctness.  Call gist_box_consistent.  Then, for leaf
entries, call box_contain_pt().


GiST "consistent" functions often validate the strategy number, but the
circle, polygon and box branches of gist_point_consistent silently assum

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Christopher Browne

On Thu, Oct 18, 2012 at 2:56 PM, Hannu Krosing  wrote:
> * works as table on INSERTS up to inserting logical WAL record describing
> the
> insert but no data is inserted locally.
>
> with all things that follow from the local table having no data
>   - unique constraints don't make sense
>   - indexes make no sense
>   -  updates and deletes hit no data
>   - etc. . .

Yep, I think I was understanding those aspects.

I think I disagree that "indexes make no sense."

I think that it would be meaningful to have an index type for this,
one that is a pointer at WAL records, to enable efficiently jumping to
the right WAL log to start accessing a data stream, given an XID.
That's a fundamentally different sort of index than we have today
(much the way that hash indexes, GiST indexes, and BTrees differ from
one another).

I'm having a hard time thinking about what happens if you have
cascaded replication, and want to carry records downstream.  In that
case, the XIDs from the original system aren't miscible with the XIDs
in a message queue on a downstream database, and I'm not sure what
we'd want to do.  Keep the original XIDs in a side attribute, maybe?
It seems weird, at any rate.  Or perhaps data from foreign sources has
got to go into a separate queue/'sorta-table', and thereby have two
XIDs, the "source system XID" and the "when we loaded it in locally
XID."
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Truncate if exists

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 2:12 PM, Stafford, David x78061
 wrote:
> On Mon, 15 Oct 2012 10:21:18 -0700, Robert Haas  wrote:
>> Yeah, I think the functionality that we need is pretty much there
>> already today.  What we need to do is to get the syntax to a point
>> where people can write the code they want to write without getting
>> tangled up by it.
>>
>> I think the invention of DO was a big step in the right direction
>> ...
>> With DO, you can write the logic you want
>> as an SQL statement, it's just a clunky and awkward SQL statement.  In
>> my view the goal ought to be to refine that mechanism to remove the
>> clunkiness and awkwardness, rather than to invent something completely
>> new.
>
> As someone who has worked with a number of databases now, none of them really 
> get this DDL integration completely right.  What I'd like to see is 1) a 
> predicate to easily test things about the schema (does this table, column, 
> index, schema, etc. exist?  does it have the right type?) and 2) a way to 
> conditionally execute DDL (and DML, which should fall right out, but it isn't 
> really what this discussion is covering).  I would propose extending the 
> current EXISTS / NOT EXISTS predicate as follows:
>
> [NOT] EXISTS TABLE tab [ ( col [type] [ , col [type]]... ) ]
> [NOT] EXISTS COLUMN tab.col [type]
> [NOT] EXISTS INDEX tab.idx [ ( col [ , col]... ) ]
> [NOT] EXISTS CONSTRAINT tab.cstrt -- not sure what else might be useful here
> [NOT] EXISTS ( select ) -- this is the current EXISTS predicate, of course
> [NOT] EXISTS ANY ROWS FROM tab [join] [WHERE predicate] [GROUP BY col [ , 
> col]...] -- exactly the same as
>   
>  -- (select 1 FROM etc.)
>   
>  -- only because I like
>   
>  -- it better
> (the latter [which by no means am I nuts over; it's just that when extending 
> EXISTS I can't stop neatening it up to my personal preferences] could be 
> extended with [NOT] EXISTS MORE THAN n ROWS FROM and [NOT] EXISTS EXACTLY n 
> ROWS FROM.)
>
> There is a new SQL statement: IF predicate true-statement [ELSE 
> false-statement].
>
> To actually execute this new IF statement, the executor would need an IF node 
> that evaluates the predicate (with ANDs and ORs, just like all SQL 
> predicates) and then executes the rest only if the predicate came out TRUE 
> (or NOT FALSE; I forget which is usually used, and the difference with NULL 
> could be useful, as long as it matches other predicates).  This moves one 
> more bit of procedural logic into the executor.
>
> Another wrinkle is that the dependent statement might not compile, due to 
> missing tables or whatnot.  Actually executing it while it doesn't compile is 
> an error, but we want to defer that error until we actually decide we need to 
> execute it.  Also, it's probably good to try compiling it again at that 
> point.  So my thought would be to try planning the dependent statement(s); if 
> they compile, hook them to the IF node; if not, hook a DEFERRED node to the 
> IF node.  The DEFERRED node has the parse tree (or raw string, whatever makes 
> sense) of the statement; on execution it tries again to plan that statement; 
> if it succeeds, run it; if not, error out.
>
> I'd also add a SEQUENCE node to the executor.  It just runs its children in 
> order (could be n-ary, or if fixed arity nodes are what is in the 
> planner/executor today, could be binary, first left, then right, and right 
> could be another SEQUENCE).  The DEFERRED node means that a CREATE statement 
> could precede use of what is created in the same sequence and all could get 
> planned (with some deferral to execution time) in advance and run in one 
> lump.  This implements DO at the executor level.
>
> The biggest concepts left from plpgsql are looping and variables.  Most 
> variables could be modeled as a single row value; SQL already can update a 
> row, so the planning of assignments and calculations of scalars (and arrays, 
> I think) already fits into things the planner knows about.  Table variables 
> (which I don't know that plpgsql supports, but someday it should) are less 
> defined.  Adding plpgsql's loops to the executor would let whole functions 
> run under one trip through the executor.  This is beyond just improving the 
> DDL support for scripts.
>
> I have written a number of database upgrade scripts.  Over time we've made 
> them less fragile, by checking for the existence of tables, indexes, and most 
> recently, columns.  The usual sequence is:
> 1) check the existence of an index; check that the first few columns are 
> correct; if not, drop the index
> 2) repeat for other indexes that have changed definition over time
> 3) check the existence of the table; create with current layout if it is 
> missing
> 4) chec

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Hannu Krosing


On 10/18/2012 08:36 PM, Claudio Freire wrote:
The CREATE QUEUE command, in fact, could be creating 
such a channel. The channel itself won't be WAL-only, just
the messages going through it. This (I think) would solve locking issues. 


Hmm. Maybe we should think of implementing this as REMOTE TABLE, that
is a table which gets no real data stored locally but all insert got 
through WAL

and are replayed as real inserts on slave side.

Then if you want matviews or partitioned table, you just attach triggers to
 the table on slave side to do them.

This would be tangential to their use as pure queues which would happen
at the level of plugins to logical replication.

--
Hannu



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Hannu Krosing


On 10/18/2012 07:33 PM, Josh Berkus wrote:

Simon,



It's hard to work out how to reply to this because its just so off
base. I don't agree with the restrictions you think you see at all,
saying it politely rather than giving a one word answer.

You have inside knowledge of Hannu's design.

Actually Simon has currently no more knowledge of this specific
design than you do - I posted this on this list as soon as I had figured
it out as a possible solution of a specific problem of supporting
 full pgQ/Londiste functionality in WAL based logical replication
with minimal overhead.

(well, actually I let it settle a few weeks, but i did not discuss
this off-list before ).

Simon may have better grasp of it thanks to having done work
on the BDR/Logical Replication design  and thus having better or
at least more recent understanding of issues involved in Logical
Replication.

When mapping londiste/Slony message capture to Logical WAL
the WAL already _is_ the event queue for replication.
NOT LOGGED tables make it also usable for non-replication
things using same mechanisms. (the equivalent in trigger-based
system would be a  log trigger which captures insert event and then
cancels an insert).


I am merely going from his
description *on this list*, because that's all I have to go in.

He requested comments, so here I am, commenting.  I'm *hoping* that it's
merely the description which is poor and not the conception of the
feature.  *As Hannu described the feature* it sounds useless and
obscure, and miles away from powering any kind of general queueing
mechanism.

If we describe a queue as something you put stuff in at one end and
get it out in same or some other specific order at the other end, then
WAL _is_ a queue when you use it for replication  (if you just write to it,
then it is "Log", if you write and read, it is "Queue")

That is, the WAL already is  a form of persistent and ordered (that is 
how WAL works)

stream of messages ("WAL records") that are generated on the "master"
and replayed on one or more consumers (called "slaves" in case of simple
replication)

All it takes to make this scenario work is keeping track of LSN or simply
log position on the slave side.

What you seem to be wanting is support for a cooperative consumers,
that is multiple consumers on the same queue working together and
sharing the work to process the incoming event .

This can be easily achieved using a single ordered event stream and
extra bookkeeping structures on the consumer side (look at cooperative
consumer samples in skytools).

What I suggested was optimisation for the case where you know that you
will never need the data on the master side and are only interested in it
on the slave side.

By writing rows/events/messages only to log (or steam or queue), you
avoid the need to later clean up it on the master by either DELETE or
TRUNCATE or rotating tables.

For both physical and logical streaming the WAL _is_ the queue of events
that were recorded on master and need to be replied on the slave.

Thanks to introducing logical replication, it now makes sense to have
actions recorded _only_ in this queue and this is what the whole RC was 
about.


I recommend that you introduce yourself a bit to skytools/pgQ to get a
better feel of the things I am talking about. Londiste is just one 
application

built on a general event logging, transport and transform/replay (that is
what i'd call queueing :) ) system pgQ.

pgQ does have its roots in Slony an(and earlier) replication systems, 
but it

is by no means _only_ a replication system.

The LOG ONLY tables are _not_ needed for pure replication (like Slony) but
they make replication + queueing type solutions like skytools/pgQ much more
 efficient as they do away wuth the need to maintain the queued data on 
the

master side where it will never be needed ( just to reapeat this once more
)


Or anything we discussed at the clustering meetings.

And, again, if you didn't want comments, you shouldn't have posted an RFC.

I did want comments and as far as I know I do not see you as hostile :)

I do understand that what you mean by QUEUE (and specially as a
MESSAGE QUEUE) is different from what I described.
You seem to want specifically an implementation of cooperative
consumers for a generic queue.

The answer is yes, it is possible to build this on WAL, or table based
event logs/queue of londiste / slony. It just takkes a little extra
management on the receiving side to do the record locking and
distribution between cooperating consumers.

All we're discussing is moving a successful piece of software into
core, which has been discussed for years at the international
technical meetings we've both been present at. I think an open
viewpoint on the feasibility of that would be reasonable, especially
when it comes from one of the original designers.

When I ask you for technical clarification or bring up potential
problems with a 2Q feature, you consistently treat it as a personal
attack

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Simon Riggs

On 18 October 2012 10:20, Andres Freund  wrote:
> On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
>> Kevin Grittner wrote:
>> > Hmm. The comment is probably better now, but I've been re-checking
>> > the code, and I think my actual code change is completely wrong.
>> > Give me a bit to sort this out.
>>
>> I'm having trouble seeing a way to make this work without rearranging
>> the code for concurrent drop to get to a state where it has set
>> indisvalid = false, made that visible to all processes, and ensured
>> that all scans of the index are complete -- while indisready is still
>> true. That is the point where TransferPredicateLocksToHeapRelation()
>> could be safely called. Then we would need to set indisready = false,
>> make that visible to all processes, and ensure that all access to the
>> index is complete. I can't see where it works to set both flags at
>> the same time. I want to sleep on it to see if I can come up with any
>> other way, but right now that's the only way I'm seeing to make DROP
>> INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-(
>
> In a nearby bug I had to restructure the code that in a way thats similar to
> this anyway, so that seems fine. Maybe you can fix the bug ontop of the two
> attached patches?

First patch and first test committed.

Working on second patch/test.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Claudio Freire

On Thu, Oct 18, 2012 at 2:33 PM, Josh Berkus  wrote:
>> I should also add that this is an switchable sync/asynchronous
>> transactional queue, whereas LISTEN/NOTIFY is a synchronous
>> transactional queue.
>
> Thanks for explaining.

New here, I missed half the conversation, but since it's been brought
up and (to me wrongfully) dismissed, I'd like to propose:

NOTIFY [ALL|ONE] [REMOTE|LOCAL|CLUSTER|DOWNSTREAM] ASYNCHRONOUSLY
LISTEN [REMOTE|LOCAL|CLUSTER|UPSTREAM] too for good measure.

That ought to work out fine as SQL constructs go, implementation aside.

That's not enough for matviews, but it is IMO a good starting point.
All you need after that, are triggers for notifying automatically upon
insert, and some mechanism to attach triggers to a channel for the
receiving side.

Since channels are limited to short strings, maybe a different kind of
object (but with similar manipulation syntax) ought to be created. The
CREATE QUEUE command, in fact, could be creating such a channel. The
channel itself won't be WAL-only, just the messages going through it.
This (I think) would solve locking issues.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-10-18 Thread Alvaro Herrera

Kohei KaiGai escribió:
> The revised patch fixes the problem that Daen pointed out.

Robert, would you be able to give this latest version of the patch a
look?

(KaiGai, does it still apply cleanly? If not, please submit a rebased
version.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-10-18 Thread Alvaro Herrera

Jeff Davis escribió:
> On Sat, 2012-08-18 at 18:10 +0400, Alexander Korotkov wrote:
> > 
> > Thanks! There is a separate patch for adjacent. I've reworked adjacent
> > check in order to make it more clear.
> 
> I am taking a look at this patch now. A few quick comments:

> * I tried some larger tests and they seemed to work. I haven't reviewed
> the spgist code changes in detail though.

Jeff, Heikki,

Any input on the subsequent version of this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-10-18 Thread Alvaro Herrera

Heikki, would you be able to give this patch a look and perhaps commit
it?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gistchoose vs. bloat

2012-10-18 Thread Alvaro Herrera

Alexander Korotkov escribió:

> > 4. It looks like the randomization is happening while trying to compare
> > the penalties. I think it may be more readable to separate those two
> > steps; e.g.
> >
> >   /* create a mapping whether randomization is on or not */
> >   for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
> >   offsets[i - FirstOffsetNumber] = i;
> >
> >   if (randomization)
> >   /* randomize offsets array */
> >
> >   for (i = 0; i < maxoff; i++)
> >   {
> >  offset = offsets[i];
> >  ...
> >   }
> >
> > That's just an idea; if you think it's more readable as-is (or if I am
> > misunderstanding) then let me know.
> 
> Actually, current implementation comes from idea of creating possible less
> overhead when randomization is off. I'll try to measure overhead in worst
> case. If it is low enough then you proposal looks reasonable to me.

Were you able to do these measurements?  If not, I'll defer to your and
Jeff's judgement on what's the best next step here.

Jeff, do you think we need more review of this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Tom Lane

Alvaro Herrera  writes:
> Boszormenyi Zoltan escribió:
>> this is the latest one, fixing a bug in the accounting
>> of per-statement lock timeout handling and tweaking
>> some comments.

> Tom, are you able to give this patch some more time on this commitfest?

I'm still hoping to get to it, but I've been spending a lot of time on
bug fixing rather than patch review lately :-(.  If you're hoping to
close out the current CF soon, maybe we should just slip it to the next
one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-10-18 Thread Alvaro Herrera

Boszormenyi Zoltan escribió:
> Hi,
> 
> this is the latest one, fixing a bug in the accounting
> of per-statement lock timeout handling and tweaking
> some comments.

Tom, are you able to give this patch some more time on this commitfest?

(If not, I think it would be fair to boot it to CF3; this is final in a
series, there's nothing that depends on it, and there's been good
movement on it; there's plenty of time before the devel cycle closes.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Josh Berkus

Simon,


> It's hard to work out how to reply to this because its just so off
> base. I don't agree with the restrictions you think you see at all,
> saying it politely rather than giving a one word answer.

You have inside knowledge of Hannu's design. I am merely going from his
description *on this list*, because that's all I have to go in.

He requested comments, so here I am, commenting.  I'm *hoping* that it's
merely the description which is poor and not the conception of the
feature.  *As Hannu described the feature* it sounds useless and
obscure, and miles away from powering any kind of general queueing
mechanism.  Or anything we discussed at the clustering meetings.

And, again, if you didn't want comments, you shouldn't have posted an RFC.

> All we're discussing is moving a successful piece of software into
> core, which has been discussed for years at the international
> technical meetings we've both been present at. I think an open
> viewpoint on the feasibility of that would be reasonable, especially
> when it comes from one of the original designers.

When I ask you for technical clarification or bring up potential
problems with a 2Q feature, you consistently treat it as a personal
attack and are emotionally defensive instead of answering my technical
questions.  This, in turn, frustrates the heck out of me (and others)
because we can't get the technical questions answered.  I don't want you
to justify yourself, I want a clear technical spec.

I'm asking these questions because I'm excited about ReplicationII, and
I want it to be the best feature it can possibly be.

Or, as we tell many new contributors, "We wouldn't bring up potential
problems and ask lots of questions if we weren't interested in the feature."

Now, on to the technical questions:

>> QUEUE emphasizes the aspect of logged only table that it accepts
>> "records" in a certain order, persists these and then quarantees
>> that they can be read out in exact the same order - all this being
>> guaranteed by existing WAL mechanisms.
>>
>> It is not meant to be a full implementation of application level queuing
>> system though but just the capture, persisting and distribution parts
>>
>> Using this as an "application level queue" needs a set of interface
>> functions to extract the events and also to keep track of the processed
>> events. As there is no general consensus what these shoul be (like if
>> processing same event twice is allowed) this part is left for specific
>> queue consumer implementations.

While implementations vary, I think you'll find that the set of
operations required for a full-featured application queue are remarkably
similar across projects.  Personally, I've worked with celery, Redis,
AMQ, and RabbitMQ, as well as a custom solution on top of pgQ.  The
design, as you've described it, make several of these requirements
unreasonably convoluted to implement.

It sounds to me like the needs of internal queueing and application
queueing may be hopelessly divergent.  That was always possible, and
maybe the answer is to forget about application queueing and focus on
making this mechanism work for replication and for matviews, the two
features we *know* we want it for.  Which don't need the application
queueing features I described AFAIK.

> The two halves of the queue are the TAIL/entry point and the HEAD/exit
> point. As you point out these could be on the different servers,
> wherever the logical changes flow to, but could also be on the same
> server. When the head and tail are on the same server, the MESSAGE
> QUEUE syntax seems appropriate, but I agree that calling it that when
> its just a head or just a tail seems slightly misleading.

Yeah, that's why I was asking for clarification; the way Hannu described
it, it sounded like it *couldn't* be read on the insert node, but only
on a replica.

> We do, I think, want a full queue implementation in core. We also want
> to allow other queue implementations to interface with Postgres, so we
> probably want to allow "first half" only as well. Meaning we want both
> head and tail separately in core code. The question is whether we
> require both head and tail in core before we allow commit, to which I
> would say I think adding the tail first is OK, and adding the head
> later when we know exactly the design.

I'm just pointing out that some of the requirements of the design for
the replication queue may conflict with a design for a full-featured
application queue.

I don't quite follow you on what you mean by "head" vs. "tail".  Explain?

> Having said that, the LOGGING ONLY syntax makes me shiver. Better name?

I suck at names.  Sorry.

> I should also add that this is an switchable sync/asynchronous
> transactional queue, whereas LISTEN/NOTIFY is a synchronous
> transactional queue.

Thanks for explaining.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your su

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Joshua D. Drake



On 10/18/2012 04:43 AM, Simon Riggs wrote:


On 18 October 2012 12:20, Magnus Hagander  wrote:


Since Simon stirred up a hornets nest suggesting deprecation of a
number of features, I figured I'd take it one step further and suggest
removal of some previously deprecated features :)


I'm laughing at the analogy that angry and unintelligent agents
responded to my proposals, but there was no stirring action from me.


I believe the stirring occurred when you dropped the idea in the 
proverbial bucket. It is not possible to drop even the tiniest pebble 
into any ideology of our community without some plague causing flying 
insects swarming just in case. You and I, included.


JD






--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 6:46 AM, Andrew Dunstan  wrote:
>
> On 10/17/2012 07:25 PM, Tom Lane wrote:
>
>>
>> I'm fairly annoyed by the entire tenor of this conversation, because
>> the people who are hollering the loudest seem to be people who have
>> never actually touched any of the rules code, but nonetheless seem
>> prepared to tell those of us who have what to spend our time on.
>
>
> +1
>
> I too have been quite annoyed.

Sorry that I'm an offender. I also did not like the way the
conversation was going for some time; for me, I felt like I didn't
understand a lot of the terse rejections that materialized immediately
on behalf of users that I personally cannot identify, and I felt those
rejections weren't in a neutral language either that encouraged
clarification.  I'm glad things have moved beyond that.

> The biggest pain people have mentioned is that they don't work with COPY.  I
> am in fact about to start working on a project which will probably alleviate
> that pain point. I'm not going to say much more, and I would not have said
> anything right now except that there is this sudden rush to deprecate rules,
> or announce a future removal of the feature. However, I hope to have a
> proposal to put to the community by about the end of November.

I have encountered this as a papercut.

Here's another use case that in my history with RULES that didn't seem
to pan out so well: In my recollection, one way to use rules is to
retarget operations that happen against a view and move them to a
table, and as I recall to make this work as one expected one had to
have a very wordy RULE (for UPDATEs) with a litany of (fairly simple)
equality and not-null conditions to make it work as one would expect
(to not under-constrain the UPDATE).  This became a maintenance
headache whenever attributes were added to the underlying relation.

It was also quite complex, as I recall, when one wanted to maintain an
interface but normalize the underlying table and split writes into two
or more places.

It has been quite some time, does that sound like a correct rendering
of a problem?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-18 Thread Fujii Masao

On Tue, Oct 16, 2012 at 9:31 PM, Heikki Linnakangas
 wrote:
> On 15.10.2012 19:31, Fujii Masao wrote:
>>
>> On Mon, Oct 15, 2012 at 11:27 PM, Heikki Linnakangas
>>   wrote:
>>>
>>> On 15.10.2012 13:13, Heikki Linnakangas wrote:

 Oh, I didn't remember that we've documented the specific structs that we
 pass around. It's quite bogus anyway to explain the messages the way we
 do currently, as they are actually dependent on the underlying
 architecture's endianess and padding. I think we should refactor the
 protocol to not transmit raw structs, but use pq_sentint and friends to
 construct the messages. This was discussed earlier (see

 http://archives.postgresql.org/message-id/4fe2279c.2070...@enterprisedb.com),
 I think there's consensus that 9.3 would be a good time to do that as we
 changed the XLogRecPtr format anyway.
>>>
>>>
>>>
>>> This is what I came up with. The replication protocol is now
>>> architecture-independent. The WAL format itself is still
>>> architecture-independent, of course, but this is useful if you want to
>>> e.g
>>> use pg_receivexlog to back up a server that runs on a different platform.
>>>
>>> I chose the int64 format to transmit timestamps, even when compiled with
>>> --disable-integer-datetimes.
>>>
>>> Please review if you have the time..
>>
>>
>> Thanks for the patch!
>>
>> When I ran pg_receivexlog, I encountered the following error.
>
>
> Yeah, clearly I didn't test this near enough...
>
> I fixed the bugs you bumped into, new version attached.

Thanks for updating the patch!

We should remove the check of integer_datetime by pg_basebackup
background process and pg_receivexlog? Currently, they always check
it, and then if its setting value is not the same between a client and
server, they fail. Thanks to the patch, ISTM this check is no longer
required.

+   pq_sendint64(&reply_message, GetCurrentIntegerTimestamp());

In XLogWalRcvSendReply() and XLogWalRcvSendHSFeedback(),
GetCurrentTimestamp() is called twice. I think that we can skip the
latter call if integer-datetime is enabled because the return value of
GetCurrentTimestamp() and GetCurrentIntegerTimestamp() is in the
same format. It's worth reducing the number of GetCurrentTimestamp()
calls, I think.

elog(DEBUG2, "sending write %X/%X flush %X/%X apply %X/%X",
-(uint32) (reply_message.write >> 32), (uint32) 
reply_message.write,
-(uint32) (reply_message.flush >> 32), (uint32) 
reply_message.flush,
-(uint32) (reply_message.apply >> 32), (uint32) 
reply_message.apply);
+(uint32) (writePtr >> 32), (uint32) writePtr,
+(uint32) (flushPtr >> 32), (uint32) flushPtr,
+(uint32) (applyPtr >> 32), (uint32) applyPtr);

elog(DEBUG2, "write %X/%X flush %X/%X apply %X/%X",
-(uint32) (reply.write >> 32), (uint32) reply.write,
-(uint32) (reply.flush >> 32), (uint32) reply.flush,
-(uint32) (reply.apply >> 32), (uint32) reply.apply);
+(uint32) (writePtr >> 32), (uint32) writePtr,
+(uint32) (flushPtr >> 32), (uint32) flushPtr,
+(uint32) (applyPtr >> 32), (uint32) applyPtr);

Isn't it worth logging not only WAL location but also the replyRequested
flag in these debug message?

The remaining of the patch looks good to me.

>> +   hdrlen = sizeof(int64) + sizeof(int64) +
>> sizeof(int64);
>> +   hdrlen = sizeof(int64) + sizeof(int64) +
>> sizeof(char);
>>
>> These should be macro, to avoid calculation overhead?
>
>
> The compiler will calculate this at compilation time, it's going to be a
> constant at runtime.

Yes, you're right.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore

2012-10-18 Thread Guillaume Lelarge

On Thu, 2012-10-18 at 12:19 -0300, Alvaro Herrera wrote:
> Robert Haas escribió:
> > On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge
> >  wrote:
> > > Any comments on this?
> > 
> > I'm not sure I'd want to back-patch this, since it is a behavior
> > change, but I do think it's probably a good idea to change it for 9.3.
> 
> Hm, but the bug is said to happen only in 9.2, so if we don't backpatch
> we would leave 9.2 alone exhibiting this behavior.
> 

Yeah, Alvarro got it right. The behaviour changed in 9.2. This patch
needs to be applied on 9.2 and master, nothing else. If the patch is
good enough though...


-- 
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Steve Crawford


On 10/17/2012 04:25 PM, Tom Lane wrote:
...Now having said that, I would definitely like to see rules in their 
current form go away eventually. But not without a substitute. 
Triggers are not a complete replacement, and no amount of wishful 
thinking makes them so.

...
Perhaps it would be more profitable to try to identify the pain points 
that make people so eager to get rid of rules, and then see if we 
could alleviate them.


Alternately/additionally identify the deficiencies in triggers that 
drive users to prefer rules. For example, a common need is to update a 
log table whenever updates are made to a main table.


Using rules to accomplish this is very easy to understand and write, 
even for most beginners. (Understand properly including limitations and 
dangers is another issue, of course.) It is also easy to maintain. If 
you drop the table, the rule is cleaned up as well.


With triggers you need to select from a variety of available languages, 
write a function in that language and write a trigger that calls that 
function. Dropping the function will remove the trigger but the user 
must remember to delete the function as well, if desired. Nothing 
insurmountable but inconvenient compared to the use of a rule.


Per the documentation "PostgreSQL only allows the execution of a 
user-defined function for the triggered action. The standard allows the 
execution of a number of other SQL commands..."


There may be valid reasons why implementing that part of the SQL 
standard in PostgreSQL is difficult or unwise but removing that 
limitation on triggers would eliminate one annoyance that pushes users 
toward rules.


Cheers,
Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xlog filename formatting functions in recovery

2012-10-18 Thread Daniel Farina

On Thu, Oct 18, 2012 at 9:22 AM, Alvaro Herrera
 wrote:
> Daniel, I assume you are submitting an updated version based on the
> feedback that has been provided.  I will mark this patch returned with
> feedback in the current CF; please submit the next version to CF3.

Thank you for reminding me, so the approach that seems basically
reasonable is to add an overload for these functions to accept a
timeline number, if I read this feedback correctly?

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Tom Lane

Robert Haas  writes:
> On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander  wrote:
>> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
>> syntax deprecated but still mapping to the new one. Has it been there
>> long enough that we should start throwing an error for ident on unix?

> Definitely not.  I see no reason to change that, well, really ever.
> But certainly not after just two releases.  It seems to me like a
> useful convenience that does no real harm.

I think the argument that it causes user confusion is a fairly strong
one, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Very minor feature suggestion

2012-10-18 Thread Murphy, Kevin

It might be nice for psql to have a 'htmlcaption' boolean pset option that 
would wrap the provided title/caption, if any, in a caption tag in the HTML 
report output, when using html format.

Motivation:

When I use:

\pset title 'Some title'

or

\C 'Some title'

psql emits the text:

Title is "Some title".

even when using html format.  This seems more like a diagnostic/annotation 
(like 'Title is unset', which is what you get from a plain "\pset title") than 
report output.  For casual use, even "\echo Some title" is more pleasant, 
except for post-processing by scripts, which people are no doubt doing.  When 
using html format, it would arguably be better for the title to be wrapped in a 
caption tag inside the table itself.  You couldn't change the default behavior, 
but the html caption would be a nice option.

Regards,
Kevin Murphy



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xlog filename formatting functions in recovery

2012-10-18 Thread Alvaro Herrera

Daniel, I assume you are submitting an updated version based on the
feedback that has been provided.  I will mark this patch returned with
feedback in the current CF; please submit the next version to CF3.

Thanks to the (rather numerous!) reviewers.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-18 Thread Alvaro Herrera

This patch doesn't seem to be going anywhere, sadly.  Since we're a bit
late in the commitfest and this patch hasn't seen any activity for a
long time, I'll mark it as returned-with-feedback.  I hope one or both
versions are resubmitted (with additional fixes?) for the next
commitfest, and that the discussion continues to determine which of the
two approaches is the best.

Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-10-18 Thread Peter Geoghegan

On 16 October 2012 21:47, Peter Geoghegan  wrote:
> The same basic strategy for sizing the tuplesort memtuples array in
> also exists in tuplestore. I wonder if we should repeat this there? I
> suppose that that could follow later.

Incidentally, the basis of this remark is commit 2689abf0, where Tom
decided to keep the two in sync. That's a precedent for what we need
to do here, I suppose.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Robert Haas

On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander  wrote:
> Since Simon stirred up a hornets nest suggesting deprecation of a
> number of features, I figured I'd take it one step further and suggest
> removal of some previously deprecated features :)
>
> In particular, we made a couple of changes over sveral releases back
> in the authentication config, that we should perhaps consider
> finishing by removing the old stuff now?
>
> 1. krb5 authentication. We've had gssapi since 8.3 (which means in all
> supported versions). krb5 has been deprecated, also since 8.3. Time to
> remove it?

That seems like a sufficiently long deprecation window, but is gssapi
a full substitute for krb5?  I don't really have a strong opinion on
this, not being a user myself.

> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
> syntax deprecated but still mapping to the new one. Has it been there
> long enough that we should start throwing an error for ident on unix?

Definitely not.  I see no reason to change that, well, really ever.
But certainly not after just two releases.  It seems to me like a
useful convenience that does no real harm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-10-18 Thread Alvaro Herrera

Tomas Vondra wrote:

> Also, I've realized the last interval may not be logged at all - I'll
> take a look into this in the next version of the patch.

I didn't see any later version of this patch posted anywhere.  I guess
it'll have to wait until the next commitfest.  Please fix the remaining
issues and resubmit.

Thanks to Robert Haas, Jeff Janes and Pavel Stehule for the reviews.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-18 Thread Alvaro Herrera

Tomas Vondra wrote:
> Hi,
> 
> thanks for the review. I'll look into that in ~2 weeks, once the
> pgconf.eu
> is over.

Excellent.  Please submit the updated version to the upcoming commitfest
when you have it.  I'm marking this patch Returned with Feedback.
Many thanks to Shigeru Hanada for the review and benchmark.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-10-18 Thread Alvaro Herrera

Greg Stark escribió:
> On Tue, Oct 16, 2012 at 9:47 PM, Peter Geoghegan  
> wrote:
> > The patch will now been marked "ready for committer". Does this need
> > doc changes, in light of what is arguably a behavioural difference?
> > You only mentioned release notes.
> 
> I'm happy to look at this one, probably next week at pgconf.eu. It seems like 
> a
> reasonable size patch to get back into things.
> 
> That's assuming my committer bits haven't lapsed and people are ok
> with me stepping back into things?

Feel free.  Your committer bits should still work.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 18 October 2012 16:15, Tom Lane  wrote:

> But I'd want to see a pretty
> bulletproof argument why overriding *only* nextval is sufficient
> (and always will be) before accepting a hook for just nextval.  If we
> build an equivalent amount of functionality piecemeal it's going to
> be a lot uglier than if we recognize we need this type of concept
> up front.

We discussed that we need only nextval() and setval() elsewhere, but
adding others is fairly cheap so we can chew on that when we have a
patch to discuss.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 18 October 2012 16:08, Robert Haas  wrote:

> To make
> it even better, add some generic options that can be passed through to
> the underlying handler.

Agreed

>> Or maybe better, invent a level of indirection like a "sequence access
>> method" (comparable to index access methods) that provides a compatible
>> set of substitute functions for sequence operations.  If you want to
>> override nextval() for a sequence, don't you likely also need to
>> override setval(), currval(), etc?  Not to mention overriding ALTER
>> SEQUENCE's behavior.
>
> This might be better, but it's also possibly more mechanism than we
> truly need here.  But then again, if we're going to end up with more
> than a handful of handlers, we probably do want to do this.

Let me have a play and see what comes out simplest. Somewhere in the
middle seems about right.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore

2012-10-18 Thread Alvaro Herrera

Robert Haas escribió:
> On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge
>  wrote:
> > Any comments on this?
> 
> I'm not sure I'd want to back-patch this, since it is a behavior
> change, but I do think it's probably a good idea to change it for 9.3.

Hm, but the bug is said to happen only in 9.2, so if we don't backpatch
we would leave 9.2 alone exhibiting this behavior.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] First draft of snapshot snapshot building design document

2012-10-18 Thread Andres Freund

On Thursday, October 18, 2012 04:47:12 PM Robert Haas wrote:
> On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund  
wrote:
> > On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote:
> >> The design document [2] really just explains the problem (which is the
> >> need for catalog metadata at a point in time to make sense of heap
> >> tuples), without describing the solution that this patch offers with
> >> any degree of detail. Rather, [2] says "How we build snapshots is
> >> somewhat intricate and complicated and seems to be out of scope for
> >> this document", which is unsatisfactory. I look forward to reading the
> >> promised document that describes this mechanism in more detail.
> > 
> > Here's the first version of the promised document. I hope it answers most
> > of the questions.
> > 
> > Input welcome!
> 
> I haven't grokked all of this in its entirety, but I'm kind of
> uncomfortable with the relfilenode -> OID mapping stuff.  I'm
> wondering if we should, when logical replication is enabled, find a
> way to cram the table OID into the XLOG record.  It seems like that
> would simplify things.
> 
> If we don't choose to do that, it's worth noting that you actually
> need 16 bytes of data to generate a unique identifier for a relation,
> as in database OID + tablespace OID + relfilenode# + backend ID.
> Backend ID might be ignorable because WAL-based logical replication is
> going to ignore temporary relations anyway, but you definitely need
> the other two.  ...

Hm. I should take look at the way temporary tables are represented. As you say 
I is not going to matter for WAL decoding, but still...

> Another thing to think about is that, like catalog snapshots,
> relfilenode mappings have to be time-relativized; that is, you need to
> know what the mapping was at the proper point in the WAL sequence, not
> what it is now.  In practice, the risk here seems to be minimal,
> because it takes a while to churn through 4 billion OIDs.  However, I
> suspect it pays to think about this fairly carefully because if we do
> ever run into a situation where the OID counter wraps during a time
> period comparable to the replication lag, the bugs will be extremely
> difficult to debug.

I think with a rollbacks + restarts we might even be able to see the same 
relfilenode earlier.

> Anyhow, adding the table OID to the WAL header would chew up a few
> more bytes of WAL space, but it seems like it might be worth it to
> avoid having to think very hard about all of these issues.

I don't think its necessary to change wal logging here. The relfilenode mapping 
is now looked up using the timetravel snapshot we've built using (spcNode, 
relNode) as the key, so the time-relativized lookup is "builtin". If we screw 
that up way much more is broken anyway.

Two problems are left:

1) (reltablespace, relfilenode) is not unique in pg_class because InvalidOid is 
stored for relfilenode if its a shared or nailed table. That not a problem for 
the lookup because weve already checked the relmapper before that, so we never 
look those up anyway. But it violates documented requirements of syscache.c. 
Even after some looking I haven't found any problem that that could cause.

2) We need to decide whether a HEAP[1-2]_* record did catalog changes when 
building/updating snapshots. Unfortunately we also need to do this *before* we 
built the first snapshot. For now treating all tables as catalog modifying 
before we built the snapshot seems to work fine.
I think encoding the oid in the xlog header wouln't help all that much here, 
because I am pretty sure we want to have the set of "catalog tables" to be 
extensible at some point...

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Tom Lane

I wrote:
> Hitoshi Harada  writes:
>> If OOM happens during expand_table() in hash_search_with_hash_value()
>> for RelationCacheInsert,

> What OOM?  expand_table is supposed to return without doing anything
> if it can't expand the table.  If that's not happening, that's a bug
> in the hash code.

Oh, wait, I take that back --- the palloc-based allocator does throw
errors.  I think that when that was designed, we were thinking that
palloc-based hash tables would be thrown away anyway after an error,
but of course that's not true for long-lived tables such as the relcache
hash table.

I'm not terribly comfortable with trying to use a PG_TRY block to catch
an OOM error - there are too many ways that could break, and this code
path is by definition not very testable.  I think moving up the
expand_table action is probably the best bet.  Will you submit a patch?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review for pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Alvaro Herrera

Joel Jacobson wrote:
> Hi Joachim,
> 
> Attached, please find new patch. Test unchanged.

This was committed, as discussed in the original patch's thread.

It would be great if reviewers could reply to the email that submits the
patch, instead of creating a thread of their own.  It helps keep things
better organized.

Thanks for the review.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Alvaro Herrera

Joel Jacobson wrote:
> On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
>  wrote:
> > Uh, the patch you posted keeps the pg_get_function_identity_arguments
> > call in dumpFunc, but there is now also a new one in getFuncs.  Do we
> > need to remove the second one?
> 
> It could be done, but unfortunately we cannot use the value computed
> in dumpFunc(),
> because getFuncs() is called before dumpFunc().

Right, I got that from the discussion.

> What could be done is to keep the changes in getFuncs(), and also
> change dumpFunc()
> to use the value computed in getFuncs(), but I think the gain is small
> in relation
> to the complexity of changing dumpFunc(), as we would still need to
> make the two other
> function calls in the SQL query in dumpFunc() to pg_get_function_arguments() 
> and
> pg_get_function_result().

Changing pg_dump is complex enough whatever the change, yes.  I have not
touched this.

> > Here's an updated patch for your consideration.  I was about to push
> > this when I noticed the above.  The only change here is that the extra
> > code that tests for new remoteVersions in the second "else if" branch of
> > getFuncs and getAggregates has been removed, since it cannot ever be
> > reached.
> 
> Looks really good.

Thanks, pushed it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Peter Geoghegan

On 18 October 2012 16:18, Christopher Browne  wrote:
> A "shim" adds complexity, but retains the "upgrade across versions"
> use case, and reduces the need to keep supporting elder versions of
> Slony.

Right. Upgrading across major versions is likely to continue to remain
a very important use-case for Slony.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-18 Thread Fujii Masao

On Wed, Oct 17, 2012 at 8:46 PM, Amit Kapila  wrote:
>> On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote:
>> On 13.10.2012 19:35, Fujii Masao wrote:
>> > On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas
>> >   wrote:
>> >> Ok, thanks. Committed.
>> >
>> > I found one typo. The attached patch fixes that typo.
>>
>> Thanks, fixed.
>>
>> > ISTM you need to update the protocol.sgml because you added
>> > the field 'replyRequested' to WalSndrMessage and StandbyReplyMessage.
>
>
>>
>> > Is it worth adding the same mechanism (send back the reply immediately
>> > if walsender request a reply) into pg_basebackup and pg_receivexlog?
>>
>> Good catch. Yes, they should be taught about this too. I'll look into
>> doing that too.
>
> If you have not started and you don't have objection, I can pickup this to
> complete it.
>
> For both (pg_basebackup and pg_receivexlog), we need to get a timeout
> parameter from user in command line, as
> there is no conf file here. New Option can be -t (parameter name can be
> recvtimeout).
>
> The main changes will be in function ReceiveXlogStream(), it is a common
> function for both
> Pg_basebackup and pg_receivexlog. Handling will be done in same way as we
> have done in walreceiver.
>
> Suggestions/Comments?

Before implementing the timeout parameter, I think that it's better to change
both pg_basebackup background process and pg_receivexlog so that they
send back the reply message immediately when they receive the keepalive
message requesting the reply. Currently, they always ignore such keepalive
message, so status interval parameter (-s) in them always must be set to
the value less than replication timeout. We can avoid this troublesome
parameter setting by introducing the same logic of walreceiver into both
pg_basebackup background process and pg_receivexlog.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Christopher Browne

On Thu, Oct 18, 2012 at 9:49 AM, Peter Geoghegan  wrote:
> On 16 October 2012 15:26, Jan Wieck  wrote:
>> This means that the transition time from the existing, trigger based
>> approach to the new WAL based mechanism will see both technologies in
>> parallel, which is no small thing to support.
>
> So, you're talking about a shim between the two in order to usefully
> support inter-version replication, or are you just thinking about
> making a clean break in compatibility for Postgres versions prior to
> 9.3 in a new release branch?

It's early to assume either.

In Slony 2.0, we accepted that we were breaking compatibility with
versions of Postgres before 8.3; we accepted that because there were
considerable 'manageability' benefits (e.g. - system catalogues no
longer hacked up, so pg_dump works against all nodes, and some
dramatically reduced locking).

But that had the attendant cost that we have had to continue fixing
bugs on 1.2, to a degree, even until now, because people on Postgres
versions earlier than 8.3 have no way to use version 2.0.

Those merits and demerits apply pretty clearly to this.

It would be somewhat attractive for a "version 2.3" (or, more likely,
to indicate the break from earlier versions, "3.0" to make the clean
break to the new-in-PG-9.3 facilities.  It is attractive in that we
could:
a) Safely remove the trigger-based log capture apparatus (or, at
least, I'm assuming so), and
b) Consciously upgrade to take advantage of all the latest cool stuff
found in Postgres 9.3.  (I haven't got any particular features in
mind; perhaps we add RANGE comparators for xid to 9.3, and make
extensive use of xid_range types?  That would be something that
couldn't reasonably get hacked to work in anything before 9.2...)
c) Drop out any special cases having to do with support of versions
8.3, 8.4, 9.0, 9.1, and 9.2.

But, of course, we'd be leaving everyone running 8.3 thru 9.2 behind,
if we did so, and would corresponding shackle ourselves to need to
support the 2.x branches for still longer.  And this would mean that
this Slony "3.0" would expressly NOT support one of our intended use
cases, namely to support upgrading from elder versions of Postgres.

A "shim" adds complexity, but retains the "upgrade across versions"
use case, and reduces the need to keep supporting elder versions of
Slony.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Tom Lane

Robert Haas  writes:
> On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane  wrote:
>> Or maybe better, invent a level of indirection like a "sequence access
>> method" (comparable to index access methods) that provides a compatible
>> set of substitute functions for sequence operations.  If you want to
>> override nextval() for a sequence, don't you likely also need to
>> override setval(), currval(), etc?  Not to mention overriding ALTER
>> SEQUENCE's behavior.

> This might be better, but it's also possibly more mechanism than we
> truly need here.  But then again, if we're going to end up with more
> than a handful of handlers, we probably do want to do this.

It's definitely a lot of mechanism, and if we can get away with
something simpler that's fine with me.  But I'd want to see a pretty
bulletproof argument why overriding *only* nextval is sufficient
(and always will be) before accepting a hook for just nextval.  If we
build an equivalent amount of functionality piecemeal it's going to
be a lot uglier than if we recognize we need this type of concept
up front.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge
 wrote:
> Any comments on this?

I'm not sure I'd want to back-patch this, since it is a behavior
change, but I do think it's probably a good idea to change it for 9.3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane  wrote:
> In particular, the reason proposing a hook first seems backwards is that
> if we have a catalog-level representation that some sequences are local
> and others not, we should be using that to drive the determination of
> whether to call a substitute function --- and maybe which one to call.
> For instance, I could see attaching a function OID to each sequence
> and then having nextval() call that function, instead of a hook per se.

Yeah, I like that.  That makes it easy to configure your database so
that some sequences have special behavior (which the database designer
can set up however they like) and others can be just vanilla, and the
plugin doesn't have to try to figure out which ones are which (which
was my first concern in reading Simon's original proposal).  To make
it even better, add some generic options that can be passed through to
the underlying handler.

So something like:

ALTER SEQUENCE wump
SET HANDLER (nextval my_magical_nextval, setval my_magical_setval)
OPTIONS (any_label_you_want_the_handlers_to_get
'some_text_associated_with_the_label', another_label
'some_more_text');

That way you could say, for example, that sequence wump should get its
values from coordinator node 172.24.16.93 and that the global
identifier for this sequence is UUID
e15ea6e6-43d5-4f65-8efd-cf28a14a2d70.  That way you can avoid having
to make any assumptions about how local sequence names on particular
nodes are mapped onto global names.

> Or maybe better, invent a level of indirection like a "sequence access
> method" (comparable to index access methods) that provides a compatible
> set of substitute functions for sequence operations.  If you want to
> override nextval() for a sequence, don't you likely also need to
> override setval(), currval(), etc?  Not to mention overriding ALTER
> SEQUENCE's behavior.

This might be better, but it's also possibly more mechanism than we
truly need here.  But then again, if we're going to end up with more
than a handful of handlers, we probably do want to do this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:43, Simon Riggs  wrote:
> On 18 October 2012 12:20, Magnus Hagander  wrote:
>
>> Since Simon stirred up a hornets nest suggesting deprecation of a
>> number of features, I figured I'd take it one step further and suggest
>> removal of some previously deprecated features :)
>
> I'm laughing at the analogy that angry and unintelligent agents
> responded to my proposals, but there was no stirring action from me.

Hmm, this looks like a stirring action in itself, so I withdraw and apologise.

You are right that some people are angry and so IMHO it was wrong of
me to try to joke about that. My point was only that I had acted in
good faith, rather than to deliberately cause annoyance.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] First draft of snapshot snapshot building design document

2012-10-18 Thread Robert Haas

On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund  wrote:
> On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote:
>> The design document [2] really just explains the problem (which is the
>> need for catalog metadata at a point in time to make sense of heap
>> tuples), without describing the solution that this patch offers with
>> any degree of detail. Rather, [2] says "How we build snapshots is
>> somewhat intricate and complicated and seems to be out of scope for
>> this document", which is unsatisfactory. I look forward to reading the
>> promised document that describes this mechanism in more detail.
>
> Here's the first version of the promised document. I hope it answers most of
> the questions.
>
> Input welcome!

I haven't grokked all of this in its entirety, but I'm kind of
uncomfortable with the relfilenode -> OID mapping stuff.  I'm
wondering if we should, when logical replication is enabled, find a
way to cram the table OID into the XLOG record.  It seems like that
would simplify things.

If we don't choose to do that, it's worth noting that you actually
need 16 bytes of data to generate a unique identifier for a relation,
as in database OID + tablespace OID + relfilenode# + backend ID.
Backend ID might be ignorable because WAL-based logical replication is
going to ignore temporary relations anyway, but you definitely need
the other two.  There's nothing, for example, to keep you from having
two relations with the same value in pg_class.relfilenode in the same
database but in different tablespaces.  It's unlikely to happen,
because for new relations we set OID = relfilenode, but a subsequent
rewrite can bring it about if the stars align just right.  (Such
situations are, of course, a breeding ground for bugs, which might
make you question whether our current scheme for assigning
relfilenodes has much of anything to recommend it.)

Another thing to think about is that, like catalog snapshots,
relfilenode mappings have to be time-relativized; that is, you need to
know what the mapping was at the proper point in the WAL sequence, not
what it is now.  In practice, the risk here seems to be minimal,
because it takes a while to churn through 4 billion OIDs.  However, I
suspect it pays to think about this fairly carefully because if we do
ever run into a situation where the OID counter wraps during a time
period comparable to the replication lag, the bugs will be extremely
difficult to debug.

Anyhow, adding the table OID to the WAL header would chew up a few
more bytes of WAL space, but it seems like it might be worth it to
avoid having to think very hard about all of these issues.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUG] False indication in pg_stat_replication.sync_state

2012-10-18 Thread Fujii Masao

On Thu, Oct 18, 2012 at 5:42 PM, Kyotaro HORIGUCHI
 wrote:
> Hello. My colleague found that pg_stat_replication.sync_state
> shows false state for some condition.
>
> This prevents Pacemaker from completing fail-over that could
> safely be done.
>
> The point is in walsender.c, pg_stat_get_wal_senders() below, (as
> of REL9_2_1)
>
>   1555:if (walsnd->pid != 0)
>   1556:{
>   1557:  /*
>   1558:   * Treat a standby such as a pg_basebackup background process
>   1559:   * which always returns an invalid flush location, as an
>   1560:   * asynchronous standby.
>   1561:   */
> ! 1562:   sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ?
>   1563:  0 : walsnd->sync_standby_priority;
>
> Here, XLogRecPtrIsInvalid(walsnd->flush) is defined as
> (walsnd->flush.xrecoff == 0) which becomes true as usual at every
> WAL 'file's (not segments) boundary. xrecoff == 0 is certainly
> invalid for the start point of WAL *RECORD*, but should be
> considered valid in replication stream. This check was introduced
> at 9.2.0 and the version up between 9.1.4 and 9.1.5.
>
>  | DEBUG:  write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68
>  | DEBUG:  write 4/0 flush 4/0 apply 3/FEFFEC68
>  | DEBUG:  HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1
>  | DEBUG:  write 4/111C0 flush 4/0 apply 3/FEFFECC0
> !| DEBUG:  HOGE: flush = 4/0 sync_priority[0] = 0
>
> This value zero of sync_priority[0] makes sync_status 'async'
> errorneously and confuses Pacemaker.
>
> # The log line marked with 'HOGE' above printed by applying the
> # patch at the bottom of this message and invoking 'select
> # sync_state from pg_stat_replication' periodically. To increase
> # the chance to see the symptom, sleep 1 second for 'file'
> # boundaries :-)
>
> The Heikki's recent(?) commit
> 0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format
> of XLogRecPtr from logid:xrecoff struct to 64 bit linear address
> would fix the false indication. But I suppose this patch won't be
> applied to existing 9.1.x and 9.2.x because of the modification
> onto streaming protocol.
>
> As far as I see the patch, it would'nt change the meaning of
> XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to
> (xrecoff == 0 && xlogid == 0). But this change affects rather
> wide portion where handling WAL nevertheless what is needed here
> is only to stop the false indication.
>
> On the other hand, pg_basebackup seems return 0/0 as flush and
> apply positions so it seems enough only to add xlogid == 0 into
> the condition. The patch attached for REL9_2_1 does this and
> yields the result following.
>
>  | DEBUG:  write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0
>  | DEBUG:  write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88
>  | DEBUG:  write 3/0 flush 3/0 apply 2/FEFFFD48
>  | DEBUG:  HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1
>  | DEBUG:  write 3/E338 flush 3/0 apply 2/FE80
> !| DEBUG:  HOGE: flush = 3/0 sync_priority[0] = 1
>
> I think this patch should be applied for 9.2.2 and 9.1.7.

Looks good to me, though I don't think the source code comment needs
to be updated in the way the patch does.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] September 2012 commitfest

2012-10-18 Thread Alvaro Herrera

Amit Kapila wrote:

> For the Patch, Trim trailing NULL columns, I have provided the performance
> data required
> and completed the review. There are only few review comments which can be
> addressed.
> So is it possible that I complete them and mark it as "Ready For Committer"
> or what else can be the way to proceed for this patch
> if author doesn't respond.

Sure, you can do that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash_search and out of memory

2012-10-18 Thread Tom Lane

Hitoshi Harada  writes:
> If OOM happens during expand_table() in hash_search_with_hash_value()
> for RelationCacheInsert,

What OOM?  expand_table is supposed to return without doing anything
if it can't expand the table.  If that's not happening, that's a bug
in the hash code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

2012-10-18 Thread Peter Geoghegan

On 16 October 2012 15:26, Jan Wieck  wrote:
> This means that the transition time from the existing, trigger based
> approach to the new WAL based mechanism will see both technologies in
> parallel, which is no small thing to support.

So, you're talking about a shim between the two in order to usefully
support inter-version replication, or are you just thinking about
making a clean break in compatibility for Postgres versions prior to
9.3 in a new release branch?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecating RULES

2012-10-18 Thread Andrew Dunstan



On 10/17/2012 07:25 PM, Tom Lane wrote:



I'm fairly annoyed by the entire tenor of this conversation, because
the people who are hollering the loudest seem to be people who have
never actually touched any of the rules code, but nonetheless seem
prepared to tell those of us who have what to spend our time on.


+1

I too have been quite annoyed.



Now having said that, I would definitely like to see rules in their
current form go away eventually.  But not without a substitute.
Triggers are not a complete replacement, and no amount of wishful
thinking makes them so.

Perhaps it would be more profitable to try to identify the pain points
that make people so eager to get rid of rules, and then see if we could
alleviate them.  One big problem I know about offhand is the
multiple-evaluation risk, which seems at least in principle fixable.
What others are there?





Yeah. That's by far the best approach. It has the merit of being 
positive rather than just taking something away that people do use, even 
if it's only a relatively small number of users.


The biggest pain people have mentioned is that they don't work with 
COPY.  I am in fact about to start working on a project which will 
probably alleviate that pain point. I'm not going to say much more, and 
I would not have said anything right now except that there is this 
sudden rush to deprecate rules, or announce a future removal of the 
feature. However, I hope to have a proposal to put to the community by 
about the end of November.


On that point, it's also worth noting that FDWs provide a nice 
workaround, instead of doing a straight


COPY TO mytable FROM myfile

you set up the source as a foreign table using file_fdw or my 
file_text_array_fdw, and then do


INSERT INTO mytable
SELECT ...
FROM my_foreign_table;

Maybe this too would be worth mentioning in the docs, maybe in the rules 
section with an xref from the copy section.


So, please, hold off for little bit. I don't mind putting warnings in 
the docs, but I'd really rather we waited on any announcement of a 
future possible deprecation, or log warnings that using rules will cause 
zombies to eat your brainz.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Alvaro Herrera

Simon Riggs wrote:
> On 18 October 2012 12:20, Magnus Hagander  wrote:
> 
> > Since Simon stirred up a hornets nest suggesting deprecation of a
> > number of features, I figured I'd take it one step further and suggest
> > removal of some previously deprecated features :)
> 
> I'm laughing at the analogy that angry and unintelligent agents
> responded to my proposals, but there was no stirring action from me.

We may all be stupid individually, but it's the swarm that matters.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Kevin Grittner

Andres Freund wrote:
> On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
 
>> I'm having trouble seeing a way to make this work without
>> rearranging the code for concurrent drop to get to a state where
>> it has set indisvalid = false, made that visible to all processes,
>> and ensured that all scans of the index are complete -- while
>> indisready is still true. That is the point where
>> TransferPredicateLocksToHeapRelation() could be safely called.
>> Then we would need to set indisready = false, make that visible to
>> all processes, and ensure that all access to the index is
>> complete. I can't see where it works to set both flags at the same
>> time.

> In a nearby bug I had to restructure the code that in a way thats
> similar to this anyway, so that seems fine. Maybe you can fix the
> bug ontop of the two attached patches?

Perfect; these two patches provide a spot in the code which is
exactly right for handling the predicate lock adjustments. Attached
is a patch which applies on top of the two you sent.

Thanks!

-Kevin
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***
*** 1320,1325  index_drop(Oid indexId, bool concurrent)
--- 1320,1337 
  	 * In the concurrent case we make sure that nobody can be looking at the
  	 * indexes by dropping the index in multiple steps, so we don't need a full
  	 * fledged AccessExlusiveLock yet.
+ 	 *
+ 	 * All predicate locks on the index are about to be made invalid. Promote
+ 	 * them to relation locks on the heap. For correctness the index must not
+ 	 * be seen with indisvalid = true during query planning after the move
+ 	 * starts, so that the index will not be used for a scan after the
+ 	 * predicate lock move, as this could create new predicate locks on the
+ 	 * index which would not ensure a heap relation lock. Also, the index must
+ 	 * not be seen during execution of a heap tuple insert with indisready =
+ 	 * false before the move is complete, since the conflict with the
+ 	 * predicate lock on the index gap could be missed before the lock on the
+ 	 * heap relation is in place to detect a conflict based on the heap tuple
+ 	 * insert.
  	 */
  	heapId = IndexGetRelation(indexId, false);
  	if (concurrent)
***
*** 1439,1444  index_drop(Oid indexId, bool concurrent)
--- 1451,1464 
  		}
  
  		/*
+ 		 * No more predicate locks will be acquired on this index, and we're
+ 		 * about to stop doing inserts into the index which could show
+ 		 * conflicts with existing predicate locks, so now is the time to move
+ 		 * them to the heap relation.
+ 		 */
+ 		TransferPredicateLocksToHeapRelation(userIndexRelation);
+ 
+ 		/*
  		 * now we are sure that nobody uses the index for queries, they just
  		 * might have it opened for updating it. So now we can unset
  		 * ->indisready and wait till nobody could update the index anymore.
***
*** 1507,1518  index_drop(Oid indexId, bool concurrent)
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 
! 	/*
! 	 * All predicate locks on the index are about to be made invalid. Promote
! 	 * them to relation locks on the heap.
! 	 */
! 	TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files
--- 1527,1534 
  		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
  		userIndexRelation = index_open(indexId, AccessExclusiveLock);
  	}
! 	else
! 		TransferPredicateLocksToHeapRelation(userIndexRelation);
  
  	/*
  	 * Schedule physical removal of the files

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe & leaves around undroppable indexes

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:56, Abhijit Menon-Sen  wrote:
> At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote:
>>
>> The attached patch fixes this issue. Haven't looked at the other one
>> in detail yet.
>
> Here are tests for both bugs. They currently fail with HEAD.
>
> Note that the first test now uses PREPARE instead of the SELECTs in the
> original example, which no longer works after commit #96cc18 because of
> the re-added AcceptInvalidationMessages calls (archaeology by Andres).

Thanks, I'll apply these now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe & leaves around undroppable indexes

2012-10-18 Thread Abhijit Menon-Sen

At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote:
>
> The attached patch fixes this issue. Haven't looked at the other one
> in detail yet.

Here are tests for both bugs. They currently fail with HEAD.

Note that the first test now uses PREPARE instead of the SELECTs in the
original example, which no longer works after commit #96cc18 because of
the re-added AcceptInvalidationMessages calls (archaeology by Andres).

-- Abhijit
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 75e33bc..d964aaf 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -14,3 +14,5 @@ test: fk-contention
 test: fk-deadlock
 test: fk-deadlock2
 test: eval-plan-qual
+test: drop-index-concurrently-1
+test: drop-index-concurrently-2

diff --git a/src/test/isolation/specs/drop-index-concurrently-1.spec b/src/test/isolation/specs/drop-index-concurrently-1.spec
new file mode 100644
index 000..83c44ab
--- /dev/null
+++ b/src/test/isolation/specs/drop-index-concurrently-1.spec
@@ -0,0 +1,33 @@
+setup
+{
+	CREATE TABLE test_dc(id serial primary key, data int);
+	INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100);
+	CREATE INDEX test_dc_data ON test_dc(data);
+}
+
+teardown
+{
+	DROP TABLE test_dc;
+}
+
+session "s1"
+step "noseq" { SET enable_seqscan = false; }
+step "prepi" { PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34; }
+step "preps" { PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text; }
+step "begin" { BEGIN; }
+step "explaini" { EXPLAIN (COSTS OFF) EXECUTE getrow_idx; }
+step "explains" { EXPLAIN (COSTS OFF) EXECUTE getrow_seq; }
+step "selecti" { EXECUTE getrow_idx; }
+step "selects" { EXECUTE getrow_seq; }
+step "end" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "select2" { SELECT * FROM test_dc WHERE data=34; }
+step "insert2" { INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); }
+step "end2" { COMMIT; }
+
+session "s3"
+step "drop" { DROP INDEX CONCURRENTLY test_dc_data; }
+
+permutation "noseq" "prepi" "preps" "begin" "explaini" "explains" "select2" "drop" "insert2" "end2" "selecti" "selects" "end"

diff --git a/src/test/isolation/expected/drop-index-concurrently-1.out b/src/test/isolation/expected/drop-index-concurrently-1.out
new file mode 100644
index 000..c42ac91
--- /dev/null
+++ b/src/test/isolation/expected/drop-index-concurrently-1.out
@@ -0,0 +1,36 @@
+Parsed test spec with 3 sessions
+
+starting permutation: noseq prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end
+step noseq: SET enable_seqscan = false;
+step prepi: PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34;
+step preps: PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text;
+step begin: BEGIN;
+step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idx;
+QUERY PLAN 
+
+Index Scan using test_dc_data on test_dc
+  Index Cond: (data = 34)
+step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seq;
+QUERY PLAN 
+
+Seq Scan on test_dc
+  Filter: ((data)::text = '34'::text)
+step select2: SELECT * FROM test_dc WHERE data=34;
+id data   
+
+34 34 
+step drop: DROP INDEX CONCURRENTLY test_dc_data; 
+step insert2: INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100);
+step end2: COMMIT;
+step selecti: EXECUTE getrow_idx;
+id data   
+
+34 34 
+13434 
+step selects: EXECUTE getrow_seq;
+id data   
+
+34 34 
+13434 
+step end: COMMIT;
+step drop: <... completed>

diff --git a/src/test/isolation/specs/drop-index-concurrently-2.spec b/src/test/isolation/specs/drop-index-concurrently-2.spec
new file mode 100644
index 000..273f735
--- /dev/null
+++ b/src/test/isolation/specs/drop-index-concurrently-2.spec
@@ -0,0 +1,21 @@
+setup
+{
+	CREATE TABLE test_dc(id serial primary key, data int);
+	CREATE INDEX test_dc_data ON test_dc(data);
+}
+
+session "s1"
+setup { BEGIN; }
+step "explain" { EXPLAIN (COSTS OFF) SELECT * FROM test_dc WHERE data=34343; }
+step "rollback" { ROLLBACK; }
+step "droptab" { DROP TABLE test_dc; }
+step "selecti" { SELECT indexrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_dc_data'::regclass; }
+step "dropi" { DROP INDEX test_dc_data; }
+
+session "s2"
+step "drop" { DROP INDEX CONCURRENTLY test_dc_data; }
+
+session "s3"
+step "cancel" { SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query = 'DROP INDEX CONCURRENTLY test_dc_data;'; }
+
+permutation "explain" "drop" "cancel" "rollback" "droptab" "selecti" "dropi"

diff --git a/src/test/isolation/expected/drop-index-concurrently-2.out b/src/test/isolation/expected/drop-index-concurrently-2.out
new file mode 100644
index 000..4802777
--- /dev/null
+++ b/src/test/isolation/expected/drop-index-concurrently-2.out
@@ -0,0 +1,24 @@
+Parsed test spec

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:20, Magnus Hagander  wrote:

> Since Simon stirred up a hornets nest suggesting deprecation of a
> number of features, I figured I'd take it one step further and suggest
> removal of some previously deprecated features :)

I'm laughing at the analogy that angry and unintelligent agents
responded to my proposals, but there was no stirring action from me.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:37, Magnus Hagander  wrote:
> On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs  wrote:
>> On 18 October 2012 12:20, Magnus Hagander  wrote:
>>
>>> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
>>> syntax deprecated but still mapping to the new one. Has it been there
>>> long enough that we should start throwing an error for ident on unix?
>>
>> Any reason to remove? Having two names for same thing is a happy place
>> for users with bad/fond memories. It costs little and no errors are
>> associated with using the old name (are there?).
>
> The only real reason for that one would be confusion. e.g. using ident
> over tcp is for most people very insecure, whereas ident over unix
> sockets is very secure. there are exceptions to both those, but for
> the majority of cases we are using the same name for one thing that
> has very good security and one that has very bad. And confusion when
> it comes to security is usually not a good thing.

I'll go with that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Magnus Hagander

On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs  wrote:
> On 18 October 2012 12:20, Magnus Hagander  wrote:
>
>> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
>> syntax deprecated but still mapping to the new one. Has it been there
>> long enough that we should start throwing an error for ident on unix?
>
> Any reason to remove? Having two names for same thing is a happy place
> for users with bad/fond memories. It costs little and no errors are
> associated with using the old name (are there?).

The only real reason for that one would be confusion. e.g. using ident
over tcp is for most people very insecure, whereas ident over unix
sockets is very secure. there are exceptions to both those, but for
the majority of cases we are using the same name for one thing that
has very good security and one that has very bad. And confusion when
it comes to security is usually not a good thing.

The krb5 one is more about maintaining code, but there is not much
cost to keeping ident-over-unix, that's true.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deprecations in authentication

2012-10-18 Thread Simon Riggs

On 18 October 2012 12:20, Magnus Hagander  wrote:

> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
> syntax deprecated but still mapping to the new one. Has it been there
> long enough that we should start throwing an error for ident on unix?

Any reason to remove? Having two names for same thing is a happy place
for users with bad/fond memories. It costs little and no errors are
associated with using the old name (are there?).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Deprecations in authentication

2012-10-18 Thread Magnus Hagander

Since Simon stirred up a hornets nest suggesting deprecation of a
number of features, I figured I'd take it one step further and suggest
removal of some previously deprecated features :)

In particular, we made a couple of changes over sveral releases back
in the authentication config, that we should perhaps consider
finishing by removing the old stuff now?

1. krb5 authentication. We've had gssapi since 8.3 (which means in all
supported versions). krb5 has been deprecated, also since 8.3. Time to
remove it?

2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old
syntax deprecated but still mapping to the new one. Has it been there
long enough that we should start throwing an error for ident on unix?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Dimitri Fontaine

Simon Riggs  writes:
> Not sure how it is cleaner when we have to have trigger stuff hanging
> around to make one object pretend to be another. That also creates a
> chain of dependency which puts this into the future, rather than now.

Yes, that part isn't cleaner at all. The part where we have a separate
Object to deal with I like better, and I tried to reconciliate the two
view points.

Note that the event trigger would come installed and disabled, the user
would only have to activate it:

   ALTER EVENT TRIGGER distribute_my_cluster ENABLE;

Still the same issue.

> The goal is make-sequences-work, not to invent something new that
> might be cooler or more useful. If we create something new, then we
> need to consider the references Daniel described, but that is a whole
> different thing and already accessible if you need/want that.

So ok, I withdraw my consensus proposal. I tried.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order

2012-10-18 Thread Joel Jacobson

On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
 wrote:
> Uh, the patch you posted keeps the pg_get_function_identity_arguments
> call in dumpFunc, but there is now also a new one in getFuncs.  Do we
> need to remove the second one?

It could be done, but unfortunately we cannot use the value computed
in dumpFunc(),
because getFuncs() is called before dumpFunc().

The patch currently only affects getFuncs(), it doesn't touch dumpFunc().

What could be done is to keep the changes in getFuncs(), and also
change dumpFunc()
to use the value computed in getFuncs(), but I think the gain is small
in relation
to the complexity of changing dumpFunc(), as we would still need to
make the two other
function calls in the SQL query in dumpFunc() to pg_get_function_arguments() and
pg_get_function_result().

> Here's an updated patch for your consideration.  I was about to push
> this when I noticed the above.  The only change here is that the extra
> code that tests for new remoteVersions in the second "else if" branch of
> getFuncs and getAggregates has been removed, since it cannot ever be
> reached.

Looks really good.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-10-18 Thread Andres Freund

On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote:
> Kevin Grittner wrote:
> > Hmm. The comment is probably better now, but I've been re-checking
> > the code, and I think my actual code change is completely wrong.
> > Give me a bit to sort this out.
> 
> I'm having trouble seeing a way to make this work without rearranging
> the code for concurrent drop to get to a state where it has set
> indisvalid = false, made that visible to all processes, and ensured
> that all scans of the index are complete -- while indisready is still
> true. That is the point where TransferPredicateLocksToHeapRelation()
> could be safely called. Then we would need to set indisready = false,
> make that visible to all processes, and ensure that all access to the
> index is complete. I can't see where it works to set both flags at
> the same time. I want to sleep on it to see if I can come up with any
> other way, but right now that's the only way I'm seeing to make DROP
> INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-(

In a nearby bug I had to restructure the code that in a way thats similar to 
this anyway, so that seems fine. Maybe you can fix the bug ontop of the two 
attached patches?

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From 037daa464d3fc63dbc943b13dd90f477a4fc9aba Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 25 Sep 2012 01:41:29 +0200
Subject: [PATCH 1/2] Fix concurrency issues in concurrent index drops

Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and*
indisready. Thats problematic if some transaction is still looking at the index
and another transction makes changes. See the example below.

Now we do the drop in three stages, just as a concurrent index build. First
unset indivalid, wait, unset indisready, wait, drop index.

Example:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
10);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)
---
 src/backend/catalog/index.c |   99 ---
 1 file changed, 84 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 464950b..b39536e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1316,6 +1316,10 @@ index_drop(Oid indexId, bool concurrent)
 	 * table lock strong enough to prevent all queries on the table from
 	 * proceeding until we commit and send out a shared-cache-inval notice
 	 * that will make them update their index lists.
+	 *
+	 * In the concurrent case we make sure that nobody can be looking at the
+	 * indexes by dropping the index in multiple steps, so we don't need a full
+	 * fledged AccessExlusiveLock yet.
 	 */
 	heapId = IndexGetRelation(indexId, false);
 	if (concurrent)
@@ -1336,7 +1340,19 @@ index_drop(Oid indexId, bool concurrent)
 
 	/*
 	 * Drop Index concurrently is similar in many ways to creating an index
-	 * concurrently, so some actions are similar to DefineIndex()
+	 * concurrently, so some actions are similar to DefineIndex() just in the
+	 * reverse order.
+	 *
+	 * First we unset indisvalid so queries starting afterwards don't use the
+	 * index to answer queries anymore. We have to keep indisready = true
+	 * though so transactions that are still using the index can continue to
+	 * see valid index contents. E.g. when they are using READ COMMITTED mode,
+	 * and another transactions that started later commits makes changes and
+	 * commits, they need to see those new tuples in the index.
+	 *
+	 * After all transactions that could possibly have used it for queries
+	 * ended we can unset indisready and wait till nobody could be updating it
+	 * anymore.
 	 */
 	if (concurrent)
 	{
@@ -1355,21 +1371,14 @@ index_drop(Oid indexId, bool concurrent)
 			elog(ERROR, "cache lookup failed for index %u", indexId);
 		indexForm = (Form_pg_index) GETSTRUCT(tuple);
 
-		indexForm->indisvalid = false;	/* make unusable for queries */
-		indexForm->indisready = false;	/* make invisible to changes */
+		indexForm->indisvalid = false;	/* make unusable for new queries */
+		/* we keep indisready == true so it still gets updated */
 
 		simple_heap_update(i

[HACKERS] [BUG] False indication in pg_stat_replication.sync_state

2012-10-18 Thread Kyotaro HORIGUCHI

Hello. My colleague found that pg_stat_replication.sync_state
shows false state for some condition.

This prevents Pacemaker from completing fail-over that could
safely be done.

The point is in walsender.c, pg_stat_get_wal_senders() below, (as
of REL9_2_1)

  1555:if (walsnd->pid != 0)
  1556:{
  1557:  /*
  1558:   * Treat a standby such as a pg_basebackup background process
  1559:   * which always returns an invalid flush location, as an
  1560:   * asynchronous standby.
  1561:   */
! 1562:   sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ?
  1563:  0 : walsnd->sync_standby_priority;

Here, XLogRecPtrIsInvalid(walsnd->flush) is defined as
(walsnd->flush.xrecoff == 0) which becomes true as usual at every
WAL 'file's (not segments) boundary. xrecoff == 0 is certainly
invalid for the start point of WAL *RECORD*, but should be
considered valid in replication stream. This check was introduced
at 9.2.0 and the version up between 9.1.4 and 9.1.5.

 | DEBUG:  write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68
 | DEBUG:  write 4/0 flush 4/0 apply 3/FEFFEC68
 | DEBUG:  HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1
 | DEBUG:  write 4/111C0 flush 4/0 apply 3/FEFFECC0
!| DEBUG:  HOGE: flush = 4/0 sync_priority[0] = 0

This value zero of sync_priority[0] makes sync_status 'async'
errorneously and confuses Pacemaker.

# The log line marked with 'HOGE' above printed by applying the
# patch at the bottom of this message and invoking 'select
# sync_state from pg_stat_replication' periodically. To increase
# the chance to see the symptom, sleep 1 second for 'file'
# boundaries :-)

The Heikki's recent(?) commit
0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format
of XLogRecPtr from logid:xrecoff struct to 64 bit linear address
would fix the false indication. But I suppose this patch won't be
applied to existing 9.1.x and 9.2.x because of the modification
onto streaming protocol.

As far as I see the patch, it would'nt change the meaning of
XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to
(xrecoff == 0 && xlogid == 0). But this change affects rather
wide portion where handling WAL nevertheless what is needed here
is only to stop the false indication.

On the other hand, pg_basebackup seems return 0/0 as flush and
apply positions so it seems enough only to add xlogid == 0 into
the condition. The patch attached for REL9_2_1 does this and
yields the result following.

 | DEBUG:  write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0
 | DEBUG:  write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88
 | DEBUG:  write 3/0 flush 3/0 apply 2/FEFFFD48
 | DEBUG:  HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1
 | DEBUG:  write 3/E338 flush 3/0 apply 2/FE80
!| DEBUG:  HOGE: flush = 3/0 sync_priority[0] = 1

I think this patch should be applied for 9.2.2 and 9.1.7.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


= The patch for this test.

diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index 064ddd5..19f79d1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -618,6 +618,10 @@ ProcessStandbyReplyMessage(void)
 reply.flush.xlogid, reply.flush.xrecoff,
 reply.apply.xlogid, reply.apply.xrecoff);
 
+   if (reply.write.xrecoff == 0 ||
+   reply.flush.xrecoff == 0)
+   sleep(1);
+
/*
 * Update shared state for this WalSender process based on reply data 
from
 * standby.
@@ -1561,7 +1565,10 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 */
sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ?
0 : walsnd->sync_standby_priority;
-
+   elog(DEBUG1, "HOGE: flush = %X/%X sync_priority[%d] = 
%d",
+walsnd->flush.xlogid, walsnd->flush.xrecoff, 
+i, sync_priority[i]);
+   
if (walsnd->state == WALSNDSTATE_STREAMING &&
walsnd->sync_standby_priority > 0 &&
(priority == 0 ||
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 064ddd5..1d4cbc4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1555,11 +1555,11 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 		if (walsnd->pid != 0)
 		{
 			/*
-			 * Treat a standby such as a pg_basebackup background process
-			 * which always returns an invalid flush location, as an
+			 * Treat a standby such as a pg_basebackup background process which
+			 * always returns 0/0 (InvalidXLogRecPtr) as flush location, as an
 			 * asynchronous standby.
 			 */
-			sync_priority[i] = XLogRecPtrIsInvalid(walsnd->f

[HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-18 Thread Sebastien FLAESCH


Hello,

Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue
with a UTF-8 database.

Maybe this is expected, but want to be sure that I am not missing something.

On Windows, I have created a database with:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'English_United States.1252'
  LC_CTYPE = 'English_United States.1252'

I am using libpq with a simple C program.
The program handles UTF-8 strings.

While it's not a traditional "UNICODE" Windows application using WCHAR, it
should be possible to send and get back UTF-8 data with PostgreSQL.

Interacting with the Windows system in the system locale is another story, but
from a pure C / SQL / libpq point of view, as long as the PostgreSQL client
encoding is properly defined to UTF-8, it should work, and it does mainly:

- I can use UTF-8 string constants in my queries.
- I can pass UTF-8 data to the database with parameterized queries.
- I can fetch UTF-8 data from the database.
- I can create db object names with UTF-8 characters.

But the db object names must be specified with double quotes:
When I do not use quoted db object names, I get a strange problem.
The table is created, I can use it in my program, I can even use it in the
pgAdmin query tool, but in the pgAdmin db browser, there is no table name
displayed in the treeview...

Further, when selecting schema information from pg_class, I can see that the
table exists, but there is nothing displayed in the relname attribute...

It appears that the problem disappears when using a "C" collation are char type:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'C'
  LC_CTYPE = 'C'

I suspect this has something to do with the fact that non-quoted identifiers
are converted to lowercase, and because my LC_CTYPE is English_United 
States.1252,
the conversion to lowercase fails...

But:

- why does PostgreSQL accept to create the table with invalid UTF-8 characters?

- why can I make queries in the query tool with the UTF-8 table name?
(note that show client_encoding in the query tool gives me "UNICODE" - is this
the same as "UTF-8"?)


So is there a bug, or do I have to use a C collation and char type for UTF-8
databases on Windows?

I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale,
and co).

Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL
databases on Windows...?

Thanks for reading.
Sebastien FLAESCH
Four Js Development Tools


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Simon Riggs

On 17 October 2012 11:26, Hannu Krosing  wrote:

> LOGGED ONLY TABLE is very technical description of realisation - I'd
> prefer it to work as mush like a table as possible, similar to how VIEW
> currently works - for all usages that make sense, you can simply
> substitute it for a TABLE
>
> QUEUE emphasizes the aspect of logged only table that it accepts
> "records" in a certain order, persists these and then quarantees
> that they can be read out in exact the same order - all this being
> guaranteed by existing WAL mechanisms.
>
> It is not meant to be a full implementation of application level queuing
> system though but just the capture, persisting and distribution parts
>
> Using this as an "application level queue" needs a set of interface
> functions to extract the events and also to keep track of the processed
> events. As there is no general consensus what these shoul be (like if
> processing same event twice is allowed) this part is left for specific
> queue consumer implementations.

The two halves of the queue are the TAIL/entry point and the HEAD/exit
point. As you point out these could be on the different servers,
wherever the logical changes flow to, but could also be on the same
server. When the head and tail are on the same server, the MESSAGE
QUEUE syntax seems appropriate, but I agree that calling it that when
its just a head or just a tail seems slightly misleading.

I guess the question is whether we provide a full implementation or
just the first half.

We do, I think, want a full queue implementation in core. We also want
to allow other queue implementations to interface with Postgres, so we
probably want to allow "first half" only as well. Meaning we want both
head and tail separately in core code. The question is whether we
require both head and tail in core before we allow commit, to which I
would say I think adding the tail first is OK, and adding the head
later when we know exactly the design.

Having said that, the LOGGING ONLY syntax makes me shiver. Better name?

I should also add that this is an switchable sync/asynchronous
transactional queue, whereas LISTEN/NOTIFY is a synchronous
transactional queue.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-18 Thread Sebastien FLAESCH


Hello,

Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue
with a UTF-8 database.

Maybe this is expected, but want to be sure that I am not missing something.

On Windows, I have created a database with:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'English_United States.1252'
  LC_CTYPE = 'English_United States.1252'

I am using libpq with a simple C program.
The program handles UTF-8 strings.

While it's not a traditional "UNICODE" Windows application using WCHAR, it
should be possible to send and get back UTF-8 data with PostgreSQL.

Interacting with the Windows system in the system locale is another story, but
from a pure C / SQL / libpq point of view, as long as the PostgreSQL client
encoding is properly defined to UTF-8, it should work, and it does mainly:

- I can use UTF-8 string constants in my queries.
- I can pass UTF-8 data to the database with parameterized queries.
- I can fetch UTF-8 data from the database.
- I can create db object names with UTF-8 characters.

But the db object names must be specified with double quotes:
When I do not use quoted db object names, I get a strange problem.
The table is created, I can use it in my program, I can even use it in the
pgAdmin query tool, but in the pgAdmin db browser, there is no table name
displayed in the treeview...

Further, when selecting schema information from pg_class, I can see that the
table exists, but there is nothing displayed in the relname attribute...

It appears that the problem disappears when using a "C" collation are char type:

  ENCODING = 'UTF-8'
  LC_COLLATE = 'C'
  LC_CTYPE = 'C'

I suspect this has something to do with the fact that non-quoted identifiers
are converted to lowercase, and because my LC_CTYPE is English_United 
States.1252,
the conversion to lowercase fails...

But:

- why does PostgreSQL accept to create the table with invalid UTF-8 characters?

- why can I make queries in the query tool with the UTF-8 table name?
(note that show client_encoding in the query tool gives me "UNICODE" - is this
the same as "UTF-8"?)


So is there a bug, or do I have to use a C collation and char type for UTF-8
databases on Windows?

I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale,
and co).

Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL
databases on Windows...?

Thanks for reading.
Sebastien FLAESCH
Four Js Development Tools


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Simon Riggs

On 17 October 2012 21:25, Josh Berkus  wrote:
>
>> It is not meant to be a full implementation of application level queuing
>> system though but just the capture, persisting and distribution parts
>>
>> Using this as an "application level queue" needs a set of interface
>> functions to extract the events and also to keep track of the processed
>> events. As there is no general consensus what these shoul be (like if
>> processing same event twice is allowed) this part is left for specific
>> queue consumer implementations.
>
> Well, but AFAICT, you've already prohibited features through your design
> which are essential to application-level queues, and are implemented by,
> for example, pgQ.
>
> 1. your design only allows the queue to be read on replicas, not on the
> node where the item was inserted.
>
> 2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on
> earth would a client know which items they have executed and which they
> haven't?
>
> 3. Double-down on #2 in a multithreaded environment.

It's hard to work out how to reply to this because its just so off
base. I don't agree with the restrictions you think you see at all,
saying it politely rather than giving a one word answer.

The problem here is you phrase these things with too much certainty,
seeing only barriers. The "how on earth?" vibe is not appropriate at
all. It's perfectly fine to ask for answers to those difficult
questions, but don't presume that there are no answers, or that you
know with certainty they are even hard ones. By phrasing things in
such a closed way the only way forwards is through you, which does not
help.

All we're discussing is moving a successful piece of software into
core, which has been discussed for years at the international
technical meetings we've both been present at. I think an open
viewpoint on the feasibility of that would be reasonable, especially
when it comes from one of the original designers.

I apologise for making a personal comment, but this does affect the
technical discussion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Global Sequences

2012-10-18 Thread Simon Riggs

On 16 October 2012 18:29, Tom Lane  wrote:

> Or maybe better, invent a level of indirection like a "sequence access
> method" (comparable to index access methods) that provides a compatible
> set of substitute functions for sequence operations.  If you want to
> override nextval() for a sequence, don't you likely also need to
> override setval(), currval(), etc?  Not to mention overriding ALTER
> SEQUENCE's behavior.

Agreed, though with exact API as discussed on portion of thread with Markus.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

91 matches

Mail list logo