Re: [HACKERS] [PATCH] pg_upgrade: support for btrfs copy-on-write clones

2013-11-15 Thread Heikki Linnakangas

On 05.10.2013 16:57, Oskari Saarenmaa wrote:

05.10.2013 16:38, Bruce Momjian kirjoitti:

On Fri, Oct  4, 2013 at 10:42:46PM +0300, Oskari Saarenmaa wrote:

Thanks for the offers, but it looks like ZFS doesn't actually implement
a similar file level clone operation.  See
https://github.com/zfsonlinux/zfs/issues/405 for discussion on a feature
request for it.

ZFS does support cloning entire datasets which seem to be similar to
btrfs subvolume snapshots and could be used to set up a new data
directory for a new $PGDATA.   This would require the original $PGDATA
to be a dataset/subvolume of its own and quite a bit different logic
(than just another file copy method in pg_upgrade) to initialize the new
version's $PGDATA as a snapshot/clone of the original.  The way this
would work is that the original $PGDATA dataset/subvolume gets cloned to
a new location after which we move the files out of the way of the new
PG installation and run pg_upgrade in link mode.  I'm not sure if
there's a good way to integrate this into pg_upgrade or if it's just
something that could be documented as a fast way to run pg_upgrade
without touching original files.

With btrfs tooling the sequence would be something like this:

   btrfs subvolume snapshot /srv/pg92 /srv/pg93
   mv /srv/pg93/data /srv/pg93/data92
   initdb /data/pg93/data
   pg_upgrade --link \
   --old-datadir=/data/pg93/data92 \
   --new-datadir=/data/pg93/data


Does btrfs support file system snapshots?  If so, shouldn't people just
take a snapshot of the old data directory before the ugprade, rather
than using cloning?


Yeah, it's possible to clone an existing subvolume, but this requires
that $PGDATA is a subvolume of its own and would be a bit difficult to
integrate into existing pg_upgrade scripts.

The BTRFS_IOC_CLONE ioctl operates on file level and can be used to
clone files anywhere in a btrfs filesystem.


Hmm, you can also do

cp --reflog -r data92 data-tmp
pg_upgrade --link --old-datadir=data92-copy --new-datadir=data-tmp
rm -rf data-tmp

That BTRFS_IOC_CLONE ioctl seems so hacky that I'd rather not get that 
in our source tree. cp --reflog is much more likely to get that magic 
incantation right, since it gets a lot more attention and testing than 
pg_upgrade.


I'm not in favor of adding filesystem-specific tricks into pg_upgrade. 
It would be nice to list these tricks in the docs, though.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 14:22:30 +1300, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:12 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  Hi,
 
  On 2013-11-13 22:55:43 +1300, David Rowley wrote:
   Here 
   http://www.postgresql.org/message-id/24278.1352922...@sss.pgh.pa.usthere
   was some talk about init_sequence being a bottleneck when many sequences
   are used in a single backend.
  
   The attached I think implements what was talked about in the above link
   which for me seems to double the speed of a currval() loop over 3
   sequences. It goes from about 7 seconds to 3.5 on my laptop.
 
  I think it'd be a better idea to integrate the sequence caching logic
  into the relcache. There's a comment about it:
   * (We can't
   * rely on the relcache, since it's only, well, a cache, and may decide to
   * discard entries.)
  but that's not really accurate anymore. We have the infrastructure for
  keeping values across resets and we don't discard entries.
 
 
 I just want to check this idea against an existing todo item to move
 sequences into a single table, as I think by the sounds of it this binds
 sequences being relations even closer together.

 This had been on the back of my mind while implementing the hash table
 stuff for init_sequence and again when doing my benchmarks where I created
 3 sequences and went through the pain of having a path on my file
 system with 3 8k files.

Well. But in which real world usecases is that actually the bottleneck?

 1. The search_path stuff makes this a bit more complex. It sounds like this
 would require some duplication of the search_path logic.

I'd assumed that if we were to do this, the sequences themselves would
still continue to live in pg_class. Just instead of a relfilenode
containing their state it would be stored in an extra table.

 2. There is also the problem with tracking object dependency.
 
 Currently:
 create sequence t_a_seq;
 create table t (a int not null default nextval('t_a_seq'));
 alter sequence t_a_seq owned by t.a;
 drop table t;
 drop sequence t_a_seq; -- already deleted by drop table t
 ERROR:  sequence t_a_seq does not exist
 
 Moving sequences to a single table sounds like a special case for this
 logic.

There should already be code such dependencies.

4) Scalability problems: The one block sequences use already can be a
major contention issue when you have paralell inserts to the same
table. A workload which I, unlike a couple thousand unrelated sequences,
actually think is more realistic. So we'd need to force 1 sequence tuple
per block, which we currently cannot do.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 19:12:15 +1300, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Andres Freund and...@2ndquadrant.com writes:
   I think it'd be a better idea to integrate the sequence caching logic
   into the relcache. There's a comment about it:
* (We can't
* rely on the relcache, since it's only, well, a cache, and may decide
  to
* discard entries.)
   but that's not really accurate anymore. We have the infrastructure for
   keeping values across resets and we don't discard entries.
 
  We most certainly *do* discard entries, if they're not open when a cache
  flush event comes along.
 
  I suppose it'd be possible to mark a relcache entry for a sequence
  as locked-in-core, but that doesn't attract me at all.  A relcache
  entry is a whole lot larger than the amount of state we really need
  to keep for a sequence.
 
  One idea is to have a hashtable for the sequence-specific data,
  but to add a link field to the relcache entry that points to the
  non-flushable sequence hashtable entry.  That would save the second
  hashtable lookup as long as the relcache entry hadn't been flushed
  since last use, while not requiring any violence to the lifespan
  semantics of relcache entries.  (Actually, if we did that, it might
  not even be worth converting the list to a hashtable?  Searches would
  become a lot less frequent.)
 
 
 Unless I've misunderstood something it looks like this would mean giving
 heamam.c and relcache.c knowledge of sequences.
 Currently relation_open is called from open_share_lock in sequence.c. The
 only way I can see to do this would be to add something like
 relation_open_sequence() in heapam.c which means we'd need to invent
 RelationIdGetSequenceRelation() and use that instead
 of RelationIdGetRelation() and somewhere along the line have it pass back
 the SeqTableData struct which would be tagged onto RelIdCacheEnt.

Look like indexes already go through that path, and store data in the
relcache, without requiring heapam.c to know all that much.

 I think it can be done but I don't think it will look pretty.
 Perhaps if there was a more generic way... Would tagging some void
 *rd_extra only the RelationData be a better way? And just have sequence.c
 make use of that for storing the SeqTableData.

I'd rather have it properly typed in -rd_sequence, maybe in a union
against the index members.

 Also I'm wondering what we'd do with all these pointers when someone does
 DISCARD SEQUENCES; would we have to invalidate the relcache or would it
 just be matter of looping over it and freeing of the sequence data setting
 the pointers to NULL?

Good question. Have a 'discard_count' member and global variable? That
starts at zero and gets incremented everytime somebody discards?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cannot allocate memory

2013-11-15 Thread Heng Zhi Feng (zh...@hsr.ch)
Hi,

We have a problem where PostgreSQL will restart or shutdown when calling it 
through PDAL write filter. This was after we applied pgtune on the 
postgresql.conf.

These are the settings on the machine:
Virtual Machine - Ubuntu 13.10
1.92GB Memory
2 Parallel Processors

And these are the configurations from pgtune:
#--
# CUSTOMIZED OPTIONS
#--
default_statistics_target = 50   # pgtune wizard 2013-11-14
maintenance_work_mem = 112MB   # pgtune wizard 2013-11-14
constraint_exclusion = on   # pgtune wizard 2013-11-14
checkpoint_completion_target = 0.9 # pgtune wizard 2013-11-14
effective_cache_size = 1408MB   # pgtune wizard 2013-11-14
work_mem = 11MB  # pgtune wizard 2013-11-14
wal_buffers = 8MB # pgtune wizard 
2013-11-14
checkpoint_segments = 16 # pgtune wizard 2013-11-14
shared_buffers = 448MB# pgtune wizard 2013-11-14
max_connections = 80  # pgtune wizard 2013-11-14

I have also set the shmmax to a higher value to adapt to the new configurations 
but it does not seem to solve the problem.

Below is a snippet of the postgresql.log:
2013-11-15 11:02:35 CET LOG:  could not fork autovacuum worker process: Cannot 
allocate memory
2013-11-15 11:02:36 CET LOG:  could not send data to client: Broken pipe
2013-11-15 11:02:36 CET LOG:  unexpected EOF on client connection

Thanks

Zhi Feng


Re: [HACKERS] Logging WAL when updating hintbit

2013-11-15 Thread Sawada Masahiko
On Thu, Nov 14, 2013 at 7:51 PM, Florian Weimer fwei...@redhat.com wrote:
 On 11/14/2013 07:02 AM, Sawada Masahiko wrote:

 I attached patch adds new wal_level 'all'.


 Shouldn't this be a separate setting?  It's useful for storage which
 requires rewriting a partially written sector before it can be read again.


Thank you for comment.
Actually, I had thought to add separate parameter.
If so, this feature logs enough information with all of the wal_level
(e.g., minimal) ?
And I thought that relation between wal_lvel and new feature would be
confuse user.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Heikki Linnakangas

On 15.11.2013 07:47, David Rowley wrote:

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote

I think that means that we should just completely replace the list with
the hash table. The difference with a small N is lost in noise, so there's
no point in keeping the list as a fast path for small N. That'll make the
patch somewhat simpler.


Attached is a much more simple patch which gets rid of the initial linked
list.


Thanks, committed with minor copy-editing. I dialed down the initial 
size of the hash table from 1000 to 16, that ought to be enough.


I ran a quick performance test of my own, based on the script you sent. 
I modified it a bit to eliminate the PL/pgSQL overhead, making it more 
heavily bottlenecked by the nextval/currval overhead. Results:


nextval, 1 seqs 36772   2426
currval, 1 seq  11761069
currval, 2 seqs 865 857
currval, 4 seqs 742 759
currval, 5 seqs 718 711
currval, 10 seqs680 668
currval, 100 seqs   871 656
currval, 1000 seqs  3507700
currval, 1 seqs 34742   1224

The performance when you touch only a few sequences is unchanged. When 
you touch a lot of them, you gain. Just as you would expect.


Attached is the test script I used. After running the test, I realized 
that there's a little flaw in the test methodology, but I doesn't 
invalidates the results. I used the same backend for all the test runs, 
so even when currval() is called repeatedly for a single sequence, the 
linked list (or hash table, with the patch) nevertheless contains 
entries for all 1 sequences. However, the sequences actually used by 
the test are always in the front of the list, because the nextval() 
calls were made in the same order. But with the unpatched version, if 
you called currval() on the lastly initialized sequence repeatedly, 
instead of the firstly initialized one, you would get much would get 
horrible performance, even when you touch only a single sequence.


Regarding the more grandiose ideas of using the relcache or rewriting 
the way sequences are stored altogether: this patch might become 
obsolete if we do any of that stuff, but that's ok. The immediate 
performance problem has been solved now, but those other ideas might be 
worth pursuing for other reasons.


- Heikki
CREATE or replace FUNCTION create_seq(n integer) RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
drop table if exists testseqs;
create table testseqs(seqoid oid, n int4);
WHILE n  0 LOOP
EXECUTE 'CREATE SEQUENCE testseq' || n;
	insert into testseqs select oid, n from pg_class where relname=('testseq' || n);
  n := n - 1;
END LOOP;
END
$$;

CREATE or replace FUNCTION drop_seq() RETURNS void language plpgsql as
$$
declare
   n int4;
begin
   for n in select testseqs.n from testseqs loop
 execute 'drop sequence testseq' || n;
   end loop;
end;
$$;


CREATE or replace FUNCTION nextval_seq(nseqs integer, niter integer) RETURNS bigint
LANGUAGE sql
AS $$
select count(*) from (
  select nextval(seqoid), n, g from testseqs, generate_series(1, niter) g where n = nseqs
) foo;
$$;


CREATE or replace FUNCTION currval_seq(nseqs integer, niter integer) RETURNS bigint
LANGUAGE sql
AS $$
select count(*) from (
  select currval(seqoid), n, g from testseqs, generate_series(1, niter) g where n = nseqs
) foo;
$$;


\timing on
--select create_seq(1);

\echo warmup
select nextval_seq(1, 100);

\echo calling nextval on 1 distinct sequences, 100 times
select nextval_seq(1, 100);

\echo calling currval on 1 sequence, 100 times
select currval_seq(1, 100);

\echo calling currval on 2 sequences, 50 times
select currval_seq(2, 50);

\echo calling currval on 4 sequences, 25 times
select currval_seq(4, 25);

\echo calling currval on 5 sequences, 20 times
select currval_seq(5, 20);

\echo calling currval on 10 sequences, 10 times
select currval_seq(10, 10);

\echo calling currval on 100 sequences, 1 times
select currval_seq(100, 1);

\echo calling currval on 1000 sequences, 1000 times
select currval_seq(1000, 1000);

\echo calling currval on 1 distinct sequences, 100 times
select currval_seq(1, 100);



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Andres Freund
On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:
 On 15.11.2013 07:47, David Rowley wrote:
 On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote
 
 I think that means that we should just completely replace the list with
 the hash table. The difference with a small N is lost in noise, so there's
 no point in keeping the list as a fast path for small N. That'll make the
 patch somewhat simpler.
 
 Attached is a much more simple patch which gets rid of the initial linked
 list.
 
 Thanks, committed with minor copy-editing. I dialed down the initial size of
 the hash table from 1000 to 16, that ought to be enough.

I am slightly confused by the use of HASH_CONTEXT without setting
ctl-hcxt, that works, but seems more like an accident.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread David Rowley
On Fri, Nov 15, 2013 at 11:31 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 Thanks, committed with minor copy-editing. I dialed down the initial size
 of the hash table from 1000 to 16, that ought to be enough.


Great. Thanks for commiting.

Regards

David Rowley


 - Heikki



Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-15 Thread Haribabu kommi
On 14 November 2013 23:59 Fujii Masao wrote:
 On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
 haribabu.ko...@huawei.com wrote:
  Please find attached the patch, for adding a new option for
  pg_basebackup, to specify a different directory for pg_xlog.
 
 Sounds good! Here are the review comments:
 
 +printf(_(--xlogdir=XLOGDIR   location for the
 transaction log directory\n));
 
 This message is not aligned well.

Fixed.
 
 -if (!streamwal || strcmp(filename +
 strlen(filename) - 8, /pg_xlog) != 0)
 +if ((!streamwal  (strcmp(xlog_dir, ) == 0))
 +|| strcmp(filename + strlen(filename) -
 8, /pg_xlog) != 0)
 
 You need to update the source code comment.

Corrected the source code comment. Please check once.
 
 +#ifdef HAVE_SYMLINK
 +if (symlink(xlog_dir, linkloc) != 0)
 +{
 +fprintf(stderr, _(%s: could not create symbolic link
 \%s\: %s\n),
 +progname, linkloc, strerror(errno));
 +exit(1);
 +}
 +#else
 +fprintf(stderr, _(%s: symlinks are not supported on this
 platform 
 +  cannot use xlogdir));
 +exit(1);
 +#endif
 +}
 
 Is it better to call pg_free() at the end? Even if we don't do that, it
 would be almost harmless, though.

Added pg_free to free up the linkloc.

 Don't we need to prevent users from specifying the same directory in
 both --pgdata and --xlogdir?

I feel no need to prevent, even if user specifies both --pgdata and --xlogdir 
as same directory
all the transaction log files will be created in the base directory instead of 
pg_xlog directory. 

Regards,
Hari babu.


UserSpecifiedxlogDir_v2.patch
Description: UserSpecifiedxlogDir_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The number of character limitation of custom script on pgbench

2013-11-15 Thread Sawada Masahiko
On Wed, Nov 13, 2013 at 10:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On 11/13/13, 6:18 AM, Sawada Masahiko wrote:
 The function of custom script of pgbench allows only  BUFSIZ
 (i.g.,1024byte) or less as length of a SQL.
 So I'm thinking following solution.
 (1) to increase buffer size
 (2) to change to variable buffer size
 (3) to return ERROR with information

 I'd go for #2.  But at least an error.

 #2 definitely.  I've run into this limitation myself recently, and
 so have other people.  It's time to fix it.


I attached the patch which solves this problem, and have submitted to CF3.
I changed how to get the SQL from custom script file.

Regards,

---
Sawada Masahiko


fix_limitaion_custom_script_pgbench.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Simon Riggs
On 14 November 2013 03:41, Amit Kapila amit.kapil...@gmail.com wrote:

 I have gone through the mail chain of this thread and tried to find
 the different concerns or open ends for this patch.

Not enough. This feature is clearly being suggested as a way to offer
Postgres in embedded mode for users by a back door. Doing that forces
us to turn off many of the server's features and we will take a huge
step backwards in features, testing, maintainability of code and
wasted community time.

No administrative hassles is just a complete fiction. Admin will
become a huge burden for any user in this mode, which will bite the
community and cause us to waste much time redesigning the server to
operate on a single session.

-1 from me


 4. Secondary connections for data access

Proposal
---
A single-user connection database with no administrative hassles

Concerns
-
As this proposal will not allow any data it stores to be accessed
 by another connection, so all forms of  replication are excluded and
 all maintenance actions force the database to be
unavailable for a period of time. Those two things are barriers of
 the most major kind to anybody working in an enterprise with connected
 data and devices.

Suggestions for it's use or make it usable

a. a usable  scriptable --single mode is justification enough.
 Having to wait for hours just enter one more command because --single
 doesn't support any scripts sucks. Especially in
recovery situations.
b. it's worth having this particular thing because it makes
 pg_upgrade more robust.
c. some competing solutions already provide similar solution
 (http://www.firebirdsql.org/manual/fbmetasecur-embedded.html).
d. we need to make sure that this isn't foreclosing the option of
 having a multi-process environment with a single user connection.  I
 don't see that it is, but it might be wise to sketch
exactly how that case would work before accepting this.

Why is not feasible to run a normal server with 1 connection.

Are we really following what Firebird is doing? Why?

 6. Restricting operation's in single backend mode

Serializable transactions could skip all the SSI predicate locking
 and conflict checking when in single-connection mode. With only one
 connection the transactions could never overlap, so
there would be no chance of serialization anomalies when running
 snapshot isolation.

It could be of use if someone had code they wanted to run under
 both normal and single-connection modes. For single-connection only,
 they could just choose REPEATABLE READ to
get exactly the same semantics.

This is an example of my concern that we would begin optimising for
the case of single user mode and encourage its use by users. This
shows that the feature is not being suggested just for recovery.

PostgreSQL has been designed from the ground up to support
concurrency. If we begin optimising for single user mode it will take
years to unpick our work and eventually we'll have a conflict and
someone will say we can't do that because it will be a problem in
single user mode.


 7. Proposal related to maintainence activities

For maintainence activities, in longer run, we can have a
 postmaster process that isn't listening on any ports, but is managing
 background processes in addition to a single child backend.

 As per my understanding, to complete this patch we need to
 a. complete the work for #1, #2, #5
 b. #6 and #7 can be done as enhancements after the initial feature is 
 committed
 c. need to decide what should we do for #3 and #4.

Huh? Multi-process mode already works. Why would do we need a
solution for the problem that single process mode uses only one
process? Don't use it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-15 Thread Magnus Hagander
On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi
haribabu.ko...@huawei.comwrote:

 On 14 November 2013 23:59 Fujii Masao wrote:
  On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
  haribabu.ko...@huawei.com wrote:
   Please find attached the patch, for adding a new option for
   pg_basebackup, to specify a different directory for pg_xlog.
 
  Sounds good! Here are the review comments:
 
  +printf(_(--xlogdir=XLOGDIR   location for the
  transaction log directory\n));
 
  This message is not aligned well.

 Fixed.

  -if (!streamwal || strcmp(filename +
  strlen(filename) - 8, /pg_xlog) != 0)
  +if ((!streamwal  (strcmp(xlog_dir, ) == 0))
  +|| strcmp(filename + strlen(filename) -
  8, /pg_xlog) != 0)
 
  You need to update the source code comment.

 Corrected the source code comment. Please check once.

  +#ifdef HAVE_SYMLINK
  +if (symlink(xlog_dir, linkloc) != 0)
  +{
  +fprintf(stderr, _(%s: could not create symbolic link
  \%s\: %s\n),
  +progname, linkloc, strerror(errno));
  +exit(1);
  +}
  +#else
  +fprintf(stderr, _(%s: symlinks are not supported on this
  platform 
  +  cannot use xlogdir));
  +exit(1);
  +#endif
  +}
 
  Is it better to call pg_free() at the end? Even if we don't do that, it
  would be almost harmless, though.

 Added pg_free to free up the linkloc.

  Don't we need to prevent users from specifying the same directory in
  both --pgdata and --xlogdir?

 I feel no need to prevent, even if user specifies both --pgdata and
 --xlogdir as same directory
 all the transaction log files will be created in the base directory
 instead of pg_xlog directory.


Given how easy it would be to prevent that, I think we should. It would be
an easy misunderstanding to specify that when you actually want it in
wherever/pg_xlog. Specifying that would be redundant in the first place,
but people ca do that, but it would also be very easy to do it by mistake,
and you'd end up with something that's really bad, including a recursive
symlink.

I also think it would probably be worthwhile to support this in  tar format
as well, but I guess that's a separate patch really. There's really no
reason we should't support wal streaming into a separate tar file. But -
separate issue.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Andres Freund
On 2013-11-15 09:51:28 -0200, Simon Riggs wrote:
 On 14 November 2013 03:41, Amit Kapila amit.kapil...@gmail.com wrote:
 
  I have gone through the mail chain of this thread and tried to find
  the different concerns or open ends for this patch.
 
 Not enough. This feature is clearly being suggested as a way to offer
 Postgres in embedded mode for users by a back door.

I think fixing single user mode to work halfway reasonable is enough
justification for the feature. Having to deal with that when solving
critical issues is just embarassing.

 Doing that forces
 us to turn off many of the server's features and we will take a huge
 step backwards in features, testing, maintainability of code and
 wasted community time.

I think the patch as proposed actually reduces maintenance overhead
since we don't have to deal with the strange separate codepaths for
single user mode.

But: I very, very much agree with the other concerns around this. This
should be a patch to fix single user mode, not one to make postgres into
a single process database. It's not, and trying to make it by using
single user mode for it will start to hinder development of normal
postgres because we suddenly need to be concerned about performance and
features in situations where we previously weren't.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Dimitri Fontaine
Andres Freund and...@2ndquadrant.com writes:
 I think fixing single user mode to work halfway reasonable is enough
 justification for the feature. Having to deal with that when solving
 critical issues is just embarassing.

+1

 But: I very, very much agree with the other concerns around this. This
 should be a patch to fix single user mode, not one to make postgres into
 a single process database. It's not, and trying to make it by using
 single user mode for it will start to hinder development of normal
 postgres because we suddenly need to be concerned about performance and
 features in situations where we previously weren't.

+1

Maybe what needs to happen to this patch is away to restrict its usage
to --single. I'm thinking that postgres --single maybe could be made to
fork the server process underneath the psql controler client process
transparently.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] init_sequence spill to hash table

2013-11-15 Thread Heikki Linnakangas

On 15.11.2013 12:44, Andres Freund wrote:

On 2013-11-15 12:31:54 +0200, Heikki Linnakangas wrote:

On 15.11.2013 07:47, David Rowley wrote:

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote

I think that means that we should just completely replace the list with
the hash table. The difference with a small N is lost in noise, so there's
no point in keeping the list as a fast path for small N. That'll make the
patch somewhat simpler.


Attached is a much more simple patch which gets rid of the initial linked
list.


Thanks, committed with minor copy-editing. I dialed down the initial size of
the hash table from 1000 to 16, that ought to be enough.


I am slightly confused by the use of HASH_CONTEXT without setting
ctl-hcxt, that works, but seems more like an accident.


Ugh. Accident indeed. Thanks, fixed!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Simon Riggs
On 15 November 2013 09:00, Andres Freund and...@2ndquadrant.com wrote:

 This
 should be a patch to fix single user mode, not one to make postgres into
 a single process database.

+1

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2013-11-15 Thread Heikki Linnakangas

On 30.10.2013 19:11, Andres Freund wrote:

On 2013-10-30 22:39:20 +0530, Abhijit Menon-Sen wrote:

At 2013-10-30 11:04:36 -0400, t...@sss.pgh.pa.us wrote:



As a compromise, perhaps we can unconditionally round the size up to be
a multiple of 2MB? […]


That sounds reasonably painless to me.


Here's a patch that does that and adds a DEBUG1 log message when we try
with MAP_HUGETLB and fail and fallback to ordinary mmap.


But it's in no way guaranteed that the smallest hugepage size is
2MB. It'll be on current x86 hardware, but not on any other platform...


Sure, but there's no big harm done. We're just trying to avoid hitting a 
kernel bug, and as a bonus, we avoid wasting some memory that would 
otherwise be lost due to the kernel rounding the allocation. If the 
smallest hugepage size is smaller than 2MB, we round up the allocation 
unnecessarily, but that doesn't seem serious.



I spent some time whacking this around, new patch version attached. I 
moved the mmap() code into a new function, that leaves the 
PGSharedMemoryCreate more readable.


I modified the patch so that it throws an error if you set 
huge_tlb_pages=on, and the platform doesn't support MAP_HUGETLB (ie. 
non-Linux, or EXEC_BACKEND). 'try' is the default, so this only affects 
you if you explicitly set it to 'on'. I think that's the right behavior; 
if you explicitly ask for it, and you don't get it, that should be an 
error. But I'm not wedded to the idea if someone objects; a log message 
might also be reasonable: LOG: huge TLB pages are not supported on this 
platform, but huge_tlb_pages was 'on'


The error message on failed allocation, if huge_tlb_pages=on, needs 
updating:


$ bin/postmaster -D data
FATAL:  could not map anonymous shared memory: Cannot allocate memory
HINT:  This error usually means that PostgreSQL's request for a shared 
memory segment exceeded available memory or swap space. To reduce the 
request size (currently 189390848 bytes), reduce PostgreSQL's shared 
memory usage, perhaps by reducing shared_buffers or max_connections.


The reason the allocation failed in this case was that I used 
huge_tlb_pages=on, but had not configured the kernel for huge pages. The 
hint is quite misleading in that case, it should advise to configure the 
kernel, or turn off huge_tlb_pages.


The documentation needs some work. I think it's pretty user-unfriendly 
to link to https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt. 
It gives a lot of details, and although it explains stuff that is 
relevant, like setting the nr_hugepages sysctl, it also contains a lot 
of stuff that is not relevant to us, like how to mount hugetlbfs. Can we 
do better than that? Is there a better guide somewhere on how to set the 
kernel settings. If not, we should include step-by-step instructions in 
our manual.


The Managing Kernel Resources section in the user manual should also 
be updated to mention how to enable huge pages.


Also, now that I changed huge_tlb_pages='on' to fail on platforms where 
it's not supported at all, the docs need to be updated to reflect it.


- Heikki
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..7a60ad0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1052,6 +1052,41 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages
+  termvarnamehuge_tlb_pages/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamehuge_tlb_pages/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Enables/disables the use of huge TLB pages. Valid values are
+literaltry/literal (the default), literalon/literal,
+and literaloff/literal.
+   /para
+
+   para
+At present, this feature is supported only on Linux. The setting
+is ignored on other systems.
+   /para
+
+   para
+The use of huge TLB pages results in smaller page tables and
+less CPU time spent on memory management, increasing performance. For
+more details, see
+ulink url=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt;hugepages.txt/ulink
+in the Linux kernel documentation.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaltry/literal,
+the server will try to use huge pages, but fall back to using
+normal allocation if that fails. With literalon/literal, failure
+to use huge pages will prevent the server from starting up. With
+literaloff/literal, huge pages will not be used.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-temp-buffers xreflabel=temp_buffers
   termvarnametemp_buffers/varname (typeinteger/type)/term
   indexterm
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index b604407..3ccd2c2 100644
--- a/src/backend/port/sysv_shmem.c
+++ 

Re: [HACKERS] LISTEN / NOTIFY enhancement request for Postgresql

2013-11-15 Thread Pavel Golub
Hello, Dimitri.

You wrote:

DF Bruce Momjian br...@momjian.us writes:
   • is used to separate names in a path
   • * is used to match any name in a path
   •  is used to recursively match any destination starting from this name
 
 For example using the example above, these subscriptions are possible
 
 Subscription  Meaning
 PRICE.  Any price for any product on any exchange
 PRICE.STOCK.Any price for a stock on any exchange
 PRICE.STOCK.NASDAQ.* Any stock price on NASDAQ
 PRICE.STOCK.*.IBMAny IBM stock price on any exchange
 
 
 My request is to implement the same or similar feature in Postgresql.

 This does seem useful and pretty easy to implement.  Should we add a
 TODO?

DF I think we should consider the ltree syntax in that case, as documented
DF in the following link:

DF   http://www.postgresql.org/docs/9.3/interactive/ltree.html

Great idea! Thanks for link.

DF Regards,
DF -- 
DF Dimitri Fontaine
DF http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support





-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-11-15 Thread Heikki Linnakangas

On 15.11.2013 05:30, Peter Eisentraut wrote:

Various places in the constraint checking code say something like, if we
ever implement assertions, here is where it should go.  I've been
fiddling with filling in those gaps for some time now, and the other day
I noticed, hey, this actually kind of works, so here it is.  Let's see
whether this architecture is sound.


Cool!


A constraint trigger performs the actual checking.  For the
implementation of the trigger, I've used some SPI hacking for now; that
could probably be refined.  The attached patch has documentation, tests,
psql support.  Missing pieces are pg_dump support, dependency
management, and permission checking (the latter marked in the code).


A fundamental problem with this is that it needs to handle isolation 
reliable, so that the assertion cannot be violated when two concurrent 
backends do things. Consider the example from the manual, which checks 
that a table has at least one row. Now, if the table has two rows to 
begin with, and in one backend you delete one row, and concurrently in 
another backend you delete the other row, and then commit both 
transactions, the assertion is violated.


In other words, the assertions need to be checked in serializable mode. 
Now that we have a real serializable mode, I think that's actually feasible.



PS. The patch doesn't check that the assertion holds when it's created:

postgres=# create table foo (i int4);
CREATE TABLE
postgres=# create assertion myassert check  ((select count(*) from foo) 
 0);

CREATE ASSERTION

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade misreports full disk

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 10:28 AM, Peter Eisentraut pete...@gmx.net wrote:
 When pg_upgrade encounters a full disk while copying relation files,
 it reports this as:

 error while copying relation xyz (...): Success

 because it doesn't set errno in some error cases.  In other places we
 treat short writes as ENOSPC, so here is a patch to do that for
 pg_upgrade as well.

+1 for committing this and back-patching it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Sort contents entries in reference documentation

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 4:40 AM, Colin 't Hart colinth...@gmail.com wrote:
 Hi,

 While looking at the documentation on SELECT I noticed that the
 entries in reference.sgml aren't sorted correctly -- psql \h does have
 them in the correct order.

 Attached a trivial patch to fix this.

Committed and back-patched to 9.3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The number of character limitation of custom script on pgbench

2013-11-15 Thread Alvaro Herrera
Sawada Masahiko escribió:

 Yes, I also think #2 is good.
 I will implement the patch.

I remember this was recently discussed in the spanish list.  Please see
this email:

http://archives.postgresql.org/message-id/48589.192.168.207.54.1382570043.squir...@webmail.etecsa.cu

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Tomas Vondra (t...@fuzzy.cz) wrote:
 On 15 Listopad 2013, 1:00, David Rowley wrote:
  more focused on trying to draw a bit of attention to commit
  061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
  properly set the last byte to 0 afterwards. I think this case could just
  be
  replaced with strlcpy which does all this hard work for us.
 
 Hmm, you mean this piece of code?
 
strncpy(saved_argv0, argv[0], MAXPGPATH);
 
 IMHO you're right that's probably broken, unless there's some checking
 happening before the call.

Agreed, that looks like a place we should be using strlcpy() instead.

Robert, what do you think?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Alvaro Herrera
So I committed this patch without backpatching anything.  There was some
discussion about the exact strategy for backpatching the behavior
change, but no final agreement; the suggestions were

0. Backpatch as an ERROR.  If it causes failures, people is supposed to
change their apps or something.

1. Don't backpatch the ERROR bit at all, so that if the renegotiation
fails we would silently continue just as currently

2. Do spit the message, but only as a WARNING.  Thinks this may end up
causing log disks to fill up.

3. Add it as an ERROR, but make it possible to disable it, presumably
via a new GUC.  So people can see their security problems and hopefuly
fix them, but if they don't, then they can shut it up via server
configuration.  This would entail a GUC variable that exists in existing
branches but not HEAD (we could avoid an upgradability problem of
postgresql.conf files by having a no-op phantom GUC variable in HEAD).

I was reminded of this once more because I just saw a spurious
renegotiation failure in somebody's production setup.  Kind of like a
recurring nightmare which I thought I had already erradicated.

Opinions?  Also, should we wait longer for the new renegotiation code to
be more battle-tested?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 1. Don't backpatch the ERROR bit at all, so that if the renegotiation
 fails we would silently continue just as currently

I'm leaning towards the above at this point.

 I was reminded of this once more because I just saw a spurious
 renegotiation failure in somebody's production setup.  Kind of like a
 recurring nightmare which I thought I had already erradicated.

I saw one yesterday. :(

 Opinions?  Also, should we wait longer for the new renegotiation code to
 be more battle-tested?

I've got a better environment to test this in now and given that I saw
it just yesterday, I'm very interested in addressing it.  I grow tired
of seeing these renegotiation errors.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Race condition in b-tree page deletion

2013-11-15 Thread Peter Eisentraut
On 11/13/13, 11:04 AM, Heikki Linnakangas wrote:
 Here's a patch implementing this scheme.

Compiler warnings:

nbtpage.c: In function ‘_bt_pagedel’:
nbtpage.c:1695:24: warning: ‘metad’ may be used uninitialized in this function 
[-Wmaybe-uninitialized]
nbtpage.c:1414:18: note: ‘metad’ was declared here
nbtpage.c:1730:117: warning: ‘metapg’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
nbtpage.c:1413:7: note: ‘metapg’ was declared here




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Kevin Grittner
Tomas Vondra t...@fuzzy.cz wrote:
 On 15 Listopad 2013, 1:00, David Rowley wrote:
 more focused on trying to draw a bit of attention to commit
 061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and
 does not properly set the last byte to 0 afterwards. I think
 this case could just be replaced with strlcpy which does all
 this hard work for us.

 Hmm, you mean this piece of code?

   strncpy(saved_argv0, argv[0], MAXPGPATH);

 IMHO you're right that's probably broken, unless there's some
 checking happening before the call.


I agree, and there is no such checking.  Fix pushed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 09:24:59 -0500, Stephen Frost wrote:
 * Tomas Vondra (t...@fuzzy.cz) wrote:
  On 15 Listopad 2013, 1:00, David Rowley wrote:
   more focused on trying to draw a bit of attention to commit
   061b88c732952c59741374806e1e41c1ec845d50 which uses strncpy and does not
   properly set the last byte to 0 afterwards. I think this case could just
   be
   replaced with strlcpy which does all this hard work for us.
  
  Hmm, you mean this piece of code?
  
 strncpy(saved_argv0, argv[0], MAXPGPATH);
  
  IMHO you're right that's probably broken, unless there's some checking
  happening before the call.
 
 Agreed, that looks like a place we should be using strlcpy() instead.

I don't mind fixing it, but I think anything but s/strncpy/strlcpy/ is
over the top. Translating such strings is just a waste of translator's
time.
If you really worry about paths being longer than MAXPGPATH, there's
lots, and lots of things to do that are, far, far more critical than
this.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Jaime Casanova
On Fri, Nov 15, 2013 at 9:28 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/13/13, 12:17 AM, Jaime Casanova wrote:
 I have rebased Michael Paquier's patch and did a few changes:

 * changed standby.enabled filename to recovery.trigger
 * make archive_command a SIGHUP parameter again
 * make restore_command a SIGHUP parameter
 * rename restore_command variable to XLogRestoreCommand to match
 XLogArchiveCommand

 Please check for compiler warnings in pg_basebackup:

 pg_basebackup.c:1109:1: warning: ‘escapeConnectionParameter’ defined but not 
 used [-Wunused-function]
 pg_basebackup.c:1168:1: warning: ‘escape_quotes’ defined but not used 
 [-Wunused-function]


those are functions that are no longer used but Josh considered they
could become useful before release.
i can put them inside #ifdef _NOT_USED_ decorations or just remove
them now and if/when we find some use for them re add them

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 04:21:50 +0100, Tomas Vondra wrote:
 Hmm, you mean this piece of code?
 
strncpy(saved_argv0, argv[0], MAXPGPATH);
 
 IMHO you're right that's probably broken, unless there's some checking
 happening before the call.

FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
name (if executed via PATH) or the path to the executable.
Now, you could probably write a program to exeve() a binary with argv[0]
being longer, but in that case you can also just put garbage in there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 1:02 AM, Sawada Masahiko wrote: 
 I attached patch adds new wal_level 'all'.

Compiler warning:

pg_controldata.c: In function ‘wal_level_str’:
pg_controldata.c:72:2: warning: enumeration value ‘WAL_LEVEL_ALL’ not handled 
in switch [-Wswitch]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Thu, Nov 14, 2013 at 1:54 PM, Hannu Krosing ha...@2ndquadrant.com wrote:
 On 11/14/2013 08:17 PM, Merlin Moncure wrote:
 On Thu, Nov 14, 2013 at 11:34 AM, David E. Wheeler
 da...@justatheory.com wrote:
 On Nov 14, 2013, at 7:07 AM, Merlin Moncure mmonc...@gmail.com wrote:

 This is exactly what needs to be done, full stop (how about: hstore).
 It really comes down to this: changing the serialization behaviors
 that have been in production for 2 releases (three if you count the
 extension) is bad enough, but making impossible some legal json
 constructions which are currently possible is an unacceptable
 compatibility break.  It's going to break applications I've currently
 put into production with no clear workaround.  This is quite frankly
 not ok and and I'm calling foul.  The RFC may claim that these
 constructions are dubious but that's irrelevant.  It's up to the
 parser to decide that and when serializing you are not in control of
 the parser.
 The current JSON type preserves key order and duplicates. But is it 
 documented that this is a feature, or something to be guaranteed?
 It doesn't, but the row_to_json function has a very clear mechanism of
 action.  And, 'not being documented' is not the standard for latitude
 to make arbitrary changes to existing function behaviors.
 the whole hash*() function family was changed based on not documented
 premise, so we do have a precedent .

 In my experience, no JSON parser guarantees key order or duplication.
 I found one in about two seconds.  http://docs.python.org/2/library/json.html

 object_pairs_hook, if specified will be called with the result of
 every JSON object decoded with an ordered list of pairs. The return
 value ofobject_pairs_hook will be used instead of the dict. This
 feature can be used to implement custom decoders that rely on the
 order that the key and value pairs are decoded (for example,
 collections.OrderedDict() will remember the order of insertion). If
 object_hook is also defined, the object_pairs_hooktakes priority.

 That makes the rest of your argument moot.  Plus, I quite clearly am
 dealing with parsers that do.
 I am sure you could also devise an json encoding scheme
 where white space is significant ;)

 The question is, how much of it should json *type* support.

 As discussed in other thread, most of your requirements
 would be met by having json/row/row set-to-text serializer
 functions which output json-formatted text.

No, that would not work putting aside the fact it would require
rewriting heaps of code.  What I do now inside the json wrapping
routines is create things like

{
  x: [
{dynamic object},
{dynamic object},
...
  ],
  y: ...,
  ...
}

The only way to do it is to build 'dynamic object' into json in
advance of the outer xxx_to_json call.  The 'dynamic object' is
created out of a json builder that takes a paired array -- basically a
variant of Andrew's 'json_build' upthread.  If the 'json serializer'
outputted text, the 'outer' to_json call would then re-escape the
object.  I can't use hstore for that purpose precisely because of the
transformations it does on the object.

Stepping back, I'm using json serialization as a kind of 'supercharged
crosstab'.  To any client that can parse json, json serialization
completely displaces crosstabbing -- it's superior in every way.  I
am, if you may, kind of leading research efforts in the area and I can
tell you with absolute certainty that breaking this behavior is a
mistake.

Forcing hstore-ish output mechanisms removes the ability to handle
certain important edge cases that work just fine today. If that
ability was taken away, it would be a very bitter pill for me to
swallow and would have certain ramifications for me professionally; I
went out on a pretty big limb and pushed pg/json aggressively (over
strenuous objection) in an analytics product which is now in the final
stages of beta testing.  I would hate to see the conclusion of the
case study be Ultimately we had to migrate the code back to Hibernate
due to compatibility issues.

Here are the options on the table:
1) convert existing json type to binary flavor (notwithstanding objections)
2) maintain side by side types, one representing binary, one text.
unfortunately, i think the text one must get the name 'json' due to
unfortunate previous decision.
3) merge the behaviors into a single type and get the best of both
worlds (as suggested upthread).

I think we need to take a *very* hard look at #3 before exploring #1
or #2: Haven't through it through yet but it may be possible to handle
this in such a way that will be mostly transparent to the end user and
may have other benefits such as a faster path for serialization.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
 MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
 name (if executed via PATH) or the path to the executable.

Err, it's the longest that *we* think the path can be..  That's not the
same as actually being the longest that a path can be, which depends on
the filesystem and OS...  It's not hard to get past our 1024 limit:

sfrost@beorn:/really/long/path echo $PWD | wc -c
1409

 Now, you could probably write a program to exeve() a binary with argv[0]
 being longer, but in that case you can also just put garbage in there.

We shouldn't blow up in that case either, really.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Peter Eisentraut
On 11/13/13, 12:17 AM, Jaime Casanova wrote:
 I have rebased Michael Paquier's patch and did a few changes:
 
 * changed standby.enabled filename to recovery.trigger
 * make archive_command a SIGHUP parameter again
 * make restore_command a SIGHUP parameter
 * rename restore_command variable to XLogRestoreCommand to match
 XLogArchiveCommand

Please check for compiler warnings in pg_basebackup:

pg_basebackup.c:1109:1: warning: ‘escapeConnectionParameter’ defined but not 
used [-Wunused-function]
pg_basebackup.c:1168:1: warning: ‘escape_quotes’ defined but not used 
[-Wunused-function]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR during end-of-xact/FATAL

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 11:04 AM, Noah Misch n...@leadboat.com wrote:
 So, in short, ERROR + ERROR*10 = PANIC, but FATAL + ERROR*10 = FATAL.
 That's bizarre.

 Quite so.

 Given that that's where we are, promoting an ERROR during FATAL
 processing to PANIC doesn't seem like it's losing much; we're
 essentially already doing that in the (probably more likely) case of a
 persistent ERROR during ERROR processing.  But since PANIC sucks, I'd
 rather go the other direction: let's make an ERROR during ERROR
 processing promote to FATAL.  And then let's do what you write above:
 make sure that there's a separate on-shmem-exit callback for each
 critical shared memory resource and that we call of those during FATAL
 processing.

 Many of the factors that can cause AbortTransaction() to fail can also cause
 CommitTransaction() to fail, and those would still PANIC if the transaction
 had an xid.  How practical might it be to also escape from an error during
 CommitTransaction() with a FATAL instead of PANIC?  There's more to fix up in
 that case (sinval, NOTIFY), but it may be within reach.  If such a technique
 can only reasonably fix abort, though, I have doubts it buys us enough.

The critical stuff that's got to happen after
RecordTransactionCommit() appears to be ProcArrayEndTransaction() and
AtEOXact_Inval(). Unfortunately, the latter is well after the point
when we're supposed to only be doing non-critical resource cleanup,
nonwithstanding which it appears to be critical.

So here's a sketch.  Hoist the preparatory logic in
RecordTransactionCommit() - smgrGetPendingDeletes,
xactGetCommittedChildren, and xactGetCommittedInvalidationMessages up
into the caller and do it before setting TRANS_COMMIT.  If any of that
stuff fails, we'll land in AbortTransaction() which must cope.  As
soon as we exit the commit critical section, set a flag somewhere
(where?) indicating that we have written our commit record; when that
flag is set, (a) promote any ERROR after that point through the end of
commit cleanup to FATAL and (b) if we enter AbortTransaction(), don't
try to RecordTransactionAbort().

I can't see that the notification stuff requires fixup in this case;
AFAICS, it is just adjusting backend-local state, and it's OK to
disregard any problems there during a FATAL exit.  Do you see
something to the contrary?  But invalidation messages are a problem:
if we commit and exit without sending our queued-up invalidation
messages, Bad Things Will Happen.  Perhaps we could arrange things so
that in that case only, we just PANIC.   That would allow most write
transactions to get by with FATAL, promoting to PANIC only in the case
of transactions that have modified system catalogs and only until the
invalidations have actually been sent.  Avoiding the PANIC in that
case seems to require some additional wizardry which is not entirely
clear to me at this time.

I think we'll have to approach the various problems in this area
stepwise, or we'll never make any progress.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 6:06 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 But: I very, very much agree with the other concerns around this. This
 should be a patch to fix single user mode, not one to make postgres into
 a single process database. It's not, and trying to make it by using
 single user mode for it will start to hinder development of normal
 postgres because we suddenly need to be concerned about performance and
 features in situations where we previously weren't.

 +1

 Maybe what needs to happen to this patch is away to restrict its usage
 to --single. I'm thinking that postgres --single maybe could be made to
 fork the server process underneath the psql controler client process
 transparently.

I personally would prefer not to do that.  My main non-administrative
interest in this mode is doing things like benchmarking protocol
overhead.  I'm OK with not supporting (and optimizing) for single user
code paths but I don't like the idea of building walls that serve no
purpose other than to make it difficult for other people mess around.
Just document strenuously that this mode is not intended for
application bundling and move on...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 09:53:24 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
  MAXPGPATH is the longest a path can be, and argv[0] is either the 
  executable's
  name (if executed via PATH) or the path to the executable.
 
 Err, it's the longest that *we* think the path can be..  That's not the
 same as actually being the longest that a path can be, which depends on
 the filesystem and OS...  It's not hard to get past our 1024 limit:

Sure, there can be longer paths, but postgres don't support them. In a
*myriad* of places. It's just not worth spending code on it.

Just about any of the places that use MAXPGPATH are vulnerable or
produce confusing error messages if it's exceeded. And there are about
zero complaints about it.

  Now, you could probably write a program to exeve() a binary with argv[0]
  being longer, but in that case you can also just put garbage in there.
 
 We shouldn't blow up in that case either, really.

Good luck.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 Sure, there can be longer paths, but postgres don't support them. In a
 *myriad* of places. It's just not worth spending code on it.

 Just about any of the places that use MAXPGPATH are vulnerable or
 produce confusing error messages if it's exceeded. And there are about
 zero complaints about it.

Confusing error messages are one thing, segfaulting is another.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 10:04:12 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  Sure, there can be longer paths, but postgres don't support them. In a
  *myriad* of places. It's just not worth spending code on it.
 
  Just about any of the places that use MAXPGPATH are vulnerable or
  produce confusing error messages if it's exceeded. And there are about
  zero complaints about it.
 
 Confusing error messages are one thing, segfaulting is another.

I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
fix.
I am arguing about introducing additional code and error messages about
it, that need to be translated. And starting doing so in isolationtester
of all places.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Michael Paquier
On Fri, Nov 15, 2013 at 11:38 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 those are functions that are no longer used but Josh considered they
 could become useful before release.
 i can put them inside #ifdef _NOT_USED_ decorations or just remove
 them now and if/when we find some use for them re add them
Why not simply removing them? They will be kept in the git history either way.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Alvaro Herrera
David Rowley escribió:
 On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra t...@fuzzy.cz wrote:

  Be careful with 'Name' data type - it's not just a simple string buffer.
  AFAIK it needs to work with hashing etc. so the zeroing is actually needed
  here to make sure two values produce the same result. At least that's how
  I understand the code after a quick check - for example this is from the
  same jsonfuncs.c you mentioned:
 
  memset(fname, 0, NAMEDATALEN);
  strncpy(fname, NameStr(tupdesc-attrs[i]-attname), NAMEDATALEN);
  hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
 
  So the zeroing is on purpose, although if strncpy does that then the
  memset is probably superflous.

This code should probably be using namecpy().  Note namecpy() doesn't
memset() after strncpy() and has survived the test of time, which
strongly suggests that the memset is indeed superfluous.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database disconnection and switch inside a single bgworker

2013-11-15 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 Currently, bgworkers offer the possibility to connect to a given
 database using BackgroundWorkerInitializeConnection in bgworker.h, but
 there is actually no way to disconnect from a given database inside
 the same bgworker process.

That's isomorphic to having a backend switch to a different database,
which occasionally gets requested, but there is no real likelihood
that we'll ever implement.  The problem is, how can you be sure you
have flushed all the database-specific state that's been built up?
The relcache and catcaches are only the tip of the iceberg; we've
got caches all over the place.  And once you had flushed that data,
you'd have to recreate it --- but the code for doing so is intimately
intertwined with connection startup tasks that you'd most likely not
want to repeat.

And, once you'd done all that work, what would you have?  A database
switch methodology that would save a fork(), but not much else.
The time to warm up the caches wouldn't be any better than in a
fresh process.

The cost/benefit ratio for making this work just doesn't look very
promising.  That's why autovacuum is built the way it is.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
 fix.
 I am arguing about introducing additional code and error messages about
 it, that need to be translated. And starting doing so in isolationtester
 of all places.

I agree with Andres on this.  Commit
7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
overkill, and I don't want to see us starting to plaster the source
code with things like this.  Converting strncpy to strlcpy seems
appropriate --- and sufficient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 This code should probably be using namecpy().  Note namecpy()
 doesn't memset() after strncpy() and has survived the test of
 time, which strongly suggests that the memset is indeed
 superfluous.

That argument would be more persuasive if I could find any current
usage of the namecpy() function anywhere in the source code.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Alvaro Herrera
Kevin Grittner escribió:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
  This code should probably be using namecpy().  Note namecpy()
  doesn't memset() after strncpy() and has survived the test of
  time, which strongly suggests that the memset is indeed
  superfluous.
 
 That argument would be more persuasive if I could find any current
 usage of the namecpy() function anywhere in the source code.

Well, its cousin namestrcpy is used in a lot of places.  That one uses a
regular C string as source; namecpy uses a Name as source, so they are
slightly different but the coding is pretty much the same.

There is a difference in using the macro StrNCpy instead of the strncpy
library function directly.  ISTM this makes sense because Name is known
to be zero-terminated at NAMEDATALEN, which a random C string is not.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
  fix.
  I am arguing about introducing additional code and error messages about
  it, that need to be translated. And starting doing so in isolationtester
  of all places.
 
 I agree with Andres on this.  Commit
 7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
 overkill, and I don't want to see us starting to plaster the source
 code with things like this.  Converting strncpy to strlcpy seems
 appropriate --- and sufficient.

Personally, I'd like to see better handling like this- but done in a way
which minimizes impact to code and translators.  A function like
namecpy() (which I agree with Kevin about- curious that it's not used..)
which handled the check, errmsg and exit seems reasonable to me, for the
userland binaries (and perhaps the postmaster when doing command-line
checking of, eg, -D) that need it.

Still, I'm not offering to go do it, so take my feelings on it with that
in mind. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Andres Freund
On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
 +1 to waiting awhile.  I think if we don't see any problems in
 HEAD, then back-patching as-is would be the best solution.
 The other alternatives are essentially acknowledging that you're
 back-patching something you're afraid isn't production ready.
 Let's not go there.

Agreed. Both on just backpatching it unchanged and waiting for the fix
to prove itself a bit.

 Another reason I'm not in a hurry is that the problem we're trying
 to solve doesn't seem to be causing real-world trouble.  So by
 awhile, I'm thinking let's let it get through 9.4 beta testing.

Well, there have been a bunch of customer complaints about it, afair
that's what made Alvaro look into it in the first place. So it's not a
victimless bug.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 So I committed this patch without backpatching anything. ...
 ... should we wait longer for the new renegotiation code to
 be more battle-tested?

+1 to waiting awhile.  I think if we don't see any problems in
HEAD, then back-patching as-is would be the best solution.
The other alternatives are essentially acknowledging that you're
back-patching something you're afraid isn't production ready.
Let's not go there.

Another reason I'm not in a hurry is that the problem we're trying
to solve doesn't seem to be causing real-world trouble.  So by
awhile, I'm thinking let's let it get through 9.4 beta testing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
I tried again this morning using gin-packed-postinglists-16.patch and
gin-fast-scan.6.patch. No crashes during index building.

Pg was compiled with debug enabled in both cases. The data is a ~0.1%
random sample of production data (10,000,000 records for the test) with the
below structure.

  Table public.kp
 Column |  Type   | Modifiers
+-+---
 id | bigint  | not null
 string | text| not null
 score1 | integer |
 score2 | integer |
 score3 | integer |
 score4 | integer |
Indexes:
kp_pkey PRIMARY KEY, btree (id)
kp_string_key UNIQUE CONSTRAINT, btree (string)
textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
WHERE score1 IS NOT NULL



All data is in Pg buffer cache for these timings. Words like the and
and are very common (~9% of entries, each) and a word like hotel is
much less common (~0.2% of entries). Below is the query tested:

  SELECT id,string
FROM kp
   WHERE score1 IS NOT NULL
 AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
 -- ? is substituted with the query strings
ORDER BY score1 DESC, score2 ASC
LIMIT 1000;

 Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
rows=142 loops=1)
   -  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
time=250.008..250.017 rows=142 loops=1)
 Sort Key: score1, score2
 Sort Method: quicksort  Memory: 36kB
 -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
(actual time=249.711..249.945 rows=142 loops=1)
   Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
'''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
   -  Bitmap Index Scan on textsearch_gin_idx
(cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
loops=1)
 Index Cond: (to_tsvector('simple'::regconfig, string)
@@ '''hotel''  ''and''  ''the'''::tsquery)
 Total runtime: 250.096 ms



Times are from \timing on.

MASTER
===
the:   888.436 ms   926.609 ms   885.502 ms
and:   944.052 ms   937.732 ms   920.050 ms
hotel:  53.992 ms57.039 ms65.581 ms
and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

These numbers roughly match what we get with Pg 9.2. The time savings
between 'the' and 'and  the  hotel' is mostly heap lookups for the score
and the final sort.



The size of the index on disk is about 2% smaller in the patched version.

PATCHED
===
the:  1055.169 ms 1081.976 ms  1083.021 ms
and:   912.173 ms  949.364 ms   965.261 ms
hotel:  62.591 ms   64.341 ms62.923 ms
and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

I was hoping that the 'and  the  hotel' case would improve with this
patch to be closer to the 'hotel' search, as I thought that was the kind of
thing it targeted. Unfortunately, it did not. I actually applied the
patches, compiled, initdb/load data, and ran it again thinking I made a
mistake.

Reordering the terms 'hotel  and  the' doesn't change the result.



On Fri, Nov 15, 2013 at 1:51 AM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 3:25 AM, Rod Taylor r...@simple-knowledge.comwrote:

 I checked out master and put together a test case using a small
 percentage of production data for a known problem we have with Pg 9.2 and
 text search scans.

 A small percentage in this case means 10 million records randomly
 selected; has a few billion records.


 Tests ran for master successfully and I recorded timings.



 Applied the patch included here to master along with
 gin-packed-postinglists-14.patch.
 Run make clean; ./configure; make; make install.
 make check (All 141 tests passed.)

 initdb, import dump


 The GIN index fails to build with a segfault.


 Thanks for testing. See fixed version in thread about packed posting lists.

 --
 With best regards,
 Alexander Korotkov.



Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Kevin Grittner escribió:

 That argument would be more persuasive if I could find any current
 usage of the namecpy() function anywhere in the source code.

 Well, its cousin namestrcpy is used in a lot of places.  That one uses a
 regular C string as source; namecpy uses a Name as source, so they are
 slightly different but the coding is pretty much the same.

Fair enough.

 There is a difference in using the macro StrNCpy instead of the strncpy
 library function directly.  ISTM this makes sense because Name is known
 to be zero-terminated at NAMEDATALEN, which a random C string is not.

Is the capital T in the second #undef in this pg_locale.c code intended?:

#ifdef WIN32
/*
 * This Windows file defines StrNCpy. We don't need it here, so we undefine
 * it to keep the compiler quiet, and undefine it again after the file is
 * included, so we don't accidentally use theirs.
 */
#undef StrNCpy
#include shlwapi.h
#ifdef StrNCpy
#undef STrNCpy
#endif
#endif

--
Kevin GrittnerEDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
 Another reason I'm not in a hurry is that the problem we're trying
 to solve doesn't seem to be causing real-world trouble.  So by
 awhile, I'm thinking let's let it get through 9.4 beta testing.

 Well, there have been a bunch of customer complaints about it, afair
 that's what made Alvaro look into it in the first place. So it's not a
 victimless bug.

OK, then maybe end-of-beta is too long.  But how much testing will it get
during development?  I know I never use SSL on development installs.
How many hackers do?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-11-15 Thread Dimitri Fontaine
Hi,

Peter Eisentraut pete...@gmx.net writes:
 Rebased patch.  No changes except that merge conflicts were resolved,
 and I had to add some Data::Dumper tweaks to the regression tests so
 that the results came out in  consistent order on different versions of
 Perl.

I just spent some time reading that patch, here's my first round of
review:

  - Documentation style seems to be to be different from the man page
or reference docs style that we use elsewhere, and is instead
deriving the general case from examples. Reads strange.

  - The internal datatype argument and return type discussion for
function argument looks misplaced, but I don't have a better
proposition for that.

  - The code looks good, I didn't spot any problem when reading it.
Given your Jenkins installation I didn't try (yet at least) to use
the patch and compile and run it locally.

Interesting note might be the addition of two new system caches, one
for the PL languages and another one for the transform functions. I
agree with the need, just wanted to raise awereness about that in
case there's something to be said on that front.

  - Do we need an ALTER TRANSFORM command?

Usually we have at least an Owner for the new objects and a command
to change the owner. Then should we be able to change the
function(s) used in a transform?

  - Should transform live in a schema?

At first sight, no reason why, but see next point about a use case
that we might be able to solve doing that.

  - SQL Standard has something different named the same thing,
targetting client side types apparently. Is there any reason why we
would want to stay away from using the same name for something
really different in PostgreSQL?

On the higher level design, the big question here is about selective
behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
then any plperl function will now receive its hstore arguments as a
proper perl hash rather than a string.

Any pre-existing plperl function with hstore arguments or return type
then needs to be upgraded to handle the new types nicely, and some of
those might not be under the direct control of the DBA running the
CREATE TRANSFORM command, when using some plperl extensions for example.

A mechanism allowing for the transform to only be used in some functions
but not others might be useful. The simplest such mechanism I can think
of is modeled against the PL/Java classpath facility as specified in the
SQL standard: you attach a classpath per schema.

We could design transforms to only activate in the schema they are
created, thus allowing plperl functions to co-exist where some will
receive proper hash for hstores and other will continue to get plain
text.

Should using the schema to that ends be frowned upon, then we need a way
to register each plperl function against using or not using the
transform facility, defaulting to not using anything. Maybe something
like the following:

  CREATE FUNCTION foo(hash hstore, x ltree)
 RETURNS hstore
 LANGUAGE plperl
 USING TRANSFORM FOR hstore, ltree
  AS $$ … $$;

Worst case, that I really don't think we need, would be addressing that
per-argument:

  CREATE FUNCTION foo (hash hstore WITH TRANSFORM, kv hstore) …

I certainly hope we don't need that, and sure can't imagine use cases
for that level of complexity at the time of writing this review.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Andres Freund
On 2013-11-15 10:58:19 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
  Another reason I'm not in a hurry is that the problem we're trying
  to solve doesn't seem to be causing real-world trouble.  So by
  awhile, I'm thinking let's let it get through 9.4 beta testing.
 
  Well, there have been a bunch of customer complaints about it, afair
  that's what made Alvaro look into it in the first place. So it's not a
  victimless bug.
 
 OK, then maybe end-of-beta is too long.  But how much testing will it get
 during development?  I know I never use SSL on development installs.
 How many hackers do?

I guess few. And even fewer will actually have connections that live
long enough to experience renegotiations :/.

I wonder how hard it'd be to rig the buildfarm code to generate ssl
certificates and use them during installcheck. If we'd additionally set
a low renegotiation limit...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] writable FDWs / update targets confusion

2013-11-15 Thread Albe Laurenz
Tomas Vondra wrote:
 I'm working on adding write support to one of my FDWs. Adding INSERT went
 pretty fine, but when adding DELETE/UPDATE I got really confused about how
 the update targets are supposed to work.
 
 My understanding of how it's supposed to work is this:
 
  (1) AddForeignUpdateTargets adds columns that serve as ID of the record
  (e.g. postgres_fdw adds 'ctid')
 
  (2) planning the inner foreign scan handles the new column appropriately
  (e.g. scans system columns, as in case of 'ctid' etc.)
 
  (3) IterateForeignScan will see the column in the tuple descriptor, will
  set it just like any other column, etc.
 
  (4) ExecForeignDelete will fetch the new column and do something with it
 
 However no matter what I do, I can't get the steps (3) and (4) working this
 way.

I have no idea either.

 And looking at postgres_fdw it seems to me it does not really set the ctid
 into the tuple as a column, but just does this:
 
 if (ctid)
 tuple-t_self = *ctid;
 
 which I can't really do because I need to use INT8 and not TID. But even
 if I do this,

What exactly did you do?
Did you try
   tuple-t_self = myInt8;

That would write 8 bytes into a 6-byte variable, thus scribbling
past the end, right?

 Interestingly, if I do this in ExecForeignDelete
 
 static TupleTableSlot *
 myExecForeignDelete(EState *estate,
   ResultRelInfo *resultRelInfo,
   TupleTableSlot *slot,
   TupleTableSlot *planSlot)
 {
 
 bool isNull;
 MyModifyState state = (MyModifyState)resultRelInfo-ri_FdwState;
 int64 ctid;
 
 Datum datum = ExecGetJunkAttribute(planSlot,
state-ctidAttno, isNull);
 
 ctid = DatumGetInt64(datum);
 
 elog(WARNING, ID = %ld, ctid);
 
 if (isNull)
 elog(ERROR, ctid is NULL);
 
 /* FIXME not yet implemented */
 return NULL;
 }
 
 I do get (isNull=FALSE) but the ctid evaluates to some random number, e.g.
 
 WARNING:  ID = 44384788 (44384788)
 WARNING:  ID = 44392980 (44392980)
 
 and so on.

Maybe that's the effect of writing past the end of the variable.

 So what did I get wrong? Is it possible to use arbitrary hidden column as
 junk columns (documentation seems to suggest that)? What is the right
 way to do that / whad did I get wrong?

I would like to know an answer to this as well.

I don't think that assigning to tuple-t_self will work, and I
know too little about the executor to know if there's any way
to fit a ctid that is *not* an ItemPointerData into a TupleTableSlot
so that it will show up as resjunk TargerEntry in ExecForeignUpdate.

Tom, could you show us a rope if there is one?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes (timings)

2013-11-15 Thread Andres Freund
On 2013-11-15 17:11:46 +0100, Erik Rijkers wrote:
 I've been messing with minmax indexes some more so here are some results of 
 that.
 
 Perhaps someone finds these timings useful.
 
 
 Centos 5.7, 32 GB memory, 2 quadcores.
 
 '--prefix=/var/data1/pg_stuff/pg_installations/pgsql.minmax' 
 '--with-pgport=6444' '--enable-depend' '--enable-cassert'
 '--enable-debug' '--with-perl' '--with-openssl' '--with-libxml' 
 '--enable-dtrace'

Just some general advice: doing timings with --enale-cassert isn't that
meaningful - it often can distort results significantly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes (timings)

2013-11-15 Thread Kevin Grittner
Erik Rijkers e...@xs4all.nl wrote:

 Perhaps someone finds these timings useful.

 '--enable-cassert'

Assertions can really distort the timings, and not always equally
for all code paths.  Any chance of re-running those tests without
that?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 2:50 AM, Amit Kapila wrote:
 Find the rebased version attached with this mail.

Doesn't build:

openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c 
/usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t 
sgml -i output-html -V html-index postgres.sgml
openjade:reference.sgml:61:3:E: cannot find alter_system.sgml; tried 
ref/alter_system.sgml, ./alter_system.sgml, ./alter_system.sgml, 
/usr/local/share/sgml/alter_system.sgml, /usr/share/sgml/alter_system.sgml
openjade:config.sgml:164:27:X: reference to non-existent ID SQL-ALTERSYSTEM
make[3]: *** [HTML.index] Error 1
make[3]: *** Deleting file `HTML.index'
osx -D. -x lower -i include-xslt-index postgres.sgml postgres.xmltmp
osx:reference.sgml:61:3:E: cannot find alter_system.sgml; tried 
ref/alter_system.sgml, ./alter_system.sgml, 
/usr/local/share/sgml/alter_system.sgml, /usr/share/sgml/alter_system.sgml
osx:config.sgml:164:27:X: reference to non-existent ID SQL-ALTERSYSTEM
make[3]: *** [postgres.xml] Error 1

New file missing in patch?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 3:17 AM, Peter Geoghegan wrote:
 Attached patch allows pg_stat_statements to store arbitrarily long
 query texts, using an external, transparently managed lookaside file.

Compiler warnings with fortify settings:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE 
-I/usr/include/libxml2   -c -o pg_stat_statements.o pg_stat_statements.c -MMD 
-MP -MF .deps/pg_stat_statements.Po
pg_stat_statements.c: In function ‘entry_reset’:
pg_stat_statements.c:1827:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]
pg_stat_statements.c: In function ‘garbage_collect_query_texts’:
pg_stat_statements.c:1779:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
 Sorry, I posted buggy version of patch. Attached version is correct.

This patch crashes the hstore the pg_trgm regression tests.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 7:09 AM, KONDO Mitsumasa wrote:
 I create a patch that is improvement of disk-read and OS file caches. It can
 optimize kernel readahead parameter using buffer access strategy and
 posix_fadvice() in various disk-read situations.

Various compiler warnings:

tablecmds.c: In function ‘copy_relation_data’:
tablecmds.c:9120:3: warning: passing argument 5 of ‘smgrread’ makes pointer 
from integer without a cast [enabled by default]
In file included from tablecmds.c:79:0:
../../../src/include/storage/smgr.h:94:13: note: expected ‘char *’ but argument 
is of type ‘int’

bufmgr.c: In function ‘ReadBuffer_common’:
bufmgr.c:455:4: warning: passing argument 5 of ‘smgrread’ from incompatible 
pointer type [enabled by default]
In file included from ../../../../src/include/storage/buf_internals.h:22:0,
 from bufmgr.c:45:
../../../../src/include/storage/smgr.h:94:13: note: expected ‘char *’ but 
argument is of type ‘BufferAccessStrategy’

md.c: In function ‘mdread’:
md.c:680:2: warning: passing argument 2 of ‘BufferHintIOAdvise’ makes integer 
from pointer without a cast [enabled by default]
In file included from md.c:34:0:
../../../../src/include/storage/fd.h:79:12: note: expected ‘off_t’ but argument 
is of type ‘char *’



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut pete...@gmx.net wrote:

 On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
  Sorry, I posted buggy version of patch. Attached version is correct.

 This patch crashes the hstore the pg_trgm regression tests.


What exact version did you try 14 or 16? If it was 16, could you please
post a stacktrace, because it doesn't crash for me.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Transaction-lifespan memory leak with plpgsql DO blocks

2013-11-15 Thread Yeb Havinga

On 2013-11-14 22:23, Tom Lane wrote:


So after some experimentation I came up with version 2, attached.


Thanks for looking into this! I currently do not have access to a setup 
to try the patch. A colleague of mine will look into this next week.


Thanks again,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] writable FDWs / update targets confusion

2013-11-15 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Tom, could you show us a rope if there is one?

What is it you actually need to fetch?

IIRC, the idea was that most FDWs would do the equivalent of fetching the
primary-key columns to use in an update.  If that's what you need, then
AddForeignUpdateTargets should identify those columns and generate Vars
for them.  postgres_fdw is probably not a good model since it's using
ctid (a non-portable thing) and piggybacking on the existence of a tuple
header field to put that in.

If you're dealing with some sort of hidden tuple identity column that
works like CTID but doesn't fit in six bytes, there may not be any good
solution in the current state of the FDW support.  As I mentioned, we'd
batted around the idea of letting FDWs define a system column with some
other datatype besides TID, but we'd not figured out all the nitty
gritty details in time for 9.3.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor p...@rbt.ca wrote:

 I tried again this morning using gin-packed-postinglists-16.patch and
 gin-fast-scan.6.patch. No crashes.

 It is about a 0.1% random sample of production data (10,000,000 records)
 with the below structure. Pg was compiled with debug enabled in both cases.

   Table public.kp
  Column |  Type   | Modifiers
 +-+---
  id | bigint  | not null
  string | text| not null
  score1 | integer |
  score2 | integer |
  score3 | integer |
  score4 | integer |
 Indexes:
 kp_pkey PRIMARY KEY, btree (id)
 kp_string_key UNIQUE CONSTRAINT, btree (string)
 textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
 WHERE score1 IS NOT NULL



 This is a query tested. All data is in Pg buffer cache for these timings.
 Words like the and and are very common (~9% of entries, each) and a
 word like hotel is much less common (~0.2% of entries).

   SELECT id,string
 FROM kp
WHERE score1 IS NOT NULL
  AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
  -- ? is substituted with the query strings
 ORDER BY score1 DESC, score2 ASC
 LIMIT 1000;

  Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
 rows=142 loops=1)
-  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
 time=250.008..250.017 rows=142 loops=1)
  Sort Key: score1, score2
  Sort Method: quicksort  Memory: 36kB
  -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
 (actual time=249.711..249.945 rows=142 loops=1)
Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
 '''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
-  Bitmap Index Scan on textsearch_gin_idx
 (cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
 loops=1)
  Index Cond: (to_tsvector('simple'::regconfig, string)
 @@ '''hotel''  ''and''  ''the'''::tsquery)
  Total runtime: 250.096 ms



 Times are from \timing on.

 MASTER
 ===
 the:   888.436 ms   926.609 ms   885.502 ms
 and:   944.052 ms   937.732 ms   920.050 ms
 hotel:  53.992 ms57.039 ms65.581 ms
 and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

 These numbers roughly match what we get with Pg 9.2. The time savings
 between 'the' and 'and  the  hotel' is mostly heap lookups for the score
 and the final sort.



 The size of the index on disk is about 2% smaller in the patched version.

 PATCHED
 ===
 the:  1055.169 ms 1081.976 ms  1083.021 ms
 and:   912.173 ms  949.364 ms   965.261 ms
 hotel:  62.591 ms   64.341 ms62.923 ms
 and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
 hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

 I was hoping that the 'and  the  hotel' case would improve with this
 patch to be closer to the 'hotel' search, as I thought that was the kind of
 thing it targeted. Unfortunately, it did not. I actually applied the
 patches, compiled, initdb/load data, and ran it again thinking I made a
 mistake.

 Reordering the terms 'hotel  and  the' doesn't change the result.


Oh, in this path new consistent method isn't implemented for tsvector
opclass, for array only. Will be fixed soon.
BTW, was index 2% smaller or 2 times smaller? If it's 2% smaller than I
need to know more about your dataset :)

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Peter Eisentraut
On 11/15/13, 12:24 PM, Alexander Korotkov wrote:
 On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut pete...@gmx.net
 mailto:pete...@gmx.net wrote:
 
 On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
  Sorry, I posted buggy version of patch. Attached version is correct.
 
 This patch crashes the hstore the pg_trgm regression tests.
 
 
 What exact version did you try 14 or 16? If it was 16, could you please
 post a stacktrace, because it doesn't crash for me.

The one that's the latest in the commitfest: 
http://www.postgresql.org/message-id/capphfdveq-jhe_2z9pvw22bp6h_o8xoaxcbjagf87gs4p4j...@mail.gmail.com

If you have a newer one, please add it there.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pre-commit triggers

2013-11-15 Thread Andrew Dunstan


Attached is a patch to provide a new event trigger that will fire on 
transaction commit. I have tried to make certain that it fires at a 
sufficiently early stage in the commit process that some of the evils 
mentioned in previous discussions on this topic aren't relevant.


The triggers don't fire if there is no real XID, so only actual data 
changes should cause the trigger to fire. They also don't fire in single 
user mode, so that if you do something stupid like create a trigger that 
unconditionally raises an error you have a way to recover.


This is intended to be somewhat similar to the same feature in the 
Firebird database, and the initial demand came from a client migrating 
from that system to Postgres.


cheers

andrew
diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
index ac31332..3bbf1a4 100644
--- a/doc/src/sgml/event-trigger.sgml
+++ b/doc/src/sgml/event-trigger.sgml
@@ -12,7 +12,7 @@
productnamePostgreSQL/ also provides event triggers.  Unlike regular
triggers, which are attached to a single table and capture only DML events,
event triggers are global to a particular database and are capable of
-   capturing DDL events.
+   capturing DDL events or transaction commits.
   /para
 
   para
@@ -29,8 +29,9 @@
  occurs in the database in which it is defined. Currently, the only
  supported events are
  literalddl_command_start/,
- literalddl_command_end/
- and literalsql_drop/.
+ literalddl_command_end/,
+ literalsql_drop/, and
+ literaltransaction_commit/.
  Support for additional events may be added in future releases.
/para
 
@@ -65,6 +66,15 @@
/para
 
para
+A literaltransaction_commit/ trigger is called at the end of a
+transaction, just before any deferred triggers are fired, unless
+no data changes have been made by the transaction, or
+productnamePostgreSQL/ is running in Single-User mode. This is so
+that you can recover from a badly specified literaltransaction_commit/
+trigger.
+   /para
+
+   para
  Event triggers (like other functions) cannot be executed in an aborted
  transaction.  Thus, if a DDL command fails with an error, any associated
  literalddl_command_end/ triggers will not be executed.  Conversely,
@@ -77,8 +87,13 @@
/para
 
para
- For a complete list of commands supported by the event trigger mechanism,
- see xref linkend=event-trigger-matrix.
+A literaltransaction_commit/ trigger is also not called in an
+aborted transaction.
+   /para
+
+   para
+ For a complete list of commands supported by the event trigger
+ mechanism, see xref linkend=event-trigger-matrix.
/para
 
para
@@ -101,6 +116,11 @@
  to intercept. A common use of such triggers is to restrict the range of
  DDL operations which users may perform.
/para
+
+   para
+literaltransaction_commit/ triggers do not currently support
+literalWHEN/literal clauses.
+   /para
   /sect1
 
   sect1 id=event-trigger-matrix
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 0591f3f..74fc04c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1825,6 +1825,16 @@ CommitTransaction(void)
 	Assert(s-parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s-transactionId != InvalidTransactionId  IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
@@ -2030,6 +2040,16 @@ PrepareTransaction(void)
 	Assert(s-parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s-transactionId != InvalidTransactionId  IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 328e2a8..f93441f 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -153,7 +153,8 @@ 

Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Heikki Linnakangas

On 14.11.2013 22:10, Simon Riggs wrote:

On 16 January 2013 00:40, Simon Riggs si...@2ndquadrant.com wrote:


SeqAm allows you to specify a plugin that alters the behaviour for
sequence allocation and resetting, aimed specifically at clustering
systems.

New command options on end of statement allow syntax
   CREATE SEQUENCE foo_seq
   USING globalseq


Production version of this, ready for commit to PG 9.4

Includes test extension which allows sequences without gaps - gapless.

Test using seq_test.psql after creating extension.

No dependencies on other patches.


It's pretty hard to review the this without seeing the other 
implementation you're envisioning to use this API. But I'll try:


I wonder if the level of abstraction is right. The patch assumes that 
the on-disk storage of all sequences is the same, so the access method 
can't change that. But then it leaves the details of actually updating 
the on-disk block, WAL-logging and all, as the responsibility of the 
access method. Surely that's going to look identical in all the seqams, 
if they all use the same on-disk format. That also means that the 
sequence access methods can't be implemented as plugins, as plugins 
can't do WAL-logging.


The comment in seqam.c says that there's a private column reserved for 
access method-specific data, called am_data, but that seems to be the 
only mention of am_data in the patch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trivial one-off memory leak in guc-file.l ParseConfigFile

2013-11-15 Thread Heikki Linnakangas

On 22.09.2013 22:40, didier wrote:

Hi

fix a small memory leak in guc-file.l ParseConfigFile

AbsoluteConfigLocation() return a strdup string but it's never free or
referenced outside ParseConfigFile

Courtesy Valgrind and Noah Misch MEMPOOL work.


I spotted and fixed this some time ago while fixing another leak, see 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=138184adc5f7c60c184972e4d23f8cdb32aed77d. 
I didn't realize you had already reported it back then.


So, I've marked this as committed in the commitfest app. But thanks for 
the report anyway.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Andres Freund
On 2013-11-15 20:08:30 +0200, Heikki Linnakangas wrote:
 It's pretty hard to review the this without seeing the other
 implementation you're envisioning to use this API. But I'll try:

We've written a distributed sequence implementation against it, so it
works at least for that use case.

While certainly not release worthy, it still might be interesting to
look at.
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=contrib/bdr/bdr_seq.c;h=c9480c8021882f888e35080f0e7a7494af37ae27;hb=bdr

bdr_sequencer_fill_sequence pre-allocates ranges of values,
bdr_sequence_alloc returns them upon nextval().

 I wonder if the level of abstraction is right. The patch assumes that the
 on-disk storage of all sequences is the same, so the access method can't
 change that.  But then it leaves the details of actually updating the on-disk
 block, WAL-logging and all, as the responsibility of the access method.
 Surely that's going to look identical in all the seqams, if they all use the
 same on-disk format. That also means that the sequence access methods can't
 be implemented as plugins, as plugins can't do WAL-logging.

Well, it exposes log_sequence_tuple() - together with the added am
private column of pg_squence that allows to do quite a bit of different
things. I think unless we really implement pluggable RMGRs - which I
don't really see coming - we cannot be radically different.

 The comment in seqam.c says that there's a private column reserved for
 access method-specific data, called am_data, but that seems to be the only
 mention of am_data in the patch.

It's amdata, not am_data :/. Directly at the end of pg_sequence.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 9:56 PM, Peter Eisentraut pete...@gmx.net wrote:

 On 11/15/13, 12:24 PM, Alexander Korotkov wrote:
  On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut pete...@gmx.net
  mailto:pete...@gmx.net wrote:
 
  On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
   Sorry, I posted buggy version of patch. Attached version is
 correct.
 
  This patch crashes the hstore the pg_trgm regression tests.
 
 
  What exact version did you try 14 or 16? If it was 16, could you please
  post a stacktrace, because it doesn't crash for me.

 The one that's the latest in the commitfest:
 http://www.postgresql.org/message-id/capphfdveq-jhe_2z9pvw22bp6h_o8xoaxcbjagf87gs4p4j...@mail.gmail.com

 If you have a newer one, please add it there.


Ok, I actualized information in commitfest app.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Simon Riggs
On 15 November 2013 15:08, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 I wonder if the level of abstraction is right.

That is the big question and not something to shy away from.

What is presented is not the first thought, by a long way. Andres'
contribution to the patch is mainly around this point, so the seq am
is designed with the needs of the main use case in mind.

I'm open to suggested changes but I would say that practical usage
beats changes suggested for purity.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 3:10 PM, Simon Riggs wrote:
 On 16 January 2013 00:40, Simon Riggs si...@2ndquadrant.com wrote:
 
 SeqAm allows you to specify a plugin that alters the behaviour for
 sequence allocation and resetting, aimed specifically at clustering
 systems.

 New command options on end of statement allow syntax
   CREATE SEQUENCE foo_seq
   USING globalseq
 
 Production version of this, ready for commit to PG 9.4

This patch doesn't apply anymore.

Also, you set this to returned with feedback in the CF app.  Please
verify whether that was intentional.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 9:40 PM, Michael Paquier wrote:
 Hi all,
 
 Please find attached updated patches for the support of REINDEX
 CONCURRENTLY, renamed 2.0 for the occasion:
 - 20131114_1_index_drop_comments.patch, patch that updates some
 comments in index_drop. This updates only a couple of comments in
 index_drop but has not been committed yet. It should be IMO...
 - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
 providing a single API that can be used to wait for old snapshots
 - 20131114_3_reindex_concurrently.patch, providing the core feature.
 Patch 3 needs to have patch 2 applied first. Regression tests,
 isolation tests and documentation are included with the patch.

The third patch needs to be rebased, because of a conflict in index.c.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Simon Riggs
On 15 November 2013 15:48, Peter Eisentraut pete...@gmx.net wrote:
 On 11/14/13, 3:10 PM, Simon Riggs wrote:
 On 16 January 2013 00:40, Simon Riggs si...@2ndquadrant.com wrote:

 SeqAm allows you to specify a plugin that alters the behaviour for
 sequence allocation and resetting, aimed specifically at clustering
 systems.

 New command options on end of statement allow syntax
   CREATE SEQUENCE foo_seq
   USING globalseq

 Production version of this, ready for commit to PG 9.4

 This patch doesn't apply anymore.

Yes, a patch conflict was just committed, will repost.

 Also, you set this to returned with feedback in the CF app.  Please
 verify whether that was intentional.

Not sure that was me, if so, corrected.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Adrian Klaver
First review of the above patch as listed in current CommitFest as well 
as subsequent ECDH patch in the thread below:


http://www.postgresql.org/message-id/1383782378-7342-1-git-send-email-mark...@gmail.com

Platform OpenSuse 12.2

Both patches applied cleanly.

Configured:

./configure --with-python --with-openssl 
--prefix=/home/aklaver/pgsqlTest --with-pgport=5462 --enable-cassert



make and make check ran without error.

The description of the GUCs show up in the documentation but I am not 
seeing the GUCs themselves in postgresql.conf, so I could test no 
further. It is entirely possible I am missing a step and would 
appreciate enlightenment.



The general premise seems sound, allowing the DBA control over the type 
of cipher of used.


Thanks,
--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
2%.

It's essentially sentence fragments from 1 to 5 words in length. I wasn't
expecting it to be much smaller.

10 recent value selections:

 white vinegar reduce color running
 vinegar cure uti
 cane vinegar acidity depends parameter
 how remedy fir clogged shower
 use vinegar sensitive skin
 home remedies removing rust heating
 does non raw apple cider
 home remedies help maintain healthy
 can vinegar mess up your
 apple cide vineger ph balance

regards,

Rod



On Fri, Nov 15, 2013 at 12:51 PM, Alexander Korotkov
aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor p...@rbt.ca wrote:

 I tried again this morning using gin-packed-postinglists-16.patch and
 gin-fast-scan.6.patch. No crashes.

 It is about a 0.1% random sample of production data (10,000,000 records)
 with the below structure. Pg was compiled with debug enabled in both cases.

   Table public.kp
  Column |  Type   | Modifiers
 +-+---
  id | bigint  | not null
  string | text| not null
  score1 | integer |
  score2 | integer |
  score3 | integer |
  score4 | integer |
 Indexes:
 kp_pkey PRIMARY KEY, btree (id)
 kp_string_key UNIQUE CONSTRAINT, btree (string)
 textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
 WHERE score1 IS NOT NULL



 This is a query tested. All data is in Pg buffer cache for these timings.
 Words like the and and are very common (~9% of entries, each) and a
 word like hotel is much less common (~0.2% of entries).

   SELECT id,string
 FROM kp
WHERE score1 IS NOT NULL
  AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
  -- ? is substituted with the query strings
 ORDER BY score1 DESC, score2 ASC
 LIMIT 1000;

  Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
 rows=142 loops=1)
-  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
 time=250.008..250.017 rows=142 loops=1)
  Sort Key: score1, score2
  Sort Method: quicksort  Memory: 36kB
  -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
 (actual time=249.711..249.945 rows=142 loops=1)
Recheck Cond: ((to_tsvector('simple'::regconfig, string)
 @@ '''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
-  Bitmap Index Scan on textsearch_gin_idx
 (cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
 loops=1)
  Index Cond: (to_tsvector('simple'::regconfig,
 string) @@ '''hotel''  ''and''  ''the'''::tsquery)
  Total runtime: 250.096 ms



 Times are from \timing on.

 MASTER
 ===
 the:   888.436 ms   926.609 ms   885.502 ms
 and:   944.052 ms   937.732 ms   920.050 ms
 hotel:  53.992 ms57.039 ms65.581 ms
 and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

 These numbers roughly match what we get with Pg 9.2. The time savings
 between 'the' and 'and  the  hotel' is mostly heap lookups for the score
 and the final sort.



 The size of the index on disk is about 2% smaller in the patched version.

 PATCHED
 ===
 the:  1055.169 ms 1081.976 ms  1083.021 ms
 and:   912.173 ms  949.364 ms   965.261 ms
 hotel:  62.591 ms   64.341 ms62.923 ms
 and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
 hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

 I was hoping that the 'and  the  hotel' case would improve with this
 patch to be closer to the 'hotel' search, as I thought that was the kind of
 thing it targeted. Unfortunately, it did not. I actually applied the
 patches, compiled, initdb/load data, and ran it again thinking I made a
 mistake.

 Reordering the terms 'hotel  and  the' doesn't change the result.


 Oh, in this path new consistent method isn't implemented for tsvector
 opclass, for array only. Will be fixed soon.
 BTW, was index 2% smaller or 2 times smaller? If it's 2% smaller than I
 need to know more about your dataset :)

 --
 With best regards,
 Alexander Korotkov.




Re: [HACKERS] Cannot allocate memory

2013-11-15 Thread Kevin Grittner
Heng Zhi Feng (zh...@hsr.ch) zh...@hsr.ch wrote:

 Virtual Machine – Ubuntu 13.10
 1.92GB Memory
 2 Parallel Processors

 work_mem = 11MB

 shared_buffers = 448MB
 max_connections = 80

 2013-11-15 11:02:35 CET LOG:  could not fork autovacuum worker process: 
 Cannot allocate memory
 2013-11-15 11:02:36 CET LOG:  could not send data to client: Broken pipe
 2013-11-15 11:02:36 CET LOG:  unexpected EOF on client connection

Before you start PostgreSQL, what does `free -m` show?

On such a tiny machine, some of the usual advice needs to be
modified a bit.  Sure, people say to start with shared_buffers at
25% of machine RAM, but if the (virtual) machine has so little RAM
that the OS is already taking a significant percentage, I would say
to go with 25% of what is free (excluding OS cache).  Likewise, the
advice I usually give to start with work_mem at 25% of machine RAM
divided by max_connections should be based on *available* RAM.  So
4MB to 5MB is probably going to be more appropriate than 11MB.  You
will probably need to reduce temp_buffers to 2MB or less -- right
now 1/3 of your machine RAM could be tied up in space reserved for
caching temporary table data, not released until connections close.

Since this VM is tight on resources and only has two cores, you
might want to use pgbouncer, configured in transaction mode with a
pool limited to something like 5 connections, so that you can
increase work_mem and avoid over-taxing the resources you have.

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.com wrote:

 2%.

 It's essentially sentence fragments from 1 to 5 words in length. I wasn't
 expecting it to be much smaller.

 10 recent value selections:

  white vinegar reduce color running
  vinegar cure uti
  cane vinegar acidity depends parameter
  how remedy fir clogged shower
  use vinegar sensitive skin
  home remedies removing rust heating
  does non raw apple cider
  home remedies help maintain healthy
  can vinegar mess up your
  apple cide vineger ph balance


I didn't get why it's not significantly smaller. Is it possible to share
dump?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.com wrote:

 2%.

 It's essentially sentence fragments from 1 to 5 words in length. I wasn't
 expecting it to be much smaller.

 10 recent value selections:

  white vinegar reduce color running
  vinegar cure uti
  cane vinegar acidity depends parameter
  how remedy fir clogged shower
  use vinegar sensitive skin
  home remedies removing rust heating
  does non raw apple cider
  home remedies help maintain healthy
  can vinegar mess up your
  apple cide vineger ph balance


 I didn't get why it's not significantly smaller. Is it possible to share
 dump?


Sorry, I reported that incorrectly. It's not something I was actually
looking for and didn't pay much attention to at the time.

The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor rod.tay...@gmail.com wrote:




 On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov 
 aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.comwrote:

 2%.

 It's essentially sentence fragments from 1 to 5 words in length. I
 wasn't expecting it to be much smaller.

 10 recent value selections:

  white vinegar reduce color running
  vinegar cure uti
  cane vinegar acidity depends parameter
  how remedy fir clogged shower
  use vinegar sensitive skin
  home remedies removing rust heating
  does non raw apple cider
  home remedies help maintain healthy
  can vinegar mess up your
  apple cide vineger ph balance


 I didn't get why it's not significantly smaller. Is it possible to share
 dump?


 Sorry, I reported that incorrectly. It's not something I was actually
 looking for and didn't pay much attention to at the time.

 The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.


Good. That's meet my expectations :)
You mention that both master and patched versions was compiled with debug.
Was cassert enabled?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Marko Kreen
On Fri, Nov 15, 2013 at 11:16:25AM -0800, Adrian Klaver wrote:
 The description of the GUCs show up in the documentation but I am
 not seeing the GUCs themselves in postgresql.conf, so I could test
 no further. It is entirely possible I am missing a step and would
 appreciate enlightenment.

Sorry, I forgot to update sample config.

ssl-prefer-server-cipher-order-v2.patch
- Add GUC to sample config
- Change default value to 'true', per comments from Alvaro and Magnus.

ssl-ecdh-v2.patch
- Add GUC to sample config

-- 
marko

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..56bfa01 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -883,6 +883,18 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-prefer-server-ciphers xreflabel=ssl_prefer_server_ciphers
+  termvarnamessl_prefer_server_ciphers/varname (typebool/type)/term
+  indexterm
+   primaryvarnamessl_prefer_server_ciphers/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies whether to prefer client or server ciphersuite.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 7f01a78..2094674 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -112,6 +112,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable: if false, prefer client ciphers */
+bool	   SSLPreferServerCiphers;
+
 /*  */
 /*		 Hardcoded values		*/
 /*  */
@@ -845,6 +848,10 @@ initialize_SSL(void)
 	if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1)
 		elog(FATAL, could not set the cipher list (no valid ciphers available));
 
+	/* Let server choose order */
+	if (SSLPreferServerCiphers)
+		SSL_CTX_set_options(SSL_context, SSL_OP_CIPHER_SERVER_PREFERENCE);
+
 	/*
 	 * Load CA store, so we can verify client certificates if needed.
 	 */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 54d8078..0ec5ddf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -127,6 +127,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 extern char *SSLCipherSuites;
+extern bool SSLPreferServerCiphers;
 
 #ifdef TRACE_SORT
 extern bool trace_sort;
@@ -801,6 +802,15 @@ static struct config_bool ConfigureNamesBool[] =
 		check_ssl, NULL, NULL
 	},
 	{
+		{ssl_prefer_server_ciphers, PGC_POSTMASTER, CONN_AUTH_SECURITY,
+			gettext_noop(Give priority to server ciphersuite order.),
+			NULL
+		},
+		SSLPreferServerCiphers,
+		true,
+		NULL, NULL, NULL
+	},
+	{
 		{fsync, PGC_SIGHUP, WAL_SETTINGS,
 			gettext_noop(Forces synchronization of updates to disk.),
 			gettext_noop(The server will use the fsync() system call in several places to make 
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34a2d05..a190e5f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -86,6 +86,7 @@
 #ssl_key_file = 'server.key'		# (change requires restart)
 #ssl_ca_file = ''			# (change requires restart)
 #ssl_crl_file = ''			# (change requires restart)
+#ssl_prefer_server_ciphers = on		# (change requires restart)
 #password_encryption = on
 #db_user_namespace = off
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 56bfa01..3785052 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -895,6 +895,19 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-ecdh-curve xreflabel=ssl_ecdh_curve
+  termvarnamessl_ecdh_curve/varname (typestring/type)/term
+  indexterm
+   primaryvarnamessl_ecdh_curve/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies name of EC curve which will be used in ECDH key excanges.
+Default is literalprime256p1/.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 2094674..8d688f2 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -69,6 +69,9 @@
 #if SSLEAY_VERSION_NUMBER = 0x0907000L
 #include openssl/conf.h
 #endif
+#if (OPENSSL_VERSION_NUMBER = 0x0090800fL)  !defined(OPENSSL_NO_ECDH)
+#include openssl/ec.h
+#endif
 

Re: [HACKERS] additional json functionality

2013-11-15 Thread David E. Wheeler
On Nov 15, 2013, at 6:35 AM, Merlin Moncure mmonc...@gmail.com wrote:

 Here are the options on the table:
 1) convert existing json type to binary flavor (notwithstanding objections)
 2) maintain side by side types, one representing binary, one text.
 unfortunately, i think the text one must get the name 'json' due to
 unfortunate previous decision.
 3) merge the behaviors into a single type and get the best of both
 worlds (as suggested upthread).
 
 I think we need to take a *very* hard look at #3 before exploring #1
 or #2: Haven't through it through yet but it may be possible to handle
 this in such a way that will be mostly transparent to the end user and
 may have other benefits such as a faster path for serialization.

If it’s possible to preserve order and still get the advantages of binary 
representation --- which are substantial (see 
http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
examples) --- without undue maintenance overhead, then great.

I am completely opposed to duplicate key preservation in JSON, though. It has 
caused us a fair number of headaches at $work.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump insert with column names speedup

2013-11-15 Thread Peter Eisentraut
src/bin/pg_dump/pg_dump.c:1604: trailing whitespace.
+   staticStmt = createPQExpBuffer(); 
src/bin/pg_dump/pg_dump.c:1612: trailing whitespace.
+   else 
src/bin/pg_dump/pg_dump.c:1614: trailing whitespace.
+   if (column_inserts) 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2013-11-15 Thread Jeff Janes
On Fri, Nov 8, 2013 at 12:11 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Erik Rijkers wrote:
  On Thu, September 26, 2013 00:34, Erik Rijkers wrote:
   On Wed, September 25, 2013 22:34, Alvaro Herrera wrote:
  
   [minmax-5.patch]
  
   I have the impression it's not quite working correctly.

 Here's a version 7 of the patch, which fixes these bugs and adds
 opclasses for a bunch more types (timestamp, timestamptz, date, time,
 timetz), courtesy of Martín Marqués.  It's also been rebased to apply
 cleanly on top of today's master branch.

 I have also added a selectivity function, but I'm not positive that it's
 very useful yet.




I tested it with attached script, but broke out of the for loop after 5
iterations (when it had 300,000,005 rows inserted)

Then I did an analyze, and got an error message below:

jjanes=# analyze;
ERROR:  could not truncate file base/16384/16388_vm to 488 blocks: it's
only 82 blocks now

16388 is the index's relfilenode.

Here is the backtrace upon entry to the truncate that is going to fail:

#0  mdtruncate (reln=0x23c91b0, forknum=VISIBILITYMAP_FORKNUM, nblocks=488)
at md.c:858
#1  0x0048eb4a in mmRevmapTruncate (rmAccess=0x26ad878,
heapNumBlocks=1327434) at mmrevmap.c:360
#2  0x0048d37a in mmvacuumcleanup (fcinfo=value optimized out) at
minmax.c:1264
#3  0x0072dcef in FunctionCall2Coll (flinfo=value optimized out,
collation=value optimized out, arg1=value optimized out,
arg2=value optimized out) at fmgr.c:1323
#4  0x0048c1e5 in index_vacuum_cleanup (info=value optimized out,
stats=0x0) at indexam.c:715
#5  0x0052a7ce in do_analyze_rel (onerel=0x7f59798589e8,
vacstmt=0x23b0bd8, acquirefunc=0x5298d0 acquire_sample_rows,
relpages=1327434,
inh=0 '\000', elevel=13) at analyze.c:634
#6  0x0052b320 in analyze_rel (relid=value optimized out,
vacstmt=0x23b0bd8, bstrategy=value optimized out) at analyze.c:267
#7  0x0057cba7 in vacuum (vacstmt=0x23b0bd8, relid=value optimized
out, do_toast=1 '\001', bstrategy=value optimized out,
for_wraparound=0 '\000', isTopLevel=value optimized out) at
vacuum.c:249
#8  0x00663177 in standard_ProcessUtility (parsetree=0x23b0bd8,
queryString=value optimized out, context=value optimized out,
params=0x0,
dest=value optimized out, completionTag=value optimized out) at
utility.c:682
#9  0x7f598290b791 in pgss_ProcessUtility (parsetree=0x23b0bd8,
queryString=0x23b0220 analyze \n;, context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
dest=0x23b0f18, completionTag=0x7fffd3442f30 ) at
pg_stat_statements.c:825
#10 0x0065fcf7 in PortalRunUtility (portal=0x24195e0,
utilityStmt=0x23b0bd8, isTopLevel=1 '\001', dest=0x23b0f18,
completionTag=0x7fffd3442f30 )
at pquery.c:1187
#11 0x00660c6d in PortalRunMulti (portal=0x24195e0, isTopLevel=1
'\001', dest=0x23b0f18, altdest=0x23b0f18, completionTag=0x7fffd3442f30 )
at pquery.c:1318
#12 0x00661323 in PortalRun (portal=0x24195e0,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x23b0f18,
altdest=0x23b0f18,
completionTag=0x7fffd3442f30 ) at pquery.c:816
#13 0x0065dbb4 in exec_simple_query (query_string=0x23b0220
analyze \n;) at postgres.c:1048
#14 0x0065f259 in PostgresMain (argc=value optimized out,
argv=value optimized out, dbname=0x2347be8 jjanes, username=value
optimized out)
at postgres.c:3992
#15 0x0061b7d0 in BackendRun (argc=value optimized out,
argv=value optimized out) at postmaster.c:4085
#16 BackendStartup (argc=value optimized out, argv=value optimized out)
at postmaster.c:3774
#17 ServerLoop (argc=value optimized out, argv=value optimized out) at
postmaster.c:1585
#18 PostmasterMain (argc=value optimized out, argv=value optimized out)
at postmaster.c:1240
#19 0x005b5e90 in main (argc=3, argv=0x2346cd0) at main.c:196


Cheers,

Jeff


minmax_test3.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2013-11-15 Thread Andres Freund
On 2013-11-15 20:47:26 +0100, Andres Freund wrote:
 Hi,
 
 On 2013-09-27 00:55:45 +0200, Andres Freund wrote:
  So what's todo? The file header tells us:
   * - revive pure-spinlock implementation
   * - abstract away atomic ops, we really only need a few.
   *   - CAS
   *   - LOCK XADD
   * - convert PGPROC-lwWaitLink to ilist.h slist or even dlist.
   * - remove LWLockWakeup dealing with MyProc
   * - overhaul the mask offsets, make SHARED/EXCLUSIVE_LOCK_MASK wider, 
  MAX_BACKENDS
 
 So, here's the next version of this patchset:
 1) I've added an abstracted atomic ops implementation. Needs a fair
amount of work, also submitted as a separate CF entry. (Patch 1  2)
 2) I've converted PGPROC-lwWaiting into a dlist. That makes a fair bit
of code easier to read and reduces the size of the patchset. Also
fixes a bug in the xlog-scalability code. (Patch 3)
 3) Alvaro and I updated the comments in lwlock.c. (Patch 4)
 
 I think 2) should be committable pretty soon. It's imo a pretty clear
 win in readability. 1) will need a good bit of more work.
 
 With regard to the scalable lwlock work, what's most needed now is a
 good amount of testing.
 
 Please note that you need to 'autoreconf' after applying the patchset. I
 don't have a compatible autoconf version on this computer causing the
 diff to be humongous if I include those changes.

Please also note that due to the current state of the atomics
implementation this likely will only work on a somewhat recent gcc and
that the performance might be slightly worse than in the previous
version because the atomic add is implemented using the CAS fallback.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Andres Freund
Hi,

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
 - 20131114_3_reindex_concurrently.patch, providing the core feature.
 Patch 3 needs to have patch 2 applied first. Regression tests,
 isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 12:26 PM, Alexander Korotkov wrote:
 Revised version of patch is attached.

This doesn't build:

ginget.c: In function ‘scanPage’:
ginget.c:1108:2: warning: implicit declaration of function 
‘GinDataLeafPageGetPostingListEnd’ [-Wimplicit-function-declaration]
ginget.c:1108:9: warning: assignment makes pointer from integer without a cast 
[enabled by default]
ginget.c:1109:18: error: ‘GinDataLeafIndexCount’ undeclared (first use in this 
function)
ginget.c:1109:18: note: each undeclared identifier is reported only once for 
each function it appears in
ginget.c::3: error: unknown type name ‘GinDataLeafItemIndex’
ginget.c::3: warning: implicit declaration of function ‘GinPageGetIndexes’ 
[-Wimplicit-function-declaration]
ginget.c::57: error: subscripted value is neither array nor pointer nor 
vector
ginget.c:1112:12: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1115:38: error: request for member ‘iptr’ in something not a structure 
or union
ginget.c:1118:230: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1119:16: error: request for member ‘iptr’ in something not a structure 
or union
ginget.c:1123:233: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1136:3: warning: implicit declaration of function 
‘ginDataPageLeafReadItemPointer’ [-Wimplicit-function-declaration]
ginget.c:1136:7: warning: assignment makes pointer from integer without a cast 
[enabled by default]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Sat, Nov 16, 2013 at 12:10 AM, Peter Eisentraut pete...@gmx.net wrote:

 On 11/14/13, 12:26 PM, Alexander Korotkov wrote:
  Revised version of patch is attached.

 This doesn't build:

 ginget.c: In function ‘scanPage’:
 ginget.c:1108:2: warning: implicit declaration of function
 ‘GinDataLeafPageGetPostingListEnd’ [-Wimplicit-function-declaration]
 ginget.c:1108:9: warning: assignment makes pointer from integer without a
 cast [enabled by default]
 ginget.c:1109:18: error: ‘GinDataLeafIndexCount’ undeclared (first use in
 this function)
 ginget.c:1109:18: note: each undeclared identifier is reported only once
 for each function it appears in
 ginget.c::3: error: unknown type name ‘GinDataLeafItemIndex’
 ginget.c::3: warning: implicit declaration of function
 ‘GinPageGetIndexes’ [-Wimplicit-function-declaration]
 ginget.c::57: error: subscripted value is neither array nor pointer
 nor vector
 ginget.c:1112:12: error: request for member ‘pageOffset’ in something not
 a structure or union
 ginget.c:1115:38: error: request for member ‘iptr’ in something not a
 structure or union
 ginget.c:1118:230: error: request for member ‘pageOffset’ in something not
 a structure or union
 ginget.c:1119:16: error: request for member ‘iptr’ in something not a
 structure or union
 ginget.c:1123:233: error: request for member ‘pageOffset’ in something not
 a structure or union
 ginget.c:1136:3: warning: implicit declaration of function
 ‘ginDataPageLeafReadItemPointer’ [-Wimplicit-function-declaration]
 ginget.c:1136:7: warning: assignment makes pointer from integer without a
 cast [enabled by default]


This patch is against gin packed posting lists patch. Doesn't compile
separately.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Extra functionality to createuser

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 4:35 PM, Christopher Browne wrote: On Thu, Nov 14, 2013 at
5:41 AM, Sameer Thakur samthaku...@gmail.com wrote:
 So i think -g option is failing

 Right you are.

 I was missing a g: in the getopt_long() call.

 Attached is a revised patch that handles that.


src/bin/scripts/createuser.c:117: indent with spaces.
+   case 'g':
src/bin/scripts/createuser.c:118: indent with spaces.
+   roles = pg_strdup(optarg);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 12:29 PM, J Smith dark.panda+li...@gmail.com wrote:
 Looks like we got another set of errors overnight. Here's the log file
 from the errors. (Log file scrubbed slightly to remove private data,
 but still representative of the problem I believe.)

 Nov 13 05:34:34 dev postgres[6084]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6084]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6084]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6084]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3
 Nov 13 05:34:34 dev postgres[6076]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6076]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6076]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6076]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3
 Nov 13 05:34:34 dev postgres[6087]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6087]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6087]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6087]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3
 Nov 13 05:34:34 dev postgres[6086]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6086]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6086]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6086]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3
 Nov 13 05:34:34 dev postgres[6088]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6088]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6088]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6088]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3
 Nov 13 05:34:34 dev postgres[6085]: [4-1] user=dev,db=dev ERROR:
 could not access status of transaction 6337381
 Nov 13 05:34:34 dev postgres[6085]: [4-2] user=dev,db=dev DETAIL:
 Could not open file pg_subtrans/0060: No such file or directory.
 Nov 13 05:34:34 dev postgres[6085]: [4-3] user=dev,db=dev CONTEXT:
 SQL statement SELECT 1 FROM ONLY typhon.collection_batches x
 WHERE id OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
 Nov 13 05:34:34 dev postgres[6085]: [4-4] user=dev,db=dev STATEMENT:
 update listings set deactivated_at=$1 where id=$2 and lock_version=$3

 Several processes all seemed to hit the problem at the same moment,
 and all of them refer to the same transaction ID. Again, the file
 pg_subtrans/0060 doesn't exist, and the only file that does exist is
 pg_subtrans/005A which appears to be a zeroed-out file 245760 bytes in
 length.

 Still don't have a clue as to how I can reproduce the problem. It
 seems that in all cases the error occurred during either an UPDATE to
 a table_X or an INSERT to table_Y. In all cases, the error occurred in
 a manner identical to those shown in the log above, the only
 difference being either an UPDATE on table_X or an INSERT on table_Y.

 Not sure what direction I should head to now. Perhaps some aggressive
 logging would help, so we can see the queries surrounding the
 problems? I could reconfigure things to capture all statements and set
 up monit or something to send an alert when the problem resurfaces,
 for instance.

 Cheers all.

I think what would help the most is if you could arrange to obtain a
stack backtrace at the point when the error is thrown.  Maybe put a
long sleep call in just before the error happens, and when it gets
stuck there, attach gdb and run bt full.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent 

Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler da...@justatheory.com wrote:
 On Nov 15, 2013, at 6:35 AM, Merlin Moncure mmonc...@gmail.com wrote:

 Here are the options on the table:
 1) convert existing json type to binary flavor (notwithstanding objections)
 2) maintain side by side types, one representing binary, one text.
 unfortunately, i think the text one must get the name 'json' due to
 unfortunate previous decision.
 3) merge the behaviors into a single type and get the best of both
 worlds (as suggested upthread).

 I think we need to take a *very* hard look at #3 before exploring #1
 or #2: Haven't through it through yet but it may be possible to handle
 this in such a way that will be mostly transparent to the end user and
 may have other benefits such as a faster path for serialization.

 If it’s possible to preserve order and still get the advantages of binary 
 representation --- which are substantial (see 
 http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
 http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
 examples) --- without undue maintenance overhead, then great.

 I am completely opposed to duplicate key preservation in JSON, though. It has 
 caused us a fair number of headaches at $work.

Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

*) 'json' type internally has a binary as well a text representation.
The text representation is basically the current type behavior
(duduplicated unordered).  The binary representation is the hstore-ish
variant.  The text mode is discarded when it's deemed no longer
appropriate to be needed, and, once gone, can never be rebuilt as it
was.

*) only the binary internal representation ever gets stored to disk
(or anything else).

*) the text mode is preferred for output if it is there.  otherwise, a
deduplicated, reordered text representation is generated

*) When literal text is casted to json, the binary structure is built
up and kept alongside binary mode.   So, if you went: 'select '{a:
1, a: 2}'::json', you'd get the same thing back.  (This is how
it works now.).  but, if you went: 'insert into foo select '{a: 1,
  a: 2}'::json returning *', you'd get {a: 2} back essentially
(although technically that would be a kind of race).

*) When the json is stored to table, the text representation gets
immediately discarded on the basis that it's no longer the true
representation of the data.

*) Ditto when making any equality operation (not as sure on this point).

*) Ditto when doing any operation that mutates the structure in any
way. the text representation is immutable except during serialization
and if it gets invalidated it gets destroyed.

*) New API function: json_simplify(); or some such.  It reorders and
dedups from user's point of view (but really just kills off the text
representation)

*) once the text mode is gone, you get basically the proposed 'hstore' behavior.

*) serialization functions are generally used in contexts that do not
store anything but get output as query data.  They create *only* the
text mode.  However, if the resultant json is stored anywhere, the
text mode is destroyed and replaced with binary variant.  This is both
a concession to the current behavior and an optimization of
'serialization-in-query' for which I think the binary mode is pessimal
performance wise.  so, xxx_to_json serialization functions work
exactly as they do now which fixes my problem essentially.

*) if you are unhappy with duplicates in the above, just get use to
calling  json_simpify() on the serialized json (or deal with in on the
client side).

This is all pretty glossy, but maybe there is a way forward...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andrew Dunstan


On 11/15/2013 03:25 PM, Merlin Moncure wrote:

On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler da...@justatheory.com wrote:

On Nov 15, 2013, at 6:35 AM, Merlin Moncure mmonc...@gmail.com wrote:


Here are the options on the table:
1) convert existing json type to binary flavor (notwithstanding objections)
2) maintain side by side types, one representing binary, one text.
unfortunately, i think the text one must get the name 'json' due to
unfortunate previous decision.
3) merge the behaviors into a single type and get the best of both
worlds (as suggested upthread).

I think we need to take a *very* hard look at #3 before exploring #1
or #2: Haven't through it through yet but it may be possible to handle
this in such a way that will be mostly transparent to the end user and
may have other benefits such as a faster path for serialization.

If it’s possible to preserve order and still get the advantages of binary 
representation --- which are substantial (see 
http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
examples) --- without undue maintenance overhead, then great.

I am completely opposed to duplicate key preservation in JSON, though. It has 
caused us a fair number of headaches at $work.

Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

*) 'json' type internally has a binary as well a text representation.
The text representation is basically the current type behavior
(duduplicated unordered).  The binary representation is the hstore-ish
variant.  The text mode is discarded when it's deemed no longer
appropriate to be needed, and, once gone, can never be rebuilt as it
was.

*) only the binary internal representation ever gets stored to disk
(or anything else).

*) the text mode is preferred for output if it is there.  otherwise, a
deduplicated, reordered text representation is generated

*) When literal text is casted to json, the binary structure is built
up and kept alongside binary mode.   So, if you went: 'select '{a:
1, a: 2}'::json', you'd get the same thing back.  (This is how
it works now.).  but, if you went: 'insert into foo select '{a: 1,
   a: 2}'::json returning *', you'd get {a: 2} back essentially
(although technically that would be a kind of race).

*) When the json is stored to table, the text representation gets
immediately discarded on the basis that it's no longer the true
representation of the data.

*) Ditto when making any equality operation (not as sure on this point).

*) Ditto when doing any operation that mutates the structure in any
way. the text representation is immutable except during serialization
and if it gets invalidated it gets destroyed.

*) New API function: json_simplify(); or some such.  It reorders and
dedups from user's point of view (but really just kills off the text
representation)

*) once the text mode is gone, you get basically the proposed 'hstore' behavior.

*) serialization functions are generally used in contexts that do not
store anything but get output as query data.  They create *only* the
text mode.  However, if the resultant json is stored anywhere, the
text mode is destroyed and replaced with binary variant.  This is both
a concession to the current behavior and an optimization of
'serialization-in-query' for which I think the binary mode is pessimal
performance wise.  so, xxx_to_json serialization functions work
exactly as they do now which fixes my problem essentially.

*) if you are unhappy with duplicates in the above, just get use to
calling  json_simpify() on the serialized json (or deal with in on the
client side).

This is all pretty glossy, but maybe there is a way forward...




It's making my head hurt, to be honest, and it sounds like a recipe for 
years and years of inconsistencies and bugs.


I don't want to have two types, but I think I'd probably rather have two 
clean types than this. I can't imagine it being remotely acceptable to 
have behaviour depend in whether or not something was ever stored, which 
is what this looks like.


cheers

andrew











--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] postgres_fdw, remote triggers and schemas

2013-11-15 Thread Thom Brown
Hi,

I've observed an issue whereby a parent table with a trigger that
redirects inserts to a child table fails to run the trigger
successfully if written to using a foreign table:

Example:

Database 1:

CREATE TABLE parent (id int, content text);

CREATE TABLE child () INHERITS (parent);

CREATE OR REPLACE FUNCTION redirect_func ()
  RETURNS trigger AS $$
BEGIN
  INSERT INTO child VALUES (NEW.*);
  RETURN NULL;
END; $$ language plpgsql;

CREATE TRIGGER parent_trig
  BEFORE INSERT ON parent
  FOR EACH ROW EXECUTE PROCEDURE redirect_func();


Database 2:

CREATE FOREIGN TABLE foreign_parent (id int, content text)
  SERVER local_pg_db
  OPTIONS (table_name 'parent');


Then...

postgres=# INSERT INTO foreign_parent VALUES (2, 'test2');
ERROR:  relation child does not exist
CONTEXT:  Remote SQL command: INSERT INTO public.parent(id, content)
VALUES ($1, $2)
PL/pgSQL function public.redirect_func() line 3 at SQL statement

I've run that remote SQL command in isolation on database 1 and it
completes successfully.

It appears that this is caused by the relation reference in the
trigger function not being explicit about the schema, as if I remove
public from the search_path, I can generate this issue on database 1
with the same statement.  The search_path only contains 'pg_catalog'
on the foreign table connection.

Is this unintended, or is it something users should fix themselves by
being explicit about relation schemas in trigger functions?  Should
the schema search path instead pick up whatever the default would be
for the user being used for the connection?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 2:37 PM, Andrew Dunstan and...@dunslane.net wrote:

 On 11/15/2013 03:25 PM, Merlin Moncure wrote:

 On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler da...@justatheory.com
 wrote:

 On Nov 15, 2013, at 6:35 AM, Merlin Moncure mmonc...@gmail.com wrote:

 Here are the options on the table:
 1) convert existing json type to binary flavor (notwithstanding
 objections)
 2) maintain side by side types, one representing binary, one text.
 unfortunately, i think the text one must get the name 'json' due to
 unfortunate previous decision.
 3) merge the behaviors into a single type and get the best of both
 worlds (as suggested upthread).

 I think we need to take a *very* hard look at #3 before exploring #1
 or #2: Haven't through it through yet but it may be possible to handle
 this in such a way that will be mostly transparent to the end user and
 may have other benefits such as a faster path for serialization.

 If it’s possible to preserve order and still get the advantages of binary
 representation --- which are substantial (see
 http://theory.so/pg/2013/10/23/testing-nested-hstore/ and
 http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of
 examples) --- without undue maintenance overhead, then great.

 I am completely opposed to duplicate key preservation in JSON, though. It
 has caused us a fair number of headaches at $work.

 Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

 *) 'json' type internally has a binary as well a text representation.
 The text representation is basically the current type behavior
 (duduplicated unordered).  The binary representation is the hstore-ish
 variant.  The text mode is discarded when it's deemed no longer
 appropriate to be needed, and, once gone, can never be rebuilt as it
 was.

 *) only the binary internal representation ever gets stored to disk
 (or anything else).

 *) the text mode is preferred for output if it is there.  otherwise, a
 deduplicated, reordered text representation is generated

 *) When literal text is casted to json, the binary structure is built
 up and kept alongside binary mode.   So, if you went: 'select '{a:
 1, a: 2}'::json', you'd get the same thing back.  (This is how
 it works now.).  but, if you went: 'insert into foo select '{a: 1,
a: 2}'::json returning *', you'd get {a: 2} back essentially
 (although technically that would be a kind of race).

 *) When the json is stored to table, the text representation gets
 immediately discarded on the basis that it's no longer the true
 representation of the data.

 *) Ditto when making any equality operation (not as sure on this point).

 *) Ditto when doing any operation that mutates the structure in any
 way. the text representation is immutable except during serialization
 and if it gets invalidated it gets destroyed.

 *) New API function: json_simplify(); or some such.  It reorders and
 dedups from user's point of view (but really just kills off the text
 representation)

 *) once the text mode is gone, you get basically the proposed 'hstore'
 behavior.

 *) serialization functions are generally used in contexts that do not
 store anything but get output as query data.  They create *only* the
 text mode.  However, if the resultant json is stored anywhere, the
 text mode is destroyed and replaced with binary variant.  This is both
 a concession to the current behavior and an optimization of
 'serialization-in-query' for which I think the binary mode is pessimal
 performance wise.  so, xxx_to_json serialization functions work
 exactly as they do now which fixes my problem essentially.

 *) if you are unhappy with duplicates in the above, just get use to
 calling  json_simpify() on the serialized json (or deal with in on the
 client side).

 This is all pretty glossy, but maybe there is a way forward...



 It's making my head hurt, to be honest, and it sounds like a recipe for
 years and years of inconsistencies and bugs.

 I don't want to have two types, but I think I'd probably rather have two
 clean types than this. I can't imagine it being remotely acceptable to have
 behaviour depend in whether or not something was ever stored, which is what
 this looks like.

Well, maybe so.  My main gripe with the 'two types' solutions is that:
1) current type is already in core (that is, not an extension). In
hindsight, I think this was a huge mistake.
2) current type has grabbed the 'json' type name and the 'json_xxx' API.
3) current type is getting used all over the place

'Two types' means that (AIUI) you can't mess around with the existing
API too much. And the new type (due out in 2016?) will be something of
a second citizen.  The ramifications of dealing with the bifurcation
is what makes *my* head hurt.  Every day the json stuff is getting
more and more widely adopted.  9.4 isn't going to drop until 2014 best
case and it won't be widely deployed in the enterprise until 2015 and
beyond.  So you're going to have a huge code base 

Re: [HACKERS] additional json functionality

2013-11-15 Thread Josh Berkus
On 11/15/2013 12:25 PM, Merlin Moncure wrote:
 Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
 
 *) 'json' type internally has a binary as well a text representation.
 The text representation is basically the current type behavior

snip long detailed explanation of behavior-dependant type

That's not at all workable.  Users would be completely unable to predict
or understand the JSON type and how it acts.  That's not just violating
POLS; that's bashing POLS' head in with a shovel.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 2:54 PM, Josh Berkus j...@agliodbs.com wrote:
 On 11/15/2013 12:25 PM, Merlin Moncure wrote:
 Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

 *) 'json' type internally has a binary as well a text representation.
 The text representation is basically the current type behavior

 snip long detailed explanation of behavior-dependant type

 That's not at all workable.  Users would be completely unable to predict
 or understand the JSON type and how it acts.  That's not just violating
 POLS; that's bashing POLS' head in with a shovel.

All right: make a new type then, and leave the current one alone please.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andres Freund
On 2013-11-15 12:54:53 -0800, Josh Berkus wrote:
 On 11/15/2013 12:25 PM, Merlin Moncure wrote:
  Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
  
  *) 'json' type internally has a binary as well a text representation.
  The text representation is basically the current type behavior
 
 snip long detailed explanation of behavior-dependant type
 
 That's not at all workable.  Users would be completely unable to predict
 or understand the JSON type and how it acts.  That's not just violating
 POLS; that's bashing POLS' head in with a shovel.

It's also not currently possible to implement such a behaviour inside a
type's functions. You'd need core code cooperation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-15 Thread J Smith
On Fri, Nov 15, 2013 at 3:21 PM, Robert Haas robertmh...@gmail.com wrote:

 I think what would help the most is if you could arrange to obtain a
 stack backtrace at the point when the error is thrown.  Maybe put a
 long sleep call in just before the error happens, and when it gets
 stuck there, attach gdb and run bt full.


That could potentially be doable. Perhaps I could use something like
google-coredumper or something similar to have a core dump generated
if the error comes up? Part of the problem is that the error is so
sporadic that it's going to be tough to say when the next one will
occur. For instance, we haven't changed our load on the server, yet
the error hasn't occurred since Nov 13, 15:01. I'd also like to avoid
blocking on the server with sleep or anything like that unless
absolutely necessary, as there are other services we have in
development that are using other databases on this cluster. (I can as
a matter of last resort, of course, but if google-coredumper can do
the job I'd like to give that a shot first.)

Any hints on where I could insert something like this? Should I try
putting it into the section of elog.c dealing with ENOENT errors, or
try to find a spot closer to where the file itself is being opened? I
haven't looked at Postgres internals for a while now so I'm not quite
sure of the best location for this sort of thing.

Cheers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw, remote triggers and schemas

2013-11-15 Thread Tom Lane
Thom Brown t...@linux.com writes:
 I've observed an issue whereby a parent table with a trigger that
 redirects inserts to a child table fails to run the trigger
 successfully if written to using a foreign table:

That trigger is making unsafe assumptions about what search_path
it's run under.  If you don't want to schema-qualify the reference
to child, consider attaching a SET search_path clause to the
trigger function definition.

 Is this unintended, or is it something users should fix themselves by
 being explicit about relation schemas in trigger functions?  Should
 the schema search path instead pick up whatever the default would be
 for the user being used for the connection?

postgres_fdw intentionally runs the remote session with a very minimal
search_path (I think just pg_catalog, in fact).  I would argue that
any trigger that breaks because of that is broken anyway, since it
would fail --- possibly with security implications --- if some ordinary
user modified the search path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump insert with column names speedup

2013-11-15 Thread David Rowley
On Sat, Nov 16, 2013 at 9:04 AM, Peter Eisentraut pete...@gmx.net wrote:

 src/bin/pg_dump/pg_dump.c:1604: trailing whitespace.
 +   staticStmt = createPQExpBuffer();
 src/bin/pg_dump/pg_dump.c:1612: trailing whitespace.
 +   else
 src/bin/pg_dump/pg_dump.c:1614: trailing whitespace.
 +   if (column_inserts)


The tailing white space is fixed in the attached patch.

Regards

David Rowley


pg_dump_colinsert_v0.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >