Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread David Rowley
On Wed, Jan 15, 2014 at 5:39 AM, Florian Pflug f...@phlo.org wrote:

The notnullcount machinery seems to apply to both STRICT and non-STRICT
 transfer function pairs. Shouldn't that be constrained to STRICT transfer
 function pairs? For non-STRICT pairs, it's up to the transfer functions to
 deal with NULL inputs however they please, no?


The reason I had to track the notnullcount was because for non-strict was
because:

select sum(n) over (order by i rows between current row and unbounded
following) from (values(1,1),(2,NULL)) v(i,n);

would otherwise return

1
0

sum is not strict, so I guess we need to track notnullcount for non-strict
functions.

See the code around if (peraggstate-notnullcount == 0)
in retreat_windowaggregate().


 The logic around movedaggbase in eval_windowaggregates() seems a bit
 convoluted. Couldn't the if be moved before the code that pulls
 aggregatedbase up to frameheadpos using the inverse aggregation function?


I had a look at this and even tried moving the code to before the inverse
transitions, but it looks like that would only work if I added more tests
to the if condition to ensure that we're not about to perform inverse
transitions. To me it just seemed more bullet proof the way it is rather
than duplicating logic and having to ensure that it stays duplicated. But
saying that I don't think I've fully got my head around why the original
code is valid in the first place. I would have imagined that it should
contain a check for FRAMEOPTION_START_UNBOUNDED_FOLLOWING, but if that
sounds like complete rubbish then I'll put it down to my head still being
fried from work.



 Also, could we easily check whether the frames corresponding to the
 individual rows are all either identical or disjoint, and don't use the
 inverse transfer function then? Currently, for a frame which contains
 either just the current row, or all the current row's peers, I think we'd
 use the inverse transfer function to fully un-add the old frame, and then
 add back the new frame.


I didn't know there was a situation where this could happen. Could you give
me an example query and I'll run it through the debugger to have a look at
what's going on. But sure, if this is possible and I understand what you
mean then I guess it would be a good optimisation to detect this and throw
away the previous results and start fresh.

Thanks for all of your reviewing on this so far.

Regards

David Rowley


 best regards,
 Florian Pflug




Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread knizhnik
I wonder if kernel can sometimes provide weaker version of fsync() which 
is not enforcing all pending data to be written immediately but just 
servers as write barrier, guaranteeing
that all write operations preceding fsync() will be completed before any 
of subsequent operations.


It will allow implementation of weaker transaction models which are not 
satisfying all ACID requirements (results of committed transaction can 
be lost in case power failure or OS crash) but still preserving database 
consistency. It is acceptable for many applications and can provide much 
better performance.


Right now it is possible to implement something like this at application 
level using asynchronous write process. So all write/sync operations 
should be redirected to this process.
But such process can become a bottleneck reducing scalability of the 
system. Also communication channels with this process can cause 
significant memory/CPU overhead.


In most DBMSes including PostgreSQL transaction log and database data 
are located in separate files. So such write barrier should be 
associated not with one file, but with set of files or may be the whole 
file system.  I wonder if there are some principle problems in 
implementing or using such file system write barrier?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2014-01-16 Thread Heikki Linnakangas

On 01/16/2014 03:25 AM, Peter Geoghegan wrote:

I think you should consider breaking off the relcache parts of my
patch and committing them, because they're independently useful. If we
are going to have a lot of conflicts that need to be handled by a
heap_delete(), there is no point in inserting non-unique index tuples
for what is not yet conclusively a proper (non-promise) tuple. Those
should always come last. And even without upsert, strictly inserting
into unique indexes first seems like a useful thing relative to the
cost. Unique violations are the cause of many aborted transactions,
and there is no need to ever bloat non-unique indexes of the same slot
when that happens.


Makes sense. Can you extract that into a separate patch, please?

I was wondering if that might cause deadlocks if an existing index is 
changed from unique to non-unique, or vice versa, as the ordering would 
change. But we don't have a DDL command to change that, so the question 
is moot.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Rushabh Lathia
Hi,

I have reviewed you patch.

-- Patch got applied cleanly (using patch -p1)
-- Make  Make install works fine
-- make check looks good

I done code-walk and it looks good. Also did some manual testing and haven't
found any issue with the implementation.

Even I personally felt the Function and Volatility is nice to have info
into \do+.

Marking it as ready for committer.

Regards,
Rushabh Lathia
www.EnterpriseDB.com



On Fri, Jan 10, 2014 at 7:12 AM, Marko Tiikkaja ma...@joh.to wrote:

 Hi,

 Here's a patch for $SUBJECT, displaying information which I find quite
 tedious to locate using alternative methods.  Hopefully someone else does,
 too.  Or doesn't.  Not sure.


 Regards,
 Marko Tiikkaja


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Rushabh Lathia


Re: [HACKERS] PoC: Partial sort

2014-01-16 Thread Jeremy Harris

On 22/12/13 20:26, Alexander Korotkov wrote:

On Sat, Dec 14, 2013 at 6:30 PM, Jeremy Harris j...@wizmail.org wrote:


On 14/12/13 12:54, Andres Freund wrote:

Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I think we might need to switch to a different algorithm to really
benefit from mostly pre-sorted input.



Eg:  http://www.postgresql.org/message-id/5291467e.6070...@wizmail.org

Maybe Alexander and I should bash our heads together.



Partial sort patch is mostly optimizer/executor improvement rather than
improvement of sort algorithm itself.


I finally got as far as understanding Alexander's cleverness, and it
does make the performance advantage (on partially-sorted input) of the
merge-sort irrelevant.

There's a slight tradeoff possible between the code complexity of
the chunking code front-ending the sorter and just using the
enhanced sorter.  The chunking does reduce the peak memory usage
quite nicely too.

The implementation of the chunker does O(n) compares using the
keys of the feed-stream index, to identify the chunk boundaries.
Would it be possible to get this information from the Index Scan?
--
Cheers,
   Jeremy



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: option --if-exists for pg_dump

2014-01-16 Thread Jeevan Chalke
Hi Pavel,

I have reviewed the patch and here are my concerns and notes:

POSITIVES:
---
1. Patch applies with some white-space errors.
2. make / make install / make check is smooth. No issues as such.
3. Feature looks good as well.
4. NO concern on overall design.
5. Good work.


NEGATIVES:
---

Here are the points which I see in the review and would like you to have
your attention.

1.
+It use conditional commands (with literalIF EXISTS/literal

Grammar mistakes. use = uses

2.
@@ -55,7 +55,8 @@ static ArchiveHandle *_allocAH(const char *FileSpec,
const ArchiveFormat fmt,
 const int compression, ArchiveMode mode, SetupWorkerPtr
setupWorkerPtr);
 static void _getObjectDescription(PQExpBuffer buf, TocEntry *te,
  ArchiveHandle *AH);
-static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions
*ropt, bool isData, bool acl_pass);
+static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions
*ropt,
+bool isData, bool acl_pass);
 static char *replace_line_endings(const char *str);
 static void _doSetFixedOutputState(ArchiveHandle *AH);
 static void _doSetSessionAuth(ArchiveHandle *AH, const char *user);
@@ -234,6 +235,7 @@ RestoreArchive(Archive *AHX)
boolparallel_mode;
TocEntry   *te;
OutputContext sav;
+

AH-stage = STAGE_INITIALIZING;

@@ -2961,7 +3005,8 @@ _getObjectDescription(PQExpBuffer buf, TocEntry *te,
ArchiveHandle *AH)
 }

 static void
-_printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt, bool
isData, bool acl_pass)
+_printTocEntry(ArchiveHandle *AH, TocEntry *te, RestoreOptions *ropt, bool
isData,
+ bool acl_pass)
 {
/* ACLs are dumped only during acl pass */
if (acl_pass)

Above changes are NOT at all related to the patch. Please remove them even
though they are clean-up like changes. Don't mix them with actual changes.

3.
+   if (strncmp(te-dropStmt + desc_len + 5,  IF
EXISTS, 9) != 0)

 IF EXISTS has 10 characters NOT 9.

4.
+   if (strncmp(te-dropStmt + desc_len + 5,  IF
EXISTS, 9) != 0)
+   ahprintf(AH, DROP %s IF EXISTS %s,
+ te-desc,
+ te-dropStmt + 6 + desc_len);

Here you have used strncmp, starting at te-dropStmt + X,
where X = desc_len + 5. While adding back you used X = 6 + desc_len.
First time you used 5 as you added space in comparison but for adding back
we
want past space location and thus you have used 6. That's correct, but
little
bit confusing. Why not you simply used
+   if (strstr(te-dropStmt, IF EXISTS) != NULL)
to check whether drop statement has IF EXISTS or not like you did at some
other place. This will remove my concern 3 and above confusion as well.
What you think ?

5.
+   }
+
+   else

Extra line before else part. Please remove it for consistency.

6.
+   printf(_(  --if-exists  use IF EXISTS when dropping
objects\n));  (pg_dump)
+   printf(_(  --if-exists  don't report error if cleaned
object doesn't exist\n));  (pg_dumpall)
+   printf(_(  --if-exists  use IF EXISTS when dropping
objects\n));  (pg_restore)

Please have same message for all three.

7.
printf(_(  --binary-upgrade for use by upgrade utilities
only\n));
printf(_(  --column-inserts dump data as INSERT commands
with column names\n));
+   printf(_(  --if-exists  don't report error if cleaned
object doesn't exist\n));
printf(_(  --disable-dollar-quoting disable dollar quoting, use
SQL standard quoting\n));
printf(_(  --disable-triggers   disable triggers during
data-only restore\n));

Please maintain order like pg_dump and pg_restore. Also at variable
declaration
and at options parsing mechanism.

8.
+   if (if_exists  !outputClean)
+   exit_horribly(NULL, option --if-exists requires -c/--clean
option\n);

Are we really want to exit when -c is not provided ? Can't we simply ignore
--if-exists in that case (or with emitting a warning) ?

Marking Waiting on author.

Thanks


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2014-01-16 Thread Peter Geoghegan
On Thu, Jan 16, 2014 at 12:35 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Makes sense. Can you extract that into a separate patch, please?

Okay.

On an unrelated note, here are results for a benchmark that compares
the two patches for an insert heavy workload:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/insert-heavy-cmp/

I should point out that this is a sympathetic case for the exclusion
approach; there is only one unique index involved, and the heap tuples
were relatively wide:

pg@gerbil:~/pgbench-tools/tests$ cat tpc-b-upsert.sql
\set nbranches 10
\set naccounts 10
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom delta -5000 5000
with rej as(insert into pgbench_accounts(aid, bid, abalance, filler)
values(:aid, :bid, :delta, 'filler') on duplicate key lock for update
returning rejects aid, abalance) update pgbench_accounts set abalance
= pgbench_accounts.abalance + rej.abalance from rej where
pgbench_accounts.aid = rej.aid;

(This benchmark used an unlogged table, if only because to do
otherwise would severely starve this particular server of I/O).
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Marko Tiikkaja

On 1/16/14 9:53 AM, Rushabh Lathia wrote:

I have reviewed you patch.

-- Patch got applied cleanly (using patch -p1)
-- Make  Make install works fine
-- make check looks good

I done code-walk and it looks good. Also did some manual testing and haven't
found any issue with the implementation.

Even I personally felt the Function and Volatility is nice to have info
into \do+.

Marking it as ready for committer.


Thanks for reviewing!


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread David Rowley
On Thu, Jan 16, 2014 at 9:45 AM, Florian Pflug f...@phlo.org wrote:

 I'm currently thinking the best way forward is to get a basic patch
 without any NUMERIC stuff committed, and to revisit this after that's done.


Attached is a version to that effect. The number of inverse transition
functions is down to 8 from 36.

I managed to keep avg(interval) in there as it's all integer based division.
sum for all int types remain, plus sum for interval and cash. count(exp)
and count(*) also naturally remain.

sum(bigint) became a bit weird as it uses numeric types internally, so I
had to keep the do_numeric_discard() function to support it. Previously
this function did NaNCount-- if it was removing a NaN numeric, but since I
got rid of the NaNCounter as it was no longer required I just decided to
throw an error if we encounter a numeric with NaN. I thought this was ok as
the only place the function is being called the numeric value being passed
is built from a bigint which obviously can't be NaN. It may be better to
just get rid of do_numeric_discard() and stick the sub_var((state-sumX),
newval, (state-sumX)); into int8_avg_accum_inv(). I don't have a
preference as to which as I think there's reasons for both ways. Perhaps
even the name int8_avg_accum_inv() is misleading as we never use it for
avg(bigint). Anyway, I'll leave it as is for now as I just can't decide
which way is better.

Florian, I've not had a chance to look at your updated v2.2 patch yet
sorry. I'll try and get some time tomorrow evening.

Regards

David Rowley


 best regards,
 Florian Pflug




inverse_transition_functions_v2.3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2014-01-16 Thread Heikki Linnakangas

On 11/29/2013 04:04 AM, Greg Stark wrote:

Attached is what I have so far. I have to say I'm starting to come
around to Tom's point of view. This is a lot of hassle for not much
gain. i've noticed a number of other overflow checks that the llvm
optimizer is not picking up on so even after this patch it's not clear
that all the signed overflow checks that depend on -fwrapv will be
gone.

This patch still isn't quite finished though.


In addition to what you have in the patch, Coverity is complaining about 
the overflow checks in int4abs (which is just like int8abs), and in 
DetermineTimeZoneOffset.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] drop duplicate buffers in OS

2014-01-16 Thread Aidan Van Dyk
Can we just get the backend that dirties the page to the posix_fadvice
DONTNEED?

Or have another helper that sweeps the shared buffers and does this
post-first-dirty?

a.


On Wed, Jan 15, 2014 at 1:34 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
 kondo.mitsum...@lab.ntt.co.jp wrote:
  I create patch that can drop duplicate buffers in OS using usage_count
  alogorithm. I have developed this patch since last summer. This feature
 seems to
  be discussed in hot topic, so I submit it more faster than my schedule.
 
  When usage_count is high in shared_buffers, they are hard to drop from
  shared_buffers. However, these buffers wasn't required in file cache.
 Because
  they aren't accessed by postgres(postgres access to shared_buffers).
  So I create algorithm that dropping file cache which is high usage_count
 in
  shared_buffers and is clean state in OS. If file cache are clean state
 in OS, and
  executing posix_fadvice DONTNEED, it can only free in file cache without
 writing
  physical disk. This algorithm will solve double-buffered situation
 problem and
  can use memory more efficiently.
 
  I am testing DBT-2 benchmark now...

 The thing about this is that our usage counts for shared_buffers don't
 really work right now; it's common for everything, or nearly
 everything, to have a usage count of 5.  So I'm reluctant to rely on
 that for much of anything.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a
slave.


Re: [HACKERS] Deprecations in authentication

2014-01-16 Thread Magnus Hagander
On Wed, Jan 15, 2014 at 6:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  One thing I noticed - in MSVC, the config parameter krb5 (equivalent of
  the removed --with-krb5) enabled *both* krb5 and gssapi, and there is no
  separate config parameter for gssapi. Do we want to rename that one to
  gss, or do we want to keep it as krb5? Renaming it would break
  otherwise working environments, but it's kind of weird to leave it...

 +1 for renaming --- anybody who's building with krb5 and expecting to,
 you know, actually *get* krb5 would probably rather find out about this
 change at build time instead of down the road a ways.

 A compromise position would be to introduce a gss parameter while leaving
 krb5 in place as a deprecated (perhaps undocumented?) synonym for it.
 But I think that's basically confusing.


Yeah, I'm not sure it actually helps much.


Andrew - is this going to cause any issues wrt the buildfarm, by any chance?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Peter Eisentraut
On 1/15/14, 11:23 PM, Stephen Frost wrote:
 * Peter Eisentraut (pete...@gmx.net) wrote:
 In my mind, a conf.d directory is an extension of a single-file
 configuration, and so it should be handled that way.
 
 I'm apparently out on some funny limb with this thought, but I'll throw
 it out there anyway- in my head, the 'postgresql.auto.conf' thing that
 essentially ends up included as part of 'postgresql.conf' should be
 handled the same way a single 'postgresql.conf' or 'conf.d' directory
 is.

Then one might as well argue that the pg_db_role_setting table be
relocated to /etc.  It's the same facility, only on a slightly different
level.  The fact that postgresql.auto.conf looks the same as a
plain-text configuration file is an implementation detail.  We could
have chosen some binary format instead.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Alvaro Herrera
Peter Eisentraut wrote:

 In my mind, a conf.d directory is an extension of a single-file
 configuration, and so it should be handled that way.

+1 on this.  This means

1. it is to be read automatically by the server without need for an
   include_dir conf.d option in the main postgresql.conf.

2. it is to be placed alongside postgresql.conf, not necessarily in
   PGDATA.

3. it might or might not be writable by the running server; it's an
   operating-system-owned configuration location.

4. there is no point in disabling it, and thus we offer no mechanism
   to do that.

5. its entries override what is set in postgresql.conf, and are in turn
   overridden by what is set in postgresql.auto.conf.


Taken together, these properties guarantee that it's an useful mechanism
to be used by system-level deployment/configuration tools such as Puppet
et al.


I also think, but this is a secondary point, that initdb should write
its modified settings into a file in conf.d instead of generating a
custom postgresql.conf.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Filter error log statements by sqlstate

2014-01-16 Thread Jeevan Chalke
Hi Oskari,

Patch looks good to me now. I have found no issues too. It is good to go in
now.

However, few small suggestions:

1. Whenever we know that a variable is containing only 32 bits, better
define
it as uint32 and not just int (m_sqlstate in get_sqlstate_error_level()
function). int size may differ in size on different platforms.
2. Also always cast the results with the actual return type of the function.

These are my views. Current code has absolutely no issues as such (at least
on
my machine).

Assigning back to you for your views on above points. If you are agree
resend
the patch with changes else feel free to mark it as Ready for Committor.

Thanks


-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
Peter,

* Peter Eisentraut (pete...@gmx.net) wrote:
 Then one might as well argue that the pg_db_role_setting table be
 relocated to /etc.  It's the same facility, only on a slightly different
 level.  The fact that postgresql.auto.conf looks the same as a
 plain-text configuration file is an implementation detail.  We could
 have chosen some binary format instead.

I argued quite a bit that we should be looking at moving more things
into the catalog tables and pulling them out of postgresql.conf,
reducing that down to just what's required to be there, rather than this
nasty hack where SQL commands are modifying text config files.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Andres Freund
On 2014-01-16 08:34:23 -0500, Stephen Frost wrote:
 Peter,
 
 * Peter Eisentraut (pete...@gmx.net) wrote:
  Then one might as well argue that the pg_db_role_setting table be
  relocated to /etc.  It's the same facility, only on a slightly different
  level.  The fact that postgresql.auto.conf looks the same as a
  plain-text configuration file is an implementation detail.  We could
  have chosen some binary format instead.
 
 I argued quite a bit that we should be looking at moving more things
 into the catalog tables and pulling them out of postgresql.conf,
 reducing that down to just what's required to be there, rather than this
 nasty hack where SQL commands are modifying text config files.

Given that the majority didn't seem to be convinced by this and that the
feature was written differently this isn't a convincing argument for the
location of the file given the current feature, no?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Robert Haas
On Wed, Jan 15, 2014 at 7:54 AM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Jan 15, 2014 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 We've discussed previously the negative impact of large bulk
 operations, especially wrt WAL writes. Patch here allows maintenance
 operations to have their WAL generation slowed down as a replication
 lag prevention feature.

 I believe there was originally intended to be some work on I/O rate
 limiting, but that hasn't happened and is in some ways orthogonal to
 this patch and we will likely eventually want both.

 Single new parameter works very similarly to vacuum_cost_delay

 wal_rate_limit_delay = Xms


 Seems like a really bad name if we are only slowing down some commands -
 that seems to indicate we're slowing down all of them. I think it should be
 something that indicates that it only affects the maintenance commands.

And why should it only affect the maintenance commands anyway, and who
decides what's a maintenance command?

I thought Heroku suggested something like this previously, and their
use case was something along the lines of we need to slow the system
down enough to do a backup so we can delete some stuff before the disk
fills.  For that, it seems likely to me that you would just want to
slow everything down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Alvaro Herrera
Please keep the --help and the options in the SGML table in alphabetical
order within their respective sections.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund
On 2014-01-16 09:06:30 -0500, Robert Haas wrote:
  Seems like a really bad name if we are only slowing down some commands -
  that seems to indicate we're slowing down all of them. I think it should be
  something that indicates that it only affects the maintenance commands.
 
 And why should it only affect the maintenance commands anyway, and who
 decides what's a maintenance command?

I think implementing it for everything might have some use, but it's a
much, much more complex task. You can't simply do rate limiting in
XLogInsert() or somesuch - we're holding page locks, buffer pins, other
locks... I don't see why that needs to be done in the same feature.

I don't really see much difficulty in determining what's a utility
command and what not for the purpose of this? All utility commands which
create WAL in O(table_size) or worse.

 I thought Heroku suggested something like this previously, and their
 use case was something along the lines of we need to slow the system
 down enough to do a backup so we can delete some stuff before the disk
 fills.  For that, it seems likely to me that you would just want to
 slow everything down.

I think the usecase for this more along the lines of not slowing normal
operations or cause replication delays to standbys unduly, while
performing maintenance operations on relations.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-01-16 Thread Euler Taveira
On 08-11-2013 06:20, Dilip kumar wrote:
 On 08 November 2013 13:38, Jan Lentfer
 
 
 For this use case, would it make sense to queue work (tables) in order of 
 their size, starting on the largest one?
 
 For the case where you have tables of varying size this would lead to a 
 reduced overall processing time as it prevents large (read: long processing 
 time) tables to be processed in the last step. While processing large tables 
 at first and filling up processing slots/jobs when they get free with 
 smaller tables one after the other would safe overall execution time.
 Good point, I have made the change and attached the modified patch.
 
Don't you submit it for a CF, do you? Is it too late for this CF?


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Alvaro Herrera
Eyeballing this patch, three thoughts:

1. I wonder whether ilist.c/h should be moved to src/common so that
frontend code could use it.

2. I wonder whether ilist.c should gain the ability to have
singly-linked lists with a pointer to the tail node for appending to the
end.  This code would use it, and also the code doing postgresql.conf
parsing which has head/tail pointers all over the place.  I'm sure there
are other uses.

3. How many definitions of atooid() do we have now?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Andres Freund
Hi,

On 2014-01-16 11:25:59 -0300, Alvaro Herrera wrote:
 Eyeballing this patch, three thoughts:
 
 1. I wonder whether ilist.c/h should be moved to src/common so that
 frontend code could use it.

Sounds like a good idea. There's some debugging checks that elog, but
that should be fixable easily.

 2. I wonder whether ilist.c should gain the ability to have
 singly-linked lists with a pointer to the tail node for appending to the
 end.  This code would use it, and also the code doing postgresql.conf
 parsing which has head/tail pointers all over the place.  I'm sure there
 are other uses.

I am not generaly adverse to it, but neither of those usecases seems to
warrant that. They just should use a doubly linked list, it's not like
the memory/runtime overhead is significant. And the implementation
overhead doesn't count either if they use ilist.h.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.0 (was logical changeset generation)

2014-01-16 Thread Robert Haas
On Wed, Jan 15, 2014 at 3:39 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-15 13:28:25 -0500, Robert Haas wrote:
 - I think you should just regard ReplicationSlotCtlLock as protecting
 the name and active flags of every slot.  ReplicationSlotCreate()
 would then not need to mess with the spinlocks at all, and
 ReplicationSlotAcquire and ReplicationSlotDrop get a bit simpler too I
 think.  Functions like ReplicationSlotsComputeRequiredXmin() can
 acquire this lock in shared mode and only lock the slots that are
 actually active, instead of all of them.

 I first thought you meant that we should get rid of the spinlock, but
 after rereading I think you just mean that -name, -active, -in_use
 are only allowed to change while holding the lwlock exclusively so we
 don't need to spinlock in those cases? If so, yes, that works for me.

Yeah, that's about what I had in mind.

 - If you address /* FIXME: apply sanity checking to slot name */, then
 I think that also addresses /* XXX: do we want to use truncate
 identifier instead? */. In other words, let's just error out if the
 name is too long.  I'm not sure what other sanity checking is needed
 here; maybe just restrict it to 7-bit characters to avoid problems
 with encodings for individual databases varying.

 Yea, erroring out seems like a good idea. But I think we need to
 restrict slot names a bit more than that, given they are used as
 filenames... We could instead name the files using the slot's offset,
 but I'd prefer to not go that route.

OK.  Well, add some code, then.  :-)

 - ReplicationSlotAcquire probably needs to ignore slots that are not active.
 Not sure what you mean? If the slot isn't in_use we'll skip it in the loop.

active != in_use.

I suppose your point is that the slot can't be in_use if it's not also
active.  Maybe it would be better to get rid of active/in_use and have
three states: REPLSLOT_CONNECTED, REPLSLOT_NOT_CONNECTED,
REPLSLOT_FREE.  Or something like that.

 - If there's a coding rule that slot-database can't be changed while
 the slot is active, then the check to make sure that the user isn't
 trying to bind to a slot with a mis-matching database could be done
 before the code described in the previous point, avoiding the need to
 go back and release the resource.

 I don't think slot-database should be allowed to change at all...

Well, it can if the slot is dropped and a new one created.

 - I think the critical section in ReplicationSlotDrop is bogus.  If
 DeleteSlot() fails, we scarcely need to PANIC.  The slot just isn't
 gone.

 Well, if delete slot fails, we don't really know at which point it
 failed which means that the on-disk state might not correspond to the
 in-memory state. I don't see a point in adding code trying to handle
 that case correctly...

Deleting the slot should be an atomic operation.  There's some
critical point before which the slot will be picked up by recovery and
after which it won't.  You either did that operation, or not, and can
adjust the in-memory state accordingly.

 - cancel_before_shmem_exit is only guaranteed to remove the
 most-recently-added callback.

 Yea :(. I think that's safe for the current usages but seems mighty
 fragile... Not sure what to do about it. Just register KillSlot once and
 keep it registered?

Yep.  Use a module-private flag to decide whether it needs to do anything.

 - A comment in KillSlot wonders whether locking is required.  I say
 yes.  It's safe to take lwlocks and spinlocks during shmem exit, and
 failing to do so seems like a recipe for subtle corner-case bugs.

 I agree that it's safe to use spinlocks, but lwlocks? What if we are
 erroring out while holding an lwlock? Code that runs inside a
 TransactionCommand is protected against that, but if not ProcKill()
 which invokes LWLockReleaseAll() runs pretty late in the teardown
 process...

Hmm.  I guess it'd be fine to decide that a connected slot can be
marked not-connected without the lock.  I think you'd want a rule that
a slot can't be freed except when it's not-connected; otherwise, you
might end up marking the slot not-connected after someone else had
already recycled it for an unrelated purpose (drop slot, create new
slot).

 - There seems to be no interface to acquire or release slots from
 either SQL or the replication protocol, nor any way for a client of
 this code to update its slot details.

 I don't think either ever needs to do that - START_TRANSACTION SLOT slot
 ...; and decoding_slot_*changes will acquire/release for them while
 active. What would the usecase be to allow them to be acquired from SQL?

My point isn't so much about SQL as that with just this patch I don't
see any way for anyone to ever acquire a slot for anything, ever.  So
I think there's a piece missing, or three.

 The slot details get updates by the respective replication code. For
 streaming rep, that should happen via reply and feedback messages. For
 changeset extraction it happens when 

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-01-16 Thread Robert Haas
On Thu, Jan 16, 2014 at 12:07 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Jan 16, 2014 at 12:49 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jan 15, 2014 at 7:28 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Unpatched
 ---
 testname | wal_generated |
 duration
 --+--+--
  one short and one long field, no change |1054923224 |  33.101135969162

 After pgrb_delta_encoding_v4
 -

 testname | wal_generated |
 duration
 --+--+--
  one short and one long field, no change | 877859144 | 30.6749138832092


 Temporary Changes
 (Revert Max Chunksize = 4 and logic of finding longer match)
 -

  testname| wal_generated |
 duration
 --+--+--
  one short and one long field, no change | 677337304 | 25.4048750400543

 Sure, but watch me not care.

 If we're interested in taking advantage of the internal
 compressibility of tuples, we can do a lot better than this patch.  We
 can compress the old tuple and the new tuple.  We can compress
 full-page images.  We can compress inserted tuples.  But that's not
 the point of this patch.

 The point of *this* patch is to exploit the fact that the old and new
 tuples are likely to be very similar, NOT to squeeze out every ounce
 of compression from other sources.

Okay, got your point.
Another minor thing is that in latest patch which I have sent yesterday,
I have modified it such that while formation of chunks if there is a data
at end of string which doesn't have special pattern and is less than max
chunk size, we also consider that as a chunk. The reason of doing this
was that let us say if we have 104 bytes string which contains no special
bit pattern, then it will just have one 64 byte chunk and will leave the
remaining bytes, which might miss the chance of doing compression for
that data.

Yeah, that sounds right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.0 (was logical changeset generation)

2014-01-16 Thread Andres Freund
Hi,

On 2014-01-16 09:34:51 -0500, Robert Haas wrote:
  - ReplicationSlotAcquire probably needs to ignore slots that are not 
  active.
  Not sure what you mean? If the slot isn't in_use we'll skip it in the loop.
 
 active != in_use.
 
 I suppose your point is that the slot can't be in_use if it's not also
 active.

Yes. There's asserts to that end...

 Maybe it would be better to get rid of active/in_use and have
 three states: REPLSLOT_CONNECTED, REPLSLOT_NOT_CONNECTED,
 REPLSLOT_FREE.  Or something like that.

Hm. Color me unenthusiastic. If you feel strongly I can change it, but
otherwise not.

  - If there's a coding rule that slot-database can't be changed while
  the slot is active, then the check to make sure that the user isn't
  trying to bind to a slot with a mis-matching database could be done
  before the code described in the previous point, avoiding the need to
  go back and release the resource.
 
  I don't think slot-database should be allowed to change at all...
 
 Well, it can if the slot is dropped and a new one created.

Well. That obviously requires the lwlock to be acquired...

  - I think the critical section in ReplicationSlotDrop is bogus.  If
  DeleteSlot() fails, we scarcely need to PANIC.  The slot just isn't
  gone.
 
  Well, if delete slot fails, we don't really know at which point it
  failed which means that the on-disk state might not correspond to the
  in-memory state. I don't see a point in adding code trying to handle
  that case correctly...
 
 Deleting the slot should be an atomic operation.  There's some
 critical point before which the slot will be picked up by recovery and
 after which it won't.  You either did that operation, or not, and can
 adjust the in-memory state accordingly.

I am not sure I understand that point. We can either update the
in-memory bit before performing the on-disk operations or
afterwards. Either way, there's a way to be inconsistent if the disk
operation fails somewhere inbetween (it might fail but still have
deleted the file/directory!). The normal way to handle that in other
places is PANICing when we don't know so we recover from the on-disk
state.
I really don't see the problem here? Code doesn't get more robust by
doing s/PANIC/ERROR/, rather the contrary. It takes extra smarts to only
ERROR, often that's not warranted.

  - A comment in KillSlot wonders whether locking is required.  I say
  yes.  It's safe to take lwlocks and spinlocks during shmem exit, and
  failing to do so seems like a recipe for subtle corner-case bugs.
 
  I agree that it's safe to use spinlocks, but lwlocks? What if we are
  erroring out while holding an lwlock? Code that runs inside a
  TransactionCommand is protected against that, but if not ProcKill()
  which invokes LWLockReleaseAll() runs pretty late in the teardown
  process...
 
 Hmm.  I guess it'd be fine to decide that a connected slot can be
 marked not-connected without the lock.

I now acquire the spinlock since that has to work, or we have much worse
problems... That guarantees that other backends see the value as well.

 I think you'd want a rule that
 a slot can't be freed except when it's not-connected; otherwise, you
 might end up marking the slot not-connected after someone else had
 already recycled it for an unrelated purpose (drop slot, create new
 slot).

Yea, that rule is there. Otherwise we'd get in great trouble.

  - There seems to be no interface to acquire or release slots from
  either SQL or the replication protocol, nor any way for a client of
  this code to update its slot details.
 
  I don't think either ever needs to do that - START_TRANSACTION SLOT slot
  ...; and decoding_slot_*changes will acquire/release for them while
  active. What would the usecase be to allow them to be acquired from SQL?
 
 My point isn't so much about SQL as that with just this patch I don't
 see any way for anyone to ever acquire a slot for anything, ever.  So
 I think there's a piece missing, or three.

The slot is acquired by code using the slot. So when START_TRANSACTION
SLOT ... (in contrast to a START_TRANSACTION without SLOT) is sent,
walsender.c does an ReplicationSlotAcquire(cmd-slotname) in
StartReplication() and releases it after it has finished.

  The slot details get updates by the respective replication code. For
  streaming rep, that should happen via reply and feedback
  messages. For changeset extraction it happens when
  LogicalConfirmReceivedLocation() is called; the walsender interface
  does that using reply messages, the SQL interface calls it when
  finished (unless you use the _peek_ functions).
 
 Right, but where is this code?  I don't see this updating the reply
 and feedback message processing code to touch slots.  Did I miss that?

It's in wal_decoding: logical changeset extraction walsender interface
currently :(. Splitting the streaming replication part of that patch off
isn't easy...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-01-16 Thread Alvaro Herrera
Euler Taveira wrote:
 On 08-11-2013 06:20, Dilip kumar wrote:
  On 08 November 2013 13:38, Jan Lentfer
  
  
  For this use case, would it make sense to queue work (tables) in order of 
  their size, starting on the largest one?
  
  For the case where you have tables of varying size this would lead to a 
  reduced overall processing time as it prevents large (read: long 
  processing time) tables to be processed in the last step. While processing 
  large tables at first and filling up processing slots/jobs when they get 
  free with smaller tables one after the other would safe overall execution 
  time.
  Good point, I have made the change and attached the modified patch.
  
 Don't you submit it for a CF, do you? Is it too late for this CF?

Not too late.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Tom Lane
Marko Tiikkaja ma...@joh.to writes:
 On 1/16/14 9:53 AM, Rushabh Lathia wrote:
 Even I personally felt the Function and Volatility is nice to have info
 into \do+.

FWIW, I'm on board with the idea of printing the oprcode, but adding
volatility here seems like probably a waste of valuable terminal width.
I think that the vast majority of operators have immutable or at worst
stable underlying functions, so this doesn't seem like the first bit
of information I'd need about the underlying function.  And why print
this but not, say, security, owner, source code, or other columns
shown in \df?  ISTM the value of this addition is to give you what
you need to go look in \df, not to try to substitute for that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Marko Tiikkaja

On 1/16/14 4:22 PM, Tom Lane wrote:

Marko Tiikkaja ma...@joh.to writes:

On 1/16/14 9:53 AM, Rushabh Lathia wrote:

Even I personally felt the Function and Volatility is nice to have info
into \do+.


FWIW, I'm on board with the idea of printing the oprcode, but adding
volatility here seems like probably a waste of valuable terminal width.
I think that the vast majority of operators have immutable or at worst
stable underlying functions, so this doesn't seem like the first bit
of information I'd need about the underlying function.


Completely unscientifically, 50% of the time I've wanted to know the 
oprcode has been because I wanted to know its volatility (exactly 
because of the stable oprcodes we have).  It seemed like a useful 
addition, but I don't feel that strongly about it.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I don't really see much difficulty in determining what's a utility
 command and what not for the purpose of this? All utility commands which
 create WAL in O(table_size) or worse.

By that definition, INSERT, UPDATE, and DELETE can all be utility
commands.  So would a full-table-scan SELECT, if it's unfortunate
enough to run into a lot of hint-setting or HOT-pruning work.

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Christian Kruse
Hi Alvaro,

On 16/01/14 10:21, Alvaro Herrera wrote:
 1. it is to be read automatically by the server without need for an
include_dir conf.d option in the main postgresql.conf.

+1

 4. there is no point in disabling it, and thus we offer no mechanism
to do that.

Not only there is „no point“ in disabling it, it makes this feature
nearly useless. One can't rely on it if the distro may disable
it. There are so many out there, it will never be a reliable feature
if it can be disabled.

 5. its entries override what is set in postgresql.conf, and are in turn
overridden by what is set in postgresql.auto.conf.

+1

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpfpMOCECyB7.pgp
Description: PGP signature


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Tom Lane
Marko Tiikkaja ma...@joh.to writes:
 On 1/16/14 4:22 PM, Tom Lane wrote:
 FWIW, I'm on board with the idea of printing the oprcode, but adding
 volatility here seems like probably a waste of valuable terminal width.
 I think that the vast majority of operators have immutable or at worst
 stable underlying functions, so this doesn't seem like the first bit
 of information I'd need about the underlying function.

 Completely unscientifically, 50% of the time I've wanted to know the 
 oprcode has been because I wanted to know its volatility (exactly 
 because of the stable oprcodes we have).  It seemed like a useful 
 addition, but I don't feel that strongly about it.

Hm.  Personally, I've lost count of the number of times I've had to
resort to select ... from pg_operator because \do lacked an oprcode
column, but I don't remember that many or indeed any were because
I wanted to check the volatility.

Anybody else have an opinion?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 Do you see a reasonable way to implement this generically for all
 commands? I don't.

 In suitable safe places, check if you've written too much WAL, and sleep 
 if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of 
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe 
 vacuum_delay_point() is a better analogy. Hopefully you don't need to 
 sprinkle them in too many places to be useful.

We probably don't really need to implement it for all commands; I think
everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
emit enough WAL to really matter.  My point was that we should try to hit
everything that potentially *could* generate lots of WAL, rather than
arbitrarily deciding that some are utility commands and some are not.

For INSERT/UPDATE/DELETE, likely you could do this with a single 
CHECK_FOR_WAL_BUDGET() added at a strategic spot in nodeModifyTable.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Peter Eisentraut wrote:
  In my mind, a conf.d directory is an extension of a single-file
  configuration, and so it should be handled that way.
 
 +1 on this.  This means
 
 1. it is to be read automatically by the server without need for an
include_dir conf.d option in the main postgresql.conf.

I am not thrilled with the idea that we're claiming ownership of the
directory which postgresql.conf happens to reside in and you had better
not have any 'conf.d' directory in there unless it's the one for PG.

 2. it is to be placed alongside postgresql.conf, not necessarily in
PGDATA.

I certainly agree with this idea- but I feel it should be configurable.
The path which is passed in should be relative to the location of
postgresql.conf.

 3. it might or might not be writable by the running server; it's an
operating-system-owned configuration location.

Works well enough, I suppose.

 4. there is no point in disabling it, and thus we offer no mechanism
to do that.

No.  There should be a way to disable it.

 5. its entries override what is set in postgresql.conf, and are in turn
overridden by what is set in postgresql.auto.conf.

For my part, I'd rather my configurations be deterministic and would
prefer that we error on obvious misconfigurations where values are set
and then reset.

 Taken together, these properties guarantee that it's an useful mechanism
 to be used by system-level deployment/configuration tools such as Puppet
 et al.

It will be a useful mechanism for puppet even with the ability to
disable it in postgresql.conf (you're probably managing that with puppet
also, or keeping it at the system default which will likely include the
include_dir option by default- but, yes, you'd need to check that) and
without the override previous definition option.

 I also think, but this is a secondary point, that initdb should write
 its modified settings into a file in conf.d instead of generating a
 custom postgresql.conf.

This makes sense.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 Given that the majority didn't seem to be convinced by this and that the
 feature was written differently this isn't a convincing argument for the
 location of the file given the current feature, no?

I'll start by pointing out that the only reason this ALTER SYSTEM thing
exists is because there are options that we can't set in the catalogs.

Now, I can't claim to conjecture on what the majority opinion is when it
disagreed with my own.  I remain of the opinion that having settings
which admins are familiar with being overridden by some file in the data
directory is going to cause confusion and problems.  Therefore, I feel
that the 'auto' file should be where postgresql.conf lives because
admins will at least have some hope of finding it there (and then
realizing they need to disable that functionality and lock down the
postgresql.conf permissions to be root-owned...).

Having the log files be sufficiently detailed to help an admin is better
than nothing, but it certainly won't be as good as having the config
files in the same directory when the admin is trying to figure out why
the cluster won't start any more.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Add force option to dropdb

2014-01-16 Thread salah jubeh

If the user owns objects, that will prevent this from working also.  I
have the feeling that adding DROP OWNED BY and/or REASSIGNED OWNED BY
calls to this utility would be a bit excessive, but who knows.

Please find attached the first attempt to drop drop the client connections.

I have added an option -k, --kill instead of force since killing client 
connection does not guarantee -drop force-.  

Regards




On Tuesday, January 14, 2014 8:06 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
wrote:
 
salah jubeh wrote:

 For the sake of completeness:
 1. I think also, I need also to temporary disallow conecting to the database, 
 is that right?
 2. Is there other factors can hinder dropping database?

If the user owns objects, that will prevent this from working also.  I
have the feeling that adding DROP OWNED BY and/or REASSIGNED OWNED BY
calls to this utility would be a bit excessive, but who knows.


 3. Should I write two patches one for pg_version=9.2 and one for 
 pg_version9.2  

No point -- nothing gets applied to branches older than current
development anyway.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services--- dropdb_orig.c	2014-01-16 15:21:51.059243617 +0100
+++ dropdb.c	2014-01-16 16:38:57.419414200 +0100
@@ -32,6 +32,7 @@
 		{echo, no_argument, NULL, 'e'},
 		{interactive, no_argument, NULL, 'i'},
 		{if-exists, no_argument, if_exists, 1},
+		{kill, no_argument, NULL, 'k'},
 		{maintenance-db, required_argument, NULL, 2},
 		{NULL, 0, NULL, 0}
 	};
@@ -48,6 +49,8 @@
 	enum trivalue prompt_password = TRI_DEFAULT;
 	bool		echo = false;
 	bool		interactive = false;
+	bool		kill = false;
+	char *database_conn_limit = -1;
 
 	PQExpBufferData sql;
 
@@ -59,7 +62,7 @@
 
 	handle_help_version_opts(argc, argv, dropdb, help);
 
-	while ((c = getopt_long(argc, argv, h:p:U:wWei, long_options, optindex)) != -1)
+	while ((c = getopt_long(argc, argv, h:p:U:wWeik, long_options, optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -84,6 +87,9 @@
 			case 'i':
 interactive = true;
 break;
+			case 'k':
+kill = true;
+break;
 			case 0:
 /* this covers the long options */
 break;
@@ -121,8 +127,6 @@
 
 	initPQExpBuffer(sql);
 
-	appendPQExpBuffer(sql, DROP DATABASE %s%s;\n,
-	  (if_exists ? IF EXISTS  : ), fmtId(dbname));
 
 	/* Avoid trying to drop postgres db while we are connected to it. */
 	if (maintenance_db == NULL  strcmp(dbname, postgres) == 0)
@@ -131,11 +135,64 @@
 	conn = connectMaintenanceDatabase(maintenance_db,
 			host, port, username, prompt_password, progname);
 
+	/*disallow database connections and terminate client connections*/
+	if (kill) 
+	{
+		appendPQExpBuffer(sql, SELECT datconnlimit FROM pg_database WHERE datname= '%s';,fmtId(dbname));
+		result = executeQuery(conn, sql.data, progname, echo);
+		// get the datconnĺimit to do cleanup in case of dropdb fail
+		if (PQntuples(result) == 1)
+		{
+			database_conn_limit = PQgetvalue(result, 0, 0);
+		} else 
+		{
+			fprintf(stderr, _(%s: database removal failed: %s\n),
+			progname, dbname);
+			PQclear(result);
+			PQfinish(conn);
+			exit(1);
+		}
+		PQclear(result);
+		resetPQExpBuffer(sql);
+		//new connections are not allowed 
+		appendPQExpBuffer(sql, UPDATE pg_database SET datconnlimit = 0 WHERE datname= '%s';\n,fmtId(dbname));
+		if (echo)
+			printf(%s, sql.data);
+		result = PQexec(conn, sql.data);
+		if (PQresultStatus(result) != PGRES_COMMAND_OK)
+		{
+			fprintf(stderr, _(%s: database removal failed: %s\n),
+			progname, dbname);
+			PQclear(result);
+			PQfinish(conn);
+			exit(1);
+		}
+		PQclear(result);
+		resetPQExpBuffer(sql);
+		// terminate client connections
+		appendPQExpBuffer(sql, SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname ='%s';,fmtId(dbname));
+		result = executeQuery(conn, sql.data, progname, echo);
+		PQclear(result);
+		resetPQExpBuffer(sql);
+	}
+
+	appendPQExpBuffer(sql, DROP DATABASE %s%s;\n,
+	  (if_exists ? IF EXISTS  : ), fmtId(dbname));
+
 	if (echo)
 		printf(%s, sql.data);
 	result = PQexec(conn, sql.data);
 	if (PQresultStatus(result) != PGRES_COMMAND_OK)
 	{
+		//cleanup: reset datconnlimit	
+		if (kill)
+		{
+			resetPQExpBuffer(sql);
+			appendPQExpBuffer(sql, UPDATE pg_database SET datconnlimit = %s WHERE datname= '%s';,database_conn_limit,fmtId(dbname));
+			if (echo)
+printf(%s, sql.data);
+			result = PQexec(conn, sql.data);
+		}
 		fprintf(stderr, _(%s: database removal failed: %s),
 progname, PQerrorMessage(conn));
 		PQfinish(conn);
@@ -159,6 +216,7 @@
 	printf(_(  -i, --interactive prompt before deleting anything\n));
 	printf(_(  -V, --version output version information, then exit\n));
 	printf(_(  --if-exists   don't report error if database doesn't exist\n));
+	printf(_(  -k, --killkill client connections\n));
 	printf(_(  -?, --helpshow this help, then exit\n));
 	printf(_(\nConnection 

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jeremy Harris

On 14/01/14 22:23, Dave Chinner wrote:

On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote:

To quantify that, in a production setting we were seeing pauses of
up to two minutes with shared_buffers set to 8GB and default dirty

^

page settings for Linux, on a machine with 256GB RAM and 512MB

   ^
There's your problem.

By default, background writeback doesn't start until 10% of memory
is dirtied, and on your machine that's 25GB of RAM. That's way to
high for your workload.

It appears to me that we are seeing large memory machines much more
commonly in data centers - a couple of years ago 256GB RAM was only
seen in supercomputers. Hence machines of this size are moving from
tweaking settings for supercomputers is OK class to tweaking
settings for enterprise servers is not OK

Perhaps what we need to do is deprecate dirty_ratio and
dirty_background_ratio as the default values as move to the byte
based values as the defaults and cap them appropriately.  e.g.
10/20% of RAM for small machines down to a couple of GB for large
machines


whisper  Perhaps the kernel needs a dirty-amount control measured
in time units rather than pages (it being up to the kernel to
measure the achievable write rate)...
--
Cheers,
   Jeremy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Alvaro Herrera
Stephen Frost wrote:
 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
  Peter Eisentraut wrote:
   In my mind, a conf.d directory is an extension of a single-file
   configuration, and so it should be handled that way.
  
  +1 on this.  This means
  
  1. it is to be read automatically by the server without need for an
 include_dir conf.d option in the main postgresql.conf.
 
 I am not thrilled with the idea that we're claiming ownership of the
 directory which postgresql.conf happens to reside in and you had better
 not have any 'conf.d' directory in there unless it's the one for PG.

Please explain.  I don't see a reason not to.  Are you saying you expect
to put, say, Apache configuration files in the same directory as
PostgreSQL's?  That doesn't sound really tenable to me, and it doesn't
sounds like what any sane admin would do.

In /etc you have /etc/postfix/ where you have Postfix files, and
/etc/exim4 where you have Exim files, and /etc/apache2 or whatever.  And
if each of those software packages uses a conf.d or conf-enabled
directory, they use a hardcoded name for a place inside their own
subdirectory, not one directly in /etc.

What you don't have is /etc/exim4.conf and /etc/postfix.conf and
/etc/apache.conf, where each of those files specifies that I would
please like to have extra files loaded from /etc/exim.conf.d and
/etc/apache.conf.d, and that packages please figure out by parsing my
config file in whatever random format I have invented to figure out
where snippets are to be stored.

  2. it is to be placed alongside postgresql.conf, not necessarily in
 PGDATA.
 
 I certainly agree with this idea- but I feel it should be configurable.
 The path which is passed in should be relative to the location of
 postgresql.conf.

What would be the point of having it be configurable?

  4. there is no point in disabling it, and thus we offer no mechanism
 to do that.
 
 No.  There should be a way to disable it.

What's you rationale for this?

  5. its entries override what is set in postgresql.conf, and are in turn
 overridden by what is set in postgresql.auto.conf.
 
 For my part, I'd rather my configurations be deterministic and would
 prefer that we error on obvious misconfigurations where values are set
 and then reset.

Any convention we establish will make configurations deterministic.
Random variations such as making the conf.d directory be configurable
would make them non-deterministic.  ISTM you're contradicting yourself
here.

  Taken together, these properties guarantee that it's an useful mechanism
  to be used by system-level deployment/configuration tools such as Puppet
  et al.
 
 It will be a useful mechanism for puppet even with the ability to
 disable it in postgresql.conf (you're probably managing that with puppet
 also, or keeping it at the system default which will likely include the
 include_dir option by default- but, yes, you'd need to check that) and
 without the override previous definition option.

Meh.  External packages (say, postgis) will be able to add their own
files into conf.d without having to check what its actual location is,
and without having to check whether the thing is enabled in the first
place.  If we make it disable-able, packages will not be able to do
that; or rather, they will be able to add files, but they will have no
effect.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund
On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I don't really see much difficulty in determining what's a utility
  command and what not for the purpose of this? All utility commands which
  create WAL in O(table_size) or worse.
 
 By that definition, INSERT, UPDATE, and DELETE can all be utility
 commands.  So would a full-table-scan SELECT, if it's unfortunate
 enough to run into a lot of hint-setting or HOT-pruning work.

Well, I said *utility* command. So everything processed by
ProcessUtility() generating WAL like that.

 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.
We shouldn't let the the need for generic resource control stop us from
providing some for of resource control for a consistent subset of
commands.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs
On 16 January 2014 16:56, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.
 
 Do you see a reasonable way to implement this generically for all
 commands? I don't.

 In suitable safe places, check if you've written too much WAL, and sleep if
 so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
 vacuum_delay_point() is a better analogy. Hopefully you don't need to
 sprinkle them in too many places to be useful.

 That'd be most places doing XLogInsert() if you want to be
 consistent. Each needing to be analyzed whether we're blocking important
 resources, determining where in the callstack we can do the
 CHECK_FOR_WAL_BUDGET().

 I don't think the the add power by allowing bulk DML to be metered is
 worth the increased implementation cost.

 I think the usecases that would want this for DML probably also wan this
 to work for unlogged, temp tables. That'd require a much more generic
 resource control framework.

Thank you everyone for responding so positively to this proposal. It
is something many users have complained about.

In time, I think we might want both WAL Rate Limiting and I/O Rate
Limiting. IMHO I/O rate limiting is harder and so this proposal is
restricted solely to WAL rate limiting.

I'm comfortable with a session level parameter. I was mulling over a
wal_rate_limit_scope parameter but I think that is too much.
I will follow Craig's proposal to define this in terms of MB/s, adding
that I would calc from beginning of statement to now, so it is
averaged rate. Setting of zero means unlimited. The rate would be
per-session (in this release) rather than a globally calculated
setting.

I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

That seems like a cheap enough call to allow it to be added in more
places, so my expanded list of commands touched are
 CLUSTER/VACUUM FULL
 ALTER TABLE (only in phase 3 table rewrite)
 ALTER TABLE SET TABLESPACE
 CREATE INDEX
which are already there, plus new ones suggested/implied
 COPY
 CREATE TABLE AS SELECT
 INSERT/UPDATE/DELETE

Please let me know if I missed something (rather than debating what is
or is not a maintenance command).

Any other thoughts before I cut v2 ?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 In suitable safe places, check if you've written too much WAL, and sleep if
 so.

 That'd be most places doing XLogInsert() if you want to be
 consistent.

See my other response.  There's no need for consistency, only to be sure
that code paths that might generate a *lot* of WAL include such checks.
We've gotten along fine without any formal methodology for where to put
CHECK_FOR_INTERRUPTS() or vacuum_delay_point() calls, and this seems like
just more of the same.

 I think the usecases that would want this for DML probably also wan this
 to work for unlogged, temp tables.

Huh?  Unlogged tables generate *zero* WAL, by definition.

If your point is that WAL creation isn't a particularly good resource
consumption metric, that's an argument worth considering, but it seems
quite orthogonal to whether we can implement such throttling reasonably.
And in any case, you didn't provide such an argument.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Andres Freund
On 2014-01-16 11:12:55 -0500, Tom Lane wrote:
 If we were to do this, I'd argue that the location of this hard-wired
 config directory ought to be selected by a configure option.
 in fact, I'd argue that the default behavior with no such option be
 that there's no such hard-wired directory.  That puts it entirely
 on the packager to decide what makes sense as a location.  In the end
 it's going to be the packager's decision anyway --- it's just a matter
 of how painful we make it for him to configure it to his distro's
 conventions.

I don't think a configure option is going to help much to reach that
goal - unless the distribution only allows a single cluster and a single
version to be installed anything but a path relative to $PGDATA won't
work well if configured statically.

So, if we want to support antything but a) relative to pgdata b)
relative to postgresql.conf it needs to be configurable at startup
time. Possibly with an added initdb switch to set the location.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund
On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.
 
 Do you see a reasonable way to implement this generically for all
 commands? I don't.
 
 In suitable safe places, check if you've written too much WAL, and sleep if
 so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
 vacuum_delay_point() is a better analogy. Hopefully you don't need to
 sprinkle them in too many places to be useful.

That'd be most places doing XLogInsert() if you want to be
consistent. Each needing to be analyzed whether we're blocking important
resources, determining where in the callstack we can do the
CHECK_FOR_WAL_BUDGET().

I don't think the the add power by allowing bulk DML to be metered is
worth the increased implementation cost.

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables. That'd require a much more generic
resource control framework.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 1. it is to be read automatically by the server without need for an
 include_dir conf.d option in the main postgresql.conf.

 I am not thrilled with the idea that we're claiming ownership of the
 directory which postgresql.conf happens to reside in and you had better
 not have any 'conf.d' directory in there unless it's the one for PG.

If we were to do this, I'd argue that the location of this hard-wired
config directory ought to be selected by a configure option.   And
in fact, I'd argue that the default behavior with no such option be
that there's no such hard-wired directory.  That puts it entirely
on the packager to decide what makes sense as a location.  In the end
it's going to be the packager's decision anyway --- it's just a matter
of how painful we make it for him to configure it to his distro's
conventions.

On the whole though, I find the argument that we can't configure this
in postgresql.conf to be exceedingly weak.  In particular, the idea
that you can puppet-ify a configuration without any knowledge of the
distro you're targeting is ridiculous on its face, as is the idea
that we/postgres can dictate configuration file location conventions
to packagers who have distro rules to follow.  There isn't going to
be a one location to rule them all, and that means the argument
that a location determined by postgresql.conf is too unstable does
not really hold water.

Another point here is that hard-wiring a config directory location
into the executables completely breaks many scenarios for running
multiple clusters with the same executables.  Yeah, maybe you'd
like to share most of the same config across all your clusters.
But then again, maybe you wouldn't.  The proposed behavior would
make it practically impossible for a test cluster to not pick up
random subsets of the system primary cluster's configuration.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund
On 2014-01-16 11:19:28 -0500, Tom Lane wrote:
  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.
 
 Huh?  Unlogged tables generate *zero* WAL, by definition.

Yes. That's my point. If we provide it as a generic resource control -
which what's being discussed here sounds to me - it should be generic.

If we provide as a measure to prevent standbys from getting out of date
due to maintenance commands, then it only needs to cover those.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs
On 16 January 2014 17:22, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-16 11:19:28 -0500, Tom Lane wrote:
  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.

 Huh?  Unlogged tables generate *zero* WAL, by definition.

 Yes. That's my point. If we provide it as a generic resource control -
 which what's being discussed here sounds to me - it should be generic.

 If we provide as a measure to prevent standbys from getting out of date
 due to maintenance commands, then it only needs to cover those.

Agreed, but it won't happen in this release. I/O resource control to
follow in later releases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
Avlaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Please explain.  I don't see a reason not to.  Are you saying you expect
 to put, say, Apache configuration files in the same directory as
 PostgreSQL's?  That doesn't sound really tenable to me, and it doesn't
 sounds like what any sane admin would do.

That doesn't mean we get to dictate it.

 In /etc you have /etc/postfix/ where you have Postfix files, and
 /etc/exim4 where you have Exim files, and /etc/apache2 or whatever.  And
 if each of those software packages uses a conf.d or conf-enabled
 directory, they use a hardcoded name for a place inside their own
 subdirectory, not one directly in /etc.

I missed the part of this where you point out that Apache on Debian has
some kind of problem because it's possible for an admin to remove the
'IncludeDir sites-enabled' line from apache2.conf.

 What you don't have is /etc/exim4.conf and /etc/postfix.conf and
 /etc/apache.conf, where each of those files specifies that I would
 please like to have extra files loaded from /etc/exim.conf.d and
 /etc/apache.conf.d, and that packages please figure out by parsing my
 config file in whatever random format I have invented to figure out
 where snippets are to be stored.

Right, there's absolutely zero cases of /etc/*.conf files these days.
Nor are there any /etc/*.conf.d directories.

/sarcasm

Sure, maybe we all feel that there should be an /etc/pam directory
instead of /etc/pam.conf and /etc/pam.d, but that doesn't make it so,
and we're only talking about a subset (perhaps a majority, but it's
certainly not a totality) of PG users- not everyone is using Debian or
RH or even *Linux* or GNU tools to run PG with.

   2. it is to be placed alongside postgresql.conf, not necessarily in
  PGDATA.
  
  I certainly agree with this idea- but I feel it should be configurable.
  The path which is passed in should be relative to the location of
  postgresql.conf.
 
 What would be the point of having it be configurable?

To allow admins to configure it?  Perhaps their policy is that it must
be postgresql.conf.d, or that they want it served off of an NFS mount
instead of being pulled from /etc, or maybe they only back up /srv but
they don't want to change where the package manager dumps the
postgresql.conf that's used for starting up the cluster?

   4. there is no point in disabling it, and thus we offer no mechanism
  to do that.
  
  No.  There should be a way to disable it.
 
 What's you rationale for this?

I don't want PG randomly guessing at directories to go poking around in
for config files.  It's also documentation.

   5. its entries override what is set in postgresql.conf, and are in turn
  overridden by what is set in postgresql.auto.conf.
  
  For my part, I'd rather my configurations be deterministic and would
  prefer that we error on obvious misconfigurations where values are set
  and then reset.
 
 Any convention we establish will make configurations deterministic.
 Random variations such as making the conf.d directory be configurable
 would make them non-deterministic.  ISTM you're contradicting yourself
 here.

Perhaps you see it that way, but I certainly don't.  Having conf.d be
configurable doesn't mean that we have one file with work_mem = 1GB
and another with work_mem = 5GB where it's not at all clear which one
is actually what gets used.

  It will be a useful mechanism for puppet even with the ability to
  disable it in postgresql.conf (you're probably managing that with puppet
  also, or keeping it at the system default which will likely include the
  include_dir option by default- but, yes, you'd need to check that) and
  without the override previous definition option.
 
 Meh.  External packages (say, postgis) will be able to add their own
 files into conf.d without having to check what its actual location is,
 and without having to check whether the thing is enabled in the first
 place.  If we make it disable-able, packages will not be able to do
 that; or rather, they will be able to add files, but they will have no
 effect.

Which cluster directory is the packager of PostGIS going to put his
config snippets into, exactly?  Even if you force the hard-code conf.d
idea, the packager will still need to figure out what clusters exist.

Although, if you could make the conf.d directory configurable, and have
more than one, then we actually could have a system-wide directory and
then a per-cluster directory, which *would* make things easier for
packagers.  But, oh wait, don't forget that the system-wide one would
still need to be major-version-specific...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund
On 2014-01-16 17:20:19 +0100, Simon Riggs wrote:
 I'm comfortable with a session level parameter. I was mulling over a
 wal_rate_limit_scope parameter but I think that is too much.
 I will follow Craig's proposal to define this in terms of MB/s, adding
 that I would calc from beginning of statement to now, so it is
 averaged rate. Setting of zero means unlimited. The rate would be
 per-session (in this release) rather than a globally calculated
 setting.
 
 I would like to call it CHECK_FOR_WAL_RATE_LIMIT()
 
 That seems like a cheap enough call to allow it to be added in more
 places, so my expanded list of commands touched are
  CLUSTER/VACUUM FULL
  ALTER TABLE (only in phase 3 table rewrite)
  ALTER TABLE SET TABLESPACE
  CREATE INDEX
 which are already there, plus new ones suggested/implied
  COPY
  CREATE TABLE AS SELECT
  INSERT/UPDATE/DELETE
 
 Please let me know if I missed something (rather than debating what is
 or is not a maintenance command).

If we're going to do this for DML - which I am far from convinced of -
we also should do it for SELECT, since that can generate significant
amounts of WAL with checksums turned on.
Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
very confusingly since suddenly thez will not only block the WAL
generated by the INSERT but also the SELECT.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 So, if we want to support antything but a) relative to pgdata b)
 relative to postgresql.conf it needs to be configurable at startup
 time. Possibly with an added initdb switch to set the location.

How would an initdb switch be better than an option in postgresql.conf?
That sounds very backwards to me.  If you meant a *postmaster*/pg_ctl
option, that would make some sense to me, for folks who would like to
avoid having a single postgresql.conf at all...  Don't know if that
would work today or not tho.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Andres Freund
On 2014-01-16 11:35:00 -0500, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  So, if we want to support antything but a) relative to pgdata b)
  relative to postgresql.conf it needs to be configurable at startup
  time. Possibly with an added initdb switch to set the location.
 
 How would an initdb switch be better than an option in postgresql.conf?
 That sounds very backwards to me.  If you meant a *postmaster*/pg_ctl
 option, that would make some sense to me, for folks who would like to
 avoid having a single postgresql.conf at all...  Don't know if that
 would work today or not tho.

Oh, I was thinking of initdb option being complementary to a
postgresql.conf option, just setting the path in the generated
postgresql.conf. That way distributions wouldn't have to muck around in
the generated config anymore.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-01-16 11:35:00 -0500, Stephen Frost wrote:
  * Andres Freund (and...@2ndquadrant.com) wrote:
   So, if we want to support antything but a) relative to pgdata b)
   relative to postgresql.conf it needs to be configurable at startup
   time. Possibly with an added initdb switch to set the location.
  
  How would an initdb switch be better than an option in postgresql.conf?
  That sounds very backwards to me.  If you meant a *postmaster*/pg_ctl
  option, that would make some sense to me, for folks who would like to
  avoid having a single postgresql.conf at all...  Don't know if that
  would work today or not tho.
 
 Oh, I was thinking of initdb option being complementary to a
 postgresql.conf option, just setting the path in the generated
 postgresql.conf. That way distributions wouldn't have to muck around in
 the generated config anymore.

Oh, sure, I can see that being useful.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs
On 16 January 2014 17:29, Andres Freund and...@2ndquadrant.com wrote:

 Please let me know if I missed something (rather than debating what is
 or is not a maintenance command).

 If we're going to do this for DML - which I am far from convinced of -
 we also should do it for SELECT, since that can generate significant
 amounts of WAL with checksums turned on.
 Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
 very confusingly since suddenly thez will not only block the WAL
 generated by the INSERT but also the SELECT.

Good point, but rather happily I can say thought of that and refer
you to the other patch which limits SELECT's ability to dirty pages,
and thus, with checksums enabled will limit the generation of WAL.
(And no, they aren't the same thing).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Agreed, but it won't happen in this release. I/O resource control to
 follow in later releases.

This release?  TBH, I think this patch has arrived too late to be
considered for 9.4.  It's a fairly major feature and clearly not
without controversy, so I don't think that posting it for the first
time the day before CF4 starts meets the guidelines we all agreed
to back in Ottawa.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Steeve Lennmark
Alvaro,
On Thu, Jan 16, 2014 at 3:20 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Please keep the --help and the options in the SGML table in alphabetical
 order within their respective sections.


Ah, I failed to see that there was sub groups and thought the options
where not alphabetically ordered. This patch fixes that.

--
Steeve Lennmark


0005-pg_basebackup-relocate-tablespace.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Heavily modified big table bloat even in auto vacuum is running

2014-01-16 Thread tirtho
Is this now fixed? If so, where do I find the patch for postgres 9.2.2.

Thanks

- Tirthankar



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Heavily-modified-big-table-bloat-even-in-auto-vacuum-is-running-tp5774274p5787477.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs
On 16 January 2014 17:49, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Agreed, but it won't happen in this release. I/O resource control to
 follow in later releases.

 This release?  TBH, I think this patch has arrived too late to be
 considered for 9.4.  It's a fairly major feature and clearly not
 without controversy, so I don't think that posting it for the first
 time the day before CF4 starts meets the guidelines we all agreed
 to back in Ottawa.

I think its majorly useful, but there really isn't much to the
implementation and I expect its beneath the level of being defined as
a major feature. (I think it may have taken less time to write than
this current conversation has lasted).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Steeve Lennmark
Alvaro,
On Thu, Jan 16, 2014 at 3:25 PM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Eyeballing this patch, three thoughts:

 1. I wonder whether ilist.c/h should be moved to src/common so that
 frontend code could use it.


That would be nice, I guess lack of helpers is why a lot of stuff is
using pgdumputils.h from src/bin/pg_dump.

$ git grep -l dumputils.h src/bin/{psql,scripts}
src/bin/psql/command.c
src/bin/psql/copy.c
src/bin/psql/describe.c
src/bin/scripts/clusterdb.c
src/bin/scripts/createdb.c
src/bin/scripts/createuser.c
src/bin/scripts/dropdb.c
src/bin/scripts/dropuser.c
src/bin/scripts/reindexdb.c
src/bin/scripts/vacuumdb.c


 3. How many definitions of atooid() do we have now?


$ git grep '#define atooid' |wc -l
  11

I found no obvious .h to include though.

--
Steeve Lennmark


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Heikki Linnakangas

On 01/16/2014 05:39 PM, Andres Freund wrote:

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.


Do you see a reasonable way to implement this generically for all
commands? I don't.


In suitable safe places, check if you've written too much WAL, and sleep 
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of 
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe 
vacuum_delay_point() is a better analogy. Hopefully you don't need to 
sprinkle them in too many places to be useful.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2014-01-16 Thread Jeff Janes
On Thu, Nov 21, 2013 at 2:43 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-11-21 14:40:36 -0800, Jeff Janes wrote:
  But if the transaction would not have otherwise generated WAL (i.e. a
  select that did not have to do any HOT pruning, or an update with zero
 rows
  matching the where condition), doesn't it now have to flush and wait when
  it would otherwise not?

 We short circuit that if there's no xid assigned. Check
 RecordTransactionCommit().


It looks like that only short-circuits the flush if both there is no xid
assigned, and !wrote_xlog.  (line 1054 of xact.c)

I do see stalls on fdatasync on flush from select statements which had no
xid, but did generate xlog due to HOT pruning, I don't see why WAL logging
hint bits would be different.

Cheers,

Jeff


Re: [HACKERS] Patch for fail-back without fresh backup

2014-01-16 Thread Andres Freund
On 2014-01-16 09:25:51 -0800, Jeff Janes wrote:
 On Thu, Nov 21, 2013 at 2:43 PM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2013-11-21 14:40:36 -0800, Jeff Janes wrote:
   But if the transaction would not have otherwise generated WAL (i.e. a
   select that did not have to do any HOT pruning, or an update with zero
  rows
   matching the where condition), doesn't it now have to flush and wait when
   it would otherwise not?
 
  We short circuit that if there's no xid assigned. Check
  RecordTransactionCommit().
 
 
 It looks like that only short-circuits the flush if both there is no xid
 assigned, and !wrote_xlog.  (line 1054 of xact.c)

Hm. Indeed. Why don't we just always use the async commit behaviour for
that? I don't really see any significant dangers from doing so?

It's also rather odd to use the sync rep mechanisms in such
scenarios... The if() really should test markXidCommitted instead of
wrote_xlog.

 I do see stalls on fdatasync on flush from select statements which had no
 xid, but did generate xlog due to HOT pruning, I don't see why WAL logging
 hint bits would be different.

Are the stalls at commit or while the select is running? If wal_buffers
is filled too fast, which can easily happen if loads of pages are hinted
and wal logged, that will happen independently from
RecordTransactionCommit().

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Florian Pflug
On Jan16, 2014, at 09:07 , David Rowley dgrowle...@gmail.com wrote:
 On Wed, Jan 15, 2014 at 5:39 AM, Florian Pflug f...@phlo.org wrote:
 The notnullcount machinery seems to apply to both STRICT and non-STRICT 
 transfer function pairs. Shouldn't that be constrained to STRICT transfer 
 function pairs? For non-STRICT pairs, it's up to the transfer functions to 
 deal with NULL inputs however they please, no?
 
 The reason I had to track the notnullcount was because for non-strict was 
 because:
 
 select sum(n) over (order by i rows between current row and unbounded 
 following) from (values(1,1),(2,NULL)) v(i,n);
 
 would otherwise return
 1
 0
 
 sum is not strict, so I guess we need to track notnullcount for non-strict 
 functions.
 See the code around if (peraggstate-notnullcount == 0) in 
 retreat_windowaggregate(). 

Yeah, I figured that was the reason, but you can't fix it that way. See 
http://www.postgresql.org/message-id/8e857d95-cba4-4974-a238-9dd7f61be...@phlo.org
 for a detailed explanation why. My 2.2 patch allows pairs of non-strict 
forward and strict inverse transfer functions exactly because of this - i.e., 
basically it decrees that if there's an inverse transfer function, the strict 
setting of the *inverse* function determines the aggregates NULL behaviour. The 
forward transfer function is then never called
for NULL inputs, but it *is* called with the NULL state for the first non-NULL 
input, and *must* then return a non-NULL state (hence it's technically not 
strict, it's strict only in the inputs, not in the state).

BTW, I just realized I failed to update CREATE AGGREGATE's logic when I did 
that in 2.2. That doesn't matter for the built-in aggregates since these aren't 
created with CREATE AGGREGATE, but it's obviously wrong. I'll post a fixed 
patched.

 The logic around movedaggbase in eval_windowaggregates() seems a bit 
 convoluted. Couldn't the if be moved before the code that pulls 
 aggregatedbase up to frameheadpos using the inverse aggregation function?

 I had a look at this and even tried moving the code to before the inverse 
 transitions, but it looks like that would only work if I added more tests to 
 the if condition to ensure that we're not about to perform inverse 
 transitions. To me it just seemed more bullet proof the way it is rather than 
 duplicating logic and having to ensure that it stays duplicated. But saying 
 that I don't think I've fully got my head around why the original code is 
 valid in the first place. I would have imagined that it should contain a 
 check for FRAMEOPTION_START_UNBOUNDED_FOLLOWING, but if that sounds like 
 complete rubbish then I'll put it down to my head still being fried from work.

Ok, I get this now. That code really just is asking is the previous row's 
frame the same as the current row's frame in that if where you added 
movedagg. What confused me was the rather weird way that condition is 
expressed, which turns out is due to the fact that at the point of the if, we 
don't know the new frame's end. Now, we could use update_frametailpos() to find 
that, but that's potentially costly, especially if the tuple store was spilled 
to disk. So instead, the code relies on the fact that only if the frame end is 
n FOLLOWING/PRECEDING can the current row lie within the previous row's frame 
without the two frame's ends being necessarily the same.

For added confusion, that if never explicitly checks whether the frame's 
*heads* coincide - the previous code seems to have relied on the impossibility 
of having aggregatedbase = current  aggregatedupto after re-initializing 
the aggregation, because then aggregatedbase = aggregatedupto = 0. That's why 
you can't just move the if before the frameheadpos == aggregatedbase check. 
But you can *if* you also check whether aggregatedbase == frameheadpos in the 
if - which is clearer than relying on that rather subtle  assumption anyway.

BTW, the your patch will also fail badly if the frame head ever moves backwards 
- the invariant that aggregatedbase == frameheadpos after the inverse 
transition loop will then be violated. Now, this should never happen, because 
we require that the offset in n PRECEDING/FOLLOWING is constant, but we 
should probably still check for this and elog().

That check was implicit in old code, because it advanced the tuplestore mark, 
so if the frame head moved backwards, re-scanning the frame would have failed. 
That brings me to another todo - as it stands, that mark gets never advanced if 
we're doing inverse aggregation. To fix that, we need a call to 
WinSetMarkPosition() somewhere in eval_windowaggregates().

After doing this things, eval_windowaggregates ended up calling 
initalize_windowframeaggregates at a single place again, so I inlined it back 
into eval_windowaggregates().  I also made it use temp_slot_1 in the inverse 
aggregation loop, since that seemed safe - the code assumes some invariants 
about aggregatedbase and agg_row_slow.

Updated patch 

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Jeff Janes
On Thu, Jan 16, 2014 at 8:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:


  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.

 Huh?  Unlogged tables generate *zero* WAL, by definition.


Transactions that only change unlogged tables still generate commit records
to WAL.

I don't think that amount of WAL is particularly relevant to this
discussion, but I was recently surprised by it, so wanted to publicize it.
 (It was causing a lot of churn in my WAL due to interaction with
archive_timeout)

Cheers,

Jeff


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Alvaro Herrera
Tom Lane wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
  1. it is to be read automatically by the server without need for an
  include_dir conf.d option in the main postgresql.conf.
 
  I am not thrilled with the idea that we're claiming ownership of the
  directory which postgresql.conf happens to reside in and you had better
  not have any 'conf.d' directory in there unless it's the one for PG.
 
 If we were to do this, I'd argue that the location of this hard-wired
 config directory ought to be selected by a configure option.   And
 in fact, I'd argue that the default behavior with no such option be
 that there's no such hard-wired directory.  That puts it entirely
 on the packager to decide what makes sense as a location.

I fail to see the sense in this.  Surely the only useful option is to
use a relative path: an absolute path is completely useless because then
there's no way to have multiple clusters using the same executables, as
you say.  Since it has to be relative to *something*, there aren't that
many options: we could make it relative to the directory where the
executables reside (which would be absolutely pointless), or relative to
the data directory (which we already ruled out for reasons well
understood), or relative to the directory containing the other
configuration files.

 On the whole though, I find the argument that we can't configure this
 in postgresql.conf to be exceedingly weak.  In particular, the idea
 that you can puppet-ify a configuration without any knowledge of the
 distro you're targeting is ridiculous on its face,

Surely puppetification is not the only use case for conf.d.  For
example, suppose for a minute that a module providing an extension wants
to offer a program to initialize its configuration file into some
configuration directory (say, an hypothetical PostGIS package drops a
postgis.conf into conf.d with initial values for all the config
options it provides).  If you're not administering postgresql.conf with
puppet, you'd like to be able to do this to the clusters you have
without having to figure out *where* to put the files (that is, without
having to parse and possibly modify postgresql.conf).  Just dump it in
the cluster's conf.d.

 as is the idea that we/postgres can dictate configuration file
 location conventions to packagers who have distro rules to follow.

I don't see that this is ridiculous at all.  Configuration files always
go together; a distro rule that some config files go in some place and
other configuration files go somewhere else, sounds more ridiculous to
me, actually.

 There isn't going to be a one location to rule them all, and that
 means the argument that a location determined by postgresql.conf is
 too unstable does not really hold water.

postgresql.conf's location already rules them all.  Remember, my
proposal upthread was to have conf.d be located in the same directory
that holds postgresql.conf.


 Another point here is that hard-wiring a config directory location
 into the executables completely breaks many scenarios for running
 multiple clusters with the same executables.

Therefore the proposal is not to hardwire the location in the
executables.

A server with multiple clusters using the same executables needs a way
to specify different ports for each (at the very least); therefore they
already have separate postgresql.conf.  Each directory containing such
file can contain a conf.d subdirectory with extra stuff perfectly well.


Stephen Frost wrote:

 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:

  In /etc you have /etc/postfix/ where you have Postfix files, and
  /etc/exim4 where you have Exim files, and /etc/apache2 or whatever.  And
  if each of those software packages uses a conf.d or conf-enabled
  directory, they use a hardcoded name for a place inside their own
  subdirectory, not one directly in /etc.
 
 I missed the part of this where you point out that Apache on Debian has
 some kind of problem because it's possible for an admin to remove the
 'IncludeDir sites-enabled' line from apache2.conf.

I don't see that this is parallel.  For one thing, I don't think you
would normally have multiple apache2.conf files.  In the other hand,
even if you had more than one, surely the one that's managed by the
packaging system, if there is one, is not going to want to have that
line changed, because that would immediately break all modules installed
by packages.


  What you don't have is /etc/exim4.conf and /etc/postfix.conf and
  /etc/apache.conf, where each of those files specifies that I would
  please like to have extra files loaded from /etc/exim.conf.d and
  /etc/apache.conf.d, and that packages please figure out by parsing my
  config file in whatever random format I have invented to figure out
  where snippets are to be stored.
 
 Right, there's absolutely zero cases of /etc/*.conf files these days.
 Nor are there any /etc/*.conf.d directories.
 
 /sarcasm

I 

Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Stephen Frost wrote:
  I missed the part of this where you point out that Apache on Debian has
  some kind of problem because it's possible for an admin to remove the
  'IncludeDir sites-enabled' line from apache2.conf.
 
 I don't see that this is parallel.  For one thing, I don't think you
 would normally have multiple apache2.conf files.  In the other hand,
 even if you had more than one, surely the one that's managed by the
 packaging system, if there is one, is not going to want to have that
 line changed, because that would immediately break all modules installed
 by packages.

My point above was that having it configurable does *not* make it a
burden on packagers or tool authors.  If it did, the Debian folks would
be complaining to ASF to have the option removed and instead required.
Indeed, the fact that it's configurable, in my view, is how we ended up
with sites-available and sites-enabled in the first place.  Perhaps we'd
have gotten there without it, but it sure feels like having the option
was a help and not a hindrence.

 I just checked my system, and it seems to me that most cases that have
 .conf files and something.conf.d directly in /etc are more because of
 historical reasons than because they have made a conscious, contemporary
 decision to do that.  Do we want to replicate choices made in the
 1990's?

There are people who are still running OS's from the 1990's and forcing
what we feel is the 'right' answer on users is helping ourselves more
than helping our users, imv.

 Yeah, so instead of making a rational choice for people using all
 platforms, mirroring the rational choice made by Linux/Debian/GNU, lets
 make an irrational choice so that everybody will be constantly bitten by
 it, regardless of platform.

This argument that it only works if it's hard-coded simply doesn't fly
when you look at the existing examples in Debian.

  To allow admins to configure it?  Perhaps their policy is that it must
  be postgresql.conf.d
 
 This example sounds rather weak.  I mean, if they *must* have
 /etc/postgresql/9.2/main/postgresql.conf.d instead of
 /etc/postgresql/9.2/main/conf.d, why aren't they forced to have 

You're being completely Debian-centric here.  It's possible they've got
their own /etc directory structure which doesn't match your
expectations- and that's *fine* until we start forcing the issue on
them.  Perhaps they prefer to have /etc/postgresql.conf and
/etc/postgresql.conf.d, why should we be preventing them from doing so
just because the OS's and distro's that we run don't work that way?

  or that they want it served off of an NFS mount instead of being
  pulled from /etc,
 
 So conf.d would be in an NFS mount but postgresql.conf would not?

That's correct- and it wouldn't be mounted at
/etc/postgresql/9.2/main/conf.d, and please don't start in with the
symlinks argument- *that* would be a mess.

  I don't want PG randomly guessing at directories to go poking around in
  for config files.  It's also documentation.
 
 Documentation is fine.  We can have a comment at the top of
 postgresql.conf saying also see files in postgresql.conf.d/ or some
 such.  I can hardly thing that opening and reading a subdirectory named
 conf.d or postgresql.conf.d, *in the same directory as postgresql.conf*
 is randomly poking.

It's completely random when we have no idea where the postgesql.conf is
nor what other files or directories exist there today.  Perhaps it's
100% true that, no where on this planet, a postgresql.conf file and a
conf.d directory/file/socket/whateve share a directory, but I wouldn't
bet on it.  Hell, I'm half tempted to go make sure just to be sure to
prove anyone wrong who argues that case. ;)

Remember- these would not just be new installs, someone simply running
pg_upgrade to move to 9.4 would suddenly find their database server
trying to open a non-existant directory wherever their postgresql.conf
file resides.

Another issue is how does this scale up to directories for pg_hba.conf
or pg_ident.conf or friends- are we going to hard-code directories for
those too?

 If you want to argue that multiple lines setting the same variable
 should raise an error instead of last definition wins, please do
 propose that separately.

I've brought it up a number of times before- having the lines be in
different files just makes it that much worse.

  Which cluster directory is the packager of PostGIS going to put his
  config snippets into, exactly?  Even if you force the hard-code conf.d
  idea, the packager will still need to figure out what clusters exist.
 
 This was a thinko on my part, I admit, which doesn't invalidate the rest
 of my argumentation.  I think the way to think about this would be
 something like having mode in pg_ctlcluster (for example) that enables
 postgis for a certain cluster, which means it drops a certain file in
 the cluster's conf.d directory, as I described above.

Boy does that ever sound hokey.

  

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Heikki Linnakangas

On 01/16/2014 01:02 PM, David Rowley wrote:

sum(bigint) became a bit weird as it uses numeric types internally, so I
had to keep the do_numeric_discard() function to support it.


It's pretty weird that we have implemented sum(bigint) that way. I 
understand that the result is a numeric so that it won't overflow, but 
implementing it by converting every value to numeric is naive.


I propose that we reimplement sum(bigint) in a more efficient way: For 
the internal state, let's use an int8 and a numeric overflow field. The 
transition function adds to the int8 variable, and checks for overflow. 
On overflow, increment the numeric field by one. In the final function, 
multiply the numeric by 2^64, and add the residual int8 value.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Another point here is that hard-wiring a config directory location
 into the executables completely breaks many scenarios for running
 multiple clusters with the same executables.

 Therefore the proposal is not to hardwire the location in the
 executables.

At least some people seem to be advocating that, even if I misunderstood
whether you were.

I'm fine if the proposal is that postgresql.conf include include_dir
conf.d by default (where that's read as relative to postgresql.conf's own
directory).  Even better if it's not terribly difficult for a packager to
change that, because I think some will want to.  We could possibly reduce
the need for packagers to change it if we made it be
include_dir postgresql.d, because conf.d is a damn generic name for
something that might be in the same /etc directory as configs for other
packages.

regards, tom lane

PS: off topic, but isn't ParseConfigDirectory leaking the result
of AbsoluteConfigLocation?  In both normal and error paths?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Josh Berkus
On 01/16/2014 10:46 AM, Tom Lane wrote:
 I'm fine if the proposal is that postgresql.conf include include_dir
 conf.d by default (where that's read as relative to postgresql.conf's own
 directory).  Even better if it's not terribly difficult for a packager to
 change that, because I think some will want to.  We could possibly reduce
 the need for packagers to change it if we made it be
 include_dir postgresql.d, because conf.d is a damn generic name for
 something that might be in the same /etc directory as configs for other
 packages.

FWIW, this is what I was proposing.  We have an include_dir
postgresql.conf.d currently in the stock postgresql.conf, but it's
disabled (commented out) by default.  I'd just like it enabled by
default, and to pass a suggestion to the packagers that they pick an
appropriate directory and enable it by default.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 On 01/16/2014 10:46 AM, Tom Lane wrote:
  I'm fine if the proposal is that postgresql.conf include include_dir
  conf.d by default (where that's read as relative to postgresql.conf's own
  directory).  Even better if it's not terribly difficult for a packager to
  change that, because I think some will want to.  We could possibly reduce
  the need for packagers to change it if we made it be
  include_dir postgresql.d, because conf.d is a damn generic name for
  something that might be in the same /etc directory as configs for other
  packages.
 
 FWIW, this is what I was proposing.  We have an include_dir
 postgresql.conf.d currently in the stock postgresql.conf, but it's
 disabled (commented out) by default.  I'd just like it enabled by
 default, and to pass a suggestion to the packagers that they pick an
 appropriate directory and enable it by default.

To this point- I've asked the Debian packager Martin Pitt to review and
comment on this thread when he gets a chance.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] new json funcs

2014-01-16 Thread Peter Eisentraut
On 1/3/14, 9:00 PM, Andrew Dunstan wrote:
 
 Here is a patch for the new json functions I mentioned a couple of
 months ago. These are:
 
 json_to_record
 json_to_recordset
 json_object
 json_build_array
 json_build_object
 json_object_agg
 
 So far there are no docs, but the way these work is illustrated in the
 regression tests - I hope to have docs within a few days.

Compiler warnings:

json.c: In function ‘json_object_two_arg’:
json.c:2210:5: warning: unused variable ‘count’ [-Wunused-variable]

jsonfuncs.c: In function ‘json_to_record’:
jsonfuncs.c:1955:16: warning: unused variable ‘tuple’ [-Wunused-variable]
jsonfuncs.c:1953:18: warning: variable ‘rec’ set but not used 
[-Wunused-but-set-variable]

Also, please run your patch through git diff --check.  I have noticed
that several of your patches have hilarious whitespace, maybe
something with your editor.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 I propose that we reimplement sum(bigint) in a more efficient way: For 
 the internal state, let's use an int8 and a numeric overflow field. The 
 transition function adds to the int8 variable, and checks for overflow. 
 On overflow, increment the numeric field by one. In the final function, 
 multiply the numeric by 2^64, and add the residual int8 value.

It'd probably be sufficient to handle it as two int64 fields (handmade
128-bit arithmetic, or maybe even not so handmade if that ever gets
reasonably common among C compilers).  You're assuming the final output
is still numeric, right?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why conf.d should be default, and auto.conf and recovery.conf should be in it

2014-01-16 Thread Josh Berkus
On 01/16/2014 07:32 AM, Christian Kruse wrote:
 Hi Alvaro,
 
 On 16/01/14 10:21, Alvaro Herrera wrote:
 1. it is to be read automatically by the server without need for an
include_dir conf.d option in the main postgresql.conf.
 
 +1
 
 4. there is no point in disabling it, and thus we offer no mechanism
to do that.
 
 Not only there is „no point“ in disabling it, it makes this feature
 nearly useless. One can't rely on it if the distro may disable
 it. There are so many out there, it will never be a reliable feature
 if it can be disabled.

It would make *my* life vastly easier if we could mandate things like
the presence and relative directory of a conf.d.  However, if Apache
can't do it, we certainly can't.  Ultimately, we cannot impose things on
distributions which they are unwilling to support; Debian, for one, will
happily fork PostgreSQL rather than accept directory assignments which
don't meet their standards.

Also, enough people install PostgreSQL from source or using custom
packages to make for a high degree of variation anyway.

That's why I was just advocating changing the *defaults*, not mandating
anything.  Actual directory locations and usage should be configurable
by distros, packagers and users.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2014-01-16 Thread Jeff Janes
On Thu, Jan 16, 2014 at 9:37 AM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-01-16 09:25:51 -0800, Jeff Janes wrote:
  On Thu, Nov 21, 2013 at 2:43 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 
   On 2013-11-21 14:40:36 -0800, Jeff Janes wrote:
But if the transaction would not have otherwise generated WAL (i.e. a
select that did not have to do any HOT pruning, or an update with
 zero
   rows
matching the where condition), doesn't it now have to flush and wait
 when
it would otherwise not?
  
   We short circuit that if there's no xid assigned. Check
   RecordTransactionCommit().
  
 
  It looks like that only short-circuits the flush if both there is no xid
  assigned, and !wrote_xlog.  (line 1054 of xact.c)

 Hm. Indeed. Why don't we just always use the async commit behaviour for
 that? I don't really see any significant dangers from doing so?


I think the argument is that drawing the next value from a sequence can
generate xlog that needs to be flushed, but doesn't assign an xid.

I would think the sequence should flush that record before it hands out the
value, not before the commit, but...





 It's also rather odd to use the sync rep mechanisms in such
 scenarios... The if() really should test markXidCommitted instead of
 wrote_xlog.

  I do see stalls on fdatasync on flush from select statements which had no
  xid, but did generate xlog due to HOT pruning, I don't see why WAL
 logging
  hint bits would be different.

 Are the stalls at commit or while the select is running? If wal_buffers
 is filled too fast, which can easily happen if loads of pages are hinted
 and wal logged, that will happen independently from
 RecordTransactionCommit().


In the real world, I'm not sure what the distribution is.

But in my present test case, they are coming almost exclusively from
RecordTransactionCommit.

I use pgbench -T10 in a loop to generate dirty data and checkpoints (with
synchronous_commit on but with a BBU), and then to probe the consequences I
use:

pgbench -T10 -S -n --startup='set synchronous_commit='$f

(where --startup is an extension to pgbench proposed a few months ago)

Running the select-only query with synchronous_commit off almost completely
isolates it from the checkpoint drama that otherwise has a massive effect
on it.  with synchronous_commit=on, it goes from 6000 tps normally to 30
tps during the checkpoint sync, with synchronous_commit=off it might dip to
4000 or so during the worst of it.

(To be clear, this is about the pruning, not the logging of the hint bits)

Cheers,

Jeff


Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Peter Eisentraut
The documentation doesn't build:

openjade:config.sgml:1416:11:E: end tag for VARIABLELIST omitted, but OMITTAG 
NO was specified
openjade:config.sgml:1390:5: start tag was here



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2014-01-16 Thread Andres Freund
On 2014-01-16 11:01:29 -0800, Jeff Janes wrote:
 I think the argument is that drawing the next value from a sequence can
 generate xlog that needs to be flushed, but doesn't assign an xid.

Then that should assign an xid. Which would yield correct behaviour with
async commit where it's currently *not* causing a WAL flush at all
unless a page boundary is crossed.

I've tried arguing that way before...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Heikki Linnakangas

On 01/16/2014 08:59 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

I propose that we reimplement sum(bigint) in a more efficient way: For
the internal state, let's use an int8 and a numeric overflow field. The
transition function adds to the int8 variable, and checks for overflow.
On overflow, increment the numeric field by one. In the final function,
multiply the numeric by 2^64, and add the residual int8 value.


It'd probably be sufficient to handle it as two int64 fields (handmade
128-bit arithmetic, or maybe even not so handmade if that ever gets
reasonably common among C compilers).


True. That would be sufficient for summing 2^64 int8s of INT64_MAX. That 
sounds like enough, especially considering that that count() will 
overflow after that too.



 You're assuming the final output is still numeric, right?


Yep.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Andres Freund
On 2014-01-16 21:07:33 +0200, Heikki Linnakangas wrote:
 On 01/16/2014 08:59 PM, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 I propose that we reimplement sum(bigint) in a more efficient way: For
 the internal state, let's use an int8 and a numeric overflow field. The
 transition function adds to the int8 variable, and checks for overflow.
 On overflow, increment the numeric field by one. In the final function,
 multiply the numeric by 2^64, and add the residual int8 value.
 
 It'd probably be sufficient to handle it as two int64 fields (handmade
 128-bit arithmetic, or maybe even not so handmade if that ever gets
 reasonably common among C compilers).
 
 True. That would be sufficient for summing 2^64 int8s of INT64_MAX. That
 sounds like enough, especially considering that that count() will overflow
 after that too.

You'll have to handle adding negative values and underflow as
well.
Maybe it's instead sufficient to just have flag indicating that you're
working with a state that hasn't overflowed so far and just plain int8
math as long as that's the case, and entirely fall back to the current
path once overflowed. That will probably be slightly faster and easily
handle the majority of cases since overflowing int8 ought to be pretty
rare in the real world.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2014-01-16 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 I think the argument is that drawing the next value from a sequence can
 generate xlog that needs to be flushed, but doesn't assign an xid.

 I would think the sequence should flush that record before it hands out the
 value, not before the commit, but...

IIRC the argument was that we'd flush WAL before any use of the value
could make it to disk.  Which is true if you're just inserting it into
a table; perhaps less so if the client is doing something external to
the database with it.  (But it'd be reasonable to say that clients
who want a guaranteed-good serial for such purposes should have to
commit the transaction that created the value.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 You'll have to handle adding negative values and underflow as
 well.

Right.

 Maybe it's instead sufficient to just have flag indicating that you're
 working with a state that hasn't overflowed so far and just plain int8
 math as long as that's the case, and entirely fall back to the current
 path once overflowed. That will probably be slightly faster and easily
 handle the majority of cases since overflowing int8 ought to be pretty
 rare in the real world.

Dunno, I think that a transition state containing both an int64 and
a (presumably separately palloc'd) numeric will be a real PITA.
And it will not be faster, because the principal drag on performance
will just be the overflow test, which you have to do either way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2014-01-16 Thread Josh Berkus
On 12/07/2013 05:50 PM, Joe Conway wrote:
 On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote:
 
 On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier 
 michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com
 wrote:

 IMHO is more elegant create a procedure to encapsulate the code
 to avoid redundancy.
 Yep, perhaps something like PQsetClientEncodingIfDifferent or
 similar would make sense.
 
 Well I think at this first moment we can just create a procedure
 inside the dblink contrib and not touch in libpq.
 
 Maybe a libpq function could be done for 9.4, but not for back branches.
 
 I don't think it makes sense to create a new function in dblink either
 -- we're only talking about two lines of added redundancy which is
 less lines of code than a new function would add. But if we create
 PQsetClientEncodingIfDifferent() (or whatever) we can remove those
 extra lines in 9.4 ;-)

Hey, since we're about to do 9.3.3: was this patch ever committed?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-16 Thread Peter Eisentraut
You appear to be generating your patches with git diff --no-prefix.
Don't do that, leave the a/ and b/ in.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Andres Freund
On 2014-01-16 14:23:47 -0500, Tom Lane wrote:
  Maybe it's instead sufficient to just have flag indicating that you're
  working with a state that hasn't overflowed so far and just plain int8
  math as long as that's the case, and entirely fall back to the current
  path once overflowed. That will probably be slightly faster and easily
  handle the majority of cases since overflowing int8 ought to be pretty
  rare in the real world.
 
 Dunno, I think that a transition state containing both an int64 and
 a (presumably separately palloc'd) numeric will be a real PITA.

Yea, not sure myself. I just dislike the idea of having a good part of a
128bit math implementation for a single transition function.

Another alternative would be a configure check for compiler/native
128bit math and fall back to the current implementation if none is
provided... That should give decent performance with a pretty low amount
of code for most platforms.

 And it will not be faster, because the principal drag on performance
 will just be the overflow test, which you have to do either way.

Well, you don't need to check the second variable for lots of
operations. Say, the current sum is 0 and you add a -1. With the two
variables scheme that requires checking the second variable,
manipulating it etc.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: [HACKERS] ECPG infrastructure changes part 1, was: Re: ECPG fixes

2014-01-16 Thread Alvaro Herrera
Boszormenyi Zoltan escribió:

 All patches are attached again for completeness.

Thanks.  I pushed a commit comprising patches 09 through 14.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backup throttling

2014-01-16 Thread Andres Freund
Hi,

On 2014-01-15 18:52:32 -0300, Alvaro Herrera wrote:
 Another thing I found a bit strange was the use of the latch.  What this
 patch does is create a separate latch which is used for the throttling.
 This means that if the walsender process receives a signal, it will not
 wake up if it's sleeping in throttling.  Perhaps this is okay: as Andres
 was quoted upthread as saying, maybe this is not a problem because the
 sleep times are typically short anyway.  But we're pretty much used to
 the idea that whenever a signal is sent, processes act on it
 *immediately*.  Maybe some admin will not feel comfortable about waiting
 some extra 20ms when they cancel their base backups.  In any case,
 having a secondary latch to sleep on in a process seems weird.  Maybe
 this should be using MyWalSnd-latch somehow.

Yes, this definitely should reuse MyWalSnd-latch.

slightly related: we should start to reuse procLatch for walsenders
instead of having a separate latch someday.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Tom Lane
I wrote:
 Anybody else have an opinion?

Given the lack of other votes, I pushed this without a volatility column,
and with some other changes --- mostly, I kept the Description column
last, because that's how all the other \d commands do it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-16 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-16 14:23:47 -0500, Tom Lane wrote:
 Dunno, I think that a transition state containing both an int64 and
 a (presumably separately palloc'd) numeric will be a real PITA.

 Yea, not sure myself. I just dislike the idea of having a good part of a
 128bit math implementation for a single transition function.

Not sure how you figure that we need very much new code beyond the
overflow test.

 Well, you don't need to check the second variable for lots of
 operations. Say, the current sum is 0 and you add a -1. With the two
 variables scheme that requires checking the second variable,
 manipulating it etc.

I'm envisioning just

state-lowhalf += input;
if (overflowed_up)
state-highhalf++;
else if (overflowed_down)
state-highhalf--;

The only thing that might take a moment's thought, or extra cycles in the
normal case, is extending the overflow test so that it can tell whether
we need to increment or decrement the upper half.

[ thinks a bit... ]  Note that I'm supposing that the state is defined
as (highhalf * 2^64) + lowhalf, not that we need the two variables to
be exactly a 128-bit twos-complement value, which is what I think
you're talking about.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-01-16 Thread Marko Tiikkaja

On 1/16/14, 9:32 PM, Tom Lane wrote:

Given the lack of other votes, I pushed this without a volatility column,
and with some other changes --- mostly, I kept the Description column
last, because that's how all the other \d commands do it.


Thanks!  And looks like I missed the documentation as well, sorry about 
that. :-(



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2014-01-16 Thread Joe Conway

yes

Joe

Sent with AquaMail for Android
http://www.aqua-mail.com


On January 16, 2014 2:32:55 PM Josh Berkus j...@agliodbs.com wrote:


On 12/07/2013 05:50 PM, Joe Conway wrote:
 On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote:
  On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier  
michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com

 wrote:

 IMHO is more elegant create a procedure to encapsulate the code
 to avoid redundancy.
 Yep, perhaps something like PQsetClientEncodingIfDifferent or
 similar would make sense.
  Well I think at this first moment we can just create a procedure
 inside the dblink contrib and not touch in libpq.
 Maybe a libpq function could be done for 9.4, but not for back branches.
 I don't think it makes sense to create a new function in dblink either
 -- we're only talking about two lines of added redundancy which is
 less lines of code than a new function would add. But if we create
 PQsetClientEncodingIfDifferent() (or whatever) we can remove those
 extra lines in 9.4 ;-)

Hey, since we're about to do 9.3.3: was this patch ever committed?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: [HACKERS] ECPG infrastructure changes part 1, was: Re: ECPG fixes

2014-01-16 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Boszormenyi Zoltan escribió:
 
  All patches are attached again for completeness.
 
 Thanks.  I pushed a commit comprising patches 09 through 14.

Now also pushed 15 to 17.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backup throttling

2014-01-16 Thread Peter Geoghegan
On Thu, Jan 16, 2014 at 12:03 PM, Andres Freund and...@2ndquadrant.com wrote:
 slightly related: we should start to reuse procLatch for walsenders
 instead of having a separate latch someday.

+1. The potential for bugs from failing to account for this within
signal handlers seems like a concern. I think that every process
should use the process latch, except for the archiver which uses a
local latch because it pointedly does not touch shared memory. I think
I wrote a comment that made it into the latch header comments
encouraging this, but never saw to it that it was universally adhered
to.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ALTER TABLE ... SET TABLESPACE pg_default

2014-01-16 Thread Stephen Frost
Greetings,

  Harking back to 10 years ago when tablespaces were added, it looks
  like we originally figured that users didn't need permissions to
  create tables in the database default, per 2467394e.  That strikes me
  as perfectly fair.  Unfortunately, the later addition of
  ALTER TABLE ... SET TABLESPACE (af4de814) didn't get the memo about
  the default tablespace being special in this regard and refuses to let
  a user move their tables into the default tablespace, even though they
  can do so via 'CREATE TABLE ... AS SELECT * FROM ...'.

  Barring objections, I'll add the same conditional around the AclCheck
  in ATPrepSetTableSpace() as exists in DefineRelation() to allow users
  to ALTER TABLE ... SET TABLESPACE into the database's default
  tablespace and backpatch accordingly.

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] ALTER TABLESPACE ... MOVE ALL TO ...

2014-01-16 Thread Stephen Frost
Greetings,

  It's a day late and I'm a dollar short, but attached is a (very) minor
  patch to allow users to more easily move their various objects from
  one tablespace to another.  Included are docs and a regression test;
  I'm happy to improve on both should folks send me suggestions.

  As we use tablespaces quite a bit, this can be extremely handy for us
  and I expect others will find it useful too.

  Thoughts?

Thanks,

Stephen
diff --git a/doc/src/sgml/ref/alter_tablespace.sgml b/doc/src/sgml/ref/alter_tablespace.sgml
new file mode 100644
index 7d3ee2c..fcf4155 100644
*** a/doc/src/sgml/ref/alter_tablespace.sgml
--- b/doc/src/sgml/ref/alter_tablespace.sgml
*** PostgreSQL documentation
*** 12,18 
  
   refnamediv
refnameALTER TABLESPACE/refname
!   refpurposechange the definition of a tablespace/refpurpose
   /refnamediv
  
   indexterm zone=sql-altertablespace
--- 12,18 
  
   refnamediv
refnameALTER TABLESPACE/refname
!   refpurposechange the definition of a tablespace or affect objects of a tablespace/refpurpose
   /refnamediv
  
   indexterm zone=sql-altertablespace
*** ALTER TABLESPACE replaceablename/repl
*** 25,30 
--- 25,31 
  ALTER TABLESPACE replaceablename/replaceable OWNER TO replaceablenew_owner/replaceable
  ALTER TABLESPACE replaceablename/replaceable SET ( replaceable class=PARAMETERtablespace_option/replaceable = replaceable class=PARAMETERvalue/replaceable [, ... ] )
  ALTER TABLESPACE replaceablename/replaceable RESET ( replaceable class=PARAMETERtablespace_option/replaceable [, ... ] )
+ ALTER TABLESPACE replaceablename/replaceable MOVE ALL TO replaceablenew_tablespace/replaceable [ NOWAIT ]
  /synopsis
   /refsynopsisdiv
  
*** ALTER TABLESPACE replaceablename/repl
*** 32,46 
titleDescription/title
  
para
!commandALTER TABLESPACE/command changes the definition of
!a tablespace.
/para
  
para
!You must own the tablespace to use commandALTER TABLESPACE/.
 To alter the owner, you must also be a direct or indirect member of the new
 owning role.
 (Note that superusers have these privileges automatically.)
/para
   /refsect1
  
--- 33,60 
titleDescription/title
  
para
!commandALTER TABLESPACE/command can be used to change the definition of
!a tablespace or to migrate all of the objects in the current database which
!are owned by the user out of a given tablespace.
/para
  
para
!You must own the tablespace to change the definition of a tablespace.
 To alter the owner, you must also be a direct or indirect member of the new
 owning role.
 (Note that superusers have these privileges automatically.)
+ 
+Users may use ALTER TABLESPACE ... MOVE ALL, but they must have CREATE
+rights on the new tablespace and only objects, directly or indirectly, owned
+by the user will be moved.  Note that the superuser is considered an owner
+of all objects and therefore an ALTER TABLESPACE ... MOVE ALL issued by the
+superuser will move all objects in the current database which are in the
+tablespace.
+ 
+System catalogs will not be moved by this command- individuals wishing to
+move a whole database should use ALTER DATABASE, or call ALTER TABLE on the
+individual system catalogs.  Note that relations in literalinformation_schema/literal
+will be moved, just as any other normal database objects.
/para
   /refsect1
  
*** ALTER TABLESPACE replaceablename/repl
*** 94,99 
--- 108,137 
   /para
  /listitem
 /varlistentry
+ 
+varlistentry
+ termreplaceable class=parameternew_tablespace/replaceable/term
+ listitem
+  para
+   The name of the tablespace to move objects into.  The user must have
+   CREATE rights on the new tablespace to move objects into that
+   tablespace, unless the tablespace being moved into is the default
+   tablespace for the database connected to.
+  /para
+ /listitem
+/varlistentry
+ 
+varlistentry
+ termreplaceable class=parameterNOWAIT/replaceable/term
+ listitem
+  para
+   The NOWAIT option causes the ALTER TABLESPACE command to fail immediately
+   if it is unable to acquire the necessary lock on all of the objects being
+   move.
+  /para
+ /listitem
+/varlistentry
+ 
/variablelist
   /refsect1
  
*** ALTER TABLESPACE index_space RENAME TO f
*** 112,117 
--- 150,162 
  programlisting
  ALTER TABLESPACE index_space OWNER TO mary;
  /programlisting/para
+ 
+   para
+Move all of the objects which I own from the default tablespace to
+the literalfast_raid/literal tablespace:
+ programlisting
+ ALTER TABLESPACE pg_default MOVE ALL TO fast_raid;
+ /programlisting/para
   /refsect1
  
   refsect1
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
new file mode 100644
index 

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jan Kara
On Wed 15-01-14 21:37:16, Robert Haas wrote:
 On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara j...@suse.cz wrote:
  On Wed 15-01-14 10:12:38, Robert Haas wrote:
  On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara j...@suse.cz wrote:
   Filesystems could in theory provide facility like atomic write (at least 
   up
   to a certain size say in MB range) but it's not so easy and when there 
   are
   no strong usecases fs people are reluctant to make their code more 
   complex
   unnecessarily. OTOH without widespread atomic write support I understand
   application developers have similar stance. So it's kind of chicken and 
   egg
   problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
   due to its data=journal mode so if someone on the PostgreSQL side wanted 
   to
   research on this, knitting some experimental ext4 patches should be 
   doable.
 
  Atomic 8kB writes would improve performance for us quite a lot.  Full
  page writes to WAL are very expensive.  I don't remember what
  percentage of write-ahead log traffic that accounts for, but it's not
  small.
OK, and do you need atomic writes on per-IO basis or per-file is enough?
  It basically boils down to - is all or most of IO to a file going to be
  atomic or it's a smaller fraction?
 
 The write-ahead log wouldn't need it, but data files writes would.  So
 we'd need it a lot, but not for absolutely everything.
 
 For any given file, we'd either care about writes being atomic, or we
 wouldn't.
  OK, when you say that either all writes to a file should be atomic or
none of them should be, then can you try the following:
chattr +j file

  will turn on data journalling for file on ext3/ext4 filesystem.
Currently it *won't* guarantee the atomicity in all the cases but the
performance will be very similar as if it would. You might also want to
increase filesystem journal size with 'tune2fs -J size=XXX /dev/yyy' where
XXX is desired journal size in MB. Default is 128 MB I think but with
intensive data journalling you might want to have that in GB range. I'd be
interested in hearing what impact does turning 'atomic write' support
in PostgreSQL and using data journalling on ext4 have.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jan Kara
On Wed 15-01-14 10:12:38, Robert Haas wrote:
 On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara j...@suse.cz wrote:
  Filesystems could in theory provide facility like atomic write (at least up
  to a certain size say in MB range) but it's not so easy and when there are
  no strong usecases fs people are reluctant to make their code more complex
  unnecessarily. OTOH without widespread atomic write support I understand
  application developers have similar stance. So it's kind of chicken and egg
  problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
  due to its data=journal mode so if someone on the PostgreSQL side wanted to
  research on this, knitting some experimental ext4 patches should be doable.
 
 Atomic 8kB writes would improve performance for us quite a lot.  Full
 page writes to WAL are very expensive.  I don't remember what
 percentage of write-ahead log traffic that accounts for, but it's not
 small.
  OK, and do you need atomic writes on per-IO basis or per-file is enough?
It basically boils down to - is all or most of IO to a file going to be
atomic or it's a smaller fraction?

As Dave notes, unless there is HW support (which is coming with newest
solid state drives), ext4/xfs will have to implement this by writing data
to a filesystem journal and after transaction commit checkpointing them to
a final location. Which is exactly what you do with your WAL logs so
it's not clear it will be a performance win. But it is easy enough to code
for ext4 that I'm willing to try...

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Dave Chinner
On Wed, Jan 15, 2014 at 07:31:15PM -0500, Tom Lane wrote:
 Dave Chinner da...@fromorbit.com writes:
  On Wed, Jan 15, 2014 at 07:08:18PM -0500, Tom Lane wrote:
  No, we'd be happy to re-request it during each checkpoint cycle, as
  long as that wasn't an unduly expensive call to make.  I'm not quite
  sure where such requests ought to live though.  One idea is to tie
  them to file descriptors; but the data to be written might be spread
  across more files than we really want to keep open at one time.
 
  It would be a property of the inode, as that is how writeback is
  tracked and timed. Set and queried through a file descriptor,
  though - it's basically the same context that fadvise works
  through.
 
 Ah, got it.  That would be fine on our end, I think.
 
  We could probably live with serially checkpointing data
  in sets of however-many-files-we-can-have-open, if file descriptors are
  the place to keep the requests.
 
  Inodes live longer than file descriptors, but there's no guarantee
  that they live from one fd context to another. Hence my question
  about persistence ;)
 
 I plead ignorance about what an fd context is.

open-to-close life time.

fd = open(some/file, );
.
close(fd);

is a single context. If multiple fd contexts of the same file
overlap in lifetime, then the inode is constantly referenced and the
inode won't get reclaimed so the value won't get lost. However, is
there is no open fd context, there are no external references to the
inode so it can get reclaimed. Hence there's not guarantee that the
inode is present and the writeback property maintained across
close-to-open timeframes.

 We're ahead of the game as long as it usually works.

*nod*

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Jeff Layton
On Wed, 15 Jan 2014 21:37:16 -0500
Robert Haas robertmh...@gmail.com wrote:

 On Wed, Jan 15, 2014 at 8:41 PM, Jan Kara j...@suse.cz wrote:
  On Wed 15-01-14 10:12:38, Robert Haas wrote:
  On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara j...@suse.cz wrote:
   Filesystems could in theory provide facility like atomic write (at least 
   up
   to a certain size say in MB range) but it's not so easy and when there 
   are
   no strong usecases fs people are reluctant to make their code more 
   complex
   unnecessarily. OTOH without widespread atomic write support I understand
   application developers have similar stance. So it's kind of chicken and 
   egg
   problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
   due to its data=journal mode so if someone on the PostgreSQL side wanted 
   to
   research on this, knitting some experimental ext4 patches should be 
   doable.
 
  Atomic 8kB writes would improve performance for us quite a lot.  Full
  page writes to WAL are very expensive.  I don't remember what
  percentage of write-ahead log traffic that accounts for, but it's not
  small.
OK, and do you need atomic writes on per-IO basis or per-file is enough?
  It basically boils down to - is all or most of IO to a file going to be
  atomic or it's a smaller fraction?
 
 The write-ahead log wouldn't need it, but data files writes would.  So
 we'd need it a lot, but not for absolutely everything.
 
 For any given file, we'd either care about writes being atomic, or we 
 wouldn't.
 

Just getting caught up on this thread. One thing that you're just now
getting to here is that the different types of files in the DB have
different needs.

It might be good to outline each type of file (WAL, data files, tmp
files), what sort of I/O patterns are typically done to them, and what
sort of special needs they have (atomicity or whatever). Then we
could treat each file type as a separate problem, which may make some
of these problems easier to solve.

For instance, typically a WAL would be fairly sequential I/O, whereas
the data files are almost certainly random. It may make sense to
consider DIO for some of these use-cases, even if it's not suitable
everywhere.

For tempfiles, it may make sense to consider housing those on tmpfs.
They wouldn't go to disk at all that way, but if there is mem pressure
they could get swapped out (maybe this is standard practice already --
I don't know).

  As Dave notes, unless there is HW support (which is coming with newest
  solid state drives), ext4/xfs will have to implement this by writing data
  to a filesystem journal and after transaction commit checkpointing them to
  a final location. Which is exactly what you do with your WAL logs so
  it's not clear it will be a performance win. But it is easy enough to code
  for ext4 that I'm willing to try...
 
 Yeah, hardware support would be great.
 


-- 
Jeff Layton jlay...@redhat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-16 Thread Theodore Ts'o
On Wed, Jan 15, 2014 at 10:35:44AM +0100, Jan Kara wrote:
 Filesystems could in theory provide facility like atomic write (at least up
 to a certain size say in MB range) but it's not so easy and when there are
 no strong usecases fs people are reluctant to make their code more complex
 unnecessarily. OTOH without widespread atomic write support I understand
 application developers have similar stance. So it's kind of chicken and egg
 problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place
 due to its data=journal mode so if someone on the PostgreSQL side wanted to
 research on this, knitting some experimental ext4 patches should be doable.

For the record, a researcher (plus is PhD student) at HP Labs actually
implemented a prototype based on ext3 which created an atomic write
facility.  It was good up to about 25% of the ext4 journal size (so, a
couple of MB), and it was use to research using persistent memory by
creating a persistent heap using standard in-memory data structures as
a replacement for using a database.

The results of their research work was that showed that ext3 plus
atomic write plus standard Java associative arrays beat using Sqllite.

It was a research prototype, so they didn't handle OOM kill
conditions, and they also didn't try benchmarking against a real
database instead of a toy database such as SqlLite, but if someone
wants to experiment with Atomic write, there are patches against ext3
that we can probably get from HP Labs.

  - Ted


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.3.2 Client-only installation per docs fails creating directory.

2014-01-16 Thread Mike Blackwell
Per the docs (15.4 Installation Procedure, Client-only installation), after
running make, the following should work:

gmake -C src/bin installgmake -C src/include installgmake -C
src/interfaces installgmake -C doc install


However, it fails on the first command:

[postgresql-9.3.2]$ sudo gmake -C src/bin install
gmake: Entering directory `/opt/postgresql-9.3.2/src/bin'
gmake -C initdb install
gmake[1]: Entering directory `/opt/postgresql-9.3.2/src/bin/initdb'
gmake -C ../../../src/port all
[successful stuff cut here]
/bin/mkdir -p '/usr/local/pgsql-9.3.2/bin'
/usr/bin/install -c  psql '/usr/local/pgsql-9.3.2/bin/psql'
/usr/bin/install -c -m 644 ./psqlrc.sample
'/usr/local/pgsql-9.3.2/share/psqlrc.sample'
/usr/bin/install: cannot create regular file
`/usr/local/pgsql-9.3.2/share/psqlrc.sample': No such file or directory
gmake[1]: *** [install] Error 1
gmake[1]: Leaving directory `/opt/postgresql-9.3.2/src/bin/psql'
gmake: *** [install-psql-recurse] Error 2
gmake: Leaving directory `/opt/postgresql-9.3.2/src/bin'

The directory 'share' does not exist, which seem to be the issue.  Is there
a missing dependency somewhere?

It appears the doc install correctly creates 'share', so installing src/bin
last works.

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com


http://www.rrdonnelley.com/
* mike.blackw...@rrd.com*


Re: [HACKERS] [PATCH] Filter error log statements by sqlstate

2014-01-16 Thread Tom Lane
Oskari Saarenmaa o...@ohmu.fi writes:
 [ 0001-Filter-error-log-statements-by-sqlstate.patch ]

I looked at this patch.  It took me some time to understand that what
it actually does has got approximately nothing to do with what one might
first expect: rather than suppressing the entire log message about some
error, it only suppresses printing of the triggering SQL statement
(if that's available to begin with).  The proposed documentation is
certainly not clear enough on that point, and the API which appears to
allow changing the error severity level associated with a SQLSTATE isn't
exactly helping to clarify things either.

Also, the patch claims it allows logging of statements that otherwise
wouldn't get logged, but AFAICS that's wrong, because we'll never get to
this code at all if errstart decides we're not going to log the message.

I find it hard to follow exactly what the use-case for this is; could
you make a defense of why we should carry a feature like this?

In any case, I'm finding it hard to believe that filtering by individual
SQLSTATEs is a practical API.  When we've discussed providing better log
filtering in the past, that approach was invariably dismissed on the
grounds that it would be far too unwieldy to use --- any DBA attempting to
use it in anger would end up with a huge and ever-incomplete list of
SQLSTATEs he'd need to filter.  A long time ago I suggested that filtering
by SQLSTATE class (the first two characters of SQLSTATE) might be useful,
but I'm not sure I still believe that, and anyway it's not what's
implemented here.

I'm concerned also about the potential performance impact of this patch,
if used with a SQLSTATE list as long as I think they're likely to get in
practice.  Have you done any performance testing?

As far as the code goes, bit manipulations on uint64s are a pretty crummy
substitute for defining a struct with a couple of fields; and the way
you're choosing to sort the array seems mighty inefficient, as well as
probably wrong in detail (why is the loop ignoring the first array
element?); and rather than make fragile assumptions about the maximum
length of an elevel name, why not just leave the user's string as-is?
But I wouldn't advise worrying about code style issues until we decide if
we're accepting the feature.  Right now my inclination is to reject it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Filter error log statements by sqlstate

2014-01-16 Thread Peter Eisentraut
Please fix the compiler warning:

elog.c: In function ‘check_log_sqlstate_error’:
elog.c:3789:41: warning: ‘new_newval_end’ may be used uninitialized in
this function [-Wuninitialized]


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes

2014-01-16 Thread Alvaro Herrera
Boszormenyi Zoltan escribió:
 Rebased patches after the regression test and other details were fixed
 in the infrastructure part.

This thread started in 2010, and various pieces have been applied
already and some others have changed in nature.  Would you please post a
new patchset, containing rebased patches that still need application, in
a new email thread to be linked in the commitfest entry?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >