date:20130615

Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Liming Hu


On 6/15/2013 9:51 PM, Josh Berkus wrote:

Liming,

I've added your proposal in to the commitfest so that you can get a
review this commitfest.  However, you need to generate a context diff
and submit it to this mailing list for it to really get reviewed.
That's a requirement.


Hi Josh,

Thanks. Where can I see the "proposal"? Thanks.

Liming


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Amit kapila

On Saturday, June 15, 2013 8:29 PM Sawada Masahiko wrote:
On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila  wrote:
>
> On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
> On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila  wrote:
>> On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
>>> Hello,
>>
> We have already started a discussion on pgsql-hackers for the problem of
> taking fresh backup during the failback operation here is the link for 
> that:
>
>
> http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
> jgwrfu513...@mail.gmail.com
>
> Let me again summarize the problem we are trying to address.
>
>
   How will you take care of extra WAL on old master during recovery. If it
 plays the WAL which has not reached new-master, it can be a problem.
>
>>> you means that there is possible that old master's data ahead of new
>>> master's data.
>
>>   I mean to say is that WAL of old master can be ahead of new master. I 
>> understood that
>>   data files of old master can't be ahead, but I think WAL can be ahead.
>
>>> so there is inconsistent data between those server when fail back. right?
>>> if so , there is not possible inconsistent. because if you use GUC option
>>> as his propose (i.g., failback_safe_standby_mode = remote_flush),
>>> when old master is working fine, all file system level changes aren't
>>> done  before WAL replicated.
>
>> Would the propose patch will take care that old master's WAL is also not 
>> ahead in some way?
>> If yes, I think i am missing some point.

> yes it will happen that old master's WAL ahead of new master's WAL as you 
> said.
> but I think that we can solve them by delete all WAL file when old
> master starts as new standby.

I think ideally, it should reset WAL location at the point where new master has 
forrked off.
In such a scenario it would be difficult for user who wants to get a dump of 
some data in 
old master which hasn't gone to new master. I am not sure if such a need is 
there for real users, but if it
is there, then providing this solution will have some drawbacks.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Josh Berkus

Liming,

I've added your proposal in to the commitfest so that you can get a
review this commitfest.  However, you need to generate a context diff
and submit it to this mailing list for it to really get reviewed.
That's a requirement.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Daniel Farina

On Sat, Jun 15, 2013 at 8:55 PM, Tom Lane  wrote:
> Daniel Farina  writes:
>>> Okay, I have a patch that does something *like* (but not the same) as
>>> this, and whose implementation is totally unreasonable, but it's
>>> enough to get a sense of how the whole thing feels.  Critically, it
>>> goes beyond askpass, instead allowing a shell-command based hook for
>>> arbitrary interpretation and rewriting of connection info...such as
>>> the 'host' libpq keyword.  I have called it, without much thought, a
>>> 'resolver'.  In this way, it's closer to the libpq 'service' facility,
>>> except with addition of complete control of the interpretation of
>>> user-provided notation.
>
>> Hello everyone,
>
>> I'm sort of thinking of attacking this problem again, does anyone have
>> an opinion or any words of (en/dis)couragement to continue?  The
>> implementation I posted is bogus but is reasonable to feel around
>> with, but I'm curious besides its obvious defects as to what the
>> temperature of opinion is.
>
>> Most generally, I think the benefits are strongest in dealing with:
>
>> * Security: out-of-band secrets will just prevent people from pasting
>>   important stuff all over the place, as I see despairingly often
>>   today.
>
>> * Client-side Proxies: pgbouncer comes to mind, a variation being used
>>   on production applications right now that uses full-blown
>>   preprocessing of the user environment (only possible in a
>>   environment with certain assumptions like Heroku)
>>   https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
>>   promising and effective, but it'd be nice to confer the same
>>   benefits to everyone else, too.
>
>> * HA: one of the most annoying problems in HA is naming things.  Yes,
>>   this could be solved with other forms of common dynamic binding DNS
>>   or Virtual IP (sometimes), but these both are pretty complicated and
>>   carry baggage and pitfalls, but as long as there is dynamic binding
>>   of the credentials, I'm thinking it may make sense to have dynamci
>>   binding of net locations, too.
>
>> * Cross-server references
>
>>   This is basically the issues seen in HA and Security, but on
>>   (horrible) steroids: the spate of features making Postgres work
>>   cross-server (older features like dblink, but now also new ones like
>>   FDWs and Writable FDWs) make complex interconnection between servers
>>   more likely and problematic, especially if one has standbys where
>>   there is a delay in catalog propagation from a primary to standby
>>   with new connection info.
>
>>   So, an out of band way where one can adjust the dynamic binding
>>   seems useful there.
>
> TBH, I see no clear reason to think that a connection-string rewriter
> solves any of those problems.  At best it would move them somewhere else.

Yes, that's exactly what I want to achieve: moving them somewhere else
that can be held in common by client applications.

> Nor is it clear that any of this should be libpq's business, as opposed
> to something an application might do before invoking libpq.

Yes, it's unclear.  I have only arrived at seriously exploring this
after trying my very best meditate on other options.  In addition,
sometimes 'the application' is Postgres, and with diverse access paths
like FDWs and dblink, which may not be so easy to adjust, and it would
seem strange to adjust them in a way that can't be shared in common
with regular non-backend-linked client applications.

Also, it seems like a very high bar to set for an application as to
make use of environment keychains or environment-specific high
availability retargeting.  This general approach you mention is used
in Greg Burek's heroku-pgbouncer buildpack, but it took significant
work to iron out (and probably still needs more ironing, although it
seems to work great) and only serves the Heroku-verse.

Basically, the needs seem very similar, so abstracting seems to me
profitable.

This is not that different in principle than pgpass and the services
file in that regard, except taking the final step to delegate their
function...and deliver full control over notation.

Although I don't know much about it, I seem to recall that VMWare felt
inclined to instigate some kind of vaguely related solution to solve a
similar problem in a custom libpq.  After initial recoil and over a
year of contemplation, I think the reasons are more well justified
than I originally thought and it'd be nice to de-weirdify such
approaches.

> Also, I think a facility dependent on invoking a shell command is
> (a) wide open for security problems, and (b) not likely to be
> portable to Windows.

Yeah, those things occurred to me, I think a dlopen based mechanism is
a more likely solution than the shell-command one.  The latter just
let me get started quickly to experiment.  Would a rigorous proposal
about how to do that help the matter?  I mostly wanted to get the
temperature before thinking about Real Mechanisms.

-- 
Sent via pgsql-hackers ma

Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Tom Lane

Daniel Farina  writes:
>> Okay, I have a patch that does something *like* (but not the same) as
>> this, and whose implementation is totally unreasonable, but it's
>> enough to get a sense of how the whole thing feels.  Critically, it
>> goes beyond askpass, instead allowing a shell-command based hook for
>> arbitrary interpretation and rewriting of connection info...such as
>> the 'host' libpq keyword.  I have called it, without much thought, a
>> 'resolver'.  In this way, it's closer to the libpq 'service' facility,
>> except with addition of complete control of the interpretation of
>> user-provided notation.

> Hello everyone,

> I'm sort of thinking of attacking this problem again, does anyone have
> an opinion or any words of (en/dis)couragement to continue?  The
> implementation I posted is bogus but is reasonable to feel around
> with, but I'm curious besides its obvious defects as to what the
> temperature of opinion is.

> Most generally, I think the benefits are strongest in dealing with:

> * Security: out-of-band secrets will just prevent people from pasting
>   important stuff all over the place, as I see despairingly often
>   today.

> * Client-side Proxies: pgbouncer comes to mind, a variation being used
>   on production applications right now that uses full-blown
>   preprocessing of the user environment (only possible in a
>   environment with certain assumptions like Heroku)
>   https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
>   promising and effective, but it'd be nice to confer the same
>   benefits to everyone else, too.

> * HA: one of the most annoying problems in HA is naming things.  Yes,
>   this could be solved with other forms of common dynamic binding DNS
>   or Virtual IP (sometimes), but these both are pretty complicated and
>   carry baggage and pitfalls, but as long as there is dynamic binding
>   of the credentials, I'm thinking it may make sense to have dynamci
>   binding of net locations, too.

> * Cross-server references

>   This is basically the issues seen in HA and Security, but on
>   (horrible) steroids: the spate of features making Postgres work
>   cross-server (older features like dblink, but now also new ones like
>   FDWs and Writable FDWs) make complex interconnection between servers
>   more likely and problematic, especially if one has standbys where
>   there is a delay in catalog propagation from a primary to standby
>   with new connection info.

>   So, an out of band way where one can adjust the dynamic binding
>   seems useful there.

TBH, I see no clear reason to think that a connection-string rewriter
solves any of those problems.  At best it would move them somewhere else.
Nor is it clear that any of this should be libpq's business, as opposed
to something an application might do before invoking libpq.  Also,
I think a facility dependent on invoking a shell command is (a) wide
open for security problems, and (b) not likely to be portable to
Windows.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] askpass program for libpq

2013-06-15 Thread Daniel Farina

On Fri, May 17, 2013 at 2:03 PM, Daniel Farina  wrote:
> On Wed, Jan 9, 2013 at 5:17 AM, Peter Eisentraut  wrote:
>> I would like to have something like ssh-askpass for libpq.  The main
>> reason is that I don't want to have passwords in plain text on disk,
>> even if .pgpass is read protected.  By getting the password from an
>> external program, I can integrate libpq tools with the host system's key
>> chain or wallet thing, which stores passwords encrypted.
>>
>> I'm thinking about adding a new connection option "askpass" with
>> environment variable PGASKPASS.  One thing I haven't quite figured out
>> is how to make this ask for passwords only if needed.  Maybe it needs
>> two connection options, one to say which program to use and one to say
>> whether to use it.
>>
>> Ideas?
>
> Okay, I have a patch that does something *like* (but not the same) as
> this, and whose implementation is totally unreasonable, but it's
> enough to get a sense of how the whole thing feels.  Critically, it
> goes beyond askpass, instead allowing a shell-command based hook for
> arbitrary interpretation and rewriting of connection info...such as
> the 'host' libpq keyword.  I have called it, without much thought, a
> 'resolver'.  In this way, it's closer to the libpq 'service' facility,
> except with addition of complete control of the interpretation of
> user-provided notation.

Hello everyone,

I'm sort of thinking of attacking this problem again, does anyone have
an opinion or any words of (en/dis)couragement to continue?  The
implementation I posted is bogus but is reasonable to feel around
with, but I'm curious besides its obvious defects as to what the
temperature of opinion is.

Most generally, I think the benefits are strongest in dealing with:

* Security: out-of-band secrets will just prevent people from pasting
  important stuff all over the place, as I see despairingly often
  today.

* Client-side Proxies: pgbouncer comes to mind, a variation being used
  on production applications right now that uses full-blown
  preprocessing of the user environment (only possible in a
  environment with certain assumptions like Heroku)
  https://github.com/gregburek/heroku-buildpack-pgbouncer seems very
  promising and effective, but it'd be nice to confer the same
  benefits to everyone else, too.

* HA: one of the most annoying problems in HA is naming things.  Yes,
  this could be solved with other forms of common dynamic binding DNS
  or Virtual IP (sometimes), but these both are pretty complicated and
  carry baggage and pitfalls, but as long as there is dynamic binding
  of the credentials, I'm thinking it may make sense to have dynamci
  binding of net locations, too.

* Cross-server references

  This is basically the issues seen in HA and Security, but on
  (horrible) steroids: the spate of features making Postgres work
  cross-server (older features like dblink, but now also new ones like
  FDWs and Writable FDWs) make complex interconnection between servers
  more likely and problematic, especially if one has standbys where
  there is a delay in catalog propagation from a primary to standby
  with new connection info.

  So, an out of band way where one can adjust the dynamic binding
  seems useful there.

Knowing those, am I barking up the wrong tree?  Can I do something
else entirely?  I've considered DNS and SSL certs, but these seem
much, much harder and limited, too.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane

Stephen Frost  writes:
> Is there really serious overhead from using latches..?

IIRC there's at least one gettimeofday() involved in WaitLatch.  There
are platforms on which that already costs more than you want to spend
for, say, CommitDelay.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Stephen Frost

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Tom Lane wrote:
> > In general, we might want to consider replacing long sleep intervals
> > with WaitLatch operations.  I thought for a bit about trying to turn
> > pg_usleep itself into a WaitLatch call; but it's also used in frontend
> > code where that wouldn't work, and anyway it's not clear this would be
> > a good thing for short sleeps.
> 
> How about having a #ifdef !FRONTEND code path that uses the latch, and
> sleep otherwise?  And maybe use plain sleep for short sleeps in the
> backend also, to avoid the latch overhead.  I notice we already have
> three implementations of pg_usleep.

Is there really serious overhead from using latches..?  I thought much
of the point of that approach was specifically to minimize overhead...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Robert Haas

On Sat, Jun 15, 2013 at 8:22 AM, Hannu Krosing  wrote:
>> You suddently need to solve the question of how the identifiers for
>> compression formats are allocated and preserved across pg_upgrade and
>> across machines.
> This is something similar we already need to do for any non-builtin type
> OID.

That's true, but that code has already been written.  And it's not
trivial.  The code involved is CREATE/ALTER/DROP TYPE plus all the
corresponding pg_dump mechanism.  To do what you're proposing here,
we'd need CREATE/ALTER/DROP COMPRESSION METHOD, and associated pg_dump
--binary-upgrade support.  I think Andres is entirely right to be
skeptical about that.  It will make this project about 4 times as hard
for almost no benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Alvaro Herrera

Tom Lane wrote:

> In general, we might want to consider replacing long sleep intervals
> with WaitLatch operations.  I thought for a bit about trying to turn
> pg_usleep itself into a WaitLatch call; but it's also used in frontend
> code where that wouldn't work, and anyway it's not clear this would be
> a good thing for short sleeps.

How about having a #ifdef !FRONTEND code path that uses the latch, and
sleep otherwise?  And maybe use plain sleep for short sleeps in the
backend also, to avoid the latch overhead.  I notice we already have
three implementations of pg_usleep.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Robert Haas

On Sat, Jun 15, 2013 at 8:11 AM, Hannu Krosing  wrote:
> Claiming that the algorithm will be one of only two (current and
> "whatever algorithm we come up with ") suggests that it is
> only one bit, which is undoubtedly too little for having a "pluggable"
> compression API :)

See 
http://www.postgresql.org/message-id/20130607143053.gj29...@alap2.anarazel.de

>> But those identifiers should be *small* (since they are added to all
>> Datums)
> if there will be any alignment at all between the datums, then
> one byte will be lost in the noise ("remember: nobody will need
> more than 256 compression algorithms")
> OTOH, if you plan to put these format markers in the compressed
> stream and change the compression algorithm while reading it, I am lost.

The above-linked email addresses this point as well: there are bits
available in the toast pointer.  But there aren't MANY bits without
increasing the storage footprint, so trying to do something that's
more general than we really need is going to cost us in terms of
on-disk footprint.  Is that really worth it?  And if so, why?  I don't
find the idea of a trade-off between compression/decompression speed
and compression ratio to be very exciting.  As Andres says, bzip2 is
impractically slow for ... almost everything.  If there's a good
BSD-licensed algorithm available, let's just use it and be done.  Our
current algorithm has lasted us a very long time; I see no reason to
think we'll want to change this again for another 10 years, and by
that time, we may have redesigned the storage format altogether,
making the limited extensibility of our current TOAST pointer format
moot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Alvaro Herrera

Nicholas White escribió:

> For the parsing changes, it seems I can either make RESPECT and IGNORE
> reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
> and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
> IGNORE were just normal unreserved keywords due to ambiguities after a
> function definition (e.g. select abs(1) respect; - which is currently a
> valid statement).

Well, making them reserved keywords is not that great, so maybe the
lookahead thingy is better.  However, this patch introduces the third
and fourth uses of the "save the lookahead token" pattern in the
"default" switch cases.  Can we refactor that bit somehow, to avoid so
many duplicates?  (For a minute I thought that Andrew Gierth's WITH
ORDINALITY patch would add another one, but it seems not.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Andrew Dunstan



On 06/15/2013 02:43 PM, David E. Wheeler wrote:

On Jun 15, 2013, at 4:12 AM, Andrew Dunstan  wrote:


   REGRESS_OPTS = --inputdir=test --outputdir=test \
  --load-extension=$(EXTENSION)
   ...
   override pg_regress_clean_files = test/results/
   test/regression.diffs test/regression.out tmp_check/ log/


That keeps the testing stuff out of the way quite nicely.

I don't suppose there could be a way for the makefile to notice the --outputdir 
option and add those files to the clean target itself, could there? Having it 
hard-coded is slightly annoying. Maybe it could ask pg_regress where to find 
them?




That doesn't sound like a promising line of development to me. Better 
would be to provide a PGXS option to specify where tests are based, and 
set the clean target accordingly.


Then instead of the above you'd just be able to say something like

MODULETEST = test
REGRESS_OPTS = --load-extension=$(EXTENSION)

Which would be a good deal cleaner.

cheers

andrew







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Andres Freund

On 2013-06-15 11:36:54 -0700, Jeff Davis wrote:
> On Fri, 2013-06-14 at 18:27 +0200, Andres Freund wrote:
> > I'd like to see a comment around the memcpys in XLogSaveBufferForHint()
> > that mentions that they are safe in a non std buffer due to
> > XLogCheckBuffer setting an appropriate hole/offset. Or make an explicit
> > change of the copy algorithm there.
> 
> Done.
> Also done.

Thanks! Looks good to me.

> Patch attached. Also, since we branched, I think this should be
> back-patched to 9.3 as well.

Absolutely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-15 Thread Jeff Davis

On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:

> I've redone the leadlag function changes to use datumCopy as you
> suggested. However, I've had to remove the NOT_USED #ifdef around
> datumFree so I can use it to avoid building up large numbers of copies
> of Datums in the memory context while a query is executing. I've
> attached the revised patch...
> 
Comments:

WinGetResultDatumCopy() calls datumCopy, which will just copy in the
current memory context. I think you want it in the per-partition memory
context, otherwise the last value in each partition will stick around
until the query is done (so many partitions could be a problem). That
should be easy enough to do by switching to the
winobj->winstate->partcontext memory context before calling datumCopy,
and then switching back.

For that matter, why store the datum again at all? You can just store
the offset of the last non-NULL value in that partition, and then fetch
it using WinGetFuncArgInPartition(), right? We really want to avoid any
per-tuple allocations.

Alternatively, you might look into setting a mark when you get a
non-NULL value. Then you could just always fetch the oldest one.
Unfortunately, I think that only works with const_offset=true... so the
previous idea might be better.

I'll leave it to someone else to review the grammar changes.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add visibility map information to pg_freespace.

2013-06-15 Thread Simon Riggs

On 14 June 2013 15:22, Alvaro Herrera  wrote:
> Kyotaro HORIGUCHI wrote:
>> Helle,
>>
>> I've added visibility map information to pg_freespace for my
>> utility.
>
> This makes sense to me.  I only lament the fact that this makes the
> module a misnomer.  Do we want to 1) rename the module (how
> inconvenient), 2) create a separate module for this (surely not
> warranted), or 3) accept it and move on?

I'm not sure why this is suggested as being part of pg_freespace and
not part of pageinspect? (Which is where all the other inspection
tools live).

If I wanted to see the vismap (and I do...) then I'd like to see the
whole vismap, not just the part that relates to blocks currently in
cache.

If you do want that, you can just join the two things together
(function to see vismap joined to pg_freespace).

(Having said that, I don't have a major objection to it being in
pg_freespace as well).

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore -l with a directory archive

2013-06-15 Thread Fujii Masao

On Sat, Jun 15, 2013 at 1:40 AM, Fujii Masao  wrote:
> Hi,
>
> When I ran pg_restore -l with the directory arhicve input, I found that
> its format is wrongly reported as UNKNOWN.
>
> $ pg_dump -F d -f hoge
> $ pg_restore -l hoge
> ;
> ; Archive created at Sat Jun 15 01:38:14 2013
> ; dbname: postgres
> ; TOC Entries: 9
> ; Compression: -1
> ; Dump Version: 1.12-0
> ; Format: UNKNOWN
> ; Integer: 4 bytes
> ; Offset: 8 bytes
> ; Dumped from database version: 9.3beta1
> ; Dumped by pg_dump version: 9.3beta1
> ;
> ;
> ; Selected TOC Entries:
>
>
> In this case, the format should be reported as DIRECTORY.
> The attached patch fixes this problem.

Committed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-15 Thread Fabien COELHO



Hello Greg,

I've done some more testing with the "test" patch. I have not seen any 
spike at the end of the throttled run.


The attached version 11 patch does ensure that throttle added sleeps are 
not included in latency measures (-r) and that throttling is performed 
right at the beginning of a transaction. There is an ugly goto to do that.


I think there is still a latent bug in the code with listen which should 
be set back to 0 in other places, but this bug is already there.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 1303217..2fc6f6a 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +229,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +368,8 @@ usage(void)
 		   "  -n   do not run VACUUM before tests\n"
 		   "  -N   do not update tables \"pgbench_tellers\" and \"pgbench_branches\"\n"
 		   "  -r   report average latency per command\n"
+		   "  -R SPEC, --rate SPEC\n"
+		   "   target rate in transactions per second\n"
 		   "  -s NUM   report this scale factor in output\n"
 		   "  -S   perform SELECT-only transactions\n"
 	 "  -t NUM   number of transactions each client runs (default: 10)\n"
@@ -898,17 +913,56 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st->use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay && ! st->throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 => 13.8 .. 0 multiplier
+		 *  10 => 11.5 .. 0
+		 *   1 =>  9.2 .. 0
+		 *1000 =>  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread->throttle_trigger += wait;
+
+		st->until = thread->throttle_trigger;
+		st->sleeping = 1;
+		st->throttled = true;
+		if (debug)
+			fprintf(stderr, "client %d throttling "INT64_FORMAT" us\n",
+	st->id, wait);
+	}
+
 	if (st->sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st->until <= INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st->until <= now_us)
+		{
 			st->sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay && st->state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st->until;
+thread->throttle_lag += lag;
+if (lag > thread->throttle_lag_max)
+	thread->throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1149,9 @@ top:
 			st->state = 0;
 			st->use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st->use_file];
+			st->throttled = false;
+			st->listen = 0;
+			do_throttle = (throttle_delay>0);
 		}
 	}
 
@@ -1113,6 +1170,12 @@ top:
 		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
 	}
 
+	if (do_throttle) {
+		/* delay throttling after reopenning the connection */
+		do_throttle = false;
+		goto top;
+	}
+
 	/* Record transaction start time if logging is enabled */
 	if (logfile && st->state == 0)
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
@@ -2017,7 +2080,8 @@ process_builtin(char *tb)
 static void
 printResults(int ttype, int normal_xacts, int nclients,
 			 TState *threads, int nthreads,
-			 instr_time to

Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Kevin Grittner

Simon Riggs  wrote:
> On 9 June 2013 14:56, Kevin Grittner  wrote:
>> Simon Riggs  wrote:
>>> On 8 June 2013 22:25, Kevin Grittner  wrote:
 Simon Riggs  wrote:
>>
>>> There are also difficulties in semantics, since when
>>> we have OLD and NEW at row level we know we are discussing the same
>>> row. With sets of OLD and NEW we'd need to be able to link the
>>> relations back together somehow, which couldn't be done by PK since
>>> that could change. So we'd need to invent some semantics for a
>>> "linking identifier" of some description. Which once we've done it
>>> would be used by people to join them back together again, which is
>>> what we already had in the first place. So my take is that it sounds
>>> expressive, but definitely not performant.
>>
>> I have used a feature like this in other database products, and can
>> say from experience that these relations in a statement trigger can
>> be very useful without the linkage you propose.  I can see how the
>> linkage could potentially be useful, but if that is the only
>> hang-up, we would be adding a powerful feature without it.
>
> What I'm trying to do is tune FKs, which are currently implemented as
> after row triggers. Hence why I'm looking at ways to speed up after
> row triggers.
>
> Suggesting I work on after statement triggers would imply you think
> that FKs should/could be rewritten as after statement triggers. Is
> that what you mean?

Well, I didn't make any suggestion -- I was asking questions, and
then when you commented on a hypothetical feature I responded.

That said, I was asking the question because I used SQL Server for
years when it had triggers but no foreign key definitions.  There
was a pattern for enforcing foreign key relationships in AFTER EACH
STATEMENT triggers using the old and new relations (with no linkage
between particular updated rows) which we used so heavily I could
write the triggers for a given foreign key mechanically in just a
minute or two.  This technique had good performance and seems to
have all the semantics you're looking for, so I was wondering
whether that might be a better approach to this problem.  It sure
seems like it would have fewer moving parts.  I know it was very
reliable with S2PL concurrency control, but there may be problems
with adapting it to MVCC that I'm not seeing without a deeper look.

If you're interested, I could try to prod those areas of my memory
to recall the pattern, or find examples of it somewhere.

> Doing it that way would mean rewriting a lot of code and would still
> have problems 2 and 3 identified above. I can't imagine anybody would
> go for that, but if you have a sketch of how it might work we can
> consider it.

I don't, but if there is interest I could probably sketch the outlines.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] C++ compiler

2013-06-15 Thread Gaetano Mendola

I have read Peter Eisentraut blog entry about "Moving to C++", I full agree
with him about what he wrote.

Is there any interest or work in progress in making the entire Postgresql
code
 base compilable by a C++ compiler?

Regards
Gaetano Mendola

-- 
cpp-today.blogspot.com

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane

I wrote:
> ... Also, our switch to latches for sleeping purposes
> should have ameliorated the issue of signals failing to wake processes
> when we wanted them to.

> Let's turn on SA_RESTART for SIGALRM in HEAD and 9.3 and see what beta
> testing says.

I experimented with this a bit on my old HPUX box (which is one of the
ones where SA_RESTART signals don't terminate a select()), and soon
found an annoying behavior:

regression=# \timing
Timing is on.
regression=# set statement_timeout TO 2000;
SET
Time: 4.983 ms
regression=# select generate_series(1,100);
ERROR:  canceling statement due to statement timeout
Time: 2015.015 ms
regression=# select pg_sleep(10);
ERROR:  canceling statement due to statement timeout
Time: 3009.932 ms

This happens because pg_sleep() is implemented with a loop around
pg_usleep(100), and the latter no longer is terminated by a SIGALRM.

It seems like it'd be a good idea to replace the pg_sleep implementation
with something involving WaitLatch, which would not only ensure prompt
response to SIGALRM (and other signals, eg SIGQUIT), but would eliminate
useless process wakeups during a sleep.

In general, we might want to consider replacing long sleep intervals
with WaitLatch operations.  I thought for a bit about trying to turn
pg_usleep itself into a WaitLatch call; but it's also used in frontend
code where that wouldn't work, and anyway it's not clear this would be
a good thing for short sleeps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Simon Riggs

On 9 June 2013 14:56, Kevin Grittner  wrote:
> Simon Riggs  wrote:
>> On 8 June 2013 22:25, Kevin Grittner  wrote:
>>> Simon Riggs  wrote:
>
>> There are also difficulties in semantics, since when
>> we have OLD and NEW at row level we know we are discussing the same
>> row. With sets of OLD and NEW we'd need to be able to link the
>> relations back together somehow, which couldn't be done by PK since
>> that could change. So we'd need to invent some semantics for a
>> "linking identifier" of some description. Which once we've done it
>> would be used by people to join them back together again, which is
>> what we already had in the first place. So my take is that it sounds
>> expressive, but definitely not performant.
>
> I have used a feature like this in other database products, and can
> say from experience that these relations in a statement trigger can
> be very useful without the linkage you propose.  I can see how the
> linkage could potentially be useful, but if that is the only
> hang-up, we would be adding a powerful feature without it.

What I'm trying to do is tune FKs, which are currently implemented as
after row triggers. Hence why I'm looking at ways to speed up after
row triggers.

Suggesting I work on after statement triggers would imply you think
that FKs should/could be rewritten as after statement triggers. Is
that what you mean?

Doing it that way would mean rewriting a lot of code and would still
have problems 2 and 3 identified above. I can't imagine anybody would
go for that, but if you have a sketch of how it might work we can
consider it.

>> Since my objective is performance, not standards, I don't see a reason
>> to work on that, yet. I might have time to work on it later, lets see.
>
> This seems like it has some overlap with the delta relations I will
> need to generate for incremental maintenance of materialized views,
> so we should coordinate on those efforts if they happen to occur
> around the same time.

IMHO you should use Andres' logical changeset extraction for
incremental maintenance for mat views. Doing mat view maintenance
using triggers would be choosing the lower performance option.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs

On 15 June 2013 16:39, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 15 June 2013 00:01, Josh Berkus  wrote:
>>> If we're going to start adding reloptions for specific table behavior,
>>> I'd rather think of all of the optimizations we might have for a
>>> prospective "append-only table" and bundle those, rather than tying it
>>> to whether a certain index exists or not.
>
>> I agree that the FSM behaviour shouldn't be linked to index existence.
>> IMHO that should be a separate table parameter, WITH (fsm_mode = append)
>> Index only scans would also benefit from that.
>
> -1 ... I cannot believe that such a parameter would ever get turned on
> in production by anyone.

It's an extremely common use case, epecially with larger databases.

> If your table has a significant update rate,
> the resulting table bloat would make such behavior completely
> infeasible.  If you have few enough updates to make such a behavior
> practical, then you can live with the expensive index updates instead.

Depends. What I had in mind was that "append" mode would allow updates
normally within the last 1 GB of a table, but relocate updates should
they occur in earlier segments. That would cover the majority of use
cases I see.

> I also find the analogy to index-only scans to be bogus, because those
> didn't require any user tuning.

I don't think anybody is suggesting there is a requirement for user
tuning on this type of index.

But I think having one parameter would make it the same as say, GIN
indexes which also have a single tuning parameter.

I think the idea that index only scans "just work" is itself bogus. A
mild random update pattern will reduce the benefit of IOS scans
considerably, which is hard to understand or control, especially
without any way of measuring it. We could eliminate that problem if we
chose, but its not even documented as a potential issue for some
reason so its hard to discuss. Not being able to see or control a
known problem is not the same thing as "doesn't require user tuning".

> There's a nearby thread complaining bitterly about our willingness to
> create hard-to-use, hard-to-tune "features".  In its current form,
> this will be another one of those.

Not based upon anything mentioned here so far, or that I'm aware of.

I think it would be interesting to see some anonymous voting on
feature complaints, so we can see what people really think without
needing to risk offending the main authors of each of the various
parts of our software.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread David E. Wheeler

On Jun 15, 2013, at 4:12 AM, Andrew Dunstan  wrote:

>   REGRESS_OPTS = --inputdir=test --outputdir=test \
>  --load-extension=$(EXTENSION)
>   ...
>   override pg_regress_clean_files = test/results/
>   test/regression.diffs test/regression.out tmp_check/ log/
> 
> 
> That keeps the testing stuff out of the way quite nicely.

I don't suppose there could be a way for the makefile to notice the --outputdir 
option and add those files to the clean target itself, could there? Having it 
hard-coded is slightly annoying. Maybe it could ask pg_regress where to find 
them?

Thanks,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/15/2013 11:28 AM, Alvaro Herrera wrote:
> Craig Ringer wrote:
>> On 06/13/2013 11:16 AM, Peter Eisentraut wrote:
>>> This has served no purpose except to
>>> 
>>> 1. take up space 2. confuse users 3. produce broken external
>>> extension modules that take contrib as an example 4. break
>>> builds of PostgreSQL when users try to fix 3. by exporting
>>> USE_PGXS
>> 
>> Weak -1 from me; I find being able to build contrib modules using
>> PGXS from the source tree extremely useful in development. It
>> allows me to easily install an extension from a newer Pg version
>> into an older server, test against older servers, etc.
> 
> This use case seems too narrow to me to justify the burden of
> keeping PGXS-enabled makefiles in contrib.

What was the burden of it?

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRvLWLAAoJEDfy90M199hllqIP/3ZHKaiOEidUPoRK2n9BrW3b
MmHzfzn2rhOxhL4XYSxUz5mI8CDecYf5PXi1m0dwdyTCGUNWSQRiNMzN5x0Wd8v3
TVD2PS3m9XM4tWf+Y1DxSjkrDZ4KZFE8HdKFJUoCu73a0wS+nIerPC28PsPJjfen
bBzaWMbZsuY37GUh3uzJW4p/KDrRV5Bh298tP2hgWPCdxpenwT3sK7+qw2Goumly
Gdltp6BBkUTxpk42X72PRPx+QUwZJfmQ0QjubR+vkiYSi5otyVB2q1xr+umts+SW
Z69dfxyQIp0BHaFpUB1DFVXwx+hmWyNKWwA0mIBIXQhlKh8VZXbz5ENHKnK1xPpK
GcO/LFl/z5zI0CMFVFW2nwQPJ8NN+Jmru23Q53FGGsIXXJJJofCDt8U9BDLCGYpZ
uq1E1/7HfIkDL0f6jDRrBmdQacnRRYWKmhEGXjQQ1Med5IG3pFiqedZj3d8xCNDU
xB1wwAnIOiLnIo8WF2eIeYzuPQ5gN+xZFpUrQz+tt3UBuIsvhxXI1qQvJwRnwJTx
O30Aw7zZYbnI+zZvkO1ibyIUUp8byE0DuIBwg2ergKTD4BVpKi2ThIntSXY3W/hO
JzwulZfaQ120NBmrDXx641SnkXlRmMYK82KzJcDFMcyR6y04dZVhrUi0Ml2Yno6Z
dsi1AdDZJpBJq9nwIJ53
=fVLk
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Jeff Davis

On Fri, 2013-06-14 at 18:27 +0200, Andres Freund wrote:
> I'd like to see a comment around the memcpys in XLogSaveBufferForHint()
> that mentions that they are safe in a non std buffer due to
> XLogCheckBuffer setting an appropriate hole/offset. Or make an explicit
> change of the copy algorithm there.

Done.

> Btw, if you touch that code, I'd vote for renaming XLOG_HINT to XLOG_FPI
> or something like that. I find the former name confusing...

Also done.

Patch attached. Also, since we branched, I think this should be
back-patched to 9.3 as well.

Regards,
Jeff Davis

*** a/src/backend/access/hash/hash.c
--- b/src/backend/access/hash/hash.c
***
*** 287,293  hashgettuple(PG_FUNCTION_ARGS)
  			/*
  			 * Since this can be redone later if needed, mark as a hint.
  			 */
! 			MarkBufferDirtyHint(buf);
  		}
  
  		/*
--- 287,293 
  			/*
  			 * Since this can be redone later if needed, mark as a hint.
  			 */
! 			MarkBufferDirtyHint(buf, true);
  		}
  
  		/*
*** a/src/backend/access/heap/pruneheap.c
--- b/src/backend/access/heap/pruneheap.c
***
*** 262,268  heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
  		{
  			((PageHeader) page)->pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
! 			MarkBufferDirtyHint(buffer);
  		}
  	}
  
--- 262,268 
  		{
  			((PageHeader) page)->pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
! 			MarkBufferDirtyHint(buffer, true);
  		}
  	}
  
*** a/src/backend/access/nbtree/nbtinsert.c
--- b/src/backend/access/nbtree/nbtinsert.c
***
*** 413,421  _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
  	 * crucial. Be sure to mark the proper buffer dirty.
  	 */
  	if (nbuf != InvalidBuffer)
! 		MarkBufferDirtyHint(nbuf);
  	else
! 		MarkBufferDirtyHint(buf);
  }
  			}
  		}
--- 413,421 
  	 * crucial. Be sure to mark the proper buffer dirty.
  	 */
  	if (nbuf != InvalidBuffer)
! 		MarkBufferDirtyHint(nbuf, true);
  	else
! 		MarkBufferDirtyHint(buf, true);
  }
  			}
  		}
*** a/src/backend/access/nbtree/nbtree.c
--- b/src/backend/access/nbtree/nbtree.c
***
*** 1052,1058  restart:
  opaque->btpo_cycleid == vstate->cycleid)
  			{
  opaque->btpo_cycleid = 0;
! MarkBufferDirtyHint(buf);
  			}
  		}
  
--- 1052,1058 
  opaque->btpo_cycleid == vstate->cycleid)
  			{
  opaque->btpo_cycleid = 0;
! MarkBufferDirtyHint(buf, true);
  			}
  		}
  
*** a/src/backend/access/nbtree/nbtutils.c
--- b/src/backend/access/nbtree/nbtutils.c
***
*** 1789,1795  _bt_killitems(IndexScanDesc scan, bool haveLock)
  	if (killedsomething)
  	{
  		opaque->btpo_flags |= BTP_HAS_GARBAGE;
! 		MarkBufferDirtyHint(so->currPos.buf);
  	}
  
  	if (!haveLock)
--- 1789,1795 
  	if (killedsomething)
  	{
  		opaque->btpo_flags |= BTP_HAS_GARBAGE;
! 		MarkBufferDirtyHint(so->currPos.buf, true);
  	}
  
  	if (!haveLock)
*** a/src/backend/access/rmgrdesc/xlogdesc.c
--- b/src/backend/access/rmgrdesc/xlogdesc.c
***
*** 82,92  xlog_desc(StringInfo buf, uint8 xl_info, char *rec)
  		appendStringInfo(buf, "restore point: %s", xlrec->rp_name);
  
  	}
! 	else if (info == XLOG_HINT)
  	{
  		BkpBlock   *bkp = (BkpBlock *) rec;
  
! 		appendStringInfo(buf, "page hint: %s block %u",
  		 relpathperm(bkp->node, bkp->fork),
  		 bkp->block);
  	}
--- 82,92 
  		appendStringInfo(buf, "restore point: %s", xlrec->rp_name);
  
  	}
! 	else if (info == XLOG_FPI)
  	{
  		BkpBlock   *bkp = (BkpBlock *) rec;
  
! 		appendStringInfo(buf, "full-page image: %s block %u",
  		 relpathperm(bkp->node, bkp->fork),
  		 bkp->block);
  	}
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 7681,7692  XLogRestorePoint(const char *rpName)
   * records. In that case, multiple copies of the same block would be recorded
   * in separate WAL records by different backends, though that is still OK from
   * a correctness perspective.
-  *
-  * Note that this only works for buffers that fit the standard page model,
-  * i.e. those for which buffer_std == true
   */
  XLogRecPtr
! XLogSaveBufferForHint(Buffer buffer)
  {
  	XLogRecPtr	recptr = InvalidXLogRecPtr;
  	XLogRecPtr	lsn;
--- 7681,7689 
   * records. In that case, multiple copies of the same block would be recorded
   * in separate WAL records by different backends, though that is still OK from
   * a correctness perspective.
   */
  XLogRecPtr
! XLogSaveBufferForHint(Buffer buffer, bool buffer_std)
  {
  	XLogRecPtr	recptr = InvalidXLogRecPtr;
  	XLogRecPtr	lsn;
***
*** 7708,7714  XLogSaveBufferForHint(Buffer buffer)
  	 * and reset rdata for any actual WAL record insert.
  	 */
  	rdata[0].buffer = buffer;
! 	rdata[0].buffer_std = true;
  
  	/*
  	 * Check buffer while not holding an exclusive lock.
--- 7705,7711 ---

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Alvaro Herrera

Craig Ringer wrote:
> On 06/13/2013 11:16 AM, Peter Eisentraut wrote:
> > This has served no purpose except to
> >
> > 1. take up space
> > 2. confuse users
> > 3. produce broken external extension modules that take contrib as an example
> > 4. break builds of PostgreSQL when users try to fix 3. by exporting USE_PGXS
> 
> Weak -1 from me; I find being able to build contrib modules using PGXS
> from the source tree extremely useful in development. It allows me to
> easily install an extension from a newer Pg version into an older
> server, test against older servers, etc.

This use case seems too narrow to me to justify the burden of keeping
PGXS-enabled makefiles in contrib.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres 9.3beta2 coming week after next

2013-06-15 Thread Tom Lane

FYI, the core committee is planning to wrap 9.3beta2 on Monday June 24
for release Thursday the 27th.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Greg Stark

> fsync=off

synchronous_commits=off replaced fsync=off in almost every use case
where fsync=off might have been useful. The only remaining use case is
for the initial build of a database. In that case what the user really
wants is to turn off WAL logging entirely though. Having a WAL log and
not fsyncing it is kind of pointless. I guess it lets you replicate
the database but it doesn't let you use the WAL log for recovery
locally.

> Bloat
> --
>
> Table bloat. Table bloat has been a major issue with PostgreSQL users/admins
> for years. Anyone care to explain to me in a simple paragraph how to find
> out if you have table or index bloat issues in your database and what to do
> about it? (Maybe we need "pg_catalog.pg_index_bloat" and
> "pg_catalog.pg_table_bloat" views including FILLFACTOR correction?)

A nice view that exposes a summary of information from the fsm per
table would be pretty handy.

In general there's a lot of data tied up in things like the fsm that
could be usefully exposed to users.



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane

Andres Freund  writes:
> On 2013-06-15 11:29:45 -0400, Tom Lane wrote:
>> Hmm.  Personally I'd rather go in the other direction:
>> http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us

> I am not actually objecting that reasoning, I think it would be rather
> useful to get there.
> But I don't think it's realistic to do that at this point in the
> 9.3 cycle. It seems like we would fight bugs around that for quite a
> while. We have a large number of syscall sites where we don't retry on
> EINTR/EAGAIN. And, as you note, that's not even talking about third
> party code.

Yeah, it's the issue of third-party code within the backend (perl,
python, etc etc etc etc) that really makes complete EINTR-proofing seem
a bit impractical.

Back in the day I was also worried about platforms that didn't have
SA_RESTART, but that's probably pretty much the empty set by now (is
anyone aware of a modern platform on which configure fails to set
HAVE_POSIX_SIGNALS?).  Also, our switch to latches for sleeping purposes
should have ameliorated the issue of signals failing to wake processes
when we wanted them to.

Let's turn on SA_RESTART for SIGALRM in HEAD and 9.3 and see what beta
testing says.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] request a new feature in fuzzystrmatch

2013-06-15 Thread Josh Berkus

Liming,

I'm putting this in the commitfest regardless, so that it at least gets
a review.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Andres Freund

On 2013-06-15 11:29:45 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
> >> On reflection though, we *do* need to make them cope, because even
> >> without lazy SIGALRM disable, any such place is still at risk.  We
> >> surely must allow for the possibility of SIGHUP arriving at any instant,
> >> for example.
> 
> > All signal handlers we register, including SIGHUP, but the one for
> > SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
> > need that? I am not sure if there's still a reason for that decision
> > inside the backend.
> 
> Hmm.  Personally I'd rather go in the other direction:
> http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us
> but maybe the path of least resistance is to set SA_RESTART for SIGALRM
> too.  Given our current usage of it, there seems no worse harm in
> allowing kernel calls to restart after a SIGALRM than any other signal.

I am not actually objecting that reasoning, I think it would be rather
useful to get there.
But I don't think it's realistic to do that at this point in the
9.3 cycle. It seems like we would fight bugs around that for quite a
while. We have a large number of syscall sites where we don't retry on
EINTR/EAGAIN. And, as you note, that's not even talking about third
party code.

I'd be happy if we decide to go that way at the beginning of the 9.4
cycle. I.e. do it right now on HEAD for all syscalls. That way we have a
chance to find most of the bad callsites before releasing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Tom Lane

Simon Riggs  writes:
> On 15 June 2013 00:01, Josh Berkus  wrote:
>> If we're going to start adding reloptions for specific table behavior,
>> I'd rather think of all of the optimizations we might have for a
>> prospective "append-only table" and bundle those, rather than tying it
>> to whether a certain index exists or not.

> I agree that the FSM behaviour shouldn't be linked to index existence.
> IMHO that should be a separate table parameter, WITH (fsm_mode = append)
> Index only scans would also benefit from that.

-1 ... I cannot believe that such a parameter would ever get turned on
in production by anyone.  If your table has a significant update rate,
the resulting table bloat would make such behavior completely
infeasible.  If you have few enough updates to make such a behavior
practical, then you can live with the expensive index updates instead.

I also find the analogy to index-only scans to be bogus, because those
didn't require any user tuning.

There's a nearby thread complaining bitterly about our willingness to
create hard-to-use, hard-to-tune "features".  In its current form,
this will be another one of those.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs

On 15 June 2013 00:07, Tom Lane  wrote:

> We've talked a lot about index-organized tables in the past.  How much
> of the use case for this would be subsumed by a feature like that?

IOTs would only work for one specific organisation, acting like a
multi-column btree. So you could use IOTs for this but only in a
limited way.

MinMax indexes cover multiple columns, possibly even all columns if
desired. So MMIs have much greater usefulness, as well as not
requiring a full reload of the data when you decide you want one.

Personally, I think IOTs are not worth pursuing. IOTs for Postgres
would require major changes to the definition of the heap, so we're a
long way from implementing them directly and coping with all of the
detailed implementation horrors. Grouped Item Tuple indexes are
theoretically equivalent in terms of size and number of block accesses
required to access data. Heikki's earlier work on that could easily be
revived. IOTs would be a multi-year effort with conflicting use cases.

Note that SQLServer and Sybase use block range indexes for primary
indexes, neatly avoiding the hassle Oracle walked into when they chose
to do IOTs. I think we should learn that same lession ourselves.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane

Andres Freund  writes:
> On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
>> On reflection though, we *do* need to make them cope, because even
>> without lazy SIGALRM disable, any such place is still at risk.  We
>> surely must allow for the possibility of SIGHUP arriving at any instant,
>> for example.

> All signal handlers we register, including SIGHUP, but the one for
> SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
> need that? I am not sure if there's still a reason for that decision
> inside the backend.

Hmm.  Personally I'd rather go in the other direction:
http://www.postgresql.org/message-id/12819.1183306...@sss.pgh.pa.us
but maybe the path of least resistance is to set SA_RESTART for SIGALRM
too.  Given our current usage of it, there seems no worse harm in
allowing kernel calls to restart after a SIGALRM than any other signal.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Minmax indexes

2013-06-15 Thread Simon Riggs

On 15 June 2013 00:01, Josh Berkus  wrote:
> Alvaro,
>
> This sounds really interesting, and I can see the possibilities.
> However ...
>
>> Value changes in columns that are part of a minmax index, and tuple insertion
>> in summarized pages, would invalidate the stored min/max values.  To support
>> this, each minmax index has a validity map; a range can only be considered 
>> in a
>> scan if it hasn't been invalidated by such changes (A range "not considered" 
>> in
>> the scan needs to be returned in whole regardless of the stored min/max 
>> values,
>> that is, it cannot be pruned per query quals).  The validity map is very
>> similar to the visibility map in terms of performance characteristics: quick
>> enough that it's not contentious, allowing updates and insertions to proceed
>> even when data values violate the minmax index conditions.  An invalidated
>> range can be made valid by re-summarization (see below).
>
> This begins to sound like these indexes are only useful on append-only
> tables.  Not that there aren't plenty of those, but ...

The index is basically using the "index only scan" mechanism. The
"only useful on append-only tables" comment would/should apply also to
index only scans. I can't see a reason to raise that specifically for
this index type.


>> Re-summarization is relatively expensive, because the complete page range has
>> to be scanned.
>
> Why?  Why can't we just update the affected pages in the index?

Again, same thing as index-only scans. For IOS, we reset the
visibility info at vacuum. The route proposed here follows exactly the
same timing, same mechanism. I can't see a reason for any difference
between the two.


>>  To avoid this, a table having a minmax index would be
>> configured so that inserts only go to the page(s) at the end of the table; 
>> this
>> avoids frequent invalidation of ranges in the middle of the table.  We 
>> provide
>> a table reloption that tweaks the FSM behavior, so that summarized pages are
>> not candidates for insertion.
>
> We haven't had an index type which modifies table insertion behavior
> before, and I'm not keen to start now; imagine having two indexes on the
> same table each with their own, conflicting, requirements.  This is
> sounding a lot more like a candidate for our prospective pluggable
> storage manager.  Also, the above doesn't help us at all with UPDATEs.
>
> If we're going to start adding reloptions for specific table behavior,
> I'd rather think of all of the optimizations we might have for a
> prospective "append-only table" and bundle those, rather than tying it
> to whether a certain index exists or not.

I agree that the FSM behaviour shouldn't be linked to index existence.

IMHO that should be a separate table parameter, WITH (fsm_mode = append)

Index only scans would also benefit from that.


> Also, I hate the name ... if this feature goes ahead, I'm going to be
> lobbying to change it.  But that's pretty minor compared to the update
> issues.

This feature has already had 3 different names. I don't think the name
is crucial, but it makes sense to give it a name up front. So if you
want to lobby for that then you'd need to come up with a name soon, so
poor Alvaro can cope with name #4.

(There's no consistency in naming from any other implementation either).

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Andres Freund

On 2013-06-15 10:45:28 -0400, Tom Lane wrote:
> I wrote:
> > Richard Poole  writes:
> >> This behaviour appears in 6ac7facdd3990baf47efc124e9d7229422a06452 as a
> >> side-effect of speeding things up by getting rid of setitimer() calls;
> >> it's not obvious what's a good way to fix it without losing the benefits
> >> of that commit.
> 
> > Ugh.  It doesn't sound very practical to try to guarantee that every
> > single kernel call in the backend is set up to recover from EINTR,
> > even though ideally they should all be able to cope.
> 
> On reflection though, we *do* need to make them cope, because even
> without lazy SIGALRM disable, any such place is still at risk.  We
> surely must allow for the possibility of SIGHUP arriving at any instant,
> for example.

All signal handlers we register, including SIGHUP, but the one for
SIGALRM set SA_RESTART... I wonder if we can rejigger things so we don't
need that? I am not sure if there's still a reason for that decision
inside the backend.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revive line type

2013-06-15 Thread Tom Lane

Peter Eisentraut  writes:
> Complete the implementations of line_in, line_out, line_recv,
> line_send.  Remove comments and error messages about the line type not
> being implemented.  Add regression tests for existing line operators
> and functions.
> ---
> This just revives existing functionality, doesn't add anything new.
> One thing that the original code did not settle was how to convert a
> line in form Ax+By+C=0 to the two-points output form.  Obviously, you
> can just pick to random points on the line, but I wonder whether there
> is a more standard solution.

ISTM printing a line as two points is an unfortunate representational
choice altogether.  If the internal form is Ax+By+C=0, shouldn't the
text form expose A,B,C directly?  What you've got here necessarily
suffers a lot of roundoff error during I/O conversions, and so it seems
pretty likely to fail dump-and-reload tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Sawada Masahiko

On Sat, Jun 15, 2013 at 10:34 PM, Amit kapila  wrote:
>
> On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
> On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila  wrote:
>> On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
>>> Hello,
>>
 We have already started a discussion on pgsql-hackers for the problem of
 taking fresh backup during the failback operation here is the link for 
 that:


 http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
 jgwrfu513...@mail.gmail.com

 Let me again summarize the problem we are trying to address.
>>>
>>>
>>>   How will you take care of extra WAL on old master during recovery. If it
>>> plays the WAL which has not reached new-master, it can be a problem.
>
>> you means that there is possible that old master's data ahead of new
>> master's data.
>
>   I mean to say is that WAL of old master can be ahead of new master. I 
> understood that
>   data files of old master can't be ahead, but I think WAL can be ahead.
>
>> so there is inconsistent data between those server when fail back. right?
>> if so , there is not possible inconsistent. because if you use GUC option
>> as his propose (i.g., failback_safe_standby_mode = remote_flush),
>> when old master is working fine, all file system level changes aren't
>> done  before WAL replicated.
>
> Would the propose patch will take care that old master's WAL is also not 
> ahead in some way?
> If yes, I think i am missing some point.

yes it will happen that old master's WAL ahead of new master's WAL as you said.
but I think that we can solve them by delete all WAL file when old
master starts as new standby.
 thought?

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] stray SIGALRM

2013-06-15 Thread Tom Lane

I wrote:
> Richard Poole  writes:
>> This behaviour appears in 6ac7facdd3990baf47efc124e9d7229422a06452 as a
>> side-effect of speeding things up by getting rid of setitimer() calls;
>> it's not obvious what's a good way to fix it without losing the benefits
>> of that commit.

> Ugh.  It doesn't sound very practical to try to guarantee that every
> single kernel call in the backend is set up to recover from EINTR,
> even though ideally they should all be able to cope.

On reflection though, we *do* need to make them cope, because even
without lazy SIGALRM disable, any such place is still at risk.  We
surely must allow for the possibility of SIGHUP arriving at any instant,
for example.

Have you identified the specific place that's giving you trouble?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Amit kapila

On Saturday, June 15, 2013 3:50 PM Andres Freund wrote:
On 2013-06-14 21:56:52 -0400, Robert Haas wrote:
>> I don't think we need it.  I think what we need is to decide is which
>> algorithm is legally OK to use.  And then put it in.
>>
>> In the past, we've had a great deal of speculation about that legal
>> question from people who are not lawyers.  Maybe it would be valuable
>> to get some opinions from people who ARE lawyers.  Tom and Heikki both
>> work for real big companies which, I'm guessing, have substantial
>> legal departments; perhaps they could pursue getting the algorithms of
>> possible interest vetted.  Or, I could try to find out whether it's
>> possible do something similar through EnterpriseDB.

> I personally don't think the legal arguments holds all that much water
> for snappy and lz4. But then the opinion of a european non-lawyer doesn't
> hold much either.
> Both are widely used by a large number open and closed projects, some of
> which have patent grant clauses in their licenses. E.g. hadoop,
> cassandra use lz4, and I'd be surprised if the companies behind those
> have opened themselves to litigation.

> I think we should preliminarily decide which algorithm to use before we
> get lawyers involved. I'd surprised if they can make such a analysis
> faster than we can rule out one of them via benchmarks.
  
I have also tried to use snappy for patch "Performance Improvement by reducing 
WAL for Update Operation".
It has shown very good results and performed very well for all the tests asked 
by Heikki.
Results are at below link:
http://www.postgresql.org/message-id/009001ce2c6e$9bea4790$d3bed6b0$@kap...@huawei.com

I think if we can get snappy into core, it can be used for more things. 
I wanted to try it for FPW as well.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Amit kapila


On Saturday, June 15, 2013 1:19 PM Sawada Masahiko wrote:
On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila  wrote:
> On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
>> Hello,
>
>>> We have already started a discussion on pgsql-hackers for the problem of
>>> taking fresh backup during the failback operation here is the link for that:
>>>
>>>
>>> http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
>>> jgwrfu513...@mail.gmail.com
>>>
>>> Let me again summarize the problem we are trying to address.
>>
>>
>>   How will you take care of extra WAL on old master during recovery. If it
>> plays the WAL which has not reached new-master, it can be a problem.

> you means that there is possible that old master's data ahead of new
> master's data.

  I mean to say is that WAL of old master can be ahead of new master. I 
understood that
  data files of old master can't be ahead, but I think WAL can be ahead.

> so there is inconsistent data between those server when fail back. right?
> if so , there is not possible inconsistent. because if you use GUC option
> as his propose (i.g., failback_safe_standby_mode = remote_flush),
> when old master is working fine, all file system level changes aren't
> done  before WAL replicated.

Would the propose patch will take care that old master's WAL is also not ahead 
in some way?
If yes, I think i am missing some point.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund

On 2013-06-15 14:11:54 +0200, Hannu Krosing wrote:
> > Could you perhaps actually read the the description and the discussion
> > before making wild suggestions? Possibly even the patch.
> > Compressed toast datums now *do* have an identifier of the compression
> > algorithm used. 
> > That's how we can discern between pglz and whatever
> > algorithm we come up with.
> Claiming that the algorithm will be one of only two (current and
> "whatever algorithm we come up with ") suggests that it is
> only one bit, which is undoubtedly too little for having a "pluggable"
> compression API :)

No, I am thinking 127 + 2 possibly algorithms for now. If we need more
the space used for the algorithm can be extended transparently at that
point.

> > But those identifiers should be *small* (since they are added to all
> > Datums) 

> if there will be any alignment at all between the datums, then
> one byte will be lost in the noise ("remember: nobody will need
> more than 256 compression algorithms")

No. There's no additional alignment involved here.

> OTOH, if you plan to put these format markers in the compressed
> stream and change the compression algorithm while reading it, I am
> lost.

The marker *needs* to be in the compressed stream. When decompressing a
datum we only only get passed the varlena.

> > and they need to be stable, even across pg_upgrade. So I think
> > making this user configurable would be grave error at this point.

> "at this point" in what sense ?

If we add another algorithm with different tradeofs we can have a column
attribute for choosing the algorithm. If there proofs to be a need to
add more configurabily, we can do that at that point.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing

On 06/15/2013 02:02 PM, Andres Freund wrote:
> On 2013-06-15 13:11:47 +0200, Hannu Krosing wrote:
>> If it were truly pluggable - that is just load a .dll, set a GUG and start
>> using it
> Ok. I officially rechristen the patchset to 'extensible compression
> support'.
>
>> - then the selection of algorithms would be much
>> wider as several slow-but-high-compression ones under GPL could be
>> used as well, similar to how currently PostGIS works.
>> gzip and bzip2 are the first two that came in mind, but I am sure there
>> are more.
> gzip barely has a higher compression ratio than lz4 and is a magnitude
> slower decompressing, so I don't think it's interesting.
> I personally think bzip2 is too slow to be useful, even for
> decompression. 
with low compression settings gzip and bzip2 seem to decompress at the
same speed :
http://pokecraft.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO

(an interesting thing there is memory usage, but I guess it is just an
artefact of outer layers around the algorithm)

and if better compression translates to more speed depends heavily on
disk speeds :

http://www.citusdata.com/blog/64-zfs-compression claims quite big
performance increases from using gzip, even with its slow decompression"


> What might be useful is something like lzma, but it's
> implementation is so complex that I don't really want to touch it.
>
>>> In the past, we've had a great deal of speculation about that legal
>>> question from people who are not lawyers.  Maybe it would be valuable
>>> to get some opinions from people who ARE lawyers. 
>> Making a truly pluggable compression API delegates this question
>> to end users.
>>
>> Delegation is good, as it lets you get done more :)
> No. It increases the features complexity by a magnitude. That's not
> good. And it means that about nobody but a few expert users will benefit
> from it, so I am pretty strongly opposed.
>
> You suddently need to solve the question of how the identifiers for
> compression formats are allocated and preserved across pg_upgrade and
> across machines.
This is something similar we already need to do for any non-builtin type
OID.
>
> Greetings,
>
> Andres Freund
>


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing

On 06/15/2013 01:56 PM, Andres Freund wrote:
> On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
>> On 06/15/2013 02:20 AM, Andres Freund wrote:
>>> On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
 On 06/14/2013 04:01 PM, Andres Freund wrote:
> It still contains a guc as described in the above message to control the
> algorithm used for compressing new tuples but I think we should remove
> that guc after testing.
 Did you add the storage attribute?
>>> No. I think as long as we only have pglz and one new algorithm (even if
>>> that is lz4 instead of the current snappy) we should just always use the
>>> new algorithm. Unless I missed it nobody seemed to have voiced a
>>> contrary position?
>>> For testing/evaluation the guc seems to be sufficient.
>> If not significantly harder than what you currently do, I'd prefer a
>> true pluggable compression support which is
>> a) dynamically configurable , say by using a GUG
>> and
>> b) self-describing, that is, the compressed data should have enough
>> info to determine how to decompress it.
> Could you perhaps actually read the the description and the discussion
> before making wild suggestions? Possibly even the patch.
> Compressed toast datums now *do* have an identifier of the compression
> algorithm used. 
> That's how we can discern between pglz and whatever
> algorithm we come up with.
Claiming that the algorithm will be one of only two (current and
"whatever algorithm we come up with ") suggests that it is
only one bit, which is undoubtedly too little for having a "pluggable"
compression API :)


> But those identifiers should be *small* (since they are added to all
> Datums) 
if there will be any alignment at all between the datums, then
one byte will be lost in the noise ("remember: nobody will need
more than 256 compression algorithms")
OTOH, if you plan to put these format markers in the compressed
stream and change the compression algorithm while reading it, I am lost.
> and they need to be stable, even across pg_upgrade. So I think
> making this user configurable would be grave error at this point.
"at this point" in what sense ?
>
>> additionally it *could* have the property Simon proposed earlier
>> of *uncompressed* pages having some predetermined size, so we
>> could retain optimisations of substring() even on compressed TOAST
>> values.
> We are not changing the toast format here, so I don't think that's
> applicable. That's a completely separate feature.
>
>> the latter of course could also be achieved by adding offset
>> column to toast tables as well.
>> One more idea - if we are already changing toast table structure, we
>> could introduce a notion of "compress block", which could run over
>> several storage pages for much improved compression compared
>> to compressing only a single page at a time.
> We aren't changing the toast table structure. And we can't easily do so,
> think of pg_upgrade.
Where was the page for "features rejected based on of pg_upgrade" ;)
> Besides, toast always has compressed datums over several chunks. What
> would be beneficial would be to compress in a way that you can compress
> several datums together, but that's several magnitudes more complex and
> is unrelated to this feature.
>
> Greetings,
>
> Andres Freund
>


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund

On 2013-06-15 13:11:47 +0200, Hannu Krosing wrote:
> If it were truly pluggable - that is just load a .dll, set a GUG and start
> using it

Ok. I officially rechristen the patchset to 'extensible compression
support'.

> - then the selection of algorithms would be much
> wider as several slow-but-high-compression ones under GPL could be
> used as well, similar to how currently PostGIS works.

> gzip and bzip2 are the first two that came in mind, but I am sure there
> are more.

gzip barely has a higher compression ratio than lz4 and is a magnitude
slower decompressing, so I don't think it's interesting.
I personally think bzip2 is too slow to be useful, even for
decompression. What might be useful is something like lzma, but it's
implementation is so complex that I don't really want to touch it.

> > In the past, we've had a great deal of speculation about that legal
> > question from people who are not lawyers.  Maybe it would be valuable
> > to get some opinions from people who ARE lawyers. 
> Making a truly pluggable compression API delegates this question
> to end users.
> 
> Delegation is good, as it lets you get done more :)

No. It increases the features complexity by a magnitude. That's not
good. And it means that about nobody but a few expert users will benefit
from it, so I am pretty strongly opposed.

You suddently need to solve the question of how the identifiers for
compression formats are allocated and preserved across pg_upgrade and
across machines.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund

On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
> On 06/15/2013 02:20 AM, Andres Freund wrote:
> > On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
> >> On 06/14/2013 04:01 PM, Andres Freund wrote:
> >>> It still contains a guc as described in the above message to control the
> >>> algorithm used for compressing new tuples but I think we should remove
> >>> that guc after testing.
> >> Did you add the storage attribute?
> > No. I think as long as we only have pglz and one new algorithm (even if
> > that is lz4 instead of the current snappy) we should just always use the
> > new algorithm. Unless I missed it nobody seemed to have voiced a
> > contrary position?
> > For testing/evaluation the guc seems to be sufficient.

> If not significantly harder than what you currently do, I'd prefer a
> true pluggable compression support which is

> a) dynamically configurable , say by using a GUG

> and

> b) self-describing, that is, the compressed data should have enough
> info to determine how to decompress it.

Could you perhaps actually read the the description and the discussion
before making wild suggestions? Possibly even the patch.
Compressed toast datums now *do* have an identifier of the compression
algorithm used. That's how we can discern between pglz and whatever
algorithm we come up with.

But those identifiers should be *small* (since they are added to all
Datums) and they need to be stable, even across pg_upgrade. So I think
making this user configurable would be grave error at this point.

> additionally it *could* have the property Simon proposed earlier
> of *uncompressed* pages having some predetermined size, so we
> could retain optimisations of substring() even on compressed TOAST
> values.

We are not changing the toast format here, so I don't think that's
applicable. That's a completely separate feature.

> the latter of course could also be achieved by adding offset
> column to toast tables as well.

> One more idea - if we are already changing toast table structure, we
> could introduce a notion of "compress block", which could run over
> several storage pages for much improved compression compared
> to compressing only a single page at a time.

We aren't changing the toast table structure. And we can't easily do so,
think of pg_upgrade.
Besides, toast always has compressed datums over several chunks. What
would be beneficial would be to compress in a way that you can compress
several datums together, but that's several magnitudes more complex and
is unrelated to this feature.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Cédric Villemain

> >> I don't really like the directory layout we use for these modules
> >> anyway, so I'm not sure they constitute best practice for extension
> >> builders. Lately I have been using an extension skeleton that looks
> >> 
> >> something like this:
> >>  License
> >>  Readme.md
> >>  META.json (for pgxn)
> >>  extension.control
> >>  Makefile
> >>  doc/extension.md (soft linked to ../Readme.md)
> > 
> > This makes mandatory to have a MODULEDIR defined or a rule to rename it
> > with the extension name suffixed.
> 
> Of course, for extension foo this would actually be foo.md. It installs
> just fine like that. The makefile template has:
> 
>  DOCS = $(wildcard doc/*.md)

Oh! yes, I missed the soft link.

> >>  src/extension.c
> >>  sql/extension.sql
> > 
> > It is (was) the default place for regression testsI am not sure it is
> > a good thing to shuffle that. Also, you don't do 'c/source.c'
> 
> The sql here is the sql to install the extension, not part of the build
> nor part of the tests.

I am interested by this topic, since we have Extensions we invite users to 
increase the usage of them. So going a step forward with a better layout is 
definitively something to do.

What do you suggest for the previous usage ? we have a hard rule to try to put 
libdir in *sql.in files for example.

> Some time ago I fixed pg_regress to honor --inputdir and --outputdir
> properly, so my Makefile template has this:
> 
> REGRESS_OPTS = --inputdir=test --outputdir=test \
>--load-extension=$(EXTENSION)
> ...
> override pg_regress_clean_files = test/results/
> test/regression.diffs test/regression.out tmp_check/ log/
> 
> 
> That keeps the testing stuff out of the way quite nicely.
> 
> You might not like this pattern, but I find it much saner that what we
> currently use. I certainly don't claim it's perfect.

I am interested by this topic, since we have Extensions we invite users to 
increase the usage of them. So going a step forward with a better layout is 
definitively something to do. I have no strong assumption on what the ideal 
layout is, 'your' and pgxn layout are good and I won't vote against suggesting 
to use them (and improve PGXS to match those suggestions).

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing

On 06/15/2013 02:20 AM, Andres Freund wrote:
> On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
>> On 06/14/2013 04:01 PM, Andres Freund wrote:
>>> It still contains a guc as described in the above message to control the
>>> algorithm used for compressing new tuples but I think we should remove
>>> that guc after testing.
>> Did you add the storage attribute?
> No. I think as long as we only have pglz and one new algorithm (even if
> that is lz4 instead of the current snappy) we should just always use the
> new algorithm. Unless I missed it nobody seemed to have voiced a
> contrary position?
> For testing/evaluation the guc seems to be sufficient.
If not significantly harder than what you currently do, I'd prefer a
true pluggable compression support which is

a) dynamically configurable , say by using a GUG

and

b) self-describing, that is, the compressed data should have enough
info to determine how to decompress it.

additionally it *could* have the property Simon proposed earlier
of *uncompressed* pages having some predetermined size, so we
could retain optimisations of substring() even on compressed TOAST
values.

the latter of course could also be achieved by adding offset
column to toast tables as well.

One more idea - if we are already changing toast table structure, we
could introduce a notion of "compress block", which could run over
several storage pages for much improved compression compared
to compressing only a single page at a time.

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Andrew Dunstan



On 06/15/2013 06:24 AM, Cédric Villemain wrote:

Andrew Dunstan  a écrit :


On 06/14/2013 08:35 AM, Peter Eisentraut wrote:

On 6/13/13 9:20 PM, amul sul wrote:

Agree, only if we consider these contrib module is always gonna

deployed with the postgresql.

But, what if user going to install such module elsewhere i.e. not

>from contrib directory of pg source.

Why would anyone do that?



Maybe they wouldn't.

I do think we need to make sure that we have at least buildfarm
coverage
of pgxs module building and testing. I have some coverage of a few
extensions I have written, which exercise that, so maybe that will
suffice. If not, maybe we need to have one module that only builds via
pgxs and is build after an install (i.e. not via the standard contrib
build).

I agree, I found very useful to have all the provided extensions build with 
PGXS to debug it.
It also offers a good set of natural regression tests.


I don't really like the directory layout we use for these modules
anyway, so I'm not sure they constitute best practice for extension
builders. Lately I have been using an extension skeleton that looks
something like this:

 License
 Readme.md
 META.json (for pgxn)
 extension.control
 Makefile
 doc/extension.md (soft linked to ../Readme.md)

This makes mandatory to have a MODULEDIR defined or a rule to rename it with 
the extension name suffixed.



Of course, for extension foo this would actually be foo.md. It installs 
just fine like that. The makefile template has:


DOCS = $(wildcard doc/*.md)




 src/extension.c
 sql/extension.sql

It is (was) the default place for regression testsI am not sure it is a 
good thing to shuffle that.
Also, you don't do 'c/source.c'




The sql here is the sql to install the extension, not part of the build 
nor part of the tests.


Some time ago I fixed pg_regress to honor --inputdir and --outputdir 
properly, so my Makefile template has this:


   REGRESS_OPTS = --inputdir=test --outputdir=test \
  --load-extension=$(EXTENSION)
   ...
   override pg_regress_clean_files = test/results/
   test/regression.diffs test/regression.out tmp_check/ log/


That keeps the testing stuff out of the way quite nicely.

You might not like this pattern, but I find it much saner that what we 
currently use. I certainly don't claim it's perfect.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Hannu Krosing

On 06/15/2013 03:56 AM, Robert Haas wrote:
> On Fri, Jun 14, 2013 at 8:45 PM, Andres Freund  wrote:
>> On 2013-06-14 17:35:02 -0700, Josh Berkus wrote:
 No. I think as long as we only have pglz and one new algorithm (even if
 that is lz4 instead of the current snappy) we should just always use the
 new algorithm. Unless I missed it nobody seemed to have voiced a
 contrary position?
 For testing/evaluation the guc seems to be sufficient.
>>> Then it's not "pluggable", is it?  It's "upgradable compression
>>> support", if anything.  Which is fine, but let's not confuse people.
>> The point is that it's pluggable on the storage level in the sense of
>> that several different algorithms can coexist and new ones can
>> relatively easily added.
>> That part is what seems to have blocked progress for quite a while
>> now. So fixing that seems to be the interesting thing.
>>
>> I am happy enough to do the work of making it configurable if we want it
>> to be... But I have zap interest of doing it and throw it away in the
>> end because we decide we don't need it.
> I don't think we need it.  I think what we need is to decide is which
> algorithm is legally OK to use.  And then put it in.
If it were truly pluggable - that is just load a .dll, set a GUG and start
using it - then the selection of algorithms would be much
wider as several slow-but-high-compression ones under GPL could be
used as well, similar to how currently PostGIS works.

gzip and bzip2 are the first two that came in mind, but I am sure there
are more.

> In the past, we've had a great deal of speculation about that legal
> question from people who are not lawyers.  Maybe it would be valuable
> to get some opinions from people who ARE lawyers. 
Making a truly pluggable compression API delegates this question
to end users.

Delegation is good, as it lets you get done more :)


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-15 Thread Cédric Villemain


Andrew Dunstan  a écrit :

>
>On 06/14/2013 08:35 AM, Peter Eisentraut wrote:
>> On 6/13/13 9:20 PM, amul sul wrote:
>>> Agree, only if we consider these contrib module is always gonna
>deployed with the postgresql.
>>> But, what if user going to install such module elsewhere i.e. not
>from contrib directory of pg source.
>> Why would anyone do that?
>>
>>
>
>Maybe they wouldn't.
>
>I do think we need to make sure that we have at least buildfarm
>coverage
>of pgxs module building and testing. I have some coverage of a few
>extensions I have written, which exercise that, so maybe that will
>suffice. If not, maybe we need to have one module that only builds via
>pgxs and is build after an install (i.e. not via the standard contrib
>build).

I agree, I found very useful to have all the provided extensions build with 
PGXS to debug it.
It also offers a good set of natural regression tests.

>I don't really like the directory layout we use for these modules
>anyway, so I'm not sure they constitute best practice for extension
>builders. Lately I have been using an extension skeleton that looks
>something like this:
>
> License
> Readme.md
> META.json (for pgxn)
> extension.control
> Makefile
> doc/extension.md (soft linked to ../Readme.md)

This makes mandatory to have a MODULEDIR defined or a rule to rename it with 
the extension name suffixed.

> src/extension.c
> sql/extension.sql

It is (was) the default place for regression testsI am not sure it is a 
good thing to shuffle that.
Also, you don't do 'c/source.c'



--
Envoyé de mon téléphone, excusez la brièveté.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Craig Ringer

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/15/2013 05:57 PM, Martijn van Oosterhout wrote:
> On Sat, Jun 15, 2013 at 12:43:10PM +0800, Craig Ringer wrote:
>> Bloat --
>> 
>> Table bloat. Table bloat has been a major issue with PostgreSQL 
>> users/admins for years. Anyone care to explain to me in a simple 
>> paragraph how to find out if you have table or index bloat issues
>> in your database and what to do about it? (Maybe we need 
>> "pg_catalog.pg_index_bloat" and "pg_catalog.pg_table_bloat"
>> views including FILLFACTOR correction?)
>> 
>> I think I'll draft up a patch to add exactly that.
> 
> Nice list btw. I monitor this by using the excellent check_progres 
> nagios plugin, which has stuff to check for things like this.

It's been a vital tool for me too. It probably serves as a reasonable
guide for some things we could really usefully expose in system views.
Once in core we can document them in the main user manual, making them
reasonably discoverable.

Argh. I think my TODO has some kind of horrible disease, it keeps
growing uncontrollably.

> Which makes me think that it might be possible to add some other
> checks like this, in for example pg_ctl.  A big fat warning 'your
> data may be eaten' might get noticed at startup.

The users who have this kind of issue aren't the ones running pg_ctl.
They'll usually be using launchd, systemctl, upstart, sysv init
scripts, etc ... whatever, something that sends the warning straight
to the system logs that they likely never read.

I don't have tons of sympathy for these people, but I do think making
fsync=off so easy to set without understanding it is kind of like
handing a grenade to a toddler.


- -- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRvD/8AAoJELBXNkqjr+S2aG8H/jnjATe1M+4O6k8iaS6Okgux
HQU14nDaBh7tbRaNQZUlNFDaVoQA6cynN8Xzq8k5lqJxGnuNRR7SNw8+cZZiZmMe
pS2f9q2IyOMz3T/mBNPuAFcPTbp6pjYrBNpMEGF6FYDhmUMSEfhf4Cp1Ns4FG0kx
o5dIXnhgDpCCTBK4XiYqbijFGe0pqbOH98fTQJLXb2ItgE17t4jU0YoYPJovjjT8
xKnDggN+H3uPMmNTcxn0VL6XcrjM6oDeBQPtzCiePWWxYD4nwP3d0ZIok13jZSHm
KC3NWgYQ7uP8/NJitnqewMQ8RArQjAWsW94deZt28jNDeaKp/vovQlZtrU2M6dQ=
=aysr
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pluggable compression support

2013-06-15 Thread Andres Freund

On 2013-06-14 21:56:52 -0400, Robert Haas wrote:
> I don't think we need it.  I think what we need is to decide is which
> algorithm is legally OK to use.  And then put it in.
> 
> In the past, we've had a great deal of speculation about that legal
> question from people who are not lawyers.  Maybe it would be valuable
> to get some opinions from people who ARE lawyers.  Tom and Heikki both
> work for real big companies which, I'm guessing, have substantial
> legal departments; perhaps they could pursue getting the algorithms of
> possible interest vetted.  Or, I could try to find out whether it's
> possible do something similar through EnterpriseDB.

I personally don't think the legal arguments holds all that much water
for snappy and lz4. But then the opinion of a european non-lawyer doesn't
hold much either.
Both are widely used by a large number open and closed projects, some of
which have patent grant clauses in their licenses. E.g. hadoop,
cassandra use lz4, and I'd be surprised if the companies behind those
have opened themselves to litigation.

I think we should preliminarily decide which algorithm to use before we
get lawyers involved. I'd surprised if they can make such a analysis
faster than we can rule out one of them via benchmarks.

Will post an updated patch that includes lz4 as well.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Martijn van Oosterhout

On Sat, Jun 15, 2013 at 12:43:10PM +0800, Craig Ringer wrote:
> Bloat
> --
> 
> Table bloat. Table bloat has been a major issue with PostgreSQL
> users/admins for years. Anyone care to explain to me in a simple
> paragraph how to find out if you have table or index bloat issues in
> your database and what to do about it? (Maybe we need
> "pg_catalog.pg_index_bloat" and "pg_catalog.pg_table_bloat" views
> including FILLFACTOR correction?)
> 
> I think I'll draft up a patch to add exactly that.

Nice list btw. I monitor this by using the excellent check_progres
nagios plugin, which has stuff to check for things like this.

Which makes me think that it might be possible to add some other checks
like this, in for example pg_ctl.  A big fat warning 'your data may be
eaten' might get noticed at startup.

(A minor annoyance is that in recent version of PostgreSQL you have to
give check_postgres admin rights, otherwise it can't warn you about
"idle in transaction" problems.)

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Craig Ringer

On 06/15/2013 03:53 PM, Joshua D. Drake wrote:
>
> Yeah this one is not making the grade. pg_hba is just that host based
> auth but I think we are bikeshedding now.

Agreed... Even as I posted, I realised I shouldn't have mentioned the
last point, since everything else has been ignored.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MD5 aggregate

2013-06-15 Thread Dean Rasheed

On 13 June 2013 10:35, Dean Rasheed  wrote:
> Hi,
>
> Attached is a patch implementing a new aggregate function md5_agg() to
> compute the aggregate MD5 sum across a number of rows. This is
> something I've wished for a number of times. I think the primary use
> case is to do a quick check that 2 tables, possibly on different
> servers, contain the same data, using a query like
>
>   SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo;
>
> or
>
>   SELECT md5_agg(foo.*::text ORDER BY id) FROM foo;

There seem to be 2 separate directions that this could go, which
really meet different requirements:

1). Produce an unordered sum for SQL to compare 2 tables regardless of
the order in which they are scanned. A possible approach to this might
be something like an aggregate

md5_total(text/bytea) returns text

that returns the sum of the md5 values of each input value, treating
each md5 value as an unsigned 128-bit integer, and then producing the
hexadecimal representation of the final sum. This should out-perform a
solution based on numeric addition, and in typical cases, the result
wouldn't be much longer than a regular md5 sum, and so would be easy
to eyeball for differences.

2). Produce an ordered MD5 sum compatible with COPY, whose result
would match that of running unix md5sum on the COPY output. Given all
the possible COPY options that would affect the result (format,
delimiters, headers, quoting, escaping, ...), I think that such a
thing would only reasonably be possible as an extension to the COPY
command itself.

I guess in its simplest form this would just be a new option "MD5" to
COPY that would cause it to pipe its output to the md5 aggregator and
then send the final sum to the COPY destination at the end (e.g.,
"COPY foo TO STDOUT MD5" would produce the ordered MD5 sum of the data
in foo).

I still think my original md5_agg() has its uses, since what it
produces is comparable with external md5 sums, and is directly
available to SQL, but (1) is probably the most useful for quickly
comparing 2 tables. I'm much less convinced about the value of (2),
but on the face of it, it doesn't seem like it would be hard to
implement.

Thoughts?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Joshua D. Drake



On 06/14/2013 11:44 PM, Brendan Jurd wrote:


If they see something called 'pg_hba.conf', they may very reasonably
assume that it is some internal/advanced stuff that they don't need to
worry about just yet, because what the heck is a 'pg_hba'?  The 'pg'


Only the uneducated. Look, I am not trying to be an ass but seriously. 
Read the docs. I will argue vigorously against the idea of us designing 
a system that has people NOT reading the docs.


JD




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Joshua D. Drake



On 06/14/2013 11:18 PM, Craig Ringer wrote:


On 06/15/2013 02:08 PM, Brendan Jurd wrote:

On 15 June 2013 14:43, Craig Ringer  wrote:

The #1 question I see on Stack Overflow has to be confusion about
pg_hba.conf, mostly from people who have no idea it exists, don't understand
how to configure it, etc.

The totally non-obvious name of the file probably has something to do
with that.  It should be called 'auth.conf'.

Not convinced; since it only controls one facet of auth - it doesn't
define users, passwords, grants, etc - that'd probably be just as confusing.



Yeah this one is not making the grade. pg_hba is just that host based 
auth but I think we are bikeshedding now.


JD



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for fail-back without fresh backup

2013-06-15 Thread Sawada Masahiko

On Fri, Jun 14, 2013 at 10:15 PM, Amit Kapila  wrote:
> On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote:
>> Hello,
>
>> We have already started a discussion on pgsql-hackers for the problem of
> taking fresh backup during the failback operation here is the link for that:
>
>>
> http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb
> jgwrfu513...@mail.gmail.com
>
>> Let me again summarize the problem we are trying to address.
>
>> When the master fails, last few WAL files may not reach the standby. But
> the master may have gone ahead and made changes to its local file system
> after > flushing WAL to the local storage.  So master contains some file
> system level changes that standby does not have.  At this point, the data
> directory of > master is ahead of standby's data directory.
>> Subsequently, the standby will be promoted as new master.  Later when the
> old master wants to be a standby of the new master, it can't just join the
>> setup since there is inconsistency in between these two servers. We need
> to take the fresh backup from the new master.  This can happen in both the
>> synchronous as well as asynchronous replication.
>
>> Fresh backup is also needed in case of clean switch-over because in the
> current HEAD, the master does not wait for the standby to receive all the
> WAL
>> up to the shutdown checkpoint record before shutting down the connection.
> Fujii Masao has already submitted a patch to handle clean switch-over case,
>> but the problem is still remaining for failback case.
>
>> The process of taking fresh backup is very time consuming when databases
> are of very big sizes, say several TB's, and when the servers are connected
>> over a relatively slower link.  This would break the service level
> agreement of disaster recovery system.  So there is need to improve the
> process of
>> disaster recovery in PostgreSQL.  One way to achieve this is to maintain
> consistency between master and standby which helps to avoid need of fresh
>> backup.
>
>> So our proposal on this problem is that we must ensure that master should
> not make any file system level changes without confirming that the
>> corresponding WAL record is replicated to the standby.
>
>   How will you take care of extra WAL on old master during recovery. If it
> plays the WAL which has not reached new-master, it can be a problem.

you means that there is possible that old master's data ahead of new
master's data.
so there is inconsistent data between those server when fail back. right?
if so , there is not possible inconsistent. because if you use GUC option
as his propose (i.g., failback_safe_standby_mode = remote_flush),
when old master is working fine, all file system level changes aren't
done  before WAL replicated.

--
Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard to Use WAS: Hard limit on WAL space

2013-06-15 Thread Stefan Drees


On 2013-06-15 08:44 CEST, Brendan Jurd wrote:

On 15 June 2013 16:18, Craig Ringer ... wrote:

On 06/15/2013 02:08 PM, Brendan Jurd wrote:

On 15 June 2013 14:43, Craig Ringer ... wrote:

The #1 question I see on Stack Overflow has to be confusion about
pg_hba.conf, mostly from people who have no idea it exists, don't understand
how to configure it, etc.

The totally non-obvious name of the file probably has something to do
with that.  It should be called 'auth.conf'.

Not convinced; since it only controls one facet of auth - it doesn't
define users, passwords, grants, etc ...


When somebody is setting up postgres for the first time, and they list
the contents of the config directory, you want them to have some idea


as they may not have read up to section 19.1 The pg_hba.conf File inside 
chapter 19 Client Authentication of part III. Server Administration :-?, 
which states (as of 9.2.4):

"""
Client authentication is controlled by a configuration file, which 
traditionally is named pg_hba.conf and is stored in the database 
cluster's data directory. (HBA stands for host-based authentication.) A 
default pg_hba.conf file is installed when the data directory is 
initialized by initdb. It is possible to place the authentication 
configuration file elsewhere, however; see the hba_file configuration 
parameter. ...

"""

;-) thanks to hyperlinks this is quite close to the start, but I was 
surprised to not find it by skimming the text and following the 
hyperlinks but by knowing the filename instead and entering it 
("pg_hba.conf") into the Search Documentation text field on the top 
right corner of http://www.postgresql.org/docs/9.2/interactive/index.html.


Maybe we could find a better place of the whatever-then-name inside the 
part of the docs even the "TL;DR" mood people might read? A paragraph or 
two spiced up with some catchy StackOverflow-inspired terms people with 
a need to configure this authentication aspect might have expected could 
also be expected in INSTALL like docs or directly observable on the 
hyperlinked way from part I. Tutorial chapter 1 Getting Started section 
1.1 Installation all down to chapter 15. Installation from Source Code. 
But of course only, if this is "wanted behavior".


If I read the section 1.1 Installation (again 9.2.4) I have the 
impression, that it more transports the message in our case, that "you 
are the site admin, deal with it, read the docs", or don't I read it 
right? (I am a non-native English reader)



what each of the files is for.  If they see something called
'auth.conf', they'll get the right general idea.  An understanding of
the nuances (like that it doesn't control user accounts) will come
once they open up the file -- which they may well do, because it is
called 'auth.conf', and 'auth' is a thing you want to configure.


that may well be, I do not know, how people that prefer reading folder 
and filenames over manuals written for them grok text, as I read the 
docs, promised ;-)



If they see something called 'pg_hba.conf', they may very reasonably
assume that it is some internal/advanced stuff that they don't need to
worry about just yet, because what the heck is a 'pg_hba'?  The 'pg'
is unnecessary and the 'hba' is an internal jargon term that we've
ill-advisedly allowed to leak out into the filename.


at around 1995 when I started using Postgres95 it sure took some time to 
find that pg_hba.conf file, but I then perceived it to be very well 
documented, and also felt a bit guilty, as it's name occured in the 
INSTALL file cf. 
ftp://ftp-archives.postgresql.org/pub/source/v7.2/postgresql-7.2.tar.gz 
and the INSTALL file. Therein "burried" inside Step 1 of "If You Are 
Upgrading" ...



If you really feel that 'auth.conf' is too imprecise, maybe something
like 'conn-auth.conf' would be more your style.


I think you guys did and still do a fantastic job with PostgreSQL and 
eps. it's documentation, but in this case I doubt, that any renaming of 
config files will really have an impact on usability in the shady area 
of "TL;DR" - at least for the next twenty years or so - as it still 
holds, that from a false start (eg. not reading documentation written) 
anything may follow.


But as usability is a practical concern I (as a user) would be +0 on 
renaming it if people not finding it bearing the old name, but then 
editing it is really wanted behavior.


All the best,
Stefan.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

61 matches

Mail list logo