from:"chris"

[HACKERS] Proposal: ALTER EXTENSION SET OPTION

2017-11-08 Thread Chris Travers

Hi all;

One of the annoyances we currently deal with regarding analytics extensions
in a PG environment with mixed versions is there is no facility right now
to conditionally support certain modifications to functions that might be
required to make certain features work properly.

The case that comes to mind right now is in marking some functions parallel
safe for PostgreSQL 9.6 and above while taking no action for 9.5.
Currently we have a few options:

1.  Mark on a best effort basis
2,  Drop support for 9.5 and below
3.  Come up with some much more complicated version graph.

It would be very nice to be able to define some options which could be set
for extensions but don't affect their arguments or return types, such as
marking functions parallel-safe and then have scripts which run to define
these functions.

My thinking is one would have a syntax for a few specified options, such as:

ALTER EXTENSION foo SET OPTION PARALLEL SAFETY;
or if the extension supports the reverse:
ALTER EXTENSION foo UNSET OPTION PARALLEL SAFETY;

Over time more options could be added, but would be mapped to a file
convention.  In this case, we could expect:

foo--[version]--set-parallel-safety.sql and
foo--[version]--unset-parallel-safety.sql

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] proposal: schema variables

2017-11-03 Thread Chris Travers

Some thoughts on this.

On Thu, Nov 2, 2017 at 4:48 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Nico Williams <n...@cryptonector.com> writes:
> > With access controls, GUCs could become schema variables, and settings
> > from postgresql.conf could move into the database itself (which I think
> > would be nice).
>
> People re-propose some variant of that every so often, but it never works,
> because it ignores the fact that some of the GUCs' values are needed
> before you can access system catalogs at all, or in places where relying
> on system catalog access would be a bad idea.
>

I think the basic point one should get here is that no matter the
unification, you still have some things in the db and some things out.

I would rather look at how the GUC could be improved on a functional/use
case level before we look at the question of a technical solution.

 One major use case today would be restricting how high various users can
set something like work_mem or the like.  As it stands, there isn't really
a way to control this with any granularity.  So some of the proposals
regarding granting access to a session variable would be very handy in
granting access to a GUC variable.

>
> Sure, we could have two completely different configuration mechanisms
> so that some of the variables could be "inside the database", but that
> doesn't seem like a net improvement to me.  The point of the Grand Unified
> Configuration mechanism was to be unified, after all.
>

+1

>
> I'm on board with having a totally different mechanism for session
> variables.  The fact that people have been abusing GUC to store
> user-defined variables doesn't make it a good way to do that.
>

What about having a more clunky syntax as:

SET VARIABLE foo='bar';

Perhaps one can have a short form of:

SET VAR foo = 'bar';

vs

SET foo = 'bar'; -- GUC

>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Oracle to PostGre

2017-11-01 Thread Chris Travers

As a brief note, this is probably not the best list for this.  You would do
better to ask questions like this on -general where you have more
application developers and so forth.  This is more of an SQL question so
asking people who are hacking the codebase may not be the best way to get
it answered.

Also, it is Postgres or PostgreSQL.  People will assume you are totally new
if you call it Postgre.

On Wed, Nov 1, 2017 at 12:55 PM, Brahmam Eswar <brahmam1...@gmail.com>
wrote:

> Hi,
>
> App is moving to Postgre from Oracel . After migrating the store procedure
> is throwing an error with collection type.
>
> *Oracle :*
>
> create or replace PROCEDURE"PROC1"
>  (
>
>  , REQ_CURR_CODE IN VARCHAR2
>  , IS_VALID OUT VARCHAR2
>  , ERROR_MSG OUT VARCHAR2
>  ) AS
>
>
>TYPE INV_LINES_RT IS RECORD(
>  VENDOR_NUM AP.CREATURE_TXN_LINE_ITEMS.VENDOR_NUM%TYPE,
>  VENDOR_SITE_CODE AP.CREATURE_TXN_LINE_ITEMS.
> VENDOR_SITE_CODE%TYPE,
>  INVOICE_NUM AP.CREATURE_TXN_LINE_ITEMS.INVOICE_NUM%TYPE,
>  TXN_CNT NUMBER
>);
>TYPE INV_LINES_T IS TABLE OF INV_LINES_RT;
>L_INV_LINES INV_LINES_T;
>IS_MULTI_VENDOR FINO_APRVL_BUS_CHANN_DEFAULTS.
> MULTI_VENDOR_REQ_ALLOWED%TYPE;
>BUS_CHANNEL_RECORD FINO_APRVL_BUS_CHANN_DEFAULTS%ROWTYPE;
> CAL_APRVL_AMT_BY_TOTAL_AMT FINO_APRVL_BUS_CHANN_DEFAULTS.
> CAL_APR_AMT_BY_INV%TYPE;
>
>
> Postgre :
>
> create or replace FUNCTION AP.VALIDATE_CRTR_LINE_ITEMS
> (
> REQ_CURR_CODE IN VARCHAR,
> IS_VALID OUT VARCHAR,
> ERROR_MSG OUT VARCHAR
> ) AS $$
>
> DECLARE
>
> INV_LINES_T ap.validate_crtr_line_items$inv_lines_rt ARRAY;
> L_INV_LINES INV_LINES_T%TYPE;
> L_INV_LINES$temporary_record ap.validate_crtr_line_items$inv_lines_rt;
> IS_MULTI_VENDOR AP.FINO_APRVL_BUS_CHANN_DEFAULTS.MULTI_VENDOR_REQ_
> ALLOWED%TYPE;
> BUS_CHANNEL_RECORD ap.fino_aprvl_bus_chann_defaults%ROWTYPE;
>  CAL_APRVL_AMT_BY_TOTAL_AMT AP.FINO_APRVL_BUS_CHANN_
> DEFAULTS.CAL_APR_AMT_BY_INV%TYPE;
>
>
> but it's throwing an error as : 0 SQLState: 42P01 Message: ERROR:
> relation "l_inv_lines" does not exist
> --
> Thanks & Regards,
> Brahmeswara Rao J.
>

When you ask on -general, please include the query which is actually
causing the problem.  My guess is that either you didn't declare the type
properly or there is some other error in your function, but the information
provided is not sufficient to answer it.

Best or luck asking on -general.

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-11-01 Thread Chris Travers

Attached is the patch as submitted for commitfest.

Please note, I am not adverse to adding an additional --Include-path
directive if that avoids backwards-compatibility problems.  However the
patch is complex enough I would really prefer review on the rest of it to
start first.  This doesn't strike me as perfectly trivial and I think it is
easier to review pieces separately.  I have already started on the
--Include-path directive and could probably get it attached to a later
version of the patch very soon.

I would also like to address a couple of important points here:

1.  I think default restrictions plus additional paths is the best, safest
way forward.  Excluding shell-globs doesn't solve the "I need this
particular config file" very well particularly if we want to support this
outside of an internal environment.  Shell globs also tend to be overly
inclusive and so if you exclude based on them, you run into a chance that
your rewind is corrupt for being overly exclusive.

2.  I would propose any need for an additional paths be specified using an
--Include-path directive.  This could be specified multiple times and could
point to a file or directory which would be added to the paths rewound.  I
have a patch for this mostly done but I am concerned that these sorts of
changes result in a combination of changes that are easier to review
separately than together.  So if needed, I can add it or in a separate
patch following.

3.  I think it would be a mistake to tie backup solutions in non-replicated
environments to replication use cases, and vice versa.  Things like
replication slots (used for streaming backups) have different
considerations in different environments.  Rather than build the same
infrastructure first, I think it is better to support different use cases
well and then build common infrastructure to support the different cases.
I am not against building common infrastructure for pg_rewind and
pg_basebackup.  I am very much against having the core guarantees being the
exact same.

Best Wishes,
Chris Travers

On Sat, Oct 28, 2017 at 1:22 PM, Chris Travers <chris.trav...@adjust.com>
wrote:

> Hi;
>
> There are still some cleanup bits needed here but I wanted to get feedback
> on my general direction.
>
> I hope to submit for commit fest soon if the general feedback is good.
> Tests are passing (with adjustments intended for change of behaviour in one
> test script).  I want to note that Crimson Thompson (also with Adjust) has
> provided some valuable discussion on this code.
>
> The Problem:
>
> pg_rewind makes no real guarantees regarding the final state of non-data
> files or directories.  It.checks to see if the timeline has incremented
> (and therefore guarantees that if successful the data directories are on
> the same timeline) but for non-data files, these are clobbered if we rewind
> and left intact if not.  These other files include postgresql.auto.conf,
> replication slots, and can include log files.
>
> Copying logs over to the new slave is something one probably never wants
> to do (same with replication slots), and the behaviours here can mask
> troubleshooting regarding what a particular master failed, cause wal
> segments to build up, automatic settings changes, and other undesirable
> behaviours.  Together these make pg_rewind very difficult to use properly
> and push tasks to replication management tooling that the management tools
> are not in a good position to handle correctly.
>
> Designing the Fix:
>
> Two proposed fixes have been considered and one selected:  Either we
> whitelist directories and only rewind those.  The other was to allow shell
> globs to be provided that could be used to exclude files.  The whitelisting
> solution was chosen for a number of reasons.
>
> When pg_rewind "succeeds" but not quite correctly the result is usually a
> corrupted installation which requires a base backup to replace it anyway.
> In a recovery situation, sometimes pressures occur which render human
> judgment less effective, and therefore glob exclusion strikes me as
> something which would probably do more harm than good, but maybe I don't
> understand the use case (comments as to why some people look at the other
> solution as preferable would be welcome).
>
> In going with the whitelisting solution, we chose to include all
> directories with WAL-related information.This allows more predicable
> interaction with other parts of the replication chain.  Consequently not
> only do we copy pg_wal and pg_xact but also commit timestamps and so forth.
>
> The Solution:
>
> The solution is a whitelist of directories specified which are the only
> ones which are synchronised.  The relevant part of this patch is:
>
> +/* List of directories to synchronize:
>
> + * base data dirs (a

Re: [HACKERS] Patch: restrict pg_rewind to whitelisted directories

2017-11-01 Thread Chris Travers

On Tue, Oct 31, 2017 at 1:38 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Mon, Oct 30, 2017 at 6:44 PM, Chris Travers <chris.trav...@adjust.com>
> wrote:
> > The attached patch is cleaned up and filed for the commit fest this next
> > month:
>
> It's generally better to post the patch on the same message as the
> discussion thread, or at least link back to the discussion thread from
> the new one.  Otherwise people may have trouble understanding the
> history.
>

Fair point.  Mostly concerned about the WIP marker there. This is my first
time around this.

I will then retire this thread and attach the patch on the other one.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

[HACKERS] Anyone have experience benchmarking very high effective_io_concurrency on NVME's?

2017-10-31 Thread Chris Travers

Hi;

After Andres's excellent talk at PGConf we tried benchmarking
effective_io_concurrency on some of our servers and found that those which
have a number of NVME storage volumes could not fill the I/O queue even at
the maximum setting (1000).

Before we start benchmarking patched versions of PostgreSQL to bump this
setting up to ever higher values, I am wondering if anyone has done this
yet and if so if you would be willing to share results.

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

[HACKERS] Patch: restrict pg_rewind to whitelisted directories

2017-10-30 Thread Chris Travers

The attached patch is cleaned up and filed for the commit fest this next
month:

Here's the full commit message via Mercurial.  I will likely have a new
branch per version the patch since that's the closest thing to a rebase in
this version control system.

changeset:   60492:47f87a2d2fa1

tag: mine/pg_rewind_restrict_dirs

parent:  60446:e638ba9c3c11

user:Chris Travers <chris.trav...@gmail.com>

date:Mon Oct 30 12:25:18 2017 +0100

files:   doc/src/sgml/ref/pg_rewind.sgml src/bin/pg_rewind/copy_fetch.c
src/bin/pg_rewind/fetch.c src/bin/pg_rewind/fetch.h
src/bin/pg_rewind/libpq_fetch.c src/bin/pg_rewind/pg_rewind.c
src/bin/pg_rewind/t/003_extrafiles.pl

description:

Restrict pg_rewind to whitelisted directories.


This is intended to be a minimum working version and in fact builds and
passes tests.

Note that tests for extra files have been changed to reflect new behavior
and additional

debugging informnation added in to output in case of failure.


The patch iterates through a series of set directories to synchronize them
only.  This improves

predictability of the complete state of the system after a rewind.


One important outstanding question here is whether we need to ensure the
possibility of backing

up other files if they exist via an --include-path command line switch
(this would not be a glob).

In the thread discussing this patch, Michael Paquier has expressed concern
about configuration

files created by extensions or other components not being copied.  I could
add such a switch but

the patch is long enough, and it is unclear enough to the extent this is
needed at present, so

I am leaving it at the reviewer's discretion whether I should add this here
or submit a second

patch later to add the ability to add additional paths to the filemap.

Either way, it is worth noting that I expect to have a subsequent patch
either incorporated here or in a further submission that takes this and
adds the ability to include additional directories or files via a command
line flag.  This will *not* be a shell glob but one directory or file per
invocation of the switch (similar to -t in pg_dump).

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


pg_rewind_restrict_dirs.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-10-30 Thread Chris Travers

On Mon, Oct 30, 2017 at 11:36 AM, Michael Paquier <michael.paqu...@gmail.com
> wrote:

> On Mon, Oct 30, 2017 at 10:15 AM, Chris Travers
> <chris.trav...@adjust.com> wrote:
> > This also brings up a fairly major concern more generally about control
> by
> > the way.  A lot of cases where pg_rewind is called, the user doesn't
> > necessarily have much control on how it is called.  Moreover in many of
> > these cases, the user is probably not in a position to understand the
> > internals well enough to grasp what to check after.
>
> Likely they are not.
>

So currently I am submitting the current patch to commit fest.  I am open
to adding a new
--Include-path option in this patch, but I am worried about atomicity
concerns, and I am still
not really sure what the impact is for you (I haven't heard whether you
expect this file to be
there before the rewind, i.e. whether it would be on both master and slave,
or just on the slave).

Sp there is the question of under what specific circumstances this would
break for you.

>
> > Right, but there is a use case difference between "I am taking a backup
> of a
> > server" and "I need to get the server into  rejoin the replication as a
> > standby."
>
> The intersection of the first and second categories is not empty. Some
> take backups and use them to deploy standbys.
>

Sure, but it is not complete either.

>
> > A really good example of where this is a big problem is with replication
> > slots.  On a backup I would expect you want replication slots to be
> copied
> > in.
>
> I would actually expect the contrary, and please note that replication
> slots are not taken in a base backup, which is what the documentation
> says as well:
> https://www.postgresql.org/docs/10/static/protocol-replication.html
> "pg_dynshmem, pg_notify, pg_replslot, pg_serial, pg_snapshots,
> pg_stat_tmp, and pg_subtrans are copied as empty directories (even if
> they are symbolic links)."
>

Many of these are emptied on server restart but repslot is not.  What is
the logic
for excluding it from backups other than to avoid problems with replication
provisioning?

I mean if I have a replication slot for taking backups with barman (and I
am not actually doing replication) why would I *not* want that in my base
backup if I might have to restore to a new machine somewhere?

>
> Some code I have with 9.6's pg_bsaebackup removes manually replication
> slots as this logic is new in 10 ;)
>
> >> The pattern that base backups now use is an exclude list. In the
> >> future I would rather see base backups and pg_rewind using the same
> >> infrastructure for both things:
> >> - pg_rewind should use the replication protocol with BASE_BACKUP.
> >> Having it rely on root access now is crazy.
> >> - BASE_BACKUP should have an option where it is possible to exclude
> >> custom paths.
> >> What you are proposing here would make both diverge more, which is in
> >> my opinion not a good approach.
> >
> > How does rep mgr or other programs using pg_rewind know what to exclude?
>
> Good question. Answers could come from folks such as David Steele
> (pgBackRest) or Marco (barman) whom I am attaching in CC.
>

Two points that further occur to me:

Shell globs seem to me to be foot guns in this area, I think special paths
should be one path per invocation of the option not "--exclude=pg_w*" since
this is just asking for problems as time goes on and things get renamed.

It also seems to me that adding specific paths is far safer than removing
specific paths.

> --
> Michael
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-10-30 Thread Chris Travers

On Mon, Oct 30, 2017 at 10:57 AM, Michael Paquier <michael.paqu...@gmail.com
> wrote:

> On Mon, Oct 30, 2017 at 9:43 AM, Chris Travers <chris.trav...@adjust.com>
> wrote:
> > Are there any cases right now where you have features added by
> extensions that write to directories which are required for a rewind?
>
> In some of the stuff I maintain, I actually have one case now of a
> configuration file included with include_if_exists by postgresql.conf
> and is expected to be generated by a component that my code doing the
> rewind has no direct access on... I can control how pg_rewind kicks
> in, but I think that you would actually break silently the current
> code, which is scary especially if I don't control the code where
> pg_rewind is called when Postgres 11 gets integrated into the product
> I am thinking about, and even more scary if there is no way to include
> something.
>

Ok, so there is an argument that there needs to be a way to include
additional paths in this patch.  One important question I would have in
these cases is if you expect these to be set only on the master.  If not,
then is this a problem and if not then how do you handle the fail-back
problem etc?

This also brings up a fairly major concern more generally about control by
the way.  A lot of cases where pg_rewind is called, the user doesn't
necessarily have much control on how it is called.  Moreover in many of
these cases, the user is probably not in a position to understand the
internals well enough to grasp what to check after.

>
> > The problem with an exclude list is that we cannot safely exclude
> > configuration files or logs (because these could be named anything),
> right?
>
> You have the exact same problem with base backups now, and any
> configuration files created by extensions are a per-case problem...
>

Right, but there is a use case difference between "I am taking a backup of
a server" and "I need to get the server into  rejoin the replication as a
standby."

A really good example of where this is a big problem is with replication
slots.  On a backup I would expect you want replication slots to be copied
in.  However when setting yourself up as a slave you most certainly do not
want to just copy these over from the master unless you have infinite disk
space.  I would argue that under *no* circumstance should pg_rewind *ever*
copy replication slots.  But pg_base_backup really *should* (and when
provisioning a new slave you should clear these as soon as it is set up).

The pattern that base backups now use is an exclude list. In the
> future I would rather see base backups and pg_rewind using the same
> infrastructure for both things:
> - pg_rewind should use the replication protocol with BASE_BACKUP.
> Having it rely on root access now is crazy.
> - BASE_BACKUP should have an option where it is possible to exclude
> custom paths.
> What you are proposing here would make both diverge more, which is in
> my opinion not a good approach.
>

How does rep mgr or other programs using pg_rewind know what to exclude?

Again I am not convinced setting up a replica and taking a backup for
disaster recovery are the same use case or have the same requirements.

--
> Michael
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-10-30 Thread Chris Travers

First, thanks for your thoughts on this, and I am interested in probing
them more.

On Mon, Oct 30, 2017 at 9:04 AM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Sat, Oct 28, 2017 at 4:22 AM, Chris Travers <chris.trav...@adjust.com>
> wrote:
> > The Solution:
> > The solution is a whitelist of directories specified which are the only
> ones
> > which are synchronised.  The relevant part of this patch is:
> >
> > +/* List of directories to synchronize:
> > + * base data dirs (and ablespaces)
> > + * wal/transaction data
> > + * and that is it.
> > + *
> > + * This array is null-terminated to make
> > + * it easy to expand
> > + */
> >
> > +const char *rewind_dirs[] = {
> > +"base",
> > +"global",
> > +"pg_commit_ts",
> > +"pg_logical",
> > +"pg_multixact",
> > +"pg_serial",
> > +"pg_subtrans",
> > +"pg_tblspc",
> > +"pg_twophase",
> > +"pg_wal",
> > +"pg_xact",
> > +NULL
> > +};
>
> The problem with a white list, which is one reason why I do not favour
> it, is in the case where a feature adds in the data folder a new path
> for its data, particularly in the case where this is critical for a
> base backup. If this folder is not added in this list, then a rewind
> may be silently corrupted as well. There are also a set of directories
> in this list that we do not care about, pg_serial being one.
> pg_subtrans is a second, as it gets zeroed on startup.
>

The problem with an exclude list is that we cannot safely exclude
configuration files or logs (because these could be named anything), right?

Are there any cases right now where you have features added by extensions
that write to directories which are required for a rewind?  I am asking
because I would like to see if this is the minimum working change or if
this change is fundamentally broken currently and must be extended to allow
custom paths to be sync'd as well.

>
> And if you think about it, pg_rewind is actually the *same* processing
> as a base backup taken using the replication protocol. So instead of
> this patch I would recommend using excludeFiles and excludeDirContents
> by moving this list to a common header where frontend and backend can
> refer to. Note that basebackup.h includes replnodes.h, so my thoughts
> is that you should split the current header with something like
> basebackup_internal.h which is backend-only, and have pg_rewind fetch
> the list of directories to automatically ignore as those ones. You can
> also simplify a bit the code of pg_rewind a bit so as things like
> postmaster.opts. On top of that, there is visibly no need to tweak the
> SQL query fetching the directory list (which is useful for debugging
> as shaped now actually), but just change process_source_file so as its
> content is not added in the file map.
>

I am not sure it *should* be the same, however.  In a backup we probably
want to backup the postgresql.auto.conf, but on a failover, I don't think
we want to clobber configuration.  We certainly do not want to sometimes
clobber configuration but not other times (which is what happens right now
in some cases).  And under no circumstances do we want to clobber logs on a
failed server with logs on a working server.  That's asking for serious
problems in my view.

If you think about it, there's a huge difference in use case in backing up
a database cluster (Including replication slots, configs in the dir etc)
and re-syncing the data so that replication can resume, and I think there
are some dangers that come up when assuming these should be closely tied
together.

>
> Then there is a second case where you do not want a set of folders to
> be included, but those can be useful by default, an example here being
> pg_wal where one might want to have an empty path to begin with. A
> third case is a set of folders that you do not care about, but depends
> on the deployment, being here "log" for example. For those you could
> just add an --exclude-path option which simply piles up paths into a
> linked list when called multiple times. This could happen with a
> second patch.
>

Agreed.  And one could add an "--include-path" option to allow for unusual
cases where you want extra directories, such as replication slots, or the
like.

I think another patch could also specifically empty and perhaps create a
replication slot allowing for one to bring tings back up via streaming
replication more safely.

>
> > From there we iterate over this array for directory-based approaches in
> > copy_fetch.c and with one query per directory in libpq

Re: [HACKERS] proposal: schema variables

2017-10-29 Thread Chris Travers

On Sat, Oct 28, 2017 at 4:56 PM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

>
>>
> The creating database objects and necessary infrastructure is the most
> simple task of this project. I'll be more happy if there are zero
> intersection because variables and GUC are designed for different purposes.
> But due SET keyword the intersection there is.
>
> When I thinking about it, I have only one, but important reason, why I
> prefer design new type of database object -the GUC are stack based with
> different default granularity - global, database, user, session, function.
> This can be unwanted behave for variables - it can be source of hard to
> detected bugs. I afraid so this behave can be too messy for usage as
> variables.
>
> @1 I have not clean opinion about it - not sure if rights are good enough
> - probably some user limits can be more practical - but can be hard to
> choose result when some user limits and GUC will be against
>

I was mostly thinking that users can probably set things like work_mem and
possibly this might be a problem.


> @2 With variables typed custom GUC are not necessary
>

I don't know about that.  For example with the geoip2lookup extension it is
nice that you could set the preferred language for translation on a per
user basis or the mmdb path on a per-db basis.


> @3 Why you need it? It is possible with set_config function now.
>

Yeah you could do it safely with set_config and a CTE, but suppose I have:

with a (Id, value) as (values (1::Int, 'foo'), (2, 'bar'), (3, 'baz'))
SELECT set_config('custom_val', value) from a where id = 2;

What is the result out of this?  I would *expect* that this would probably
run set_config 3 times and filter the output.


>
> Regards
>
> Pavel
>
>
>
>
>>
>>
>>> regards
>>>
>>> Pavel
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Chris Travers
>> Database Administrator
>>
>> Tel: +49 162 9037 210 <+49%20162%209037210> | Skype: einhverfr |
>> www.adjust.com
>> Saarbrücker Straße 37a, 10405 Berlin
>> <https://maps.google.com/?q=Saarbr%C3%BCcker+Stra%C3%9Fe+37a,+10405+Berlin=gmail=g>
>>
>>
>


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] proposal: schema variables

2017-10-28 Thread Chris Travers

On Thu, Oct 26, 2017 at 9:21 AM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

> Hi,
>
> I propose a  new database object - a variable. The variable is persistent
> object, that holds unshared session based not transactional in memory value
> of any type. Like variables in any other languages. The persistence is
> required for possibility to do static checks, but can be limited to session
> - the variables can be temporal.
>
> My proposal is related to session variables from Sybase, MSSQL or MySQL
> (based on prefix usage @ or @@), or package variables from Oracle (access
> is controlled by scope), or schema variables from DB2. Any design is coming
> from different sources, traditions and has some advantages or
> disadvantages. The base of my proposal is usage schema variables as session
> variables for stored procedures. It should to help to people who try to
> port complex projects to PostgreSQL from other databases.
>
> The Sybase  (T-SQL) design is good for interactive work, but it is weak
> for usage in stored procedures - the static check is not possible. Is not
> possible to set some access rights on variables.
>
> The ADA design (used on Oracle) based on scope is great, but our
> environment is not nested. And we should to support other PL than PLpgSQL
> more strongly.
>
> There is not too much other possibilities - the variable that should be
> accessed from different PL, different procedures (in time) should to live
> somewhere over PL, and there is the schema only.
>
> The variable can be created by CREATE statement:
>
> CREATE VARIABLE public.myvar AS integer;
> CREATE VARIABLE myschema.myvar AS mytype;
>
> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type
>   [ DEFAULT expression ] [[NOT] NULL]
>   [ ON TRANSACTION END { RESET | DROP } ]
>   [ { VOLATILE | STABLE } ];
>
> It is dropped by command DROP VARIABLE  [ IF EXISTS] varname.
>
> The access rights is controlled by usual access rights - by commands
> GRANT/REVOKE. The possible rights are: READ, WRITE
>
> The variables can be modified by SQL command SET (this is taken from
> standard, and it natural)
>
> SET varname = expression;
>
> Unfortunately we use the SET command for different purpose. But I am
> thinking so we can solve it with few tricks. The first is moving our GUC to
> pg_catalog schema. We can control the strictness of SET command. In one
> variant, we can detect custom GUC and allow it, in another we can disallow
> a custom GUC and allow only schema variables. A new command LET can be
> alternative.
>
> The variables should be used in queries implicitly (without JOIN)
>
> SELECT varname;
>
> The SEARCH_PATH is used, when varname is located. The variables can be
> used everywhere where query parameters are allowed.
>
> I hope so this proposal is good enough and simple.
>
> Comments, notes?
>


I have a question on this.  Since one can issue set commands on arbitrary
settings (and later ALTER database/role/system on settings you have created
in the current session) I am wondering how much overlap there is between a
sort of extended GUC with custom settings and variables.

Maybe it would be simpler to treat variables and GUC settings to be similar
and see what can be done to extend GUC in this way?

I mean if instead we allowed restricting SET to known settings then we
could have a CREATE SETTING command which would behave like this and then
use SET the same way across both.

In essence I am wondering if this really needs to be as separate from GUC
as you are proposing.

If done this way then:

1.  You could issue grant or revoke on GUC settings, allowing some users
but not others to set things like work_mem for their queries
2.  You could specify allowed types in custom settings.
3.  In a subsequent stage you might be able to SELECT  INTO
setting_name FROM ;  allowing access to setting writes based on queries.



> regards
>
> Pavel
>
>
>


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

[HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-10-28 Thread Chris Travers

Hi;

There are still some cleanup bits needed here but I wanted to get feedback
on my general direction.

I hope to submit for commit fest soon if the general feedback is good.
Tests are passing (with adjustments intended for change of behaviour in one
test script).  I want to note that Crimson Thompson (also with Adjust) has
provided some valuable discussion on this code.

The Problem:

pg_rewind makes no real guarantees regarding the final state of non-data
files or directories.  It.checks to see if the timeline has incremented
(and therefore guarantees that if successful the data directories are on
the same timeline) but for non-data files, these are clobbered if we rewind
and left intact if not.  These other files include postgresql.auto.conf,
replication slots, and can include log files.

Copying logs over to the new slave is something one probably never wants to
do (same with replication slots), and the behaviours here can mask
troubleshooting regarding what a particular master failed, cause wal
segments to build up, automatic settings changes, and other undesirable
behaviours.  Together these make pg_rewind very difficult to use properly
and push tasks to replication management tooling that the management tools
are not in a good position to handle correctly.

Designing the Fix:

Two proposed fixes have been considered and one selected:  Either we
whitelist directories and only rewind those.  The other was to allow shell
globs to be provided that could be used to exclude files.  The whitelisting
solution was chosen for a number of reasons.

When pg_rewind "succeeds" but not quite correctly the result is usually a
corrupted installation which requires a base backup to replace it anyway.
In a recovery situation, sometimes pressures occur which render human
judgment less effective, and therefore glob exclusion strikes me as
something which would probably do more harm than good, but maybe I don't
understand the use case (comments as to why some people look at the other
solution as preferable would be welcome).

In going with the whitelisting solution, we chose to include all
directories with WAL-related information.This allows more predicable
interaction with other parts of the replication chain.  Consequently not
only do we copy pg_wal and pg_xact but also commit timestamps and so forth.

The Solution:

The solution is a whitelist of directories specified which are the only
ones which are synchronised.  The relevant part of this patch is:

+/* List of directories to synchronize:

+ * base data dirs (and ablespaces)

+ * wal/transaction data

+ * and that is it.

+ *

+ * This array is null-terminated to make

+ * it easy to expand

+ */

+

+const char *rewind_dirs[] = {

+"base",

+"global",

+"pg_commit_ts",

+"pg_logical",

+"pg_multixact",

+"pg_serial",

+"pg_subtrans",

+"pg_tblspc",

+"pg_twophase",

+"pg_wal",

+"pg_xact",

+NULL

+};


From there we iterate over this array for directory-based approaches in
copy_fetch.c and with one query per directory in libpq_fetch.c.  This also
means shifting from the basic interface from PQexec to PQexecParams.  It
would be possible to move to binary formats too, but this was not done
currently in order to simplify code review (that could be a separate
independent patch at a later time).

Testing Done:

The extra files tests correctly test this change in behaviour.  The tests
have been modified, and diagnostics in cases of failures expanded, in this
case.  The other tests provide good coverage of whether pg_rewind does what
it is supposed to do.

Cleanup still required:

I accidentally left Carp::Always in the PM in this perl module.  It will be
fixed.

I renamed one of the functions used to have a more descriptive name but
currently did not remove the old function yet.


Feedback is very welcome.  pg_rewind is a very nice piece of software.  I
am hoping that these sorts of changes will help ensure that it is easier to
use and provides more predictable results.
-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


pg_rewind_restrict.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Chris Travers

On Sun, Sep 17, 2017 at 6:43 PM, David Fetter <da...@fetter.org> wrote:

> On Sat, Sep 16, 2017 at 10:42:49PM +, Douglas Doole wrote:
> > Oliver, I took a look at your tests and they look thorough to me.
> >
> > One recommendation, instead of having 3999 separate selects to test every
> > legal roman numeral, why not just do something like this:
> >
> > do $$
> > declare
> > i int;
> > rn text;
> > rn_val int;
> > begin
> > for i in 1..3999 loop
> > rn := trim(to_char(i, 'rn'));
> > rn_val := to_number(rn, 'rn');
> > if (i <> rn_val) then
> > raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
> > end if;
> > end loop;
> > raise notice 'Tested roman numerals 1..3999';
> > end;
> > $$;
> >
> > It's a lot easier to maintain than separate selects.
>
> Why not just one SELECT, as in:
>
> SELECT i, to_char(i, 'rn'), to_number(to_char(i, 'rn'), 'rn');
> FROM generate_series(1,3999) i
>

Question:  What is our definition of a legal Roman numeral?

For example sometimes IXX appears in the corpus to refer to 19 even though
our standardised notation would be XIX.

>
> Best,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] pg_rewind proposed scope and interface changes

2017-09-15 Thread Chris Travers

On Wed, Sep 13, 2017 at 6:28 AM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Tue, Sep 12, 2017 at 11:52 PM, Chris Travers
> <chris.trav...@adjust.com> wrote:
> > Additionally the wal, xact, timestamp and logical directories must be
> > processed in some way.
>
> To what does the term "logical directories" refer to?
>
> >   * if --wal=sync the directories are processed the way they are today
> >   * if --wal=clear then the contents of the directories are cleared and
> > replication is assumed to be used to bring the system up after.  Note
> this
> > will need to come with warning about the need for replication slots.
>
> Hm. I am not sure in what --wal=clear is helpful. Keeping around WAL
> segments from the point of the last checkpoint where WAL forked up to
> the point where WAL has forked is helpful, because you don't need to
> copy again those WAL segments, be they come from an archive or from
> streaming. Copying a set of WAL segments during the rewind of the new
> timeline is helpful as well because you don't need to do the copy
> again. One configuration where this is helpful is that there is
> already an archive local to the target server available with the
> segments of the new timeline available.
>

so maybe clear should just clear diverged wal files.  That makes some
sense.

>
> > Base, global, pg_tablespace
> >
> > With
> > pg_wal, pg_xact, pg_commit_ts, pg_logical added if wal strategy is set to
> > sync.
>
> Skipping some directories in a way similar to what a base backup does
> would be nicer I think. We already have a list of those in
> basebackup.c in the shape of excludeDirContents and excludeFiles. I
> think that it would be a good idea to export those into a header that
> pg_rewind could include, and refer to in order to exclude them when
> fetching a set of files. At the end of the day, a rewind is a kind of
> base backup in itself, and this patch would already serve well a lot
> of people.
>

I think there are several reasons not to just do this based on base
backup.  For base backup there may be some things we restore as such that
we don't want to restore with pg_rewind.  Configuration files, logs, and
replication slots are the ones that immediately come to mind.  Given that
configuration files can have arbitrary names and locations, expecting third
party failover tooling like repmgr or patroni to pick up on such policies
strikes me as asking a bit much.  I think we are better off with a program
that solves one problem well and solves it right and the problem is
rewinding a system so that it can follow a replica that was previously
following it.

>
> Having on top of that a way to exclude a wanted set of files and the
> log directory (for example: should we look at log_directory and
> exclude it from the fetched paths if it is not an absolute path?),
> which is smart enough to take care of not removing paths critical for
> a rewind like anything in base/, then you are good to go with a
> full-blown tool that I think would serve the purposes you are looking
> for.
>

If you are going to go this route, I think you need to look at the config
files themselves, parse them, and possibly look at the command line by
which PostgreSQL is started.Otherwise I don't know how you expect this
to be configured


> --
> Michael
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

[HACKERS] pg_rewind proposed scope and interface changes

2017-09-12 Thread Chris Travers

Hi all;


After working with pg_rewind a bit more it has become clear to me that some
aspects of the ideal scope of the program are a bit unclear.  This is in
part because the majority of cases where you use the utility, a small
amount of data loss is accepted due to the tradeoff of continuity.


Basic Responsibility:


pg_rewind’s core responsibility is to restore data directories so that they
are compatible with following another server.


Expanded Options:

As with pg_basebackup, wal files and related data may need to be specified
separately.


Pre-run checks:


PG_VERSION must be the same on target and source or else we stop.


We then check the timelines and if they have not diverged don’t do anything.


Processed directories:


pg_basebackup must *always* process global, base, and pg_tablespace
directories regardless of any WAL


Additionally the wal, xact, timestamp and logical directories must be
processed in some way.


  * if --wal=sync the directories are processed the way they are today

  * if --wal=clear then the contents of the directories are cleared and
replication is assumed to be used to bring the system up after.  Note this
will need to come with warning about the need for replication slots.


No other files will be clobbered.


This means that the basic processing file list will look like this:


Base, global, pg_tablespace


With

pg_wal, pg_xact, pg_commit_ts, pg_logical added if wal strategy is set to
sync.


Proposed patch roadmap:

   1. Improve WAL handling by allowing unlinking of all regular files in
   those directories during the rewind process.  Make —wal=sync the default
   for now because that is backwards compatible.  Longer-run we may consider
   switching the default.
   2. Restrict main operation to the directories listed above.

Problems that we will not try to solve:

 * Rewinding past table creation orphans table file.  This is a complex
topic on its own and probably needs a separate utility.

Thoughts?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-07 Thread Chris Travers

On Thu, Sep 7, 2017 at 2:47 PM, Alvaro Herrera <alvhe...@alvh.no-ip.org>
wrote:

> After reading this discussion, I agree that pg_rewind needs to become
> smarter in order to be fully useful in production environments; right
> now there are many warts and corner cases that did not seem to have been
> considered during the initial development (which I think is all right,
> taking into account that its mere concept was complicated enough; so we
> need not put any blame on the original developers, rather the contrary).
>

Agreed with this assessment.  And as a solution to the problem of "base
backups take too long to take and transfer" the solution and the corner
cases make a lot of sense.

>
> I think we need to make the program simple to use (i.e. not have the
> user write shell globs for the common log file naming patterns) while
> remaining powerful (i.e. not forcibly copy any files that do not match
> hardcoded globs).

I would add that well-defined tasks are a key aspect of powerful software
in my view and here the well defined task is to restore data states to a
particular usable timeline point taken from another system.  If that is
handled well, that opens up new uses and makes some problems that are
difficult right now much easier to solve.

>   Is the current dry-run mode enough to give the user
> peace of mind regarding what would be done in terms of testing globs
> etc?  If not, maybe the debug mode needs to be improved (for instance,
> have it report the file size for each file that would be copied;
> otherwise you may not notice it's going to copy the 20GB log file until
> it's too late ...)
>

The dry run facility solves one problem in one circumstance, namely a
manually invoked run of the software along with the question of "will this
actually re-wind?"  I suppose software developers might be able to use it
to backup and restore things that are to be clobbered (but is anyone likely
to on the software development side?).  I don't see anything in that corner
that can be improved without over engineering the solution.

There are two reasons I am skeptical that a dry-run mode will ever be
"enough."

The first is that pg_rewind is often integrated into auto-failover/back
tools and the chance of running it in a dry-run mode before it is
automatically triggered is more or less nil.  These are also the cases
where you won't notice it does something bad until much later.

The second is that there are at least some corner cases we may need to
define as outside the responsibility of pg_rewind.  The one that comes to
mind here is if I am rewinding back past the creation of a small table.  I
don't see an easy or safe way to address that from inside pg_rewind without
a lot of complication.  It might be better to have a dedicated tool for
that.

>
> Now, in order for any of this to happen, there will need to be a
> champion to define what the missing pieces are, write all those patches
> and see them through the usual (cumbersome, slow) procedure.  Are you,
> Chris, willing to take the lead on that?
>

 Yeah. I think the first step would be list of the corner cases and a
proposal for how I think it should work.  Then maybe a roadmap of patches,
and then submitting them as they become complete.

> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-07 Thread Chris Travers

One more side to this which is relevant to other discussions.

If I am rewinding back to before when a table was created, the current
algorithm as well as any proposed algorithms will delete the reference to
the relfilenode in the catalogs but not the file itself.  I don't see how
an undo subsystem would fix this.

Is this a reason to rethink the idea that maybe a pg_fsck utility might be
useful that could be run immediately after a rewind?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-05 Thread Chris Travers

On Tue, Sep 5, 2017 at 2:40 PM, Vladimir Borodin <r...@simply.name> wrote:

>
> 5 сент. 2017 г., в 14:04, Michael Paquier <michael.paqu...@gmail.com>
> написал(а):
>
> For example, in archive_command we put WALs for archiving from
> pg_xlog/pg_wal into another directory inside PGDATA and than another cron
> task makes real archiving. This directory ideally should be skipped by
> pg_rewind, but it would not be handled by proposed change.
>
>
> I would be curious to follow the reasoning for such a two-phase
> archiving (You basically want to push it in two places, no? But why
> not just use pg_receivexlog then?). This is complicated to handle from
> the point of view of availability and backup reliability + durability.
>
>
> We do compress WALs and send them over network. Doing it via
> archive_command in single thread is sometimes slower than new WALs are
> written under heavy load.
>

How would this work when it comes to rewinding against a file directory?

>
> --
> May the force be with you…
> https://simply.name
>
>


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-05 Thread Chris Travers

On Tue, Sep 5, 2017 at 1:04 PM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Tue, Sep 5, 2017 at 7:54 PM, Vladimir Borodin <r...@simply.name> wrote:
> > 5 сент. 2017 г., в 12:31, Chris Travers <chris.trav...@adjust.com>
> > написал(а):
> >
> > I think the simplest solution for now is to skip any files ending in
> .conf,
> > .log, and serverlog.
>
> This is not a portable solution. Users can include configuration files
> with the names they want. So the current patch as proposed is
> definitely not something worth it.
>

Actually that is exactly why I think the long-term solution is to figure
out what we need to copy and not copy anything we don't recognise.

That means the following directories as far as I can see:
 * base
 * global
 * pg_xlog/pg_wal
 * pg_clog/pg_xact
 * pg_commit_ts
 * pg_twophase
 * pg_snapshots?

Are there any other directories I am missing?

At any rate, I think the current state makes it very difficult to test
rewind adequately, and it makes it extremely difficult to use in a
non-trivial environment because you have to handle replication slots,
configuration files, and so forth yourself, and you have to be aware that
these *may* or *may not* be consistently clobbered by a rewind, so you have
to have some way of applying another set of files in following a rewind.

If nothing else we ought to *at least* special case the recovery.conf and
the postgresql.auto.conf, and pg_replslot because these are always located
there and should never be clobbered.

>
> > For example, in archive_command we put WALs for archiving from
> > pg_xlog/pg_wal into another directory inside PGDATA and than another cron
> > task makes real archiving. This directory ideally should be skipped by
> > pg_rewind, but it would not be handled by proposed change.
>
> I would be curious to follow the reasoning for such a two-phase
> archiving (You basically want to push it in two places, no? But why
> not just use pg_receivexlog then?). This is complicated to handle from
> the point of view of availability and backup reliability + durability.
>
> > While it is definitely an awful idea the user can easily put something
> > strange (i.e. logs) to any important directory in PGDATA (i.e. into base
> or
> > pg_wal). Or how for example pg_replslot should be handled (I asked about
> it
> > a couple of years ago [1])? It seems that a glob/regexp for things to
> skip
> > is a more universal solution.
> >
> > [1]
> > https://www.postgresql.org/message-id/flat/8DDCCC9D-450D-
> 4CA2-8CF6-40B382F1F699%40simply.name
>
> Well, keeping the code simple is not always a bad thing. Logs are an
> example that can be easily countered, as well as archives in your
> case.
>

> --
> Michael
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-05 Thread Chris Travers

On Tue, Sep 5, 2017 at 12:54 PM, Vladimir Borodin <r...@simply.name> wrote:

>
> 5 сент. 2017 г., в 12:31, Chris Travers <chris.trav...@adjust.com>
> написал(а):
>
> I think the simplest solution for now is to skip any files ending in
> .conf, .log, and serverlog.
>
>
> Why don’t you want to solve the problem once? It is a bit harder to get
> consensus on a way how to do it, but it seems that there are no reasons to
> make temporary solution here.
>
> For example, in archive_command we put WALs for archiving from
> pg_xlog/pg_wal into another directory inside PGDATA and than another cron
> task makes real archiving. This directory ideally should be skipped by
> pg_rewind, but it would not be handled by proposed change.
>

Ok let's back up a bit in terms of what I see is the proper long-term fix.
Simple code, by the way, is important, but at least as important are
programs which solve simple, well defined problems.  The current state is:

1.  pg_rewind makes *no guarantee* as to whether differences in logs,
config files, etc. are clobbered.  They may (If a rewind is needed) or not
(If the timelines haven't diverged).  Therefore the behavior of these sorts
of files with the invocation of pg_rewind is not really very well defined.
That's a fairly big issue in an operational environment.

2.  There are files which *may* be copied (I.e. are copied if the timelines
have diverged) which *may* have side effects on replication topology, wal
archiving etc.  Replication slots, etc. are good examples.

The problem I think pg_rewind should solve is "give me a consistent data
environment from the timeline on that server."  I would think that access
to the xlog/clog files would indeed be relevant to that.  If I were
rewriting the application now I would include those.  Just because
something can be handled separately doesn't mean it should be, and I would
refer not to assume that archiving is properly set up and working.

>
>
> Long run, it would be nice to change pg_rewind from an opt-out approach to
> an approach of processing the subdirectories we know are important.
>
>
> While it is definitely an awful idea the user can easily put something
> strange (i.e. logs) to any important directory in PGDATA (i.e. into base or
> pg_wal). Or how for example pg_replslot should be handled (I asked about it
> a couple of years ago [1])? It seems that a glob/regexp for things to skip
> is a more universal solution.
>

I am not convinced it is a universal solution unless you take an arbitrary
number or regexes to check and loop through checking all of them.  Then the
chance of getting something catastrophically wrong in a critical
environment goes way up and you may end up in an inconsistent state at the
end.

Simple code is good.  A program that solves simple problems reliably (and
in simple ways) is better.

The problem I see is that pg_rewind gets incorporated into other tools
which don't really provide the user before or after hooks and therefore it
isn't really fair to say, for example that repmgr has the responsibility to
copy server logs out if present, or to make sure that configuration files
are not in the directory.

The universal solution is to only touch the files that we know are needed
and therefore work simply and reliably in a demanding environment.

>
> [1] https://www.postgresql.org/message-id/flat/8DDCCC9D-
> 450D-4CA2-8CF6-40B382F1F699%40simply.name
>
>
> --
> May the force be with you…
> https://simply.name
>
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-05 Thread Chris Travers

On Mon, Sep 4, 2017 at 3:38 PM, Alvaro Herrera <alvhe...@alvh.no-ip.org>
wrote:

> Chris Travers wrote:
> > On Mon, Sep 4, 2017 at 12:23 PM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >
> > > On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
> > > <michael.paqu...@gmail.com> wrote:
> > > > A simple idea would be to pass as a parameter a regex on which we
> > > > check files to skip when scanning the directory of the target
> remotely
> > > > or locally. This needs to be used with care though, it would be easy
> > > > to corrupt an instance.
> > >
> > > I actually shortcut that with a strategy similar to base backups: logs
> > > are on another partition, log_directory uses an absolute path, and
> > > PGDATA has no reference to the log path.
> >
> > Yeah, it is quite possible to move all these out of the data directory,
> but
> > bad things can happen when you accidentally copy configuration or logs
> over
> > those on the target and expecting that all environments will be properly
> > set up to avoid these problems is not always a sane assumption.
>
> I agree that operationally it's better if these files weren't in PGDATA
> to start with, but from a customer support perspective, things are
> frequently not already setup like that, so failing to support that
> scenario is a loser.
>
> I wonder how portable fnmatch() is in practice (which we don't currently
> use anywhere).  A shell glob seems a more natural interface to me for
> this than a regular expression.
>

I think the simplest solution for now is to skip any files ending in .conf,
.log, and serverlog.

Long run, it would be nice to change pg_rewind from an opt-out approach to
an approach of processing the subdirectories we know are important.

It is worth noting further that if you rewind in the wrong way, in a
cascading replication environment, you can accidentally change your
replication topology if you clobber the recovery.conf from another replica
and there is no way to ensure that this file is not in the data directory
since it MUST be put there.

Best Wishes,
Chris Travers


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-04 Thread Chris Travers

On Mon, Sep 4, 2017 at 12:23 PM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Mon, Sep 4, 2017 at 7:21 PM, Michael Paquier
> <michael.paqu...@gmail.com> wrote:
> > A simple idea would be to pass as a parameter a regex on which we
> > check files to skip when scanning the directory of the target remotely
> > or locally. This needs to be used with care though, it would be easy
> > to corrupt an instance.
>
> I actually shortcut that with a strategy similar to base backups: logs
> are on another partition, log_directory uses an absolute path, and
> PGDATA has no reference to the log path.
>

Yeah, it is quite possible to move all these out of the data directory, but
bad things can happen when you accidentally copy configuration or logs over
those on the target and expecting that all environments will be properly
set up to avoid these problems is not always a sane assumption.

So consequently, I think it would be good to fix in the tool.  The
fundamental question is if there is any reason someone would actually want
to copy config files over.

--
> Michael
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-04 Thread Chris Travers

Ok so I have a proof of concept patch here.

This is proof of concept only.  It odes not change documentation or the
like.

The purpose of the patch is discussion on the "do we want this" side.

The patch is fairly trivial but I have not added test cases or changed docs
yet.

Intention of the patch:
pg_rewind is an important backbone tool for recovering data directories
following a switchover.  However currently it is over inclusive as to what
it copies.  This patch excludes any file ending in "serverlog", ".conf",
and ".log" because these are never directly related and add a great deal of
complexity to switchovers.

.conf files are excluded for two major reasons.  The first is that often we
may want to put postgresql.conf and other files in the data directory, and
if we change these during switchover this can change, for example, the port
the database is running on or other things that can break production or
testing environments.  This is usually a problem with testing environments,
but it could happen with production environments as well.

A much larger concern with .conf files though is the recovery.conf.  This
file MUST be put in the data directory, and it helps determine the
replication topology regarding cascading replication and the like.  If you
rewind from an upstream replica, you suddenly change the replication
topology and that can have wide-ranging impacts.

I think we are much better off insisting that .conf files should be copied
separately because the scenarios where you want to do so are more limited
and the concern often separate from rewinding the timeline itself.

The second major exclusion added are files ending in "serverlog" and
".log."  I can find no good reason why server logs from the source should
*ever* clobber those on the target.  If you do this, you lose historical
information relating to problems and introduce management issues.


Backwards-incompatibility scenarios:
If somehow one has a workflow that depends on copying .conf files, this
would break that.  I cannot think of any cases where anyone would actually
want to do this but that doesn't mean they aren't out there.  If people
really want to, then they need to copy the configuration files they want
separately.

Next Steps:

If people like this idea I will add test cases and edit documentation as
appropriate.

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


pg_rewind_log_conf_patch.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposal: pg_rewind to skip config files

2017-09-04 Thread Chris Travers

In some experiments with pg_rewind and rep mgr I noticed that local testing
is complicated by the fact that pg_rewind appears to copy configuration
files from the source to target directory.

I would propose to make a modest patch to exclude postgresql.conf,
pg_hba.conf, and pg_ident.conf from the file tree traversal.

Any feedback before I create.a proof of concept?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: global index

2017-08-25 Thread Chris Travers

On Fri, Aug 25, 2017 at 12:15 PM, Petr Jelinek <petr.jeli...@2ndquadrant.com
> wrote:

> On 25/08/17 10:28, Chris Travers wrote:
> >
> >
> > On Thu, Aug 24, 2017 at 9:44 PM, Andres Freund <and...@anarazel.de
> > <mailto:and...@anarazel.de>> wrote:
> >
> > Hi,
> >
> > On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> > > While we've been developing pg_pathman extension one of the most
> frequent
> > > questions we got from our users was about global index support. We
> cannot
> > > provide it within an extension. And I couldn't find any recent
> discussion
> > > about someone implementing it. So I'm thinking about giving it a
> shot and
> > > start working on a patch for postgres.
> >
> > FWIW, I personally think for constraints the better approach is to
> make
> > the constraint checking code cope with having to check multiple
> > indexes. Initially by just checking all indexes, over the longer term
> > perhaps pruning the set of to-be-checked indexes based on the values
> in
> > the partition key if applicable.   The problem with creating huge
> global
> > indexes is that you give away some the major advantages of
> partitioning:
> > - dropping partitions now is slow / leaves a lof of garbage again
> > - there's no way you can do this with individual partitions being
> remote
> >   or such
> > - there's a good chunk of locality loss in global indexes
> >
> > The logic we have for exclusion constraints checking can essentially
> be
> > extended to do uniqueness checking over multiple partitions.
> Depending
> > on the desired deadlock behaviour one might end up doing speculative
> > insertions in addition.  The foreign key constraint checking is
> fairly
> > simple, essentially one "just" need to remove the ONLY from the
> > generated check query.
> >
>
> +1 (or +as much as I am allowed to get away with really ;) )
>
> >
> > To be clear, this would still require a high-level concept of a global
> > index and the only question is whether it gets stored as multiple
> > partitions against partitioned tables vs stored in one giant index,
> right?
> >
> No, just global constraints. For example, if you consider unique index
> to be implementation detail of a unique constraint, there is nothing
> stopping us to use multiple such indexes (one per partition) as
> implementation detail to single global unique constraint. No need for
> global index at all.
>

Ok so in this case a global constraint needs to track partitioned indexes,
right?

How does this differ, in practice from a "global but partitioned index?"
 This seems like splitting hairs but I am trying to see if there is
disagreement here that goes beyond language.

For example, could I have a global, partial unique constraint the way I can
do things with a single table currently (something like create global
unique index foo_id_idxuf on foo(Id) where id > 12345)?  Is this something
the discussion here would foreclose?

It seems to me that you can get both of these (and more) by adding the
concept of a global index which means:
1.  Index is on parent table
2.  Index is inherited to child tables and managed on parent.
3.  Writes to children that hit inherited unique index ALSO must check
(with exclusion constraints etc) ALL other tables in the inheritance tree
of the index.

That would also have a few important side benefits:
1.  Easier management of indexes where all partitions (or other children
since there are other uses for table inheritance than partitioning) must be
indexed
2.  Ability to have partial unique indexes enforced consistently across an
inheritance tree.

An alternative might be to generalise partial unique indexes into partial
unique constraints. (alter table foo add unique (bar) where id > 12345)

>
> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Thoughts on unit testing?

2017-08-25 Thread Chris Travers

On Thu, Aug 24, 2017 at 9:07 AM, Torsten Zuehlsdorff <
mailingli...@toco-domains.de> wrote:

>
>
> On 13.08.2017 21:19, Peter Geoghegan wrote:
>
>> On Thu, Aug 10, 2017 at 2:53 PM, Thomas Munro
>> <thomas.mu...@enterprisedb.com> wrote:
>>
>>> The current regression tests, isolation tests and TAP tests are very
>>> good (though I admit my experience with TAP is limited), but IMHO we
>>> are lacking support for C-level unit testing.  Complicated, fiddly
>>> things with many states, interactions, edge cases etc can be hard to
>>> get full test coverage on from the outside.  Consider
>>> src/backend/utils/mmgr/freepage.c as a case in point.
>>>
>>
I don't want to see full test coverage of the code btw.  I think that
undermines the benefit from testing.

I would like to see better test coverage though for reasons below.

>
>> It is my understanding that much of the benefit of unit testing comes
>> from maintainability.
>>
>
> I never had this understanding. I write tests to test expected behavior
> and not the coded one. If possible i separate the persons writing
> unit-tests from the ones writing the function itself. Having someone really
> read the documentation or translate the expectation into a test-case, makes
> sure, the function itself works well.
>

Right.  I would go so far as to say I test to the documentation, not to the
code.  Usually when I test my own code, I write docs, then code, then I
re-read the docs, and write test cases from the docs.

In my not-so humble opinion, the point of tests is to make sure you don't
break other people's stuff, people who are reading the docs and using them
when they write their code.

>
> Also it really safes time in the long run. Subtle changes / bugs can be
> caught which unit-tests. Also a simple bug-report can be translated into a
> unit-test make sure that the bugfix really works and that no regression
> will happen later. I literally saved ones a week of work with a single
> unit-test.
>

Also comparing to the docs means we have a way to reason about where
misbehaviour is occurring that means better maintainability and less code
entropy in the course of fixing bugs.

>
> There are many other advantages, but to earn them the code need to be
> written to be testable. And this is often not the case. Most literature
> advises to Mocking, mixins or other techniques, which most times just
> translate into "this code is not written testable" or "the technique /
> language / concept / etc is not very good in being testable".
>

Agreed here.  Usually you end up with better, more stable, more carefully
designed components as a result.  But as I say, documentation, design, and
testing are actually harder to get right than coding

So with the above being said, the fact is that a lot of advanced stuff can
be done by writing C libraries that get preloaded into PostgreSQL. These C
libraries risk being broken by cases where behaviour does not match
documentation.  So if I want to translate an oid into a db name or vice
versa, I might call the internal functions to do this.  Having a stable C
API would be of great help in ensuring longer-term maintainability of such
libraries.  So I think it would be good to add some unit tests here.

Of course it also means we get to decide what functionality is sufficiently
stable to test and guarantee, and that saves both time in maintenance, and
it improves the safety of moving forward.

But I still think the question of what to test ought to be geared around
"what are we willing to try to guarantee as behaviour for some years, not
just to ourselves but to third parties."

>
> Greetings,
> Torsten
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: global index

2017-08-25 Thread Chris Travers

On Thu, Aug 24, 2017 at 9:44 PM, Andres Freund <and...@anarazel.de> wrote:

> Hi,
>
> On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> > While we've been developing pg_pathman extension one of the most frequent
> > questions we got from our users was about global index support. We cannot
> > provide it within an extension. And I couldn't find any recent discussion
> > about someone implementing it. So I'm thinking about giving it a shot and
> > start working on a patch for postgres.
>
> FWIW, I personally think for constraints the better approach is to make
> the constraint checking code cope with having to check multiple
> indexes. Initially by just checking all indexes, over the longer term
> perhaps pruning the set of to-be-checked indexes based on the values in
> the partition key if applicable.   The problem with creating huge global
> indexes is that you give away some the major advantages of partitioning:
> - dropping partitions now is slow / leaves a lof of garbage again
> - there's no way you can do this with individual partitions being remote
>   or such
> - there's a good chunk of locality loss in global indexes
>
> The logic we have for exclusion constraints checking can essentially be
> extended to do uniqueness checking over multiple partitions. Depending
> on the desired deadlock behaviour one might end up doing speculative
> insertions in addition.  The foreign key constraint checking is fairly
> simple, essentially one "just" need to remove the ONLY from the
> generated check query.
>


To be clear, this would still require a high-level concept of a global
index and the only question is whether it gets stored as multiple
partitions against partitioned tables vs stored in one giant index, right?

Also for foreign key constraints, does it make sense, for
backwards-compatibility reasons to introduce a new syntax for checking all
child tables?



>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: global index

2017-08-23 Thread Chris Travers

On Aug 21, 2017 07:47, "Simon Riggs"  wrote:

On 18 August 2017 at 15:40, Alvaro Herrera  wrote:
> Ildar Musin wrote:
>
>> While we've been developing pg_pathman extension one of the most frequent
>> questions we got from our users was about global index support. We cannot
>> provide it within an extension. And I couldn't find any recent discussion
>> about someone implementing it. So I'm thinking about giving it a shot and
>> start working on a patch for postgres.
>>
>> One possible solution is to create an extended version of item pointer
which
>> would store relation oid along with block number and position:
>
> I've been playing with the index code in order to allow indirect tuples,
> which are stored in a format different from IndexTupleData.
>
> I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
> which internally has pointers to both IndexTupleData and
> IndirectIndexTupleData, which makes it easier to pass around the index
> tuple in either format.

> It's very easy to add an OID to that struct,
> which then allows to include the OID in either an indirect index tuple
> or a regular one.

If there is a unique index then there is no need for that. Additional
data to the index makes it even bigger and even less useful, so we
need to count that as a further disadvantage of global indexes.

I have a very clear statement from a customer recently that "We will
never use global indexes", based upon their absolute uselessness in
Oracle.

It is worth noting that the only use case I can see where global indexes
fill a functionality gap are with unique indexes which allow you to enforce
uniqueness across an inheritance tree where the uniqueness is orthogonal to
any partition key.

I could find large numbers of uses for that.  That could also allow
referential integrity to check against a root table rather than force
partition explosion.Will

Otherwise the following mostly works:

Create table (like foo including all) inherits (foo);

So the gap this addresses is very real even if it is narrow.

> Then, wherever we're using IndexTupleData in the index AM code, we would
> replace it with InMemoryIndexTuple.  This should satisfy both your use
> case and mine.

Global indexes are a subset of indirect indexes use case but luckily
not the only use.

--
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?

2017-08-21 Thread Chris Travers

On Sun, Aug 20, 2017 at 4:10 AM, MauMau <maumau...@gmail.com> wrote:

> From: Chris Travers
> > Why cannot you do all this in a language handler and treat as a user
> defined function?
> > ...
> > If you have a language handler for cypher, why do you need in_region
> or cast_region?  Why not just have a graph_search() function which
> takes in a cypher query and returns a set of records?
>
> The language handler is for *stored* functions.  The user-defined
> function (UDF) doesn't participate in the planning of the outer
> (top-level) query.  And they both assume that they are executed in SQL
> commands.
>

Sure but stored functions can take arguments, such as a query string which
gets handled by the language handler.  There's absolutely no reason you
cannot declare a function in C that takes in a Cypher query and returns a
set of tuples.   And you can do a whole lot with preloaded shared libraries
if you need to.

The planning bit is more difficult, but see below as to where I see major
limits here.

>
> I want the data models to meet these:
>
> 1) The query language can be used as a top-level session language.
> For example, if an app specifies "region=cypher_graph" at database
> connection, it can use the database as a graph database and submit
> Cypher queries without embedding them in SQL.
>

That sounds like a foot gun.  I would probably think of those cases as
being ideal for a custom background worker, similar to Mongress.
Expecting to be able to switch query languages on the fly strikes me as
adding totally needless complexity everywhere to be honest.  Having
different listeners on different ports simplifies this a lot and having,
additionally, query languages for ad-hoc mixing via language handlers might
be able to get most of what you want already.

>
> 2) When a query contains multiple query fragments of different data
> models, all those fragments are parsed and planned before execution.
> The planner comes up with the best plan, crossing the data model
> boundary.  To take the query example in my first mail, which joins a
> relational table and the result of a graph query.  The relational
> planner considers how to scan the table, the graph planner considers
> how to search the graph, and the relational planner considers how to
> join the two fragments.
>

It seems like all you really need is a planner hook for user defined
languages (I.e. "how many rows does this function return with these
parameters" right?).  Right now we allow hints but they are static.  I
wonder how hard this would be using preloaded, shared libraries.

>
> So in_region() and cast_region() are not functions to be executed
> during execution phase, but are syntax constructs that are converted,
> during analysis phase, into calls to another region's parser/analyzer
> and an inter-model cast routine.
>

So basically they work like immutable functions except that you cannot
index the output?

>
> 1. The relational parser finds in_region('cypher_graph', 'graph
> query') and produces a parse node InRegion(region_name, query) in the
> parse tree.
>
> 2. The relational analyzer looks up the system catalog to checks if
> the specified region exists, then calls its parser/analyzer to produce
> the query tree for the graph query fragment.  The relational analyser
> attaches the graph query tree to the InRegion node.
>
> 3. When the relational planner finds the graph query tree, it passes
> the graph query tree to the graph planner to produce the graph
> execution plan.
>
> 4. The relational planner produces a join plan node, based on the
> costs/statistics of the relational table scan and graph query.  The
> graph execution plan is attached to the join plan node.
>
> The parse/query/plan nodes have a label to denote a region, so that
> appropriate region's routines can be called.
>

It would be interesting to see how much of what you want you can get with
what we currently have and what pieces are really missing.

Am I right that if you wrote a function in C to take a Cypher query plan,
and analyse it, and execute it, the only thing really missing would be
feedback to the PostgreSQL planner regarding number of rows expected?

>
> Regards
> MauMau
>
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?

2017-08-19 Thread Chris Travers

On Sat, Aug 19, 2017 at 4:29 PM, MauMau <maumau...@gmail.com> wrote:

> Hello,
>
> Please forgive me for asking such a stupid and rough question.
>
> I'm thinking of making PostgreSQL a multi-model database by supporting
> data models other than the current relational model.  A data model
> consists of a query language (e.g. SQL for relational model, Cypher
> for graph model), a parser and analyzer to transform a query into a
> query tree, a planner to transform the query tree into an execution
> plan, an executor, and a storage engine.
>
> To promote the data model development, I want to make data models
> pluggable.  The rough sketch is:
>
> 1) A data model developer implements the parser, analyzer,
> transformer, planner, executor, and storage engine functions in a
> shared library.
>
> 2) The DBA registers the data model.
>
>   CREATE QUERY LANGUAGE Cypher (
> PARSER = 
>   );
>
>   CREATE DATA MODEL graph (
> QUERY LANGUAGE = Cypher,
> ANALYZER = ,
> TRANSFORMER = ,
> PLANNER = ,
> EXECUTOR = ,
> STORAGE ENGINE = ,
>   );
>
>   CREATE REGION cypher_graph (
> QUERY LANGUAGE = Cypher,
> DATA MODEL = graph
>   );
>

Why cannot you do all this in a language handler and treat as a user
defined function?

>
> The region is just a combination of a query language and a data model,
> much like a locale is a combination of a language and a country.  This
> is because there may be multiple popular query languages for a data
> model.
>
> 3) The application connects to the database, specifying a desired
> region.  The specified region's query language becomes the default
> query language for the session.
>
>
> The application can use the data of multiple data models in one query
> by specifying another region and its query via in_region().  For
> example, the following query combines the relational restaurant table
> and a social graph to get the five chinese restaurants in Tokyo that
> are most popular among friends of John and their friends.
>
>   SELECT r.name, g.num_likers
>   FROM restaurant r,
> cast_region(
>   in_region('cypher_graph',
> 'MATCH (:Person {name:"John"})-[:IS_FRIEND_OF*1..2]-(friend),
>   (friend)-[:LIKES]->(restaurant:Restaurant)
>   RETURN restaurant.name, count(*)'),
> 'relational', 'g', '(name text, num_likers int')
>   WHERE r.name = g.name AND
> r.city = 'Tokyo' AND r.cuisine = 'chinese'
>   ORDER BY g.num_likers DESC LIMIT 5


If you have a language handler for cypher, why do you need in_region or
cast_region?  Why not just have a graph_search() function which takes in a
cypher query and returns a set of records?

> ;
>
>
> What do you think would be difficult to make data models pluggable,
> especially related to plugging the parser, planner, executor, etc?
> One possible concern is that various PostgreSQL components might be
> too dependent on the data model being relational, and it would be
> difficult to separate tight coupling.
>

I guess I am missing why the current language handler structure is not
enough.  Maybe I am missing something?


>
> Regards
> MauMau
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Proposal: global index

2017-08-18 Thread Chris Travers

I would really like to see global indexes.  It would make things a lot
easier for things like unique constraints across table inheritance trees.

On Fri, Aug 18, 2017 at 11:12 AM, Ildar Musin <i.mu...@postgrespro.ru>
wrote:

> Hi hackers,
>
> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.
>
> One possible solution is to create an extended version of item pointer
> which would store relation oid along with block number and position:
>
> struct ItemPointerExt
> {
> Oid ip_relid;
> BlockIdData ip_blkid;
> OffsetNumber ip_posid;
> };
>
> and use it in global index (regular index will still use old version).
> This will require in-depth refactoring of existing index nodes to make them
> support both versions. Accordingly, we could replace ItemPointer with
> ItemPointerExt in index AM to make unified API to access both regular and
> global indexes. TIDBitmap will require refactoring as well to be able to
> deal with relation oids.
>

So, to be clear on-disk representations would be unchanged for old indexes
(ensuring that pg_upgrade would not be broken), right?

>
> It seems to be quite an invasive patch since it requires changes in
> general index routines, existing index nodes, catalog, vacuum routines and
> syntax. So I'm planning to implement it step by step. As a first prototype
> it could be:
>
> * refactoring of btree index to be able to store both regular and extended
> item pointers;


Do you foresee any performance implementation of handling both?


>

* refactoring of TIDBitmap;
> * refactoring of general index routines (index_insert, index_getnext, etc)
> and indexAM api;
> * catalog (add pg_index.indisglobal attribute and/or a specific relkind as
> discussed in [1] thread);
> * syntax for global index definition. E.g., it could be oracle-like syntax:
>
> CREATE INDEX my_idx ON my_tbl (key) GLOBAL;
>
> If it goes well, then I’ll do the rest of indexes and vacuuming. If you
> have any ideas or concerns I’ll be glad to hear it.
>
> [1] https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e
> 3-e936a35f0cfd%40postgrespro.ru
>
> Thanks!
>
> --
> Ildar Musin
> i.mu...@postgrespro.ru
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-16 Thread Chris Travers

On Wed, Aug 16, 2017 at 7:15 PM, Andres Freund <and...@anarazel.de> wrote:

>
>
> I think this entirely is the wrong approach.  We shouldn't add weird
> check commands that require locks on pg_class, we should avoid leaving
> the orphaned files in the first place.  I've upthread outlined
> approached how to do so.
>

And in the mean time I suppose there is no reason that the approach I have
outlined cannot be used by an external program.I think it is sane to
draw the line at having the db responsible for cleaning up after itself
when something goes wrong and leave external programs to do after-the-fact
review and cleanup.

>
> Greetings,
>
> Andres Freund
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-16 Thread Chris Travers

So having throught about this a bit more, and having had some real-world
experience with the script now, I have an idea that might work and some
questions to make it succeed.

My thinking is to add a new form of vacuum called VACUUM FSCK.

This would:
1. lock pg_class in exclusive mode (or do I need exclusive access?), as
this is needed to solve the race conditions.  As I see, this seems to bring
the database to a screeching halt concurrency-wise (but unlike my script
would allow other databases to be accessed normally).
2. read the files where the name consists of only digits out of the
filesystem and compare with oids from pg_class and relfilenodes
3.  Any file not found in that list would then unlink it, as well as any
files with the patter followed by an underscore or period.

This would mean that the following cases would not be handled:

If you have the first extent gone but later extents are present we check on
the first extant, and so would not see the later ones.  Same goes for
visibility maps and other helper files.

If you add a file in the directory which has a name like 34F3A222BC, that
would never get cleaned up because it contains non-digits.

So this leads to the following questions:

1.  Is locking pg_class enough to avoid race conditions?  Is exclusive mode
sufficient or do I need exclusive access mode?
2.  would it be preferable to move the file to a directory rather than
unlinking it?
3.  Should I perform any sort of check on the tables at the end to make
sure everything is ok?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-15 Thread Chris Travers

On Tue, Aug 15, 2017 at 3:32 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Chris Travers <chris.trav...@adjust.com> writes:
> > I wonder about a different solution.  Would it be possible to special
> case
> > vacuum to check for and remove (or just move to where they can be
> removed)
> > files when vacuuming pg_class?  At the point we are vacuuming pg_class,
> we
> > ought to be able to know that a relfilenode shouldn't be used anymore,
> > right?
>
> I don't think so.  It's not clear to me whether you have in mind "scan
> pg_class, collect relfilenodes from all live tuples, then zap all files
> not in that set" or "when removing a dead tuple, zap the relfilenode
> it mentions".  But neither one works.  The first case has a race condition
> against new pg_class entries.  As for the second, the existence of a dead
> tuple bearing relfilenode N isn't evidence that some other live tuple
> can't have relfilenode N.
>

Ah because if the file never made it on to disk the number could be
re-used.

>
> Another problem for the second solution is that in the case you're worried
> about (ie, PANIC due to out-of-WAL-space during relation's creating
> transaction), there's no very good reason to expect that the relation's
> pg_class tuple ever made it to disk at all.
>
> A traditional low-tech answer to this has been to keep the WAL on a
> separate volume from the main data store, so that it's protected from
> out-of-space conditions in the main store and temp areas.  The space
> needs for WAL proper are generally much more predictable than the main
> store, so it's easier to keep the dedicated space from overflowing.
> (Stalled replication/archiving processes can be hazardous to your
> health in this scenario, though, if they prevent prompt recycling of
> WAL files.)
>

Yeah, most of our dbs here have wal on a separate volume but not this
system.  This system is also unusual in that disk usage varies wildly (and
I am not 100% sure that this is the only case which causes it though I can
reproduce it consistently in the case of the wal writer running out of disk
space with symptoms exactly what I found).

So for now that leaves my fallback approach as a way to fix it when I see
it.

I have written a shell script which does as follows:
1.  starts Postgres in single user mode with a data directory or dies
(won't run if Postgres seems to be already running)
2.  gets the old of the current database
3.  lists all files consisting of only digits in the  base/[dboid] directory
4. asks Postgres (In single user mode again) for all relfilenodes and oids
of tables (In my testing both were required because there were some cases
where relfilenodes were not set in some system
5.  Loops through the file nodes gathered, checks against the relfilenode
entries, and zaps $f, $f_*, and $f.*.  Currently for testing "zaps" has
been to move to a lostnfound folder for inspection following the script.
The logic here is not perfect and is very slightly under inclusive, but
better that than the other way.

Then we can start Postgres again.  I cannot find a better way to avoid race
conditions, I guess. At any rate it sounds like preventing the problem more
generally may be something beyond what I would feel comfortable trying to
do as a patch at my current level of familiarity with he source code.

The full script is included inline below my signature in case it is of
interest to anyone on the list.

> regards, tom lane
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com

Saarbrücker Straße 37a, 10405 Berlin

---

#!/bin/bash

datadir=$1
database=$2

pg_ctl -D $datadir stop

dboidfile="$PWD/cleanupdb.oid"
reloidfile="$PWD/refilenodes.list"

echo "COPY (select oid from pg_database where datname = current_database())
TO '$dboidfile'" | postgres --single -D $datadir $database > /dev/null

if (($?))
then
   echo "FATAL: Could not start Postgres in single user mode"
   exit 1
fi

dboid=`cat $dboidfile`
filenodes=`(cd test/base/$dboid; ls [0-9]*[0-9] | grep -v '\.' | sort -n)`
#echo $filenodes

echo "COPY (select relfilenode from pg_class union select oid as
relfilenode from pg_class) TO '$reloidfile'" | postgres --single -D
$datadir $database > /dev/null
relfilenodes=`cat $reloidfile`
#echo $relfilenodes
if [[ -z relfilenodes ]]
then
   echo "FATAL: did not get any relfilenodes"
   exit 2
fi

mkdir lostnfound;
for f in $filenodes
do
  if [[ -z `echo $relfilenodes | grep -w $f` ]]
  then
  echo moving $f to lostnfound
  mv $datadir/base/$dboid/$f lostnfound
  mv $datadir/base/$dboid/${f}_* lostnfound 2> /dev/null
  mv $datadir/base/$dboid/${f}.* lostnfound 2> /dev/null
  fi
done
rm $dboidfile
rm $reloidfile

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-15 Thread Chris Travers

There's another side to this and that I am not sure it is a backend crash.

Here is what I did to reproduce:

2 virtual disk images:  100mb for main data, 40 MB for WAL.  work_mem set
to 256MB. The idea is to test different out of space conditions.

Create table as ...; drop table; select
pg_size_pretty(pg_current_xlog_location() - '0/0');

I played around with parameters to determine how different kinds of out of
space errors were handled.

1.  running out of temp space was cleaned up without a server restart
needed.
2.  A relation running out of disk space *seemed* to get cleaned up.
3.  Running out of WAL space left *both* temp and non-temp files.

I wonder about a different solution.  Would it be possible to special case
vacuum to check for and remove (or just move to where they can be removed)
files when vacuuming pg_class?  At the point we are vacuuming pg_class, we
ought to be able to know that a relfilenode shouldn't be used anymore,
right?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-14 Thread Chris Travers

On Mon, Aug 14, 2017 at 8:40 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

>
>
> It would be possible to have orphaned non-temp tables if you'd suffered
> a crash during the transaction that created those tables.  Ordinarily
> a newly-created table file wouldn't be that large, but if your workflow
> created tables and shoved boatloads of data into them in the same
> transaction, it's not so hard to see this becoming an issue.
>

I think the working theory is that these were very like a number of very
large (multi-hundred-GB materialised views).

>
> The core problem with zapping non-temp table files is that you can't
> do that unless you're sure you have consistent, up-to-date pg_class
> data that nobody else is busy adding to.  It's hard to see an external
> application being able to do that safely.  You certainly can't do it
> at the point in the postmaster startup cycle where we currently do
> the other things --- for those, we rely only on filesystem naming
> conventions to identify what to zap.

Yeah that occurred to me. At this point I would settle for something I
could run with Postgres in single user mode.  Although that is very far
from ideal.  So what I wonder is if at least a short-term solution might be
a utility that starts Postgres in single user mode and we insist that
PostgreSQL is otherwise not running before the run.

I am certainly not feeling qualified at present for more advanced solutions
but that I might be able to do.

>
>     regards, tom lane
>

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-14 Thread Chris Travers

On Mon, Aug 14, 2017 at 6:33 PM, Andres Freund <and...@anarazel.de> wrote:

> Hi,
>
> On 2017-08-14 14:12:22 +0200, Chris Travers wrote:
> > Problem:
> > The system this came up on is PostgreSQL 9.6.3 and has had repeated
> trouble
> > with disk space.  Querying pg_database_size, as well as du on the
> > subdirectory of base/ show total usage to be around 3.8TB.  Summing up
> the
> > size of the relations in pg_class though shows around 2.1TB.
> >
> > Initial troubleshooting found around 150 GB of space in pg_temp which had
> > never been cleared and was at least several days old.  Restarting the
> > server cleared these up.
> >
> > Poking around the base/[oid] directory, I found a large number of files
> > which did not correspond with a pg_class entry.  One of the apparent
> > relations was nearly 1TB in size.
> >
> > What I think happened:
> > I think various pg_temp/* and orphaned relation files (In base/[oid])
> were
> > created when PostgreSQL crashed due to running out of space in various
> > operations including creating materialised views.
> >
> > So my question is if there is a way we can safely clean these up on
> server
> > restart?  If not does it make sense to try to create a utility that can
> > connect to PostgreSQL, seek out valid files, and delete the rest?
>
> I think the fix here is to call RemovePgTempFiles() during
> crash-restarts, instead of just full starts. The previously stated need
> to be able to inspect temp files after a crash can be less impactfully
> fulfilled with restart_after_crash = false.
>
> But that only clears temp files right?

I am less concerned about the temp files because a restart clears them.

The bigger issue I see are with the orphaned base files.  It looks like
files in base/[oid] don't get cleaned up either if I read my output
correctly and it would explain why we saw 1.7TB of discrepancy between
relations and database size.  Safety-wise it seems like the best way out of
that is a dump/restore but doing that with a 2.1TB database is annoying.


> Greetings,
>
> Andres Freund
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Orphaned files in base/[oid]

2017-08-14 Thread Chris Travers

On Aug 14, 2017 14:12, "Chris Travers" <chris.trav...@adjust.com> wrote:

Hi all;

I am trying to track down a problem we are seeing that looks very similar
to bug #12050, and would certainly consider trying to contribute a fix if
we agree on one.  (I am not sure we can, so absent that, the next question
is whether it makes sense to create a utility to fix the problem when it
comes up so that a dump/restore is not needed).

The system:
PostgreSQL 9.6.3
Gentoo Linux.

Problem:
The system this came up on is PostgreSQL 9.6.3 and has had repeated trouble
with disk space.  Querying pg_database_size, as well as du on the
subdirectory of base/ show total usage to be around 3.8TB.  Summing up the
size of the relations in pg_class though shows around 2.1TB.

Initial troubleshooting found around 150 GB of space in pg_temp which had
never been cleared and was at least several days old.  Restarting the
server cleared these up.

Poking around the base/[oid] directory, I found a large number of files
which did not correspond with a pg_class entry.  One of the apparent
relations was nearly 1TB in size.

What I think happened:
I think various pg_temp/* and orphaned relation files (In base/[oid]) were
created when PostgreSQL crashed due to running out of space in various
operations including creating materialised views.

So my question is if there is a way we can safely clean these up on server
restart?  If not does it make sense to try to create a utility that can
connect to PostgreSQL, seek out valid files, and delete the rest?


Ok I  have identified one case where symptoms I am seeing can be
reproduced.  I am currently working on a Mac so there may be quirks in my
repro.  However

When the WAL writer runs out of disk space no cleanup is done.

So I  will be looking at possible solutions next.


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 <+49%20162%209037210> | Skype: einhverfr |
www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

[HACKERS] Orphaned files in base/[oid]

2017-08-14 Thread Chris Travers

Hi all;

I am trying to track down a problem we are seeing that looks very similar
to bug #12050, and would certainly consider trying to contribute a fix if
we agree on one.  (I am not sure we can, so absent that, the next question
is whether it makes sense to create a utility to fix the problem when it
comes up so that a dump/restore is not needed).

The system:
PostgreSQL 9.6.3
Gentoo Linux.

Problem:
The system this came up on is PostgreSQL 9.6.3 and has had repeated trouble
with disk space.  Querying pg_database_size, as well as du on the
subdirectory of base/ show total usage to be around 3.8TB.  Summing up the
size of the relations in pg_class though shows around 2.1TB.

Initial troubleshooting found around 150 GB of space in pg_temp which had
never been cleared and was at least several days old.  Restarting the
server cleared these up.

Poking around the base/[oid] directory, I found a large number of files
which did not correspond with a pg_class entry.  One of the apparent
relations was nearly 1TB in size.

What I think happened:
I think various pg_temp/* and orphaned relation files (In base/[oid]) were
created when PostgreSQL crashed due to running out of space in various
operations including creating materialised views.

So my question is if there is a way we can safely clean these up on server
restart?  If not does it make sense to try to create a utility that can
connect to PostgreSQL, seek out valid files, and delete the rest?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Funny WAL corruption issue

2017-08-11 Thread Chris Travers

On Fri, Aug 11, 2017 at 1:33 PM, Greg Stark <st...@mit.edu> wrote:

> On 10 August 2017 at 15:26, Chris Travers <chris.trav...@gmail.com> wrote:
> >
> >
> > The bitwise comparison is interesting.  Remember the error was:
> >
> > pg_xlogdump: FATAL:  error in WAL record at 1E39C/E1117FB8: unexpected
> > pageaddr 1E375/61118000 in log segment 0001E39C00E1, offset
> > 1146880
> ...
> > Since this didn't throw a checksum error (we have data checksums
> disabled but wal records ISTR have a separate CRC check), would this
> perhaps indicate that the checksum operated over incorrect data?
>
> No checksum error and this "unexpected pageaddr" doesn't necessarily
> mean data corruption. It could mean that when the database stopped logging
> it was reusing a wal file and the old wal stream had a record boundary
> on the same byte position. So the previous record checksum passed and
> the following record checksum passes but the record header is for a
> different wal stream position.
>
> I think you could actually hack xlogdump to ignore this condition and
> keep outputting and you'll see whether the records that follow appear
> to be old wal log data.  I haven't actually tried this though.
>

For better or worse, I get a different error at the same spot if I try this:

Doing so involved disabling the check in the backend wal reader.

pg_xlogdump: FATAL:  error in WAL record at 1E39C/E1117FB8: invalid
contrecord length 4509 at 1E39C/E1117FF8

If I hack it to ignore all errors on that record, no further records come
out though it does run over the same records.

This leads me to conclude there are no further valid records.


>
> --
> greg
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

Re: [HACKERS] Funny WAL corruption issue

2017-08-11 Thread Chris Travers

On Fri, Aug 11, 2017 at 1:33 PM, Greg Stark <st...@mit.edu> wrote:

> On 10 August 2017 at 15:26, Chris Travers <chris.trav...@gmail.com> wrote:
> >
> >
> > The bitwise comparison is interesting.  Remember the error was:
> >
> > pg_xlogdump: FATAL:  error in WAL record at 1E39C/E1117FB8: unexpected
> > pageaddr 1E375/61118000 in log segment 0001E39C00E1, offset
> > 1146880
> ...
> > Since this didn't throw a checksum error (we have data checksums
> disabled but wal records ISTR have a separate CRC check), would this
> perhaps indicate that the checksum operated over incorrect data?
>
> No checksum error and this "unexpected pageaddr" doesn't necessarily
> mean data corruption. It could mean that when the database stopped logging
> it was reusing a wal file and the old wal stream had a record boundary
> on the same byte position. So the previous record checksum passed and
> the following record checksum passes but the record header is for a
> different wal stream position.
>

I expect to test this theory shortly.

Assuming it is correct, what can we do to prevent restarts of slaves from
running into it?


> I think you could actually hack xlogdump to ignore this condition and
> keep outputting and you'll see whether the records that follow appear
> to be old wal log data.  I haven't actually tried this though.
>
> --
> greg
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

Re: [HACKERS] Funny WAL corruption issue

2017-08-10 Thread Chris Travers

On Thu, Aug 10, 2017 at 3:17 PM, Vladimir Rusinov <vrusi...@google.com>
wrote:

>
>
> On Thu, Aug 10, 2017 at 1:48 PM, Aleksander Alekseev <
> a.aleks...@postgrespro.ru> wrote:
>
>> I just wanted to point out that a hardware issue or third party software
>> issues (bugs in FS, software RAID, ...) could not be fully excluded from
>> the list of suspects. According to the talk by Christophe Pettus [1]
>> it's not that uncommon as most people think.
>
>
> This still might be the case of hardware corruption, but it does not look
> like one.
>

Yeah, I don't think so either.  The systems were not restarted, only the
service so I don't think this is a lie-on-write case.  We have EEC with
full checks, etc.  It really looks like something I initiated caused it but
not sure what and really not interested in trying to reproduce on a db of
this size.

>
> Likelihood of two different persons seeing similar error message just a
> year apart is low. From our practice hardware corruption usually looks like
> a random single bit flip (most common - bad cpu or memory), bunch of zeroes
> (bad storage), or bunch of complete garbage (usually indicates in-memory
> pointer corruption).
>
> Chris, if you still have original WAL segment from the master and it's
> corrupt copy from standby, can you do bit-by-bit comparison to see how they
> are different? Also, if you can please share some hardware details.
> Specifically, do you use ECC? If so, are there any ECC errors logged? Do
> you use physical disks/ssd or some form of storage virtualization?
>

Straight on bare metal, eec with no errors logged.  SSD for both data and
wal.

The bitwise comparison is interesting.  Remember the error was:

pg_xlogdump: FATAL:  error in WAL record at 1E39C/E1117FB8: unexpected
pageaddr 1E375/61118000 in log segment 0001E39C00E1, offset
1146880

Starting with the good segment:

Good wall segment, I think the record starts at 003b:

0117fb0     003b   

0117fc0 7f28 e111 e39c 0001 0940  cb88 db01

0117fd0 0200  067f  4000  2249 0195

0117fe0 0001  8001  b5c3  05ff 

0117ff0  0003   008c   

0118000 d093 0005 0001  8000 e111 e39c 0001

0118010 0084    7fb8 e111 e39c 0001

0118020 0910  ccac 2eba 2000 0056 067f 

0118030 4000  2249 0195 b5c4  08ff 0001

0118040 0002 0003 0004 0005 0006 0007 0008 0009

0118050 000a 000b 000c 000d 000e 000f 0010 0011

0118060 0012 0013 0014 0015 0016 0017 0018 0019

0118070 001a 001b 001c 001d 001e 001f 0020 0021

0117fb0     003b   

0117fc0 7f28 e111 e39c 0001 0940  cb88 db01

0117fd0 0200  067f  4000  2249 0195

0117fe0 0001  8001  b5c3  05ff 

0117ff0  0003   4079 ce05 1cce ecf9

0118000 d093 0005 0001  8000 6111 e375 0001

0118010 119d    cfd4 00cc ca00 0410

0118020 1800 7c00 5923 544b dc20 914c 7a5c afec

0118030 db45 0060 b700 1910 1800 7c00 791f 2ede

0118040 c573 a110 5a88 e1e6 ab48 0034 9c00 2210

0118050 1800 7c00 4415 400d 2c7e b5e3 7c88 bcef

0118060 4666 00db 9900 0a10 1800 7c00 7d1d b355
0118070 d432 8365 de99 4dba 87c7 00ed 6200 2210

I think the divergence is interesting here.  Up through 0117ff8, they are
identical.  Then the last half if the line differs.
The first half of the next is the same (but up through 011800a this time),
but the last 6 bytes differ (those six hold what appear to be the memory
address causing the problem), and we only have a few bits different in the
rest of the line.

It looks like some data and some flags were overwritten, perhaps while the
process exited.  Very interesting.

> Also, in absolute majority of cases corruption is caught by checksums. I
> am not familiar with WAL protocol - do we have enough checksums when
> writing it out and on the wire? I suspect there are much more things
> PostgreSQL can do to be more resilient, and at least detect corruptions
> earlier.
>

Since this didn't throw a checksum error (we have data checksums disabled
but wal records ISTR have a separate CRC check), would this perhaps
indicate that the checksum operated over incorrect data?

>
> --
> Vladimir Rusinov
> PostgreSQL SRE, Google Ireland
>
> Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
> Registered in Dublin, Ireland
> Registration Number: 368047
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

Re: [HACKERS] Funny WAL corruption issue

2017-08-10 Thread Chris Travers

> Yes.  Exactly the same output until a certain point where pg_xlogdump dies
> with an error.  That is the same LSN where PostgreSQL died with an error on
> restart.
>

 One other thing that is possibly relevant after reading through the last
bug report is the error pgxlogdumo gives:

pg_xlogdump: FATAL:  error in WAL record at 1E39C/E1117FB8: unexpected
pageaddr 1E375/61118000 in log segment 0001E39C00E1, offset
1146880



> --
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

Re: [HACKERS] Funny WAL corruption issue

2017-08-10 Thread Chris Travers

Sorry, meant to reply all.

On Thu, Aug 10, 2017 at 2:19 PM, Vladimir Borodin <r...@simply.name> wrote:

> Hi, Chris.
>
> 10 авг. 2017 г., в 15:09, Chris Travers <chris.trav...@gmail.com>
> написал(а):
>
> Hi;
>
> I ran into a funny situation today regarding PostgreSQL replication and
> wal corruption and wanted to go over what I think happened and what I
> wonder about as a possible solution.
>
> Basic information is custom-build PostgreSQL 9.6.3 on Gentoo, on a ~5TB
> database with variable load.  Master database has two slaves and generates
> 10-20MB of WAL traffic a second.  The data_checksum option is off.
>
>
> The problem occurred when I attempted to restart the service on the slave
> using pg_ctl (I believe the service had been started with sys V init
> scripts).  On trying to restart, it gave me a nice "Invalid memory
> allocation request" error and promptly stopped.
>
> The main logs showed a lot of messages like before the restart:
> 2017-08-02 11:47:33 UTC LOG:  PID 19033 in cancel request did not match
> any process
> 2017-08-02 11:47:33 UTC LOG:  PID 19032 in cancel request did not match
> any process
> 2017-08-02 11:47:33 UTC LOG:  PID 19024 in cancel request did not match
> any process
> 2017-08-02 11:47:33 UTC LOG:  PID 19034 in cancel request did not match
> any process
>
> On restart, the following was logged to stderr:
> LOG:  entering standby mode
> LOG:  redo starts at 1E39C/8B77B458
> LOG:  consistent recovery state reached at 1E39C/E1117FF8
> FATAL:  invalid memory alloc request size 3456458752
> LOG:  startup process (PID 18167) exited with exit code 1
> LOG:  terminating any other active server processes
>
> LOG:  database system is shut down
>
> After some troubleshooting I found that the wal segment had become
> corrupt, I copied the correct one from the master and everything came up to
> present.
>
> So It seems like somewhere something crashed big time on the back-end and
> when we tried to restart, the wal ended in an invalid way.
>
>
> We have reported the same thing [1] nearly a year ago. Could you please
> check with pg_xlogdump that both WALs (normal from master and corrupted)
> are exactly the same until some certain LSN?
>
> [1] https://www.postgresql.org/message-id/20160614103415.
> 5796.6885%40wrigleys.postgresql.org
>

Yes.  Exactly the same output until a certain point where pg_xlogdump dies
with an error.  That is the same LSN where PostgreSQL died with an error on
restart.

>
>
> I am wondering what can be done to prevent these sorts of things from
> happening in the future if, for example, a replica dies in the middle of a
> wal fsync.
> --
> Best Wishes,
> Chris Travers
>
> Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
> lock-in.
> http://www.efficito.com/learn_more
>
>
>
> --
> May the force be with you…
> https://simply.name
>
>


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

[HACKERS] Funny WAL corruption issue

2017-08-10 Thread Chris Travers

Hi;

I ran into a funny situation today regarding PostgreSQL replication and wal
corruption and wanted to go over what I think happened and what I wonder
about as a possible solution.

Basic information is custom-build PostgreSQL 9.6.3 on Gentoo, on a ~5TB
database with variable load.  Master database has two slaves and generates
10-20MB of WAL traffic a second.  The data_checksum option is off.


The problem occurred when I attempted to restart the service on the slave
using pg_ctl (I believe the service had been started with sys V init
scripts).  On trying to restart, it gave me a nice "Invalid memory
allocation request" error and promptly stopped.

The main logs showed a lot of messages like before the restart:

2017-08-02 11:47:33 UTC LOG:  PID 19033 in cancel request did not match any
process

2017-08-02 11:47:33 UTC LOG:  PID 19032 in cancel request did not match any
process

2017-08-02 11:47:33 UTC LOG:  PID 19024 in cancel request did not match any
process

2017-08-02 11:47:33 UTC LOG:  PID 19034 in cancel request did not match any
process


On restart, the following was logged to stderr:

LOG:  entering standby mode

LOG:  redo starts at 1E39C/8B77B458

LOG:  consistent recovery state reached at 1E39C/E1117FF8

FATAL:  invalid memory alloc request size 3456458752

LOG:  startup process (PID 18167) exited with exit code 1

LOG:  terminating any other active server processes

LOG:  database system is shut down

After some troubleshooting I found that the wal segment had become corrupt,
I copied the correct one from the master and everything came up to present.

So It seems like somewhere something crashed big time on the back-end and
when we tried to restart, the wal ended in an invalid way.

I am wondering what can be done to prevent these sorts of things from
happening in the future if, for example, a replica dies in the middle of a
wal fsync.
-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more

Re: [HACKERS] GiST support for UUIDs

2016-11-29 Thread Chris Bandy

On Tue, Nov 29, 2016 at 1:13 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Pushed with that change and some other mostly-cosmetic tweaking.

Thank you for addressing all those issues, Tom! I tested some
exclusion constraints that are interesting to me, and everything seems
to be working well.

-- Chris


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for UUIDs

2016-11-28 Thread Chris Bandy

On Mon, Nov 28, 2016 at 4:04 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> What I would suggest is that you forget the union hack and just use
> memcmp in all the comparison functions.  It's not likely to be worth
> the trouble to try to get those calls to be safely aligned.  The
> only place where you need go to any trouble is in uuid_parts_distance,
> which could be redesigned to memcpy a not-necessarily-aligned source
> into a local uint64[2] array and then byte-swap if needed.

Done. I only have one architecture to test on (Linux, Intel x86_64) so
I must defer to others in this regard.

> I don't entirely see the point of making pg_uuid_t an opaque struct when
> the only interesting thing about it, namely UUID_LEN, is exposed anyway.
> So my inclination would be to just do
>
> typedef struct pg_uuid_t
> {
> unsigned char data[UUID_LEN];
> } pg_uuid_t;
>
> in uuid.h and forget the idea of it being opaque.

Done.

> Also, the patch could be made a good deal smaller (and easier to review)
> in the wake of commit 40b449ae8: you no longer need to convert
> btree_gist--1.2.sql into btree_gist--1.3.sql, just leave it alone and add
> btree_gist--1.2--1.3.sql.  That way we don't have to worry about whether
> the upgrade script matches the change in the base script.

Done.


-- Chris


btree_gist_uuid_8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] extract text from XML

2016-08-08 Thread Chris Pacejo

Hi, I have found a basic use case which is supported by the xml2 module,
but is unsupported by the new XML API.

It is not possible to correctly extract text (either from text nodes or
attribute values) which contains the characters '<', '&', or '>'. 
xpath() (correctly) returns XML text nodes for queries targeting these
node types, and there is no inverse to xmlelement().  For example:

=> select (xpath('/a/text()', xmlelement(name a, '<&>')))[1]::text;
   xpath   
---
 
(1 row)

Again, not a bug; but there is no way to specify my desired intent.  The
xml2 module does provide such a function, xpath_string:

=> select xpath_string(xmlelement(name a, '<&>')::text, '/a/text()');
 xpath_string 
--
 <&>
(1 row)

One workaround is to return the node's text value by serializing the XML
value, and textually replacing those three entities with the characters
they represent, but this relies on the xpath() function not generating
other entities.

(My use case is importing data in XML format, and processing with
Postgres into a relational format.)

Perhaps a function xpath_value(text, xml) -> text[] would close the gap?
 (I did search and no such function seems to exist currently, outside
xml2.)

Thanks,
Chris


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] OS X 10.11.3, psql, bus error 10, 9.5.1

2016-03-12 Thread Chris Ruprecht

unfortunately, I have to admit to my disgrace, that I'm still no C programmer 
after all these decades of dabbling in writing code. I just used the flags 
because someone at some point told me that it was a good idea, turns out, it's 
not [always]. I shall rebuild 9.5.1 without the -fno-common flag and see if 
that fixes things.

Thanks Tom for spending part of your weekend on this.

Chris.


> On Mar 12, 2016, at 17:58, Tom Lane <t...@sss.pgh.pa.us> wrote:
> 
> I wrote:
>> That's confusing because it implies that -fno-common is the default,
>> which it evidently is not.  But anyway, my diagnosis is that you're
>> breaking something about the linker's behavior with that switch.
> 
> Oh!  Looking closer, the core dump happens here:
> 
> const printTextFormat pg_utf8format;
> 
>   printTextFormat *popt = (printTextFormat *) _utf8format;
> 
> -->   popt->name = "unicode";
> 
> So apparently, the relevant property of "-fno-common" is that it
> causes "const" variables to actually get placed in read-only data.
> 
> I think this code is new in 9.5, which'd explain why you didn't see
> the failure with older PG versions.  It's surely busted though.
> 
> I shall get rid of the const-ness, as well as the lame casting away
> of it, and I think I will also go make buildfarm member longfin use
> "-fno-common".  It is truly sad that we apparently have no test
> machine that enforces that const means const ...
> 
>   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Removing max_connection requirement on hot_standby

2015-09-29 Thread Chris Winslett

I'm orchestrating Postgres to behave as a leader-follower cluster. I've run
into issues when I am scaling down a connection count for a cluster
(scaling up is fine -- scaling down results in fatal errors).  I use an
open source tool I've written to orchestrate the cluster called Governor (
http://github.com/compose/governor).  It would work if I weren't running
hot_standby, but alas…I'm running with it.

I found the code which throws the fatal is actually a pre-flight test for
hot_standby written in 2009 (i.e. battle tested):

https://github.com/postgres/postgres/blob/efc16ea520679d713d98a2c7bf1453c4ff7b91ec/src/backend/access/transam/xlog.c#L5312-L5321

I've tested changing this value from a FATAL to a WARN.  I've compiled and
tested my scenario and all appears to be correct:

https://github.com/compose/postgres/commit/2bdf6b36821987aadb401e1b8590ecc5b02126d8

In researching these lines of code, it appears the original FATAL code was
put in place to ensure that a hot_standby is as close as possibly
configured the same as the leader.

This change will also allow backups taken using `pg_basebackup` to work
with settings that different from the original host.

Am I missing something with this change?

Cheers,
Chris

Re: [HACKERS] CTE optimization fence on the todo list?

2015-05-19 Thread Chris Rogers

I need this feature a lot.  Can anyone point me to a place in the code
where I can hack together a quick-and-dirty, compatibility-breaking
implementation?  Thanks!

On Sun, May 3, 2015 at 10:03 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 5/3/15 11:59 AM, Andrew Dunstan wrote:


 On 05/03/2015 11:49 AM, Tom Lane wrote:

 Andrew Dunstan and...@dunslane.net writes:

 On 05/01/2015 07:24 PM, Josh Berkus wrote:

 (A possible compromise position would be to offer a new GUC to
 enable/disable the optimization globally; that would add only a
 reasonably
 small amount of control code, and people who were afraid of the change
 breaking their apps would probably want a global disable anyway.)

 This could be a very bad, almost impossible to catch, behaviour break.
 Even if we add the GUC, we're probably going to be imposing very
 significant code audit costs on some users.

 On what grounds do you claim it'd be a behavior break?  It's possible
 that the subquery flattening would result in less-desirable plans not
 more-desirable ones, but the results should still be correct.


 I meant w.r.t. performance. Sorry if that wasn't clear.


 To put this in perspective... I've seen things like this take query
 runtime from minutes to multiple hours or worse; bad enough that behavior
 break becomes a valid description.

 We definitely need to highlight this in the release notes, and I think the
 GUC would be mandatory.
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE optimization fence on the todo list?

2015-04-29 Thread Chris Rogers

Has there been any movement on this in the last couple years?

I could really use the ability to optimize across CTE boundaries, and it
seems like a lot of other people could too.

[HACKERS] Re: [pgsql-pkg-debian] Updated libpq5 packages cause connection errors on postgresql 9.2

2014-12-19 Thread Chris Butler

Hi Christoph,

- Original Message -
 From: Christoph Berg c...@df7cb.de
 To: Chris Butler cbut...@zedcore.com

 Googling for digest too big for rsa key seems to indicate that this
 problem occurs when you are using (client?) certificates with short
 RSA keys. 512 bits is most often cited in the problem reports,
 something like 768 is around the minimum size that works, and of
 course, anything smaller than 1024 or really 1536 (or 2048) bits is
 too small for today's crypto standards.

 So the question here is if this is also the problem you saw - are you
 using client or server certificates with short keys?

Yes, that would appear to be the case - the key we're using is only 512 bits. 
I'll make sure we replace the certificate before re-applying the update (which 
will probably be after the holidays now).

 What this explanation doesn't explain is why the problem occurs with
 9.4's libpq5 while it works with 9.3's. The libssl version used for
 building these packages should really be the same, 9.3.5-2.pgdg70+1
 was built just two days ago as well.

For info, I can confirm that both libraries are loading the same libssl:

zedcore@web2:/tmp/usr/lib/x86_64-linux-gnu$ ldd 
/usr/lib/x86_64-linux-gnu/libpq.so.5 | grep libssl
libssl.so.1.0.0 = /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 
(0x7f3e8d898000)
zedcore@web2:/tmp/usr/lib/x86_64-linux-gnu$ ldd ./libpq.so.5 | grep libssl
libssl.so.1.0.0 = /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 
(0x7f5d76176000)

I can see a few changes are listed in the 9.4 changelog relating to SSL, so my 
guess would be one of those changes has altered the behaviour of libssl when 
presented with a small key.

-- 
Chris Butler
Zedcore Systems Ltd

Telephone: 0114 303 0666
Direct dial: 0114 303 0572

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] why does LIMIT 2 take orders of magnitude longer than LIMIT 1 in this query?

2014-10-31 Thread Chris Rogers

I'm on PostgreSQL 9.3.  This should reproduce on any table with 100,000+
rows.  The EXPLAIN ANALYZE shows many more rows getting scanned with LIMIT
2, but I can't figure out why.

Limit 1:

EXPLAIN ANALYZE WITH base AS (
  SELECT *, ROW_NUMBER() OVER () AS rownum FROM a_big_table
), filter AS (
  SELECT rownum, true AS thing FROM base
) SELECT * FROM base LEFT JOIN filter USING (rownum) WHERE filter.thing
LIMIT 1

Result:

Limit  (cost=283512.19..283517.66 rows=1 width=2114) (actual
time=0.019..0.019 rows=1 loops=1)
  CTE base
-  WindowAgg  (cost=0.00..188702.69 rows=4740475 width=101) (actual
time=0.008..0.008 rows=1 loops=1)
  -  Seq Scan on a_big_table  (cost=0.00..129446.75 rows=4740475
width=101) (actual time=0.003..0.003 rows=1 loops=1)
  CTE filter
-  CTE Scan on base base_1  (cost=0.00..94809.50 rows=4740475 width=8)
(actual time=0.000..0.000 rows=1 loops=1)
  -  Nested Loop  (cost=0.00..307677626611.24 rows=56180269915 width=2114)
(actual time=0.018..0.018 rows=1 loops=1)
Join Filter: (base.rownum = filter.rownum)
-  CTE Scan on base  (cost=0.00..94809.50 rows=4740475 width=2113)
(actual time=0.011..0.011 rows=1 loops=1)
-  CTE Scan on filter  (cost=0.00..94809.50 rows=2370238 width=9)
(actual time=0.002..0.002 rows=1 loops=1)
  Filter: thing
Total runtime: 0.057 ms

Limit 2:

EXPLAIN ANALYZE WITH base AS (
  SELECT *, ROW_NUMBER() OVER () AS rownum FROM a_big_table
), filter AS (
  SELECT rownum, true AS thing FROM base
) SELECT * FROM base LEFT JOIN filter USING (rownum) WHERE filter.thing
LIMIT 2

Result:

Limit  (cost=283512.19..283523.14 rows=2 width=2114) (actual
time=0.018..14162.283 rows=2 loops=1)
  CTE base
-  WindowAgg  (cost=0.00..188702.69 rows=4740475 width=101) (actual
time=0.008..4443.359 rows=4714243 loops=1)
  -  Seq Scan on a_big_table  (cost=0.00..129446.75 rows=4740475
width=101) (actual time=0.002..1421.622 rows=4714243 loops=1)
  CTE filter
-  CTE Scan on base base_1  (cost=0.00..94809.50 rows=4740475 width=8)
(actual time=0.001..10214.684 rows=4714243 loops=1)
  -  Nested Loop  (cost=0.00..307677626611.24 rows=56180269915 width=2114)
(actual time=0.018..14162.280 rows=2 loops=1)
Join Filter: (base.rownum = filter.rownum)
Rows Removed by Join Filter: 4714243
-  CTE Scan on base  (cost=0.00..94809.50 rows=4740475 width=2113)
(actual time=0.011..0.028 rows=2 loops=1)
-  CTE Scan on filter  (cost=0.00..94809.50 rows=2370238 width=9)
(actual time=0.009..6595.770 rows=2357122 loops=2)
  Filter: thing
Total runtime: 14247.374 ms

Re: [HACKERS] Proposal for better support of time-varying timezone abbreviations

2014-10-09 Thread Chris Bandy

On Tue, Oct 7, 2014 at 5:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:

  typedef struct
   {
 ! char token[TOKMAXLEN + 1]; /* now always null-terminated */
   char type;
 ! int32 value;
   } datetkn;


Being entirely new to this code, now makes me think of the current
timestamp. I think this word can be removed to reduce ambiguity.


+ /* use strncmp so that we match truncated tokens */
result = strncmp(key, position-token, TOKMAXLEN);


In your proposal you wanted to remove crufty code that deals with
non-null-terminated token strings. Is this some of that crufty code? Can
it be removed?


-- Chris

Re: [HACKERS] 9.3 Json Array's

2013-09-24 Thread Chris Travers



 On 24 September 2013 at 13:46 Andrew Dunstan and...@dunslane.net wrote:
 
 
 Feel free to ask questions.
 
 The heart of the API is the event handlers defined in this stuct in
 include/utils/jsonapi.h:
 
 typedef struct JsonSemAction
 {
  void   *semstate;
  json_struct_action object_start;
  json_struct_action object_end;
  json_struct_action array_start;
  json_struct_action array_end;
  json_ofield_action object_field_start;
  json_ofield_action object_field_end;
  json_aelem_action array_element_start;
  json_aelem_action array_element_end;
  json_scalar_action scalar;
 } JsonSemAction;
 
 
 Basically there is a handler for the start and end of each non-scalar
 structural element in JSON, plus a handler for scalars.
 
 There are several problems that will be posed by processing nested
 arrays and objects, including:
 
   * in effect you would need to construct a stack of state that could be
 pushed and popped

True.

   * JSON arrays aren't a very good match for SQL arrays - they are
 unidimensional and heterogenous.

This is true, but I think one would have to start with an assumption that the
data is valid for an SQL type and then check again once one gets it done.
   JSON is a pretty flexible format which makes it a poor match in many cases
for SQL types generally.  But I think the example so far (with
json_populate_recordset) is a good one, namely a best effort conversion.

 
 
 I'm not saying this can't be done - it will just take a bit of effort.

Yeah, looking through the code, I think it will be more work than I originally
thought but that just means it will take longer.
 
 cheers
 
 andrew
 
 
 
 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] 9.3 Json Array's

2013-09-23 Thread Chris Travers




 On 23 September 2013 at 23:37 Adam Jelinek ajeli...@gmail.com wrote:
 
  I am sure I am doing something wrong here, or this is an unsupported feature,
 but I wanted to double check. I was hoping that if I did a json_agg(x) and
 then copied that output of that and passed it into a json_populate_recordset
 that I would get the record back.  I know I can make things work using a CTE
 and other functions like json_each, but I was hoping for a simple one liner.
 



Yeah, I had the same experience.  It is not supported.  I am looking at trying
to add support for nested objects and better support for arrays.  Interested in
collaborating?

 
 
  CREATE SCHEMA varrm;
 
  CREATE SEQUENCE varrm.item_id_seq;
 
  CREATE TABLE varrm.item
(item_idbigint DEFAULT nextval('varrm.item_id_seq')
,title  text
,short_desc text
,long_desc  text
,tags   text[]
 

^^^ That is what it chokes on.



,external_api_key   text
,trans_timestamptimestamp without time zone DEFAULT now()
,upsert_timestamp   timestamp without time zone DEFAULT clock_timestamp()
,end_timestamp  timestamp without time zone DEFAULT '-12-31
 23:59:59.99'::timestamp without time zone
,CONSTRAINT item_primary_keyPRIMARY KEY (item_id)
);
 
  INSERT INTO varrm.item (title, short_desc, long_desc, tags, external_api_key)
 values ('My Title', 'My Short Desc', 'My Very Long Desc', '{GAME, WII, PS4,
 ACTION, FIRST PERSON SHOOTER}', '1235467');
 
  SELECT json_agg(t1) FROM (SELECT title, short_desc, long_desc, tags,
 external_api_key FROM varrm.item) AS t1
  --output is
  --[{title:My Title,short_desc:My Short Desc,long_desc:My Very Long
 Desc,tags:[GAME,WII,PS4,ACTION,FIRST PERSON
 SHOOTER],external_api_key:null}]
 
  SELECT '[{title:My Title,short_desc:My Short Desc,long_desc:My
 Very Long Desc,tags:[GAME,WII,PS4,ACTION,FIRST PERSON
 SHOOTER],external_api_key:null}]'::JSON
 
  SELECT * FROM json_populate_recordset(null::varrm.item, '[{title:My
 Title,short_desc:My Short Desc,long_desc:My Very Long
 Desc,tags:[GAME,WII,PS4,ACTION,FIRST PERSON
 SHOOTER],external_api_key:null}]'::JSON)
  /**
  ERROR:  cannot call json_populate_recordset on a nested object
 


I am still in the process of wrapping my head around the current JSON logic.  I
hope to produce a proof of concept that can later be turned into a patch.  See
my previous post on this topic.  Again collaboration is welcome.


Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] Minmax indexes

2013-09-16 Thread Chris Travers



 On 16 September 2013 at 11:03 Heikki Linnakangas hlinnakan...@vmware.com
 wrote:


 Something like this seems completely sensible to me:

 create index i_accounts on accounts using minmax (ts) where valid = true;

 The situation where that would be useful is if 'valid' accounts are
 fairly well clustered, but invalid ones are scattered all over the
 table. The minimum and maximum stoed in the index would only concern
 valid accounts.

Here's one that occurs to me:

CREATE INDEX i_billing_id_mm ON billing(id) WHERE paid_in_full IS NOT TRUE;

Note that this would be a frequently moving target and over years of billing,
the subset would be quite small compared to the full system (imagine, say, 50k
rows out of 20M).

Best Wises,
Chris Travers

 - Heikki


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-16 Thread Chris Travers



 On 16 September 2013 at 14:43 Merlin Moncure mmonc...@gmail.com wrote:


 Huge +1 on on this. Couple random thoughts:

 *) Hard to see how you would structure this as an extension as you're
 adjusting the behaviors of existing functions, unless you wanted to
 introduce new function names for testing purposes?

Yeah, and reading the source, it looks like some parts of the JSON parsing code
will have to be rewritten because the nested object errors are thrown quite
deeply in the parsing stage.  It looks to me as if this will require some
significant copying as a POC into a new file with different publicly exposed
function names.

 *) Would like to at least consider being able to use casting syntax as
 a replacement for populate_record and (the missing) populate_array
 for most usages.

Yes.  I am trying to figure out how best to do this at present.  Initially I
think I would be happy to settle for casts wrapping functions which themselves
just wrap the call to populate_record.

What I will probably do for my POC is expose the following methods:

1.  json_populate_type()
2.  json_populate_array()

Then we can talk about whether to merge the changes into the core,

This will be an interesting way to get into PostgreSQL hacking.

Best Wishes,
Chris Travers


 merlin


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-15 Thread Chris Travers



 On 15 September 2013 at 18:42 Andrew Dunstan and...@dunslane.net wrote:



 On 09/14/2013 10:27 PM, chris travers wrote:

 Well, you could fairly easily build it as an extension as a POC. The
 main point of the API this is built on was to allow for extensions.

 The logic changes might be a bit tricky. I'll be interested to see what
 you come up with.

 If we're going to do this we should make these handle arrays as well as
 objects.

Yes, arrays are necessary for my use case.  I probably should have been explicit
that I needed that too.

Best Wishes,
Chris Travers

 cheers

 andrew


Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

[HACKERS] Proposal: json_populate_record and nested json objects

2013-09-14 Thread chris travers

Hi all;

Currently json_populate_record and json_populate_recordset cannot work with
nested json objects.  This creates two fundamental problems when trying to use
JSON as an interface format.

The first problem is you can't easily embed a json data type in an json object
and have it populate a record.  This means that storing extended attributes in
the database is somewhat problematic if you accept the whole row in as a json
object.

The second problem is that nested data structures and json don't go together
well.  You can't have  a composite type which has as an attribute an array of
another composite type and populate this from a json object.  This makes json
largely an alternative to hstore for interfaces in its current shape.

I would propose handling the json_populate_record and friends as such:

1. Don't throw errors initially as a pre-check if the json object is nested.
2. If one comes to a nested fragment, check the attribute type it is going into
first.
2.1 If it is a json type, put the nested fragment there.
2.2 If it is a composite type (i.e. anything in pg_class), push it through
another json_populate_record run
2.3 If it is neither, then see if a json::[type] cast exists, if so call it.
2.4 Otherwise raise an exception

I have a few questions before I go on to look at creating a patch.

1.  Are there any problems anyone spots with this approach?

2.  Is anyone working on something like this?

3.  Would it be preferable to build something like this first as an extension
(perhaps with different function names) or first as a patch?

Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] Proposal: PL/PgSQL strict_mode

2013-09-13 Thread chris travers

A few thoughts about this.

 On 14 September 2013 at 05:28 Marko Tiikkaja ma...@joh.to wrote:


 Hi,

 After my previous suggestion for adding a STRICT keyword got shot
 down[1], I've been thinking about an idea Andrew Gierth tossed out:
 adding a new strict mode into PL/PgSQL. In this mode, any query which
 processes more than one row will raise an exception. This is a bit
 similar to specifying INTO STRICT .. for every statement, except
 processing no rows does not trigger the exception. The need for this
 mode comes from a few observations I make almost every day:
 1) The majority of statements only deal with exactly 0 or 1 rows.
 2) Checking row_count for a statement is ugly and cumbersome, so
 often it just isn't checked. I often use RETURNING TRUE INTO STRICT _OK
 for DML, but that a) requires an extra variable, and b) isn't possible
 if 0 rows affected is not an error in the application logic.
 3) SELECT .. INTO only fetches one row and ignores the rest. Even
 row_count is always set to 0 or 1, so there's no way to fetch a value
 *and* to check that there would not have been more rows. This creates
 bugs which make your queries return wrong results and which could go
 undetected for a long time.

I am going to suggest that STRICT is semantically pretty far from what is meant
here in common speech.  I think STRICT here would be confusing.  This would be
really pretty severe for people coming from Perl or MySQL backgrounds.

May I suggest SINGLE as a key word instead?  It might be worth having attached
to a INSERT, UPDATE, and DELETE statements.

I am thinking something like:

DELETE SINGLE FROM foo WHERE f1  1000;

would be more clearer.  Similarly one could have:

INSERT SINGLE INTO foo SELECT * from foo2;

and

UPDATE SINGLE foo

You could even use SELECT SINGLE but not sure where the use case is there where
unique indexes are not sufficient.



 Attached is a proof-of-concept patch (no docs, probably some actual code
 problems too) to implement this as a compile option:

 =# create or replace function footest() returns void as $$
 $# #strict_mode strict
 $# begin
 $# -- not allowed to delete more than one row
 $# delete from foo where f1  100;
 $# end$$ language plpgsql;
 CREATE FUNCTION
 =# select footest();
 ERROR: query processed more than one row
 CONTEXT: PL/pgSQL function footest() line 5 at SQL statement

 Now while I think this is a step into the right direction, I do have a
 couple of problems with this patch:
 1) I'm not sure what would be the correct behaviour with EXECUTE.
 I'm tempted to just leave EXECUTE alone, as it has slightly different
 rules anyway.
 2) If you're running in strict mode and you want to
 insert/update/delete more than one row, things get a bit uglier; a wCTE
 would work for some cases. If strict mode doesn't affect EXECUTE (see
 point 1 above), that could work too. Or maybe there could be a new
 command which runs a query, discards the results and ignores the number
 of rows processed.

Yeah, I am worried about this one.  I am concerned that if you can't disable on
a statement by statement basis, then you have a problem where you end up
removing the mode from the function and then it becomes a lot harder for
everyone maintaining the function to have a clear picture of what is going on.
 I am further worried that hacked ugly code ways around it will introduce plenty
of other maintenance pain points that will be worse than what you are solving.

 I'll be adding this to the open commitfest in hopes of getting some
 feedback on this idea (I'm prepared to hear a lot of you're crazy!),
 but feel free to comment away any time you please.

Well, I don't know if my feedback above is helpful, but there it is ;-)


 Regards,
 Marko Tiikkaja

 [1]: http://www.postgresql.org/message-id/510bf731.5020...@gmx.net

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] ASYNC Privileges proposal

2013-06-27 Thread Chris Farmiloe

So I would think that if this was to go further then channels would need
to be more of a first class citizen and created explicitly, with CREATE
CHANNEL, DROP CHANNEL etc:

CREATE CHANNEL channame;
GRANT LISTEN ON CHANNEL channame TO rolename;
GRANT NOTIFY ON CHANNEL channame TO rolename;
LISTEN channame; -- OK
NOTIFY channame, 'hi'; -- OK
LISTEN ; -- exception: no channel named 
NOTIFY , 'hi'; -- exception: no channel named 

Personally I think explicitly creating channels would be beneficial; I have
hit issues where an typo in a channel name has caused a bug to go unnoticed
for a while.
But of course this would break backwards compatibility with the current
model (with implicit channel names) so unless we wanted to force everyone
to add CREATE CHANNEL statements during their upgrade then, maybe there
would need to be some kind of system to workaround this

Possibly some kind of catch-all channel, that enables implicit channel
names?

GRANT LISTEN, NOTIFY ON CHANNEL * TO PUBLIC; -- enabled by default for
backwards compat
NOTIFY ; -- OK via * CHANNEL
LISTEN ; -- OK via * CHANNEL









Chris


On 18 June 2013 18:31, Josh Berkus j...@agliodbs.com wrote:


  I had a quick play to see what might be involved [attached], and would
 like to
  hear people thoughts; good idea, bad idea, not like that! etc
 
  I question the usefulness of allowing listen/notify to be restricted to
  an entire class of users.  The granularity of this seems too broad,
  though I am not sure if allowing message to be sent to a specific user
  is easily achievable.

 Well, if we're going to have privs on LISTEN/NOTIFY at all, they should
 be on specific message bands, i.e.:

 REVOKE LISTEN ON 'cacheupdates' FROM PUBLIC;
 GRANT LISTEN ON 'cacheupdates' TO webuser;
 GRANT LISTEN ON ALL TO admin;

 I can certainly see wanting this, but it'd be a great deal more
 sophisticated than what Chris has proposed.

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com

[HACKERS] ASYNC Privileges proposal

2013-05-23 Thread Chris Farmiloe

Hey all,

I find the current LISTEN / NOTIFY rather limited in the context of
databases with multiple roles. As it stands it is not possible to restrict
the use of LISTEN or NOTIFY to specific roles, and therefore notifications
(and their payloads) cannot really be trusted as coming from any particular
source.

If the payloads of notifications could be trusted, then applications could
make better use of them, without fear of leaking any sensitive information
to anyone who shouldn't be able to see it.

I'd like to propose a new ASYNC database privilege that would control
whether a role can use LISTEN, NOTIFY and UNLISTEN statements and the
associated pg_notify function.

ie:
GRANT ASYNC ON DATABASE  TO bob;
REVOKE ASYNC ON DATABASE  FROM bob;

SECURITY DEFINER functions could then be used anywhere that a finer grained
access control was required.

I had a quick play to see what might be involved [attached], and would like
to hear people thoughts; good idea, bad idea, not like that! etc

Chris.


async_privileges_r0.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ASYNC Privileges proposal

2013-05-19 Thread Chris Farmiloe

Hey all,

I find the current LISTEN / NOTIFY rather limited in the context of
databases with multiple roles. As it stands it is not possible to restrict
the use of LISTEN or NOTIFY to specific roles, and therefore notifications
(and their payloads) cannot really be trusted as coming from any particular
source.

If the payloads of notifications could be trusted, then applications could
make better use of them, without fear of leaking any sensitive information
to anyone who shouldn't be able to see it.

I'd like to propose a new ASYNC database privilege that would control
whether a role can use LISTEN, NOTIFY and UNLISTEN statements and the
associated pg_notify function.

ie:
GRANT ASYNC ON DATABASE  TO bob;
REVOKE ASYNC ON DATABASE  FROM bob;

SECURITY DEFINER functions could then be used anywhere that a finer grained
access control was required.

I had a quick play to see what might be involved [attached], and would like
to hear people thoughts; good idea, bad idea, not like that! etc

Chris.


async_privileges_r0.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ASYNC Privileges proposal

2013-05-19 Thread Chris Farmiloe

In fairness NOTIFY has only had a payload since v9 (maybe 8.4?), and the
issue of trust is mainly tied to data leaking from the payload, so I
suspect I won't be last person to request this as people re-visit NOTIFY :)
...but I totally get your point of course.

My first thought was also that having control at the channel-level would be
ideal, but that would be a huge implementation change by the looks of
things, would certainly affect performance a great deal more and would not
really give much more benefit that could be attained with database-level
control + a SECURITY DEFINER function.


Chris


On 20 May 2013 03:23, Tom Lane t...@sss.pgh.pa.us wrote:

 Chris Farmiloe chrisfa...@gmail.com writes:
  I find the current LISTEN / NOTIFY rather limited in the context of
  databases with multiple roles. As it stands it is not possible to
 restrict
  the use of LISTEN or NOTIFY to specific roles, and therefore
 notifications
  (and their payloads) cannot really be trusted as coming from any
 particular
  source.

 TBH, nobody has complained about this in the fifteen-plus years that
 LISTEN has been around.  I'm dubious about adding privilege-checking
 overhead for everybody to satisfy a complaint from one person.

  I'd like to propose a new ASYNC database privilege that would control
  whether a role can use LISTEN, NOTIFY and UNLISTEN statements and the
  associated pg_notify function.

 ... and if I did think that there were an issue here, I doubt I'd think
 that a privilege as coarse-grained as that would fix it.  Surely you'd
 want per-channel privileges if you were feeling paranoid about this,
 not to mention separate read and write privileges.  But the demand for
 that just isn't out there.

 regards, tom lane

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-29 Thread Chris Corbyn

What's the use case of this? It sounds like it will just create a maintenance 
nightmare where some stuff you expect to lookup in in postgresql.conf is 
actually hiding in the .auto file. Assuming only super users/sysadmins would 
have the ability to change things in the config file, wouldn't they be more 
likely to just do it on the server and edit the .conf (which among other 
things, keeps it tidy and orderly).

Also, how would you propose to handle settings that require the server to be 
restarted, such as checkpoint_segments? It seems like by allowing these to be 
set via a command (which isn't really SQL) you're creating the impression that 
they will take immediate effect, which isn't the case.

Just my $0.02. Of course, I might be missing the point.


Il giorno 30/ott/2012, alle ore 00:31, Amit Kapila ha scritto:

 SYNTAX:
 ALTER SYSTEM SET configuration_parameter = value COMMENT 'value';
  
 DESIGN IDEA:
 (a) have a postgresql.conf.auto
 (b) add a default include for postgresql.conf.auto at the beginning of 
 PostgreSQL.conf
 (c) SQL updates go to postgresql.conf.auto, which consists only ofsetting = 
 value #comments .
 (d) We document that settings which are changed manually in postgresql.conf 
 will override postgresql.conf.auto.
  
 IMPLEMENTATION IDEA:
  
 The main Idea is we create a lock file, it acts as lock to avoid concurrent 
 edit into .conf auto file
 and also as an intermediate file where we keep all the new changes until we 
 commit the alter system command.
  
 CCREATION OF  AUTO FILE
 1. during initdb we create the .auto file and it will be empty.
 2. .conf file will have its first entry as follows
  
  
 #--
 # Postgresql.conf.auto inclusion
 #--
 # Do not edit postgresql.conf.auto file or remove the include.
 # You can Edit the settings below in this file which will override 
 auto-generated file.
  
 include = 'postgresql.conf.auto'
  

  
   
 ALGORITHM for ALTER SYSTEM:
 1. check whether the given key : value is valid.
 -- This is done so that next read from .auto file should not 
 throw error.
 2. get postgresql.conf.auto path. (always the data directory)
 -- Since the .auto file in data directory pg_basebackup will 
 pick it up.
 3. Create the postgresql.conf.auto.lock file( with O_EXCL flag).
 -- This act as a protection from other backends who are 
 trying to edit this file.
 -- If already exist we wait for some time by retrying.
  4. Open the postgresql.conf.auto file in read mode.  
  5. Write the new (key, value, comment) in to the 
 postgresql.conf.auto.lock file by using below steps:
a. read the contents of postgresql.conf.auto in to memory buffer 
 line by line.
b. Scan for key in postgresql.conf.auto file.
   if found get the line number in file such that where we 
 have to insert the new (key,value).
   else we should write the new (key, value) pair to last 
 line.  
c. add the new (key, value, comment) to memory buffer to the line 
 as found in step b.
d. Write the memory buffer into postgresql.conf.auto.lock file.  
 -- here memory buffer  represent the modified state of the 
 postgresql.conf.auto file.
e. Commit the .lock file.
  -- Here rename the lock file to auto file.
  -- If auto file is opened by other process (SIGHUP 
 processing) then we retry rename for some time
other wise alter system command fails. 
 
f. If any error in between rollback lock file
 -- here delete the lock file.
 
  
 CLARIFICATION
 1. Tom, the below is mentioned by you in one of the discussions for this 
 topic. I need small clarification:
About the only change I want to make immediately is that initdb ought to 
 shove its settings into postgresql.auto instead of mucking with
 postgresql.conf.
 So do you mean to say the settings done by initdb (like max_connections, 
 etc.) need to be in .auto file instead of .conf and let these
 parameters be commented in .conf?
 2. Do .auto file needs to be included by default?
 3. Can the path of .auto be fixed as data directory path?
  
 Note:
 1. Only One backend can edit conf file at a time others wait.
 2. Suppose .auto have invalid entry eg: listening port number mentioned is 
 taken up by other application
then if we try to restart the postgres it fails. This need manual 
 intervention.
 3. This command cannot be executed inside the transaction block. Not sure 
 what to do for this part, whether it needs to be supported
in a block?
 4. currently command for reset or invalidation of (key, value)

Re: [HACKERS] Should select 'nan'::float = 'nan'::float; return false as per IEEE 754

2012-10-28 Thread Chris Corbyn

Would this introduce problems finding rows where the stored value is NaN? You'd 
need to add a function or operator to avoid that.

Il giorno 28/ott/2012, alle ore 20:43, Hannu Krosing ha scritto:

 This is how PostgreSQL currently works -
 
 test=# select 'NaN'::float = 'NaN'::float as must_be_false;
 must_be_false
 --
 t
 (1 row)
 
 I think that PostgreSQL's behaviour of comparing two
 NaN-s as equal is wrong and Iwe should follow the IEEE 754 spec here
 
 As per IEEE 754 a NaN behaves similar to NULL in SQL.
 
 There is some discussion of why it is so at:
 
 http://stackoverflow.com/questions/1565164/what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values
 
 especially the first comment
 
 -
 Hannu
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Is PQexecParams() simply a wrapper function?

2012-09-22 Thread Chris Corbyn

Hi All,

I'm currently using PQexecParams() as part of a bridge to allow Postgres to be 
used in another programming language. Not specifying the OIDs simplifies things 
(for the end user), but bytea params still cause headaches, since they need to 
be formatted differently and I receive the data as a (char *) internally, but 
can't really infer a type myself, at least not reliably.

It looks like PQprepare(), PQdescribePrepared() and PQexecPrepared() could be 
used to allow me to check if any of the params represent a BYTEAOID, then 
escape only those params. This seems like 3 round-trips to the server, however. 
That said, I believe PQexecParams() is doing a similar thing, in that it 
internally prepares a statement, then executes it (2 round trips). Or am I 
needlessly concerning myself over microseconds here?

Cheers,

Chris



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SEGFAULT on SELECT * FROM view

2012-01-01 Thread chris r.

 I ported the entire schema to my test DB server and could not reproduce
 the error there. Note that probably recreating the view solves this
 issue. Given this, how should I proceed to create a test case? Any
 tutorial on this? (I'm not too familiar with all this yet.)
 
 It's possibly statistics-dependent; make sure you have the same stats
 targets on both DBs, and try re-analyzing a few times.  Check other
 planner parameters are the same, too.
In addition to the schema, I now also copied parts of the data in our
production system to the testing DB. (I cannot copy all data, as it's
too large for the testing DB.) After a couple of VACUUM FULL ANALYZEs,
the bug still doesn't reproduce.

In our production environment, I'm working with the re-created view that
doesn't show the SEGFAULT behavior. I can live with this workaround, as
only one particular view seems to be affected, but if you want me to
debug more on the error please let me know.

Thanks for your help until now!
Chris

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SEGFAULT on SELECT * FROM view

2011-12-29 Thread chris r.

Hi all,

When SELECTing from one particular view, I get reproducible SEGFAULTs in
my pg 9.1.2 production database. To test for hardware errors, I ran
memtest, which succeeded. I then asked for help in #postgres and got
advice to create stacktraces, but I couldn't find the problem yet.

Scenario: VIEW 'vwa' on TABLE 'tba' and some subselects and aggregates.
I've no idea what's wrong with this view, but a

   SELECT * FROM vwa WHERE myid = 110 LIMIT 100 OFFSET 0;

and even EXPLAINing this statement SEGFAULTs the server.

gdb trace: http://pgsql.privatepaste.com/eacd1b6c5d
gdb ec output: http://pgsql.privatepaste.com/a61db5b564
smaps: http://pgsql.privatepaste.com/3c4f494015

Then I used the definition of vwa, and created vwb, and the very same
SELECT (just with vwb) works perfectly fine. I've a pastebin link
comparing the two view definitions, which I'm willing to share privately
if that helps. The views actually differ, although the look identical
with \d+ in the psql console, in that the newer view names more columns
that were added to the referenced tables lately.

So, you tell me, what's wrong with the old view?

Thanks a lot,
Chris

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SEGFAULT on SELECT * FROM view

2011-12-29 Thread chris r.

 So, you tell me, what's wrong with the old view?
 
 Nothing; that looks like a planner bug to me.  Please submit a
 self-contained test case.
I ported the entire schema to my test DB server and could not reproduce
the error there. Note that probably recreating the view solves this
issue. Given this, how should I proceed to create a test case? Any
tutorial on this? (I'm not too familiar with all this yet.)

Chris

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.1.2 ?

2011-11-12 Thread Chris Redekop

On Wed, Nov 9, 2011 at 6:22 PM, Florian Pflug f...@phlo.org wrote:

 On Nov9, 2011, at 23:53 , Daniel Farina wrote:
  I think a novice user would be scared half to death: I know I was the
  first time.  That's not a great impression for the project to leave
  for what is not, at its root, a vast defect, and the fact it's
  occurring for people when they use rsync rather than my very sensitive
  backup routines is indication that it's not very corner-ey.

 Just to emphasize the non-conerish-ness of this problem, it should be
 mentioned that the HS issue was observed even with backups taken with
 pg_basebackup, if memory serves correctly.

Yes I personally can reliably reproduce both the clog+subtrans problems
using pg_basebackup, and can confirm that the
oldestActiveXid_fixed.v2.patch does resolve both issues.

Re: [HACKERS] Hot Standby startup with overflowed snapshots

2011-11-02 Thread Chris Redekop

oopsreply-to-all

-- Forwarded message --
From: Chris Redekop ch...@replicon.com
Date: Wed, Nov 2, 2011 at 8:41 AM
Subject: Re: [HACKERS] Hot Standby startup with overflowed snapshots
To: Simon Riggs si...@2ndquadrant.com

Sure, I've got quite a few logs lying around - I've attached 3 of 'em...let
me know if there are any specific things you'd like me to do or look for
next time it happens

On Wed, Nov 2, 2011 at 2:59 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Fri, Oct 28, 2011 at 3:42 AM, Chris Redekop ch...@replicon.com wrote:

  On a side note I am sporadically seeing another error on hotstandby
 startup.
   I'm not terribly concerned about it as it is pretty rare and it will
 work
  on a retry so it's not a big deal.  The error is FATAL:  out-of-order
 XID
  insertion in KnownAssignedXids.  If you think it might be a bug and are
  interested in hunting it down let me know and I'll help any way I
 can...but
  if you're not too worried about it then neither am I :)

 I'd be interested to see further details of this if you see it again,
 or have access to previous logs.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

postgresql-2011-10-27_202007.log
Description: Binary data

postgresql-2011-10-31_152925.log
Description: Binary data

postgresql-2011-11-01_094501.log
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-11-02 Thread Chris Redekop

okay, sorry I'm a little confused then.  Should I be able to apply both the
v2 patch as well as the v3 patch?  or is it expected that I'd have to
manually do the merge?


On Wed, Nov 2, 2011 at 1:34 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Wed, Nov 2, 2011 at 2:40 AM, Chris Redekop ch...@replicon.com wrote:

  looks like the v3 patch re-introduces the pg_subtrans issue...

 No, I just separated the patches to be clearer about the individual
 changes.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-11-01 Thread Chris Redekop

looks like the v3 patch re-introduces the pg_subtrans issue...


On Tue, Nov 1, 2011 at 9:33 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Thu, Oct 27, 2011 at 4:25 PM, Simon Riggs si...@2ndquadrant.com
 wrote:

  StartupMultiXact() didn't need changing, I thought, but I will review
 further.

 Good suggestion.

 On review, StartupMultiXact() could also suffer similar error to the
 clog failure. This was caused *because* MultiXact is not maintained by
 recovery, which I had thought meant it was protected from such
 failure.

 Revised patch attached.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Hot Standby startup with overflowed snapshots

2011-10-27 Thread Chris Redekop

Thanks for the patch Simon, but unfortunately it does not resolve the issue
I am seeing.  The standby still refuses to finish starting up until long
after all clients have disconnected from the primary (10 minutes).  I do
see your new log statement on startup, but only once - it does not repeat.
 Is there any way for me to see  what the oldest xid on the standby is via
controldata or something like that?  The standby does stream to keep up with
the primary while the primary has load, and then it becomes idle when the
primary becomes idle (when I kill all the connections)so it appears to
be current...but it just doesn't finish starting up

I'm not sure if it's relevant, but after it has sat idle for a couple
minutes I start seeing these statements in the log (with the same offset
every time):
DEBUG:  skipping restartpoint, already performed at 9/9520



On Thu, Oct 27, 2011 at 7:26 AM, Simon Riggs si...@2ndquadrant.com wrote:

 Chris Redekop's recent report of slow startup for Hot Standby has made
 me revisit the code there.

 Although there isn't a bug, there is a missed opportunity for starting
 up faster which could be the source of Chris' annoyance.

 The following patch allows a faster startup in some circumstances.

 The patch also alters the log levels for messages and gives a single
 simple message for this situation. The log will now say

  LOG:  recovery snapshot waiting for non-overflowed snapshot or until
 oldest active xid on standby is at least %u (now %u)
  ...multiple times until snapshot non-overflowed or xid reached...

 whereas before the first LOG message shown was

  LOG:  consistent state delayed because recovery snapshot incomplete
  and only later, at DEBUG2 do you see
  LOG:  recovery snapshot waiting for %u oldest active xid on standby is %u
  ...multiple times until xid reached...

 Comments please.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby startup with overflowed snapshots

2011-10-27 Thread Chris Redekop

hrmz, still basically the same behaviour.  I think it might be a *little*
better with this patch.  Before when under load it would start up quickly
maybe 2 or 3 times out of 10 attemptswith this patch it might be up to 4
or 5 times out of 10...ish...or maybe it was just fluke *shrug*.  I'm still
only seeing your log statement a single time (I'm running at debug2).  I
have discovered something though - when the standby is in this state if I
force a checkpoint on the primary then the standby comes right up.  Is there
anything I check or try for you to help figure this out?or is it
actually as designed that it could take 10-ish minutes to start up even
after all clients have disconnected from the primary?


On Thu, Oct 27, 2011 at 11:27 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Thu, Oct 27, 2011 at 5:26 PM, Chris Redekop ch...@replicon.com wrote:

  Thanks for the patch Simon, but unfortunately it does not resolve the
 issue
  I am seeing.  The standby still refuses to finish starting up until long
  after all clients have disconnected from the primary (10 minutes).  I do
  see your new log statement on startup, but only once - it does not
 repeat.
   Is there any way for me to see  what the oldest xid on the standby is
 via
  controldata or something like that?  The standby does stream to keep up
 with
  the primary while the primary has load, and then it becomes idle when the
  primary becomes idle (when I kill all the connections)so it appears
 to
  be current...but it just doesn't finish starting up
  I'm not sure if it's relevant, but after it has sat idle for a couple
  minutes I start seeing these statements in the log (with the same offset
  every time):
  DEBUG:  skipping restartpoint, already performed at 9/9520

 OK, so it looks like there are 2 opportunities to improve, not just one.

 Try this.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Hot Standby startup with overflowed snapshots

2011-10-27 Thread Chris Redekop

Sorry...designed was poor choice of words, I meant not unexpected.
 Doing the checkpoint right after pg_stop_backup() looks like it will work
perfectly for me, so thanks for all your help!

On a side note I am sporadically seeing another error on hotstandby startup.
 I'm not terribly concerned about it as it is pretty rare and it will work
on a retry so it's not a big deal.  The error is FATAL:  out-of-order XID
insertion in KnownAssignedXids.  If you think it might be a bug and are
interested in hunting it down let me know and I'll help any way I can...but
if you're not too worried about it then neither am I :)


On Thu, Oct 27, 2011 at 4:55 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Thu, Oct 27, 2011 at 10:09 PM, Chris Redekop ch...@replicon.com
 wrote:

  hrmz, still basically the same behaviour.  I think it might be a *little*
  better with this patch.  Before when under load it would start up quickly
  maybe 2 or 3 times out of 10 attemptswith this patch it might be up
 to 4
  or 5 times out of 10...ish...or maybe it was just fluke *shrug*.  I'm
 still
  only seeing your log statement a single time (I'm running at debug2).  I
  have discovered something though - when the standby is in this state if I
  force a checkpoint on the primary then the standby comes right up.  Is
 there
  anything I check or try for you to help figure this out?or is it
  actually as designed that it could take 10-ish minutes to start up even
  after all clients have disconnected from the primary?

 Thanks for testing. The improvements cover specific cases, so its not
 subject to chance; its not a performance patch.

 It's not designed to act the way you describe, but it does.

 The reason this occurs is that you have a transaction heavy workload
 with occasional periods of complete quiet and a base backup time that
 is much less than checkpoint_timeout. If your base backup was slower
 the checkpoint would have hit naturally before recovery had reached a
 consistent state. Which seems fairly atypical. I guess you're doing
 this on a test system.

 It seems cheap to add in a call to LogStandbySnapshot() after each
 call to pg_stop_backup().

 Does anyone think this case is worth adding code for? Seems like one
 more thing to break.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-26 Thread Chris Redekop

 And I think they also reported that if they didn't run hot standby,
 but just normal recovery into a new master, it didn't have the problem
 either, i.e. without hotstandby, recovery ran, properly extended the
 clog, and then ran as a new master fine.

Yes this is correct...attempting to start as hotstandby will produce the
pg_clog error repeatedly and then without changing anything else, just
turning hot standby off it will start up successfully.

 This fits the OP's observation ob the
 problem vanishing when pg_start_backup() does an immediate checkpoint.

Note that this is *not* the behaviour I'm seeingit's possible it happens
more frequently without the immediate checkpoint, but I am seeing it happen
even with the immediate checkpoint.

 This is a different problem and has already been reported by one of
 your colleagues in a separate thread, and answered in detail by me
 there. There is no bug related to this error message.

Excellent...I will continue this discussion in that thread.

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-26 Thread Chris Redekop

FYI I have given this patch a good test and can now no longer reproduce
either the subtrans nor the clog error.  Thanks guys!


On Wed, Oct 26, 2011 at 11:09 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Wed, Oct 26, 2011 at 5:16 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
  On Wed, Oct 26, 2011 at 5:08 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
  Brewing a patch now.
 
  Latest thinking... confirmations or other error reports please.
 
  This fixes both the subtrans and clog bugs in one patch.

 I'll be looking to commit that tomorrow afternoon as two separate
 patches with appropriate credits.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-25 Thread Chris Redekop

 Chris, can you rearrange the backup so you copy the pg_control file as
 the first act after the pg_start_backup?

I tried this and it doesn't seem to make any difference.  I also tried the
patch and I can no longer reproduce the subtrans error, however instead it
now it starts up, but never gets to the point where it'll accept
connections.  It starts up but if I try to do anything I always get FATAL:
 the database system is starting up...even if the load is removed from the
primary, the standby still never finishes starting up.  Attached below is
a log of one of these startup attempts.  In my testing with the patch
applied approx 3 in 10 attempts start up successfully, 7 in 10 attempts go
into the db is starting up statethe pg_clog error is still there, but
seems much harder to reproduce nowI've seen it only once since applying
the patch (out of probably 50 or 60 under-load startup attempts).  It does
seem to be moody like that thoit will be very difficult to reproduce
for a while, and then it will happen damn-near every time for a
while...weirdness

On a bit of a side note, I've been thinking of changing my scripts so that
they perform an initial rsync prior to doing the
startbackup-rsync-stopbackup just so that the second rsync will be
fasterso that the backup is in progress for a shorter period of time, as
while it is running it will stop other standbys from starting upthis
shouldn't cause any issues eh?


2011-10-25 13:43:24.035 MDT [15072]: [1-1] LOG:  database system was
interrupted; last known up at 2011-10-25 13:43:11 MDT
2011-10-25 13:43:24.035 MDT [15072]: [2-1] LOG:  creating missing WAL
directory pg_xlog/archive_status
2011-10-25 13:43:24.037 MDT [15072]: [3-1] LOG:  entering standby mode
DEBUG:  received replication command: IDENTIFY_SYSTEM
DEBUG:  received replication command: START_REPLICATION 2/CF00
2011-10-25 13:43:24.041 MDT [15073]: [1-1] LOG:  streaming replication
successfully connected to primary
2011-10-25 13:43:24.177 MDT [15092]: [1-1] FATAL:  the database system is
starting up
2011-10-25 13:43:24.781 MDT [15072]: [4-1] DEBUG:  checkpoint record is at
2/CF81A478
2011-10-25 13:43:24.781 MDT [15072]: [5-1] DEBUG:  redo record is at
2/CF20; shutdown FALSE
2011-10-25 13:43:24.781 MDT [15072]: [6-1] DEBUG:  next transaction ID:
0/4634700; next OID: 1188228
2011-10-25 13:43:24.781 MDT [15072]: [7-1] DEBUG:  next MultiXactId: 839;
next MultiXactOffset: 1686
2011-10-25 13:43:24.781 MDT [15072]: [8-1] DEBUG:  oldest unfrozen
transaction ID: 1669, in database 1
2011-10-25 13:43:24.781 MDT [15072]: [9-1] DEBUG:  transaction ID wrap limit
is 2147485316, limited by database with OID 1
2011-10-25 13:43:24.783 MDT [15072]: [10-1] DEBUG:  resetting unlogged
relations: cleanup 1 init 0
2011-10-25 13:43:24.791 MDT [15072]: [11-1] DEBUG:  initializing for hot
standby
2011-10-25 13:43:24.791 MDT [15072]: [12-1] LOG:  consistent recovery state
reached at 2/CF81A4D0
2011-10-25 13:43:24.791 MDT [15072]: [13-1] LOG:  redo starts at 2/CF20
2011-10-25 13:43:25.019 MDT [15072]: [14-1] LOG:  consistent state delayed
because recovery snapshot incomplete
2011-10-25 13:43:25.019 MDT [15072]: [15-1] CONTEXT:  xlog redo  running
xacts:
nextXid 4634700 latestCompletedXid 4634698 oldestRunningXid 4634336; 130
xacts:
4634336 4634337 4634338 4634339 4634340 4634341 4634342 4634343 4634344
4634345
4634346 4634347 4634348 4634349 4634350 4634351 4634352 4634353 4634354
4634355
4634356 4634357 4634358 4634359 4634360 4634361 4634362 4634363 4634364
4634365
4634366 4634367 4634368 4634369 4634370 4634371 4634515 4634516 4634517
4634518
4634519 4634520 4634521 4634522 4634523 4634524 4634525 4634526 4634527
4634528
4634529 4634530 4634531 4634532 4634533 4634534 4634535 4634536 4634537
4634538
4634539 4634540 4634541 4634542 4634543 4634385 4634386 4634387 4634388
4634389
4634390 4634391 4634392 4634393 4634394 4634395 4634396 4634397 4634398
4634399
4634400 4634401 4634402 4634403 4634404 4634405 4634406 4634407 4634408
4634409
4634410 4634411 4634412 4634413 4634414 4634415 4634416 4634417 4634418
4634419
4634420 4634579 4634580 4634581 4634582 4634583 4634584 4634585 4634586
4634587
4634588 4634589 4634590 4634591 4634592 4634593 4634594 4634595 4634596
4634597
4634598 4634599 4634600 4634601 4634602 4634603 4634604 4634605 4634606
4634607;
 subxid ovf
2011-10-25 13:43:25.240 MDT [15130]: [1-1] FATAL:  the database system is
starting up
DEBUG:  standby sync_rep_test has now caught up with primary
2011-10-25 13:43:26.304 MDT [15167]: [1-1] FATAL:  the database system is
starting up
2011-10-25 13:43:27.366 MDT [15204]: [1-1] FATAL:  the database system is
starting up
2011-10-25 13:43:28.426 MDT [15241]: [1-1] FATAL:  the database system is
starting up
2011-10-25 13:43:29.461 MDT [15275]: [1-1] FATAL:  the database system is
starting up
and so on...


On Tue, Oct 25, 2011 at 6:51 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, Oct 25, 2011 at 12:39 PM, Florian Pflug f...@phlo.org

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-25 Thread Chris Redekop


 That isn't a Hot Standby problem, a recovery problem nor is it certain
 its a PostgreSQL problem.

Do you have any theories on this that I could help investigate?  It happens
even when using pg_basebackup and it persists until another sync is
performed, so the files must be in some state that that it can't recover
fromwithout understanding the internals just viewing from an
outside perspective, I don't really see how this could not be a PostgreSQL
problem

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-17 Thread Chris Redekop

I can confirm that both the pg_clog and pg_subtrans errors do occur when
using pg_basebackup instead of rsync.  The data itself seems to be fine
because using the exact same data I can start up a warm standby no problem,
it is just the hot standby that will not start up.


On Sat, Oct 15, 2011 at 7:33 PM, Chris Redekop ch...@replicon.com wrote:

   Linas, could you capture the output of pg_controldata *and* increase
 the
   log level to DEBUG1 on the standby? We should then see nextXid value of
   the checkpoint the recovery is starting from.
 
  I'll try to do that whenever I'm in that territory again... Incidentally,
  recently there was a lot of unrelated-to-this-post work to polish things
 up
  for a talk being given at PGWest 2011 Today :)
 
   I also checked what rsync does when a file vanishes after rsync
 computed the
   file list, but before it is sent. rsync 3.0.7 on OSX, at least,
 complains
   loudly, and doesn't sync the file. It BTW also exits non-zero, with a
 special
   exit code for precisely that failure case.
 
  To be precise, my script has logic to accept the exit code 24, just as
  stated in PG manual:
 
  Docs For example, some versions of rsync return a separate exit code for
  Docs vanished source files, and you can write a driver script to
 accept
  Docs this exit code as a non-error case.

 I also am running into this issue and can reproduce it very reliably.  For
 me, however, it happens even when doing the fast backup like so:
 pg_start_backup('whatever', true)...my traffic is more write-heavy than
 linas's tho, so that might have something to do with it.  Yesterday it
 reliably errored out on pg_clog every time, but today it is
 failing sporadically on pg_subtrans (which seems to be past where the
 pg_clog error was)the only thing that has changed is that I've changed
 the log level to debug1I wouldn't think that could be related though.
  I've linked the requested pg_controldata and debug1 logs for both errors.
  Both links contain the output from pg_start_backup, rsync, pg_stop_backup,
 pg_controldata, and then the postgres debug1 log produced from a subsequent
 startup attempt.

 pg_clog: http://pastebin.com/mTfdcjwH
 pg_subtrans: http://pastebin.com/qAXEHAQt

 Any workarounds would be very appreciated.would copying clog+subtrans
 before or after the rest of the data directory (or something like that) make
 any difference?

 Thanks!

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-17 Thread Chris Redekop

Well, on the other hand maybe there is something wrong with the data.
 Here's the test/steps I just did -
1. I do the pg_basebackup when the master is under load, hot slave now will
not start up but warm slave will.
2. I start a warm slave and let it catch up to current
3. On the slave I change 'hot_standby=on' and do a 'service postgresql
restart'
4. The postgres fails to restart with the same error.
5. I turn hot_standby back off and postgres starts back up fine as a warm
slave
6. I then turn off the load, the slave is all caught up, master and slave
are both sitting idle
7. I, again, change 'hot_standby=on' and do a service restart
8. Again it fails, with the same error, even though there is no longer any
load.
9. I repeat this warmstart/hotstart cycle a couple more times until to my
surprise, instead of failing, it successfully starts up as a hot standby
(this is after maybe 5 minutes or so of sitting idle)

So...given that it continued to fail even after the load had been turned of,
that makes me believe that the data which was copied over was invalid in
some way.  And when a checkpoint/logrotation/somethingelse occurred when not
under load it cleared itself upI'm shooting in the dark here

Anyone have any suggestions/ideas/things to try?


On Mon, Oct 17, 2011 at 2:13 PM, Chris Redekop ch...@replicon.com wrote:

 I can confirm that both the pg_clog and pg_subtrans errors do occur when
 using pg_basebackup instead of rsync.  The data itself seems to be fine
 because using the exact same data I can start up a warm standby no problem,
 it is just the hot standby that will not start up.


 On Sat, Oct 15, 2011 at 7:33 PM, Chris Redekop ch...@replicon.com wrote:

   Linas, could you capture the output of pg_controldata *and* increase
 the
   log level to DEBUG1 on the standby? We should then see nextXid value
 of
   the checkpoint the recovery is starting from.
 
  I'll try to do that whenever I'm in that territory again...
 Incidentally,
  recently there was a lot of unrelated-to-this-post work to polish things
 up
  for a talk being given at PGWest 2011 Today :)
 
   I also checked what rsync does when a file vanishes after rsync
 computed the
   file list, but before it is sent. rsync 3.0.7 on OSX, at least,
 complains
   loudly, and doesn't sync the file. It BTW also exits non-zero, with a
 special
   exit code for precisely that failure case.
 
  To be precise, my script has logic to accept the exit code 24, just as
  stated in PG manual:
 
  Docs For example, some versions of rsync return a separate exit code
 for
  Docs vanished source files, and you can write a driver script to
 accept
  Docs this exit code as a non-error case.

 I also am running into this issue and can reproduce it very reliably.  For
 me, however, it happens even when doing the fast backup like so:
 pg_start_backup('whatever', true)...my traffic is more write-heavy than
 linas's tho, so that might have something to do with it.  Yesterday it
 reliably errored out on pg_clog every time, but today it is
 failing sporadically on pg_subtrans (which seems to be past where the
 pg_clog error was)the only thing that has changed is that I've changed
 the log level to debug1I wouldn't think that could be related though.
  I've linked the requested pg_controldata and debug1 logs for both errors.
  Both links contain the output from pg_start_backup, rsync, pg_stop_backup,
 pg_controldata, and then the postgres debug1 log produced from a subsequent
 startup attempt.

 pg_clog: http://pastebin.com/mTfdcjwH
 pg_subtrans: http://pastebin.com/qAXEHAQt

 Any workarounds would be very appreciated.would copying clog+subtrans
 before or after the rest of the data directory (or something like that) make
 any difference?

 Thanks!

Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-10-15 Thread Chris Redekop

  Linas, could you capture the output of pg_controldata *and* increase the
  log level to DEBUG1 on the standby? We should then see nextXid value of
  the checkpoint the recovery is starting from.

 I'll try to do that whenever I'm in that territory again... Incidentally,
 recently there was a lot of unrelated-to-this-post work to polish things
up
 for a talk being given at PGWest 2011 Today :)

  I also checked what rsync does when a file vanishes after rsync computed
the
  file list, but before it is sent. rsync 3.0.7 on OSX, at least,
complains
  loudly, and doesn't sync the file. It BTW also exits non-zero, with a
special
  exit code for precisely that failure case.

 To be precise, my script has logic to accept the exit code 24, just as
 stated in PG manual:

 Docs For example, some versions of rsync return a separate exit code for
 Docs vanished source files, and you can write a driver script to accept
 Docs this exit code as a non-error case.

I also am running into this issue and can reproduce it very reliably.  For
me, however, it happens even when doing the fast backup like so:
pg_start_backup('whatever', true)...my traffic is more write-heavy than
linas's tho, so that might have something to do with it.  Yesterday it
reliably errored out on pg_clog every time, but today it is
failing sporadically on pg_subtrans (which seems to be past where the
pg_clog error was)the only thing that has changed is that I've changed
the log level to debug1I wouldn't think that could be related though.
 I've linked the requested pg_controldata and debug1 logs for both errors.
 Both links contain the output from pg_start_backup, rsync, pg_stop_backup,
pg_controldata, and then the postgres debug1 log produced from a subsequent
startup attempt.

pg_clog: http://pastebin.com/mTfdcjwH
pg_subtrans: http://pastebin.com/qAXEHAQt

Any workarounds would be very appreciated.would copying clog+subtrans
before or after the rest of the data directory (or something like that) make
any difference?

Thanks!

Re: [HACKERS] pg_last_xact_insert_timestamp

2011-09-08 Thread Chris Redekop

Thanks for all the feedback guys.  Just to throw another monkey wrench in
here - I've been playing with Simon's proposed solution of returning 0 when
the WAL positions match, and I've come to the realizatiion that even if
using pg_last_xact_insert_timestamp, although it would help, we still
wouldn't be able to get a 100% accurate how far behind? counternot
that this is a big deal, but I know my ops team is going to bitch to me
about it :).take this situation: there's a lull of 30 seconds where
there are no transactions committed on the serverthe slave is totally
caught up, WAL positions match, I'm reporting 0, everything is happy.  Then
a transaction is committed on the masterbefore the slave gets it my
query hits it and sees that we're 30 seconds behind (when in reality we're
1sec behind).Because of this affect my graph is a little spikey...I
mean it's not a huge deal or anything - I can put some sanity checking in my
number reporting (if 1 second ago you were 0 seconds behind, you can't be
more than 1 second behind now sorta thing).  But if we wanted to go for
super-ideal solution there would be a way to get the timestamp of
pg_stat_replication.replay_location+1 (the first transaction that the slave
does not have).


On Thu, Sep 8, 2011 at 7:03 AM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Sep 8, 2011 at 6:14 AM, Fujii Masao masao.fu...@gmail.com wrote:
  OTOH, new function enables users to monitor the delay as a timestamp.
  For users, a timestamp is obviously easier to handle than LSN, and the
 delay
  as a timestamp is more intuitive. So, I think that it's worth adding
  something like pg_last_xact_insert_timestamp into core for improvement
  of user-friendness.

 It seems very nice from a usability point of view, but I have to agree
 with Simon's concern about performance.  Actually, as of today,
 WALInsertLock is such a gigantic bottleneck that I suspect the
 overhead of this additional bookkeeping would be completely
 unnoticeable.  But I'm still reluctant to add more centralized
 spinlocks that everyone has to fight over, having recently put a lot
 of effort into getting rid of some of the ones we've traditionally
 had.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions

2011-07-11 Thread Chris Travers

On Mon, Jul 11, 2011 at 12:49 PM, David Johnston pol...@yahoo.com wrote:

 I do not see how recursive queries (really iteration of records) even enters
 the picture...

I agree, FWIW.  If the feature was that desirable, we could look at
questions of implementation to make recursion either unnecessary or at
least well managed.

 Right now I can emulate a hierarchical schema structure via a naming scheme
 - for example  schemabase_sub1_sub2_etc.  I am simply looking for a formal
 way to do the above AND also tell the system that I want all schemas under
 schemabase to be in the search path.  Heck, I guess just allowing for
 simply pattern matching in search_path would be useful in this case
 regardless of the presence of an actual schema hierarchy.  Using LIKE
 syntax say: SET search_path TO schemabase_sub1_% or something similar.
 The only missing ability becomes a way for graphical tools to represent the
 schema hierarchy using a tree-structure with multiple depths.

Right.  Semantically myapp_schemaname_subschemaname is no less
hierarchical than myapp.schemaname.subschemaname.  The larger issue is
that of potential ambiguity wrt cross-database references (I don't
have a lot of experience reading the SQL standards, but seeing how
different db's implement cross-db references suggests that the
standards contemplate semantic meaning to depth of the namespace).


 I can see how adding . and .. and relative paths would confuse the issue
 those are not necessary features of a multi-level schema depth.

 The above, combined with a different separator for intra-level
 namespace/schema delineation, would allow for an unambiguous way to define
 and use a hierarchical schema with seemingly minimal invasion into the
 current way of doing things. You could almost implement it just by requiring
 a specific character to act as the separator and then construct the actual
 schema using single-level literals and supporting functions that can convert
 them into an hierarchy.  In other words, the schema table would still only
 contain one field with the full parent!child as opposed to (schema,
 parent) with (VALUES('parent',null),('child','parent')).

 In other words, if we use ! as the separator, any schema named
 parent!child  could be stored and referenced as such but then if you run a
 getChildren(parent) function it would return child along with any other
 schemas of the form parent!%.  In this case the % sign could maybe only
 match everything except ! and the * symbol could be used to match ! as
 well.

Agreed that this would be helpful.  I would personally have a lot of
use for this sort of feature, particularly with managing large numbers
of stored procedures.  Right now I am using a double underscore which
is error-prone.

Best Wishes,
Chris Travers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Testing extension upgrade scripts

2011-03-03 Thread Chris Browne

da...@kineticode.com (David E. Wheeler) writes:
 You should blog this.

He just did, using the SMTP protocol...
-- 
select 'cbbrowne' || '@' || 'acm.org';
http://linuxdatabases.info/info/postgresql.html
Where do you want to Tell Microsoft To Go Today?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...

2011-02-28 Thread Chris Browne

j...@agliodbs.com (Josh Berkus) writes:
 I don't understand what you're talking about at all here. I think
 there are a lot of unsolved problems in monitoring but the one thing
 I think everyone is pretty clear on is that the right way to export
 metrics like these is to export a counter and then have some external
 component periodically copy the counter into some history table and
 calculate the derivative, second derivative, running average of the
 first derivative, etc.

 You missed the original point of the discussion, which was to have
 stats we could use for auto-tuning internally.  Not to export them.

 For example, there are optimizations we could make with the query
 planner if we knew which tables and indexes were hot in general.
 That's how we started this discussion, and it's not solved by storing
 the stats history on another server.

There's value to both, and there's no dearth of monitoring frameworks
that people keep on replacing with successors, so there's certainly room
for both ;-).

Recent stuff about such...
  https://lopsa.org/content/philosophy-monitoring
  https://labs.omniti.com/labs/reconnoiter

I'm not quite sure what ought to be in PostgreSQL as a built-in; I
suspect that what's eventually needed is to be able to correlate things
across database instances, so that when Tom says, I need to know what
data the planner's working on, the answer can be OK, got that...

This data is surely useful to get out of the system, so I'd bias towards
something sorta like what Greg suggests.  And the closed-ended answer may
prevent us from asking more sophisticated questions, also not a notably
good thing...
-- 
(reverse (concatenate 'string moc.liamg @ enworbbc))
If tautologies do not convey information, mathematicians would not be
surprised by them.
-- Mark Miller

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2011-01 as of 2011-02-04

2011-02-15 Thread Chris Browne

robertmh...@gmail.com (Robert Haas) writes:
 It does, but frankly I don't see much reason to change it, since it's
 been working pretty well on the whole.  Andrew was on point when he
 mentioned that it's not obvious what committers get out of working on
 other people's patches.  Obviously, the answer is, well, they get a
 better PostgreSQL, and that's ultimately good for all of us.  But the
 trickiest part of this whole process is that, on the one hand, it's
 not fair for committers to ignore other people's patches, but on the
 other hand, it's not fair to expect committers to sacrifice getting
 their own projects done to get other people's projects done.

I had two interesting germane comments in my RSS feed this morning, both
entitled Please send a patch

   http://www.lucas-nussbaum.net/blog/?p=630

Where Lucas suggests that, when someone requests an enhancement, the
retort Please send a patch mayn't be the best idea, because the
one receiving the requests may be many times better at contributing
such changes than the one making the request.

   http://hezmatt.org/~mpalmer/blog/general/please_send_a_patch.html

On the other hand, Lucas, remember that each time you ask someone
to take some time to implement your pet feature request, you take
some time away from her that could be used to contribute something
in an area where she gives a damn.

These are *both* true statements, and, in order to grow the community
that is capable of enhancing the system, there is merit to the careful
application of both positions.

There's stuff that Tom should do :-).  And absent the general
availability of cloning machines, we need to have people improving their
skills so that there are more that are capable of submitting (and
evaluating and committing) usable patches.
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com').
http://linuxdatabases.info/info/languages.html
Signs  of  a Klingon  Programmer -  14.  Our  competitors are without
honor!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why two dashes in extension load files

2011-02-14 Thread Chris Browne

t...@sss.pgh.pa.us (Tom Lane) writes:
 Peter Eisentraut pete...@gmx.net writes:
 On mån, 2011-02-14 at 10:13 -0500, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 Why do the extension load files need two dashes, like xml2--1.0.sql?
 Why isn't one enough?

 Because we'd have to forbid dashes in extension name and version
 strings.  This was judged to be a less annoying solution.  See
 yesterday's discussion.

 I'm not convinced.  There was nothing in that discussion why any
 particular character would have to be allowed in a version number.

 Well, there's already a counterexample in the current contrib stuff:
 uuid-ossp.  We could rename that to uuid_ossp of course, but it's
 not clear to me that there's consensus for forbidding dashes here.

I suspect that _ might be troublesome.

Let me observe on Debian policy... It requires that package names
consist as follows:

   Package names (both source and binary, see Package, Section 5.6.7)
   must consist only of lower case letters (a-z), digits (0-9), plus (+)
   and minus (-) signs, and periods (.). They must be at least two
   characters long and must start with an alphanumeric character.

   http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Source

I suspect that we'll need to have a policy analagous to that.

Also worth observing: Debian package files are of the form:
   ${package}_${version}-${dversion}_${arch}.deb

where package and version have fairly obvious interpretation, and...
  - dversion indicates a sequence handled by Debian
  - arch indicates CPU architecture (i386, amd64, ...)

Probably the dversion/arch bits aren't of interest to us, but the
remainder of the notation used by Debian seems not inapplicable for us.
-- 
let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];;
http://linuxdatabases.info/info/languages.html
Signs  of  a Klingon  Programmer  -  4. You  cannot really appreciate
Dilbert unless you've read it in the original Klingon.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postponing some large patches to 9.2

2011-02-09 Thread Chris Browne

pg...@j-davis.com (Jeff Davis) writes:
 On Tue, 2011-02-08 at 15:10 -0500, Chris Browne wrote:
 It's more than a bit sad...  The RangeType change has the massive merit
 of enabling some substantial development changes, where we can get rid
 of whole classes of comparison clauses, and hopefully whole classes of
 range errors.  That was my favorite would-be feature for 9.1.

 I appreciate the support.

 If you take the feature for a quick spin before the next commitfest,
 that would be a big help. If I get it in the first commitfest of 9.2
 that may mean some follow-up features, like RANGE KEYs/FKs, and maybe
 even RANGE JOIN might have a chance for 9.2 as well. Or, maybe some
 other features might find it useful, like partitioning or audit logs.

I've found my wish item...  I wish that queries could expand ranges in
much the same fashion that BETWEEN expands into two query nodes.

That way, you can use a range to pick data from a large table, and not
revert to a Seq Scan+Filter, which is what I'm seeing for the following
sort of query:

   select * from some_data where '[2010-01-01,2010-02-01)'::daterange @ 
whensit;
-- 
let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];;
http://linuxfinances.info/info/lsf.html
Rules of the Evil Overlord  #162. If I steal something very important
to the hero, I will not put it on public display.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Range Types - efficiency

2011-02-09 Thread Chris Browne

One of the things I'd particularly like to use range types for is to
make it easier to construct range-related queries.  Classic example is
that of reports that work on date ranges.

I create a table that will have transaction data:

CREATE TABLE some_data (
id serial,
whensit date
-- And it'll have other attributes, but those don't matter here...
);
CREATE INDEX some_when ON some_data USING btree (whensit);

I then populate it with a bunch of date-based data...

rangetest@localhost-  select count(*), min(whensit), max(whensit) from 
some_data;
 count |min |max
---++
 37440 | 2007-01-01 | 2014-12-27
(1 row)

Here's the traditional way of doing a range-based query on this data:

rangetest@localhost-  explain analyze  select * from some_data where whensit 
= '2010-01-01' and whensit  '2010-02-01';
  QUERY PLAN
---
 Bitmap Heap Scan on some_data  (cost=12.30..184.23 rows=395 width=8) (actual 
time=0.064..0.150 rows=390 loops=1)
   Recheck Cond: ((whensit = '2010-01-01'::date) AND (whensit  
'2010-02-01'::date))
   -  Bitmap Index Scan on some_when  (cost=0.00..12.21 rows=395 width=0) 
(actual time=0.054..0.054 rows=390 loops=1)
 Index Cond: ((whensit = '2010-01-01'::date) AND (whensit  
'2010-02-01'::date))
 Total runtime: 0.197 ms
(5 rows)

The RangeType-based equivalent is the following:

rangetest@localhost-  explain analyze select * from some_data where 
'[2010-01-01,2010-02-01)'::daterange @ whensit;
   QUERY PLAN
-
 Seq Scan on some_data  (cost=0.00..634.00 rows=1 width=8) (actual 
time=1.045..111.739 rows=390 loops=1)
   Filter: ('[ 2010-01-01, 2010-02-01 )'::daterange @ whensit)
 Total runtime: 111.780 ms
(3 rows)

This, alas, reverts to a seq scan on the table, rather than restricting
itself to the tuples of interest.

I realize that, after a fashion, I'm using this backwards.  But when I'm
doing temporal stuff, that tends to be the pattern:

 - There is a set of temporal configuration, indicating criteria that
   are true for particular date ranges

 - There is then event data, which has but a single date, but which
   needs to be matched against the temporal configuration.

It sure would be nice to expand that filter into subqueries involving
the two criteria, in much the same fashion that is true today for
BETWEEN.  I imagine that would allow many queries with this kind of
pattern to make use of indexes, making them visibly thousands of times
faster.
-- 
I have  traveled the  length and breadth  of this country  and talked
with the best people, and can assure you that data processing is a fad
that won't  last out  the year.  --  Business books  editor, Prentice
Hall 1957

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types - efficiency

2011-02-09 Thread Chris Browne

pg...@j-davis.com (Jeff Davis) writes:
 On Wed, 2011-02-09 at 16:20 -0500, Chris Browne wrote: 
 rangetest@localhost-  explain analyze select * from some_data where 
 '[2010-01-01,2010-02-01)'::daterange @ whensit;
QUERY PLAN
 -
  Seq Scan on some_data  (cost=0.00..634.00 rows=1 width=8) (actual 
 time=1.045..111.739 rows=390 loops=1)
Filter: ('[ 2010-01-01, 2010-02-01 )'::daterange @ whensit)
  Total runtime: 111.780 ms
 (3 rows)
 
 This, alas, reverts to a seq scan on the table, rather than restricting
 itself to the tuples of interest.
 
 I realize that, after a fashion, I'm using this backwards.  But when I'm
 doing temporal stuff, that tends to be the pattern:

 Yes. The index is a btree index on a normal column, so range types can't
 exactly help with that directly -- except maybe as a rewrite like you
 say.

 One thing you might try is a functional index on (range(whensit)) and
 then do: where '...' @ range(whensit).

 Does that work for you?

That doesn't appear to actually help:

rangetest@localhost-  create index i2 on some_data (range(whensit));
CREATE INDEX
rangetest@localhost-  explain analyze select * from some_data where 
'[2010-01-01,2010-02-01)'::daterange @ range(whensit);
 QUERY PLAN
-
 Seq Scan on some_data  (cost=0.00..727.60 rows=12480 width=8) (actual 
time=1.030..110.542 rows=390 loops=1)
   Filter: ('[ 2010-01-01, 2010-02-01 )'::daterange @ range(whensit))
 Total runtime: 110.585 ms
(3 rows)

In any case, I suggest that as a couple steps down the road thing, it
would be desirable to have that query rewrite.  Seems like a reasonable
ToDo item to consider for the future, if not in the first deployment.

Maybe that's something to add in 9.2 CommitFest #3! :-)
-- 
There  isn't  any  reason  why  Linux  can't  be  implemented  as  an
enterprise  computing solution.   Find  out what  you've been  missing
while you've been rebooting Windows NT. - Infoworld

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postponing some large patches to 9.2

2011-02-08 Thread Chris Browne

sfr...@snowman.net (Stephen Frost) writes:
 * Robert Haas (robertmh...@gmail.com) wrote:
 - Range Types.  This is a large patch which was submitted for the
 first time to the last CommitFest of the cycle, and the first version
 that had no open TODO items was posted yesterday, three-quarters of
 the way through that last CommitFest.  Some good review has been done.
  While more is probably needed, I think we should feel good about
 what's been accomplished and mark this one Returned with Feedback.

 I don't agree w/ punting Range Types.  Range Types were discussed as
 far back as the 2010 developer meeting, were discussed quite a bit
 again starting in October and throughout the fall, and Jeff has
 regularly been posting updates to it.  Given how thorough Jeff is, my
 feeling is that this patch is more than ready for beta.  My impression
 is also that it's not as invasive or destablizing as the others and
 while it wasn't being posted to the previous commit fests, it was
 clearly being worked on, updated, and improved.

I generally mirror those thoughts.  Range Types don't seem invasive or
destabilizing, and the code base has been deployed for quite some time
as an extension (not quite contrib).  It would be disappointing to
drop this one when it is mighty close.

 - synchronous replication.  Based on some looking at this today, I am
 somewhat doubtful about the possibility of me or anyone else beating
 this completely into shape in time for 9.2, unless we choose to extend
 the deadline by several weeks.  Simon said that he would have time to
 finish this in the next two weeks, but, as noted, the CommitFest is
 scheduled to be over in ONE week, and it looks to me like this is
 still pretty rough.  However, there's a lot of good stuff in here, and
 I think it might be practical to get some portion of it committed even
 if we can't agree on all of it.  I recommend we give this one a little
 more time to shake out before giving up on it.

 It really would be nice to have this, but I agree that it's pretty late
 in the game for it to be in the state is appears to be in. :/  It also
 seems to have been stalled for the past couple of months, which doesn't
 bode well for it, in my view.

The stall troubles me, and doesn't bode terribly well for 9.1.
-- 
Rules  of the  Evil  Overlord #39.  If  I absolutely  must ride  into
battle, I  will certainly not ride  at the forefront of  my Legions of
Terror, nor will I seek out my opposite number among his army.
http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postponing some large patches to 9.2

2011-02-08 Thread Chris Browne

pg...@j-davis.com (Jeff Davis) writes:
 On Tue, 2011-02-08 at 06:57 -0500, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
  - Range Types.  This is a large patch which was submitted for the
  first time to the last CommitFest of the cycle, and the first version
  that had no open TODO items was posted yesterday, three-quarters of
  the way through that last CommitFest.  Some good review has been done.
   While more is probably needed, I think we should feel good about
  what's been accomplished and mark this one Returned with Feedback.
 
 I don't agree w/ punting Range Types.  Range Types were discussed as far
 back as the 2010 developer meeting, were discussed quite a bit again
 starting in October and throughout the fall, and Jeff has regularly
 been posting updates to it.  Given how thorough Jeff is, my feeling is
 that this patch is more than ready for beta.

 I appreciate the sentiment, but in addition to some cleanup, any patch
 like this at least requires some discussion. It's a language change
 we'll be supporting for a long time.

 At minimum, we're a couple hundred emails shy of a real consensus on the
 naming ;)

It's more than a bit sad...  The RangeType change has the massive merit
of enabling some substantial development changes, where we can get rid
of whole classes of comparison clauses, and hopefully whole classes of
range errors.  That was my favorite would-be feature for 9.1.

It'll take some time to get code changes into systems to use this; the
sooner the feature's in a deployable version of Postgres, the earlier
that kind of thing may start.
-- 
(format nil ~S@~S cbbrowne gmail.com)
http://www3.sympatico.ca/cbbrowne/internet.html
Colorless green ideas sleep furiously.
-- Noam Chomsky

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OpenVMS - an effort which needs guidance and support.

2011-02-07 Thread Chris Browne

peder...@ccsscorp.com (Bill Pedersen) writes:
 I look forward to hearing from people in the PostgreSQL community as well as
 from others interested in this effort.

To a number of us, it's academically interesting, though, as we don't
have VMS systems, it's not likely to be super-easy to assist in the
matter.

It certainly would  be interesting to see how easy or difficult the port
would be.  I suspect that's a more interesting port than, say, Digital
UNIX, these days.
-- 
(reverse (concatenate 'string ofni.sesabatadxunil @ enworbbc))
http://linuxfinances.info/info/slony.html
On the other hand, you have different fingers. 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-02-07 Thread Chris Browne

dp...@pgadmin.org (Dave Page) writes:
 On Mon, Feb 7, 2011 at 6:55 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 ... Well, the current CommitFest ends in one week, ...

 Really?  I thought the idea for the last CF of a development cycle was
 that it kept going till we'd dealt with everything.  Arbitrarily
 rejecting stuff we haven't dealt with doesn't seem fair.

 Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
 *five months*. We've been doing schedule-based CommitFests ever since
 and it's worked much better.

 Rejecting stuff because we haven't gotten round to dealing with it in
 such a short period of time is a damn good way to limit the number of
 contributions we get. I don't believe we've agreed at any point that
 the last commitfest should be the same time length as the others (when
 we originally came up with the commitfest idea, it certainly wasn't
 expected), and deciding that without giving people advanced notice is
 a really good way to piss them off and encourage them to go work on
 other things.

 If we're going to put a time limit on this - and I think we should -
 we should publish a date ASAP, that gives everyone a fair chance to
 finish their work - say, 4 weeks.

 Then, if we want to make the last commitfest the same length as the
 others next year, we can make that decision and document those plans.

There *is* a problem that there doesn't seem to be enough time to
readily allow development of larger features without people getting
stuck fighting with the release periods.  But that's not the problem
taking place here.  It was documented, last May, that the final
CommitFest for 9.1 was to complete 2011-02-15, and there did seem to be
agreement on that.

It sure looks to me like there are going to be a bunch of items that,
based on the recognized policies, need to get deferred to 9.2, and the
prospects for Sync Rep getting into 9.1 don't look notably good to me.

Looking at things statistically, the 9.1 commitfests have had the
following numbers of items:

  #1 - 2010-09 - 52, of which 26 were committed
  #2 - 2010-11 - 43, of which 23 were committed
  #3 - 2011-01 - 98, of which 35 have been committed, and 10 are
 considered ready to commit.

It may appear unfair to not offer everyone a fair chance to finish
their work, but it's not as if the date wasn't published Plenty Long
Ago. and well-publicized.

But deferring the end of the CommitFest would be Not Fair to those that
*did* get their proposed changes ready for the preceding Fests.  We
cannot evade unfairness.

It's definitely readily arguable that fairness requires that:

 - Items not committable by 2011-02-15 be deferred to the 2011-Next fest

   There are around 25 items right now that are sitting with [Waiting
   for Author] and [Returned with Feedback] statuses.  They largely seem
   like pretty fair game for next fest.

 - Large items that weren't included in the 2010-11 fest be considered
   problematic to try to integrate into 9.1

   There sure seem to be some large items in the 2011-01 fest, which I
   thought wasn't supposed to be the case.

We shouldn't just impose policy for the sake of imposing policy, but I
do recall Really Long CommitFests being pretty disastrous.  And there's
*SO* much outstanding in this particular fest that it's getting past
time for doing some substantial triage so that reviewer attentions may
be directed towards the items most likely to be acceptable for 9.1.

I hate to think that 9.1 won't include Simon's SR material, but that may
have to be.
-- 
http://www3.sympatico.ca/cbbrowne/slony.html
It's a pretty rare beginner who isn't clueless.  If beginners weren't
clueless, the infamous Unix learning cliff wouldn't be a problem.
-- david parsons

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 >

1 - 100 of 489 matches

Mail list logo