date:20130709



Hello Tatsuo,


I have looked into this:
https://commitfest.postgresql.org/action/patch_view?id=1105
because it's marked as "Ready for committer". However I noticed that
you worried about other pgbench patches such as
https://commitfest.postgresql.org/action/patch_view?id=1103 .


So I would like to know whether the throttling patch is committed and
then update the progress patch to take that into account.


Shall I wait for your pgbench --throttle patch becomes ready for committer?


No. I'll submit another patch to the next commitfest to improve the 
progress behavior under throttling, if & when both initial patches are 
committed.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing Inner Joins

2013-07-09 Thread Atri Sharma

On Wed, Jul 10, 2013 at 8:29 AM, Ashutosh Bapat
 wrote:
> AFAIK,
> There is code to remove the redundant relations (joins too are relations).
> But I don't remember exactly where it is. Start looking at query_planner().
> The removal of relations should happen before actually planning the joins.
>
>

Thanks for your reply.

I am not sure if the part you mentioned is inline with the case I am
talking about. The specific case I mentioned does not seem to have
been implemented yet, and I was researching the roadblocks to it,
hence asked on the list.

Thanks and Regards,

Atri

--
Regards,

Atri
l'apprenant

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] row level security (v3)

2013-07-09 Thread Atri Sharma

On Wed, Jul 10, 2013 at 12:58 AM, Mike Blackwell  wrote:
> The most recent patch (v3) applies and builds cleanly and passes make check.
> Documentation on the new SQL syntax and catalog changes is included with the
> patch and looks good to me.
>
> The regression tests look pretty complete.  In addition to the included
> tests, dropping and altering the data type on a column referenced in the
> security clause work as expected, rejecting the change with a dependency
> error.  Renaming a column succeeds as expected.
>
> pg_dump and restore properly was also successful.
>
> I noticed that the security clause is visible to any user via psql \dt+, as
> well as in the pg_rowsecurity view.  Perhaps this should be mentioned in the
> relevant section of user-manag.sgml so users realize any sensitive
> information in the security clause isn't secure.
>
> What I've checked looks good.  I don't feel qualified to do a code review so
> that's still outstanding.  I believe Atri will be looking at that.
>

Hi All,

I, in my effort to support Mike's excellent work, will be helping out
with Kaigai san's patch's review.

I have been thinking of some tests that could be edge cases for the
patch, and will discuss them with Mike this week and test.

Also, in the first look, the patch looks good. I will give it a more
thorough look this week and send over a review to the max of my
capabilities.

Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-07-09 Thread Amit Kapila


> On Wednesday, July 10, 2013 6:32 AM Mike Blackwell wrote:

> The only environment I have available at the moment is a virtual box.  That's 
> probably not going to be very helpful for performance testing.

It's okay. Anyway thanks for doing the basic test for patch. 

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing Inner Joins

2013-07-09 Thread Ashutosh Bapat

AFAIK,
There is code to remove the redundant relations (joins too are relations).
But I don't remember exactly where it is. Start looking at query_planner().
The removal of relations should happen before actually planning the joins.

On Tue, Jul 9, 2013 at 12:21 PM, Atri Sharma  wrote:

> My main point here is researching how difficult it is to add
> functionality in the planner to allow it to to detect and make
> decisions on foreign keys present in the outer relation.
>
> I think that if this is added, rest of the work would be much easier.
> I amy be completely wrong,though.
>
> Thoughts/Comments?
>
> Regards,
>
> Atri
>
> Regards,
>
> Atri
> l'apprenant
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

[HACKERS] pgbench patches

2013-07-09 Thread Tatsuo Ishii

Hi Fabien,

I have looked into this:
https://commitfest.postgresql.org/action/patch_view?id=1105
because it's marked as "Ready for committer". However I noticed that
you worried about other pgbench patches such as
https://commitfest.postgresql.org/action/patch_view?id=1103 .

> So I would like to know whether the throttling patch is committed and
> then update the progress patch to take that into account.

Shall I wait for your pgbench --throttle patch becomes ready for committer?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add /coverage/ to .gitignore

On Mon, 2013-07-01 at 20:16 -0300, Fabrízio de Royes Mello wrote:
> Hi all,
> 
> When we run...
> 
> ./configure --enable-converage
> make coverage-html
> 
> ...the output is generated into /coverage/ directory. The attached patch
> add /converage/ to .gitignore.

I have committed this.  It should be coverage/, however, because it can
also exist in subdirectories.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2013-07-09 Thread Mike Blackwell

The only environment I have available at the moment is a virtual box.
 That's probably not going to be very helpful for performance testing.

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com



* *


On Mon, Jul 8, 2013 at 11:09 PM, Amit Kapila  wrote:

>
> On Tuesday, July 09, 2013 2:52 AM Mike Blackwell wrote:
>
> > I can't comment on further direction for the patch, but since it was
> marked as Needs Review in the CF app I took a quick look at it.
>   Thanks for looking into it.
>
>   Last time Heikki has found test scenario's where the original patch was
> not performing good.
>   He has also proposed a different approach for WAL encoding and sent the
> modified patch which has comparatively less negative performance impact and
>   asked to check if the patch can reduce the performance impact for the
> scenario's mentioned by him.
>   After that I found that with some modification's (use new tuple data for
>  encoding) in his approach, it eliminates the negative performance impact
> and
>   have WAL reduction for more number of cases.
>
>   I think the first thing to verify is whether the results posted can be
> validated in some other environment setup by another person.
>   The testcase used is posted at below link:
>   http://www.postgresql.org/message-id/51366323.8070...@vmware.com
>
>
>
> > It patches and compiles clean against the current Git HEAD, and 'make
> check' runs successfully.
>
> > Does it need documentation for the GUC variable
> 'wal_update_compression_ratio'?
>
>   This variable has been added to test the patch for different
> compression_ratio during development test.
>   It was not decided to have this variable permanently as part of this
> patch, so currently there is no documentation for it.
>   However if the decision comes out to be that it needs to be part of
> patch, then documentation for same can be updated.
>
> With Regards,
> Amit Kapila.
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

[HACKERS] docbook-xsl version for release builds

I would like to start using a newer version of docbook-xsl for the
release builds.  This is currently used for building the man pages.  The
latest release is 1.78.1 and fixes a few formatting errors.

How do we do that?

We could just take the latest Debian package and stick it on borka.  I
don't have a problem with this also targeting maintenance releases, but
maybe there are other ideas.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing recovery.conf parameters into GUCs

2013-07-09 Thread Josh Berkus

On 07/08/2013 11:43 PM, Simon Riggs wrote:
> This needs to be broken down rather than just say "I like Greg's
> proposal", or I have written a patch. Writing the patch is not the/an
> issue.
> 
> Greg's points were these (I have numbered them and named/characterised them)

Thanks for the nice summary!  Makes it easy for the rest of us to address.

> 
> 1. MOVE SETTINGS
> All settings move into the postgresql.conf.
> 
> Comment: AFAIK, all agree this.

Good to go then.

> 2. RELOCATE RECOVERY PARAMETER FILE(s)
> As of 9.2, relocating the postgresql.conf means there are no user
> writable files needed in the data directory.
> 
> Comment: AFAIK, all except Heikki wanted this. He has very strong
> objections to my commit that would have allowed relocating
> recovery.conf outside of the data directory. By which he means both
> the concepts of triggerring replication and of specifying parameters.
> Changes in 9.3 specifically write files to the data directory that
> expect this.

Yeah, I didn't understand why this was shot down either.

Heikki?

> 3. SEPARATE TRIGGER FILE
> Creating a standby.enabled file in the directory that houses the
> postgresql.conf (same logic as "include" uses to locate things) puts the
> system into recovery mode.  That feature needs to save some state, and
> making those decisions based on existence of a file is already a thing
> we do.  Rather than emulating the rename to recovery.done that happens
> now, the server can just delete it, to keep from incorrectly returning
> to a state it's exited.  A UI along the lines of the promote one,
> allowing "pg_ctl standby", should fall out of here.  I think this is
> enough that people who just want to use replication features need never
> hear about this file at all, at least during the important to simplify
> first run through.
> 
> Comment: AFAIK, all except Heikki are OK with this.

One bit of complexity I'd like to see go away is that we have two
trigger files: one to put a server into replication, and one to promote
it.  The promotion trigger file is a legacy of warm standby, I believe.
 Maybe, now that we have pg_ctl promote available, we can eliminate the
promotion trigger?

Also, previously, deleting the recovery.conf file did not cause the
server to be promoted AFAIK.  Is that something we should change if
we're going to keep a trigger file to start replication?

Also, I'm not keen on the idea that the start-replication trigger file
will still be *required*.  I really want to be able to manage my setup
entirely through configuration/pg_ctl directives and not be forced to
use a trigger file.

> 4. DISALLOW PREVIOUS API
> If startup finds a recovery.conf file where it used to live at,
> abort--someone is expecting the old behavior.  Hint to RTFM or include a
> short migration guide right on the spot.  That can have a nice section
> about how you might use the various postgresql.conf include* features if
> they want to continue managing those files separately.  Example: rename
> it as replication.conf and use include_if_exists if you want to be able
> to rename it to recovery.done like before.  Or drop it into a conf.d
> directory where the rename will make it then skipped.
> 
> Comment: I am against this. Tool vendors are not the problem here.
> There is no good reason to just break everybody's scripts with no
> warning of future deprecataion and an alternate API, especially since
> we now allow multiple parameter files.

Well, the issue is not so much the presence of a recovery.conf file full
of config variables ... which as you point out is now effectively
supported ... but the use of that as a trigger file.   So I think the
two points you make here are flipped.

Personally, I don't care if we support the old recovery.conf-trigger
behavior as long as I'm not forced to use it.  The main objection to
supporting both was code complexity, I believe.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-09 Thread Josh Berkus

Prabakaran,


> I am a newbie to PostgreSQL and was wondering about the following
> behaviour.

pgsql-hackers is not the appropriate list for this kind of question.  In
the future, please post to pgsql-novice, pgsql-sql, or pgsql-general
with this kind of question.  Thanks.

> Can you please help me understand why 'LIKE' does not use implicit cast
> ? 

Like uses the operator class "text_pattern_ops" which doesn't include an
implict cast.  For one thing, the implicit cast is from text -->
integer, not the other way around, and there is no LIKE operator for
integers.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Differences in WHERE clause of SELECT

2013-07-09 Thread Prabakaran, Vaishnavi

Hi,

 

I am a newbie to PostgreSQL and was wondering about the following
behaviour.

 

 

SELECT true WHERE 1 = '1'; <-- Returns true 

 

SELECT true WHERE 1 BETWEEN '0' and '2';<-- Returns true

 

SELECT true WHERE 1 IS DISTINCT FROM '2';<-- Returns true

 

SELECT true WHERE 1 LIKE '1';   <-- Returns 'operator does not exist'
Error

 

SELECT true WHERE '1' LIKE 1;   <-- Returns 'operator does not exist'
Error

 

 

 

The first three queries work because of the implicit cast whereas the
'LIKE' is not using implicit cast. 

 

Can you please help me understand why 'LIKE' does not use implicit cast
? 

 

 

 

Best Regards,

Vaishnavi

Re: [HACKERS] Review: extension template

On 7/8/13 4:20 AM, Dimitri Fontaine wrote:
> Let me stress that the most important value in that behavior is to be
> able to pg_restore using a newer version of the extension, the one that
> works with the target major version. When upgrading from 9.2 to 9.3 if
> you depend on keywords that now are reserved you need to install the
> newer version of the extension at pg_restore time.

I think there is an intrinsic conflict here.  You have things inside the
database and outside.  When they depend on each other, it gets tricky.
Extensions were invented to copy with that.  They do the job, more or
less.  Now you want to take the same mechanism and apply it entirely
inside the database.  But that wasn't the point of extensions!  That's
how you get definitional issues like, should extensions be dumped or not.

I don't believe the above use case.  (Even if I did, it's marginal.)
You should always be able to arrange things so that an upgrade of an
inside-the-database-package is possible before or after pg_restore.
pg_dump and pg_restore are interfaces between the database and the
outside.  They should have nothing to do with upgrading things that live
entirely inside the database.

There would be value to inside-the-database package management, but it
should be a separate concept.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] robots.txt on git.postgresql.org

Magnus Hagander  writes:
> Oh, and we need stable wheezy packages for them, or we'll be paying
> even more in maintenance. AFAICT, there aren't any for cgit, but maybe
> I'm searching for the wrong thing..

Seems to be a loser on that front too.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: extension template

Salut Dimitri,

On 07/09/2013 12:40 PM, Dimitri Fontaine wrote:
> Markus Wanner  writes:
>> Or how do you think would pg_restore fail, if you followed the mental
>> model of the template?
> 
>   # create template for extension foo version 'x' as '';
>   # create extension foo;
>   # alter template for extension foo rename to bar;
> 
>   $ pg_dump | psql
> 
> And now it's impossible to CREATE EXTENSION foo, because there's no
> source to install it from available. I think we should actively prevent
> that scenario to happen in the field (current patch prevents it).

I see. You value that property a lot.

However, I don't think the plain ability to create an extension is quite
enough to ensure a consistent restore, though. You'd also have to ensure
you recreate the very *same* contents of the extension that you dumped.

Your patch doesn't do that. It seems to stop enforcing consistency at
some arbitrary point in between. For example, you can ALTER a function
that's part of the extension. Or ALTER TEMPLATE FOR EXTENSION while an
extension of that version is installed.

> Usually what you do when you want to change an extension is that you
> provide an upgrade script then run it with ALTER EXTENSION UPDATE.

As a developer, I often happen to work on one and the same version, but
test multiple modifications. This ability to change an extension
"on-the-fly" seems pretty valuable to me.

> The current file system based extensions allow you to maintain
> separately the files foo--1.0.sql and foo--1.2.sql, and you don't need
> to copy a current version of the whole extension away before hacking the
> new version.
>
> The Extension Template facility in the current patch allows you to do
> much the same

Sure. I'm aware of that ability and appreciate it.

> I see that you've spent extra time and effort to better understand any
> community consensus that might exist around this patch series, and I
> want to say thank you for that!

Thank you for your patience in pursuing this and improving user
experience with extensions. I really appreciate this work.

> Basically what I'm saying in this too long email is that I need other
> contributors to voice-in.

I fully agree. I don't want to hold this patch back any further, if it's
just me.

>> I really recommend to rename the feature (and the
>> commands), in that case, though. We may rather call the existing
>> file-system thingie an "extension template", instead, as it becomes a
>> good differentiator to what you're proposing.
> 
> Any proposals?

"Inline extension" is a bit contradictory. Maybe "managed extension"
describes best what you're aiming for.

In a similar vein, "out-of-line extension" sounds redundant to me, so
I'd rather characterize the file-system thingie as "extension templates"
wherever a clear distinction between the two is needed.

>> How about ALTER EXTENSION ADD (or DROP)? With the link on the
>> "template", you'd have to prohibit that ALTER as well, based on the
>> exact same grounds as the RENAME: The installed extension would
>> otherwise differ from the "template" it is linked to.
> 
> You're "supposed" to be using that from within the template scripts
> themselves. The main use case is the upgrade scripts from "unpackaged".

Agreed. The documentation says it's "mainly useful in extension update
scripts".

> I could see foreclosing that danger by enforcing that creating_extension
> is true in those commands.

For managed extensions only, right? It's not been prohibited for
extensions so far.

>> See how this creates an animal pretty different from the current
>> extensions? And IMO something that's needlessly restricted in many ways.
> 
> Well really I'm not convinced. If you use ALTER EXTENSION ADD against an
> extension that you did install from the file system, then you don't know
> what will happen after a dump and restore cycle, because you didn't
> alter the files to match what you did, presumably.

Yeah. The user learned to work according to the template model. Maybe
that was not the best model to start with. And I certainly understand
your desire to ensure a consistent dump & restore cycle. However...

> If you do the same thing against an extension that you did install from
> a catalog template, you just managed to open yourself to the same
> hazards… 

... I think the current patch basically just moves the potential
hazards. Maybe these moved hazards are less dramatic and justify the
move. Let's recap:

In either case (template model & link model) the patch:

 a) guarantees to restore the template scripts and settings of
all managed extensions

With the link model (and v9 of your patch, which includes the RENAME
fix, and pretending there are no other bugs):

 b) it guarantees *some* revision of an extension version (identified
by name and version) that has been created at dump time can be
restored from a template from the dump

If you'd additionally restrict ALTER EXTENSION ADD/DROP as well as ALTER
TEMPLATE FOR EXTENSION AS .

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread Merlin Moncure

On Fri, Jul 5, 2013 at 3:01 PM, Josh Berkus  wrote:
> On 07/05/2013 12:26 PM, Tom Lane wrote:
>> ivan babrou  writes:
>>> If you can figure out that postgresql is overloaded then you may
>>> decide what to do faster. In our app we have very strict limit for
>>> connect time to mysql, redis and other services, but postgresql has
>>> minimum of 2 seconds. When processing time for request is under 100ms
>>> on average sub-second timeouts matter.
>>
>> If you are issuing a fresh connection for each sub-100ms query, you're
>> doing it wrong anyway ...
>
> It's fairly common with certain kinds of apps, including Rails and PHP.
>  This is one of the reasons why we've discussed having a kind of
> stripped-down version of pgbouncer built into Postgres as a connection
> manager.  If it weren't valuable to be able to relocate pgbouncer to
> other hosts, I'd still say that was a good idea.

for the record, I think this is a great idea.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: extension template

Markus Wanner  writes:
> First of all, I figured that creation of a template of a newer version
> is prohibited in case an extension exists:

Ooops, that's a bug I need to fix.

> I then came to think of the upgrade scripts... what do we link against
> if an extension has been created from some full version and then one or
> more upgrade scripts got applied?
>
> Currently, creation of additional upgrade scripts are not blocked. Which
> is a good thing, IMO. I don't like the block on newer full versions.
> However, the upgrade doesn't seem to change the dependency, so you can
> still delete the update script after the update. Consider this:

We should allow both cases and move the dependency when a script is
installed, and also maintain dependencies in between an upgrade script
and its previous element in the link, either another upgrade script or
the default_full_version you can start with… actually, both.

> I.e. the TO version should probably have a dependency on the FROM
> version (that might even be useful in the template model).

Agreed.

> Another thing that surprised me is the inability to have an upgrade
> script *and* a full version (for the same extension target version).
> Even if that's intended behavior, the error message could be improved:

Will fix too. Thanks for your extended testing!

Josh, if running out of time on this CF, feel free to mark this one as
"returned with feedback": I will still try to make it under the current
commit fest, but with CHAR(13) starting soon and the current state of
the patch it's clearly reasonnable to say it's not ready yet.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] robots.txt on git.postgresql.org

2013-07-09 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Jul 9, 2013 at 5:56 PM, Dimitri Fontaine  
> wrote:
>> What's blocking alternatives to be considered? I already did mention
>> cgit, which has the advantage to clearly show the latest patch on all
>> the active branches in its default view, which would match our branch
>> usage pretty well I think.

> ...
> If they do all those things, and people do like those interfaces, then
> sure, we can do that.

cgit is what Red Hat is using, and I have to say I don't like it much.
I find gitweb much more pleasant overall.  There are a few nice things
in cgit but lots of things that are worse.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [REVIEW] row level security (v3)

2013-07-09 Thread Mike Blackwell

The most recent patch (v3) applies and builds cleanly and passes make
check.  Documentation on the new SQL syntax and catalog changes is included
with the patch and looks good to me.

The regression tests look pretty complete.  In addition to the included
tests, dropping and altering the data type on a column referenced in the
security clause work as expected, rejecting the change with a dependency
error.  Renaming a column succeeds as expected.

pg_dump and restore properly was also successful.

I noticed that the security clause is visible to any user via psql \dt+, as
well as in the pg_rowsecurity view.  Perhaps this should be mentioned in
the relevant section of user-manag.sgml so users realize any sensitive
information in the security clause isn't secure.

What I've checked looks good.  I don't feel qualified to do a code review
so that's still outstanding.  I believe Atri will be looking at that.


__

*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com


 
* *

Re: [HACKERS] robots.txt on git.postgresql.org

2013-07-09 Thread Magnus Hagander

On Tue, Jul 9, 2013 at 5:56 PM, Dimitri Fontaine  wrote:
> Andres Freund  writes:
>> Gitweb is horribly slow. I don't think anybody with a bigger git repo
>> using gitweb can afford to let all the crawlers go through it.
>
> What's blocking alternatives to be considered? I already did mention
> cgit, which has the advantage to clearly show the latest patch on all
> the active branches in its default view, which would match our branch
> usage pretty well I think.

Time and testing.

For one thing, we need something that works with the fact that we have
multiple repositories on that same box. It may well be that these do,
but it needs to be verified. And t be able to give an overview. And to
be able to selectively hide some repositories. Etc.

Oh, and we need stable wheezy packages for them, or we'll be paying
even more in maintenance. AFAICT, there aren't any for cgit, but maybe
I'm searching for the wrong thing..

If they do all those things, and people do like those interfaces, then
sure, we can do that.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: extension template

Dimitri,

leaving the template vs link model aside for a moment, here are some
other issues I run into. All under the assumption that we want the link
model.

On 07/08/2013 11:49 AM, Dimitri Fontaine wrote:
> Please find attached to this mail version 9 of the Extension Templates
> patch with fixes for the review up to now.

First of all, I figured that creation of a template of a newer version
is prohibited in case an extension exists:

> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.0' AS $foo$ SELECT 1; 
> $foo$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE EXTENSION foo;
> CREATE EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.1' AS $foo$ SELECT 2; 
> $foo$;
> ERROR:  extension "foo" already exists

Why is that?

I then came to think of the upgrade scripts... what do we link against
if an extension has been created from some full version and then one or
more upgrade scripts got applied?

Currently, creation of additional upgrade scripts are not blocked. Which
is a good thing, IMO. I don't like the block on newer full versions.
However, the upgrade doesn't seem to change the dependency, so you can
still delete the update script after the update. Consider this:

> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.0' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE EXTENSION foo;
> CREATE EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# ALTER EXTENSION foo UPDATE TO '0.1';
> ALTER EXTENSION
> db1=# SELECT * FROM pg_extension;
>  extname | extowner | extnamespace | extrelocatable | extversion | extconfig 
> | extcondition 
> -+--+--+++---+--
>  plpgsql |   10 |   11 | f  | 1.0|   
> | 
>  foo |   10 | 2200 | f  | 0.1|   
> | 
> (2 rows)
> 
> db1=# DROP TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1';
> DROP TEMPLATE FOR EXTENSION
>> db1=# SELECT * FROM pg_extension;
>>  extname | extowner | extnamespace | extrelocatable | extversion | extconfig 
>> | extcondition 
>> -+--+--+++---+--
>>  plpgsql |   10 |   11 | f  | 1.0|   
>> | 
>>  foo |   10 | 2200 | f  | 0.1|   
>> | 
>> (2 rows)
>> 

In this state, extension foo as of version '0.1' is installed, but
running this through dump & restore, you'll only get back '0.0'.

Interestingly, the following "works" (in the sense that the DROP of the
upgrade script is prohibited):

> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.0' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE EXTENSION foo VERSION '0.1';
> CREATE EXTENSION
> db1=# DROP TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1';
> ERROR:  cannot drop update template for extension foo because other objects 
> depend on it
> DETAIL:  extension foo depends on control template for extension foo
> HINT:  Use DROP ... CASCADE to drop the dependent objects too.

However, in that case, you are free to drop the full version:

> db1=# DROP TEMPLATE FOR EXTENSION foo VERSION '0.0';
> DROP TEMPLATE FOR EXTENSION

This certainly creates a bad state that leads to an error, when run
through dump & restore.

Maybe this one...

> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.0' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# DROP TEMPLATE FOR EXTENSION foo VERSION '0.0';
> DROP TEMPLATE FOR EXTENSION

... should already err out here ...

> db1=# CREATE EXTENSION foo;
> ERROR:  Extension "foo" is not available from 
> "/tmp/pginst/usr/local/pgsql/share/extension" nor as a template
> db1=# CREATE EXTENSION foo VERSION '0.1';
> ERROR:  Extension "foo" is not available from 
> "/tmp/pginst/usr/local/pgsql/share/extension" nor as a template

... and not only here.

I.e. the TO version should probably have a dependency on the FROM
version (that might even be useful in the template model).

Another thing that surprised me is the inability to have an upgrade
script *and* a full version (for the same extension target version).
Even if that's intended behavior, the error message could be improved:

> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.0' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo FROM '0.0' TO '0.1' AS $$ $$;
> CREATE TEMPLATE FOR EXTENSION
> db1=# CREATE TEMPLATE FOR EXTENSION foo VERSION '0.1' AS $$ $$;
> ERROR:  duplicate key value violates unique constraint 
> "pg_extension_control_name_version_index"
> DETAIL:  Key (ctlname, ctlversion)=(foo, 0.1) already exists.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql

Re: [HACKERS] robots.txt on git.postgresql.org

2013-07-09 Thread Magnus Hagander

On Tue, Jul 9, 2013 at 5:30 PM, Andres Freund  wrote:
> On 2013-07-09 16:24:42 +0100, Greg Stark wrote:
>> I note that git.postgresql.org's robot.txt refuses permission to crawl
>> the git repository:
>>
>> http://git.postgresql.org/robots.txt
>>
>> User-agent: *
>> Disallow: /
>>
>>
>> I'm curious what motivates this. It's certainly useful to be able to
>> search for commits.
>
> Gitweb is horribly slow. I don't think anybody with a bigger git repo
> using gitweb can afford to let all the crawlers go through it.

Yes, this is the reason it's been blocked. That machine basically died
every time google or bing or baidu or those hit it. Giving horrible
response times and timeouts for actual users.

We might be able to do something better aobut that now taht we can do
better rate limiting, but it's like playing whack-a-mole. The basic
software is just fantastically slow.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we automatically run duplicate_oids?

2013-07-09 Thread Tom Lane

Peter Eisentraut  writes:
> On 7/9/13 10:40 AM, Andrew Dunstan wrote:
>> This is actually a pretty trivial task. Here is a simple perl version:

> But that would make Perl a hard build requirement.

My opinion is that this script should only run if you've changed some
catalog .h files.  Requiring Perl in those cases is no worse than what
we already require it for (cf the keyword crosscheck script mentioned
upthread).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we automatically run duplicate_oids?



On 07/09/2013 01:55 PM, Peter Eisentraut wrote:

On 7/9/13 10:40 AM, Andrew Dunstan wrote:

This is actually a pretty trivial task. Here is a simple perl version:

But that would make Perl a hard build requirement.



Well, it already is unless you're not building from git AND you're not 
building with MSVC AND you're not running a buildfarm animal (and maybe 
a few other conditions too).


In any case, we could make it conditional on $(PERL) not being empty, 
couldn't we?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we automatically run duplicate_oids?

On 7/9/13 10:40 AM, Andrew Dunstan wrote:
> This is actually a pretty trivial task. Here is a simple perl version:

But that would make Perl a hard build requirement.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.4 CF] Free VMs for Reviewers & Testers

2013-07-09 Thread Josh Berkus

On 07/08/2013 10:49 PM, Daniel Farina wrote:
> On Mon, Jul 8, 2013 at 7:25 PM, Craig Ringer  wrote:
>> I did some work on that a while ago, and found that I was able to get
>> _astonishingly_ stable performance results out of EC2 EBS instances
>> using provisioned IOPS volumes. Running them as "EBS Optimized" didn't
>> seem to be required for the workloads I was testing on.

> The really, really big ones are useful even for pushing limits, such
> as cr1.8xlarge, with 32 CPUs and 244GiB memory.  Current spot instance
> price (the heavily discounted "can die at any time" one) is $0.343/hr.
>  Otherwise, it's 3.500/hr.

Yes, and for that reason, they aren't realistic for offering to
reviewers for free, funded by the community.

For real performance testing, we already have servers hosted in Portland
which are real, non-shared servers.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread ivan babrou

On 9 July 2013 17:59, Markus Wanner  wrote:
> Ian,
>
> On 07/05/2013 07:28 PM, ivan babrou wrote:
>> - /*
>> -  * Rounding could cause connection to fail; need at 
>> least 2 secs
>> -  */
>
> You removed this above comment... please check why it's there. The
> relevant revision seems to be:
>
> ###
> commit 2908a838ac2cf8cdccaa115249f8399eef8a731e
> Author: Tom Lane 
> Date:   Thu Oct 24 23:35:55 2002 +

That's not correct, facb72007 is the relevant revision. It seems that
it's only applicable for small timeouts in seconds, but it you request
connect timeout in 1 ms you should be ready to fail. I may be wrong
about this, Bruce Momjian introduced that change in 2002.

> Code review for connection timeout patch.  Avoid unportable assumption
> that tv_sec is signed; return a useful error message on timeout failure;
> honor PGCONNECT_TIMEOUT environment variable in PQsetdbLogin; make code
> obey documentation statement that timeout=0 means no timeout.
> ###
>
>> - if (timeout < 2)
>> - timeout = 2;
>> - /* calculate the finish time based on start + timeout 
>> */
>> - finish_time = time(NULL) + timeout;
>> + gettimeofday(&finish_time, NULL);
>> + finish_time.tv_usec += (int) timeout_usec;
>
> I vaguely recall tv_usec only being required to hold values up to
> 100 by some standard. A signed 32 bit value would qualify, but only
> hold up to a good half hour worth of microseconds. That doesn't quite
> seem enough to calculate finish_time the way you are proposing to do it.

Agree, this should be fixed.

>> + finish_time.tv_sec  += finish_time.tv_usec / 100;
>> + finish_time.tv_usec  = finish_time.tv_usec % 100;
>>   }
>>   }
>>
>> @@ -1073,15 +1074,15 @@ pqSocketPoll(int sock, int forRead, int forWrite, 
>> time_t end_time)
>>   input_fd.events |= POLLOUT;
>>
>>   /* Compute appropriate timeout interval */
>> - if (end_time == ((time_t) -1))
>> + if (end_time == NULL)
>>   timeout_ms = -1;
>>   else
>>   {
>> - time_t  now = time(NULL);
>> + struct timeval now;
>> + gettimeofday(&now, NULL);
>>
>> - if (end_time > now)
>> - timeout_ms = (end_time - now) * 1000;
>> - else
>> + timeout_ms = (end_time->tv_sec - now.tv_sec) * 1000 + 
>> (end_time->tv_usec - now.tv_usec) / 1000;
>
> I think that's incorrect on a platform where tv_sec and/or tv_usec is
> unsigned. (And the cited commit above indicates there are such platforms.)

I don't get it. timeout_ms is signed, and can hold unsigned -
unsigned. Is it about anything else?

> On 07/09/2013 02:25 PM, ivan babrou wrote:
>> There's no complexity here :)
>
> Not so fast, cowboy...  :-)
>
> Regards
>
> Markus Wanner

Is there anything else I should fix?

--
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to add regression tests for SCHEMA

2013-07-09 Thread Robins Tharakan

Hi Fabien,

Appreciate you being able to check right away.
Seems something went wrong with these set of patches... Would check again
and resubmit them soon.

--
Robins Tharakan


On 9 July 2013 10:57, Fabien COELHO  wrote:

>
> Hello Robins,
>
>
>  PFA an updated patch:
>> - Renamed ROLEs as per new feedback (although the previous naming was also
>> as per an earlier feedback)
>> - Merged tests into namespace
>>
>
> I cannot apply the patch, it seems to be truncated:
>
>  sh> git apply ../regress_schema_v5.patch
>  ### warnings about trailing whitespace, then:
>  fatal: corrupt patch at line 291
>
> Another go with "patch":
>
>  sh> patch -p1 < ../regress_schema_v5.patch
>  ...
>  patch unexpectedly ends in middle of line
>  patch:  malformed patch at line 290:
>
> I have this:
>
>  sh> cksum ../regress_schema_v5.patch
>  985580529 11319 ../regress_schema_v5.patch
>
> --
> Fabien.
>

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread ivan babrou

On 9 July 2013 12:20, Markus Wanner  wrote:
> On 07/09/2013 09:15 AM, ivan babrou wrote:
>> Database server lost network — boom, 2 seconds delay. What's the point then?
>
> Oh, I see. Good point. It could still improve connection time during
> normal operation, though.

Connection time during normal operation is 1.5ms which is fast enough for now.

> None the less, I now agree with you: we recommend a pooler, which may be
> capable of millisecond timeouts, but arguably is vastly more complex
> than the proposed patch. And it even brings its own set of gotchas (lots
> of connections). I guess I don't quite buy the complexity argument, yet.

Pooler isn't capable of millisecond timeouts. At least I don't see how
could I understand that pooler is dead in 50ms.

> Sure, gettimeofday() is subject to clock adjustments. But so is time().
> And if you're setting timeouts that low, you probably know what you're
> doing (or at least care about latency a lot). Or is gettimeofday() still
> considerably slower on certain architectures or in certain scenarios?
> Where's the complexity?

There's no complexity here :)

> Regards
>
> Markus Wanner

--
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread ivan babrou

On 9 July 2013 11:05, Markus Wanner  wrote:
> On 07/08/2013 08:31 PM, ivan babrou wrote:
>> Seriously, I don't get why running 150 poolers is easier.
>
> Did you consider running pgbouncer on the database servers?
>
> Regards
>
> Markus Wanner

Database server lost network — boom, 2 seconds delay. What's the point then?

--
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we automatically run duplicate_oids?



On 07/09/2013 10:40 AM, Andrew Dunstan wrote:


On 07/08/2013 11:03 PM, Peter Geoghegan wrote:
On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut  
wrote:

I don't think rewriting it in Perl is necessary or even desirable.  I
don't see anything particularly unportable in that script as it is.

I was under the impression that the final patch ought to work on
Windows too. However, I suppose that since the number of people that
use windows as an everyday development machine is probably zero, we
could reasonably forgo doing anything on that platform.





Why the heck should we? To my certain knowledge there are people using 
Windows as a development platform for PostgreSQL code, albeit not core 
code. If we ever want to get them involved in writing core code we 
need to treat them as first class citizens.


This is actually a pretty trivial task. Here is a simple perl version:



Slightly cleaner (and shorter) version:

   use strict;

   BEGIN
   {
my @files = (qw( toasting.h indexing.h), glob("pg_*.h"));

@ARGV = @files;
   }

   my %oidcounts;

   while(<>)
   {
next if /^CATALOG\(.*BKI_BOOTSTRAP/;
next unless
  /^DATA\(insert *OID *= *(\d+)/ ||
  /^CATALOG\([^,]*, *(\d+).*BKI_ROWTYPE_OID\((\d+)\)/ ||
  /^CATALOG\([^,]*, *(\d+)/ ||
  /^DECLARE_INDEX\([^,]*, *(\d+)/ ||
  /^DECLARE_UNIQUE_INDEX\([^,]*, *(\d+)/ ||
  /^DECLARE_TOAST\([^,]*, *(\d+), *(\d+)/;
$oidcounts{$1}++;
$oidcounts{$2}++ if $2;
   }

   my $found = 0;

   foreach my $oid (sort {$a <=> $b} keys %oidcounts)
   {
next unless $oidcounts{$oid} > 1;
$found = 1;
print "$oid\n";
   }

   exit $found;

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to add regression tests for SCHEMA



Hello Robins,


PFA an updated patch:
- Renamed ROLEs as per new feedback (although the previous naming was also
as per an earlier feedback)
- Merged tests into namespace


I cannot apply the patch, it seems to be truncated:

 sh> git apply ../regress_schema_v5.patch
 ### warnings about trailing whitespace, then:
 fatal: corrupt patch at line 291

Another go with "patch":

 sh> patch -p1 < ../regress_schema_v5.patch
 ...
 patch unexpectedly ends in middle of line
 patch:  malformed patch at line 290:

I have this:

 sh> cksum ../regress_schema_v5.patch
 985580529 11319 ../regress_schema_v5.patch

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] robots.txt on git.postgresql.org

Andres Freund  writes:
> Gitweb is horribly slow. I don't think anybody with a bigger git repo
> using gitweb can afford to let all the crawlers go through it.

What's blocking alternatives to be considered? I already did mention
cgit, which has the advantage to clearly show the latest patch on all
the active branches in its default view, which would match our branch
usage pretty well I think.

  http://git.zx2c4.com/cgit/
  http://git.gnus.org/cgit/gnus.git/

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] robots.txt on git.postgresql.org

On 07/09/2013 11:24 AM, Greg Stark wrote:

I note that git.postgresql.org's robot.txt refuses permission to crawl
the git repository:

http://git.postgresql.org/robots.txt

User-agent: *
Disallow: /

I'm curious what motivates this. It's certainly useful to be able to
search for commits. I frequently type git commit hashes into Google to
find the commit in other projects. I think I've even done it in
Postgres before and not had a problem. Maybe Google brought up github
or something else.

Fwiw the reason I noticed this is because I searched for "postgresql
git log" and the first hit was for "see the commit that fixed the
issue, with all the gory details" which linked to
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6e0cd7b76c04acc8c8f868a3bcd0f9ff13e16c8

This was indexed despite the robot.txt because it was linked to from
elsewhere (Hence the interesting link title). There are ways to ask
Google not to index pages if that's really what we're after but I
don't see why we would be.

It's certainly not universal. For example, the only reason I found
buildfarm client commit d533edea5441115d40ffcd02bd97e64c4d5814d9, for
which the repo is housed at GitHub, is that Google has indexed the
buildfarm commits mailing list on pgfoundry. Do we have a robots.txt on
the postgres mailing list archives site?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add some regression tests for SEQUENCE




Please find updated patch:
- 'make check' successful with recent changes
- Renamed ROLEs as per feedback


Sorry, I took the wrong thread.

I do not see any difference between both "regress_sequence_v[45].patch".
I guess you sent the earlier version.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to add regression tests for SCHEMA




PFA an updated patch:
- Renamed ROLEs as per new feedback (although the previous naming was also
as per an earlier feedback)
- Merged tests into namespace


I do not see any difference with v4. I guess you sent the earlier version.


Sorry, wrong thread, this apply to SEQUENCE tests.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to add regression tests for SCHEMA




PFA an updated patch:
- Renamed ROLEs as per new feedback (although the previous naming was also
as per an earlier feedback)
- Merged tests into namespace


I do not see any difference with v4. I guess you sent the earlier version.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] robots.txt on git.postgresql.org

2013-07-09 Thread Andres Freund

On 2013-07-09 16:24:42 +0100, Greg Stark wrote:
> I note that git.postgresql.org's robot.txt refuses permission to crawl
> the git repository:
> 
> http://git.postgresql.org/robots.txt
> 
> User-agent: *
> Disallow: /
> 
> 
> I'm curious what motivates this. It's certainly useful to be able to
> search for commits.

Gitweb is horribly slow. I don't think anybody with a bigger git repo
using gitweb can afford to let all the crawlers go through it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] robots.txt on git.postgresql.org

2013-07-09 Thread Greg Stark

I note that git.postgresql.org's robot.txt refuses permission to crawl
the git repository:

http://git.postgresql.org/robots.txt

User-agent: *
Disallow: /


I'm curious what motivates this. It's certainly useful to be able to
search for commits. I frequently type git commit hashes into Google to
find the commit in other projects. I think I've even done it in
Postgres before and not had a problem. Maybe Google brought up github
or something else.

Fwiw the reason I noticed this is because I searched for "postgresql
git log" and the first hit was for "see the commit that fixed the
issue, with all the gory details" which linked to
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6e0cd7b76c04acc8c8f868a3bcd0f9ff13e16c8

This was indexed despite the robot.txt because it was linked to from
elsewhere (Hence the interesting link title). There are ways to ask
Google not to index pages if that's really what we're after but I
don't see why we would be.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread Dmitriy Igrishin

2013/7/9 Merlin Moncure 

> On Fri, Jul 5, 2013 at 12:28 PM, ivan babrou  wrote:
> > Hi, guys! I made a quick patch to support floating number in
> > connect_timeout param for libpq. This will treat floating number as
> > seconds so this is backwards-compatible. I don't usually write in C,
> > so there may be mistakes. Could you review it and give me some
> > feedback?
>
> First thing that jumps into my head: why not use asynchronous
> connection (PQconnectStart, etc) and code the timeout on top of that?
>
+1.

>
> merlin
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
// Dmitriy.

Re: [HACKERS] Patch for fail-back without fresh backup

2013-07-09 Thread Sawada Masahiko

On Sun, Jul 7, 2013 at 4:27 PM, Sawada Masahiko  wrote:
> On Sun, Jul 7, 2013 at 4:19 PM, Sawada Masahiko  wrote:
>> On Mon, Jun 17, 2013 at 8:48 PM, Simon Riggs  wrote:
>>> On 17 June 2013 09:03, Pavan Deolasee  wrote:
>>>
 I agree. We should probably find a better name for this. Any suggestions ?
>>>
>>> err, I already made one...
>>>
> But that's not the whole story. I can see some utility in a patch that
> makes all WAL transfer synchronous, rather than just commits. Some
> name like synchronous_transfer might be appropriate. e.g.
> synchronous_transfer = all | commit (default).
>>>
 Since commits are more foreground in nature and this feature
 does not require us to wait during common foreground activities, we want a
 configuration where master can wait for synchronous transfers at other than
 commits. May we can solve that by having more granular control to the said
 parameter ?

>
> The idea of another slew of parameters that are very similar to
> synchronous replication but yet somehow different seems weird. I can't
> see a reason why we'd want a second lot of parameters. Why not just
> use the existing ones for sync rep? (I'm surprised the Parameter
> Police haven't visited you in the night...) Sure, we might want to
> expand the design for how we specify multi-node sync rep, but that is
> a different patch.


 How would we then distinguish between synchronous and the new kind of
 standby ?
>>>
>>> That's not the point. The point is "Why would we have a new kind of
>>> standby?" and therefore why do we need new parameters?
>>>
 I am told, one of the very popular setups for DR is to have one
 local sync standby and one async (may be cascaded by the local sync). Since
 this new feature is more useful for DR because taking a fresh backup on a
 slower link is even more challenging, IMHO we should support such setups.
>>>
>>> ...which still doesn't make sense to me. Lets look at that in detail.
>>>
>>> Take 3 servers, A, B, C with A and B being linked by sync rep, and C
>>> being safety standby at a distance.
>>>
>>> Either A or B is master, except in disaster. So if A is master, then B
>>> would be the failover target. If A fails, then you want to failover to
>>> B. Once B is the target, you want to failback to A as the master. C
>>> needs to follow the new master, whichever it is.
>>>
>>> If you set up sync rep between A and B and this new mode between A and
>>> C. When B becomes the master, you need to failback from B from A, but
>>> you can't because the new mode applied between A and C only, so you
>>> have to failback from C to A. So having the new mode not match with
>>> sync rep means you are forcing people to failback using the slow link
>>> in the common case.
>>>
>>> You might observe that having the two modes match causes problems if A
>>> and B fail, so you are forced to go to C as master and then eventually
>>> failback to A or B across a slow link. That case is less common and
>>> could be solved by extending sync transfer to more/multi nodes.
>>>
>>> It definitely doesn't make sense to have sync rep on anything other
>>> than a subset of sync transfer. So while it may be sensible in the
>>> future to make sync transfer a superset of sync rep nodes, it makes
>>> sense to make them the same config for now.
>> I have updated the patch.
>>
>> we support following 2 cases.
>> 1. SYNC server and also make same failback safe standby server
>> 2. ASYNC server and also make same failback safe standby server
>>
>> 1.  changed name of parameter
>>   give up 'failback_safe_standby_names' parameter from the first patch.
>>   and changed name of parameter from 'failback_safe_mode ' to
>> 'synchronous_transfer'.
>>   this parameter accepts 'all', 'data_flush' and 'commit'.
>>
>>   -'commit'
>> 'commit' means that master waits for corresponding WAL to flushed
>> to disk of standby server on commits.
>> but master doesn't waits for replicated data pages.
>>
>>   -'data_flush'
>> 'data_flush' means that master waits for replicated data page
>> (e.g, CLOG, pg_control) before flush to disk of master server.
>> but if user set to 'data_flush' to this parameter,
>> 'synchronous_commit' values is ignored even if user set
>> 'synchronous_commit'.
>>
>>   -'all'
>> 'all' means that master waits for replicated WAL and data page.
>>
>> 2. put SyncRepWaitForLSN() function into XLogFlush() function
>>   we have put SyncRepWaitForLSN() function into XLogFlush() function,
>> and change argument of XLogFlush().
>>
>> they are setup case and need to set parameters.
>>
>> - SYNC server and also make same failback safe standgy server (case 1)
>>   synchronous_transfer = all
>>   synchronous_commit = remote_write/on
>>   synchronous_standby_names = 
>>
>> - ASYNC server and also make same failback safe standgy server (case 2)
>>   synchronous_transfer = data_flush
>>   (synchronous_commit values

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread Andres Freund

On 2013-07-05 21:28:59 +0400, ivan babrou wrote:
> Hi, guys! I made a quick patch to support floating number in
> connect_timeout param for libpq. This will treat floating number as
> seconds so this is backwards-compatible. I don't usually write in C,
> so there may be mistakes. Could you review it and give me some
> feedback?
> 
> --
> Regards, Ian Babrou
> http://bobrik.name http://twitter.com/ibobrik skype:i.babrou

> diff --git a/src/interfaces/libpq/fe-connect.c 
> b/src/interfaces/libpq/fe-connect.c
> index 18fcb0c..58c1a35 100644
> --- a/src/interfaces/libpq/fe-connect.c
> +++ b/src/interfaces/libpq/fe-connect.c
> @@ -1452,7 +1452,7 @@ static int
>  connectDBComplete(PGconn *conn)
>  {
>   PostgresPollingStatusType flag = PGRES_POLLING_WRITING;
> - time_t  finish_time = ((time_t) -1);
> + struct timeval  finish_time;
>  
>   if (conn == NULL || conn->status == CONNECTION_BAD)
>   return 0;
> @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn)
>*/
>   if (conn->connect_timeout != NULL)
>   {
> - int timeout = atoi(conn->connect_timeout);
> + int timeout_usec = (int) 
> (atof(conn->connect_timeout) * 100);

I'd rather not use a plain int for storing usecs. An overflow is rather
unlikely, but still. Also, I'd rather use something like USECS_PER_SEC
instead of a plain 100 in multiple places.

>  
> - if (timeout > 0)
> + if (timeout_usec > 0)
>   {
> - /*
> -  * Rounding could cause connection to fail; need at 
> least 2 secs
> -  */
> - if (timeout < 2)
> - timeout = 2;
> - /* calculate the finish time based on start + timeout */
> - finish_time = time(NULL) + timeout;
> + gettimeofday(&finish_time, NULL);
> + finish_time.tv_usec += (int) timeout_usec;

Accordingly adjust this.

Looks like a sensible thing to me.

*Independent* from this patch, you might want look into server-side
connection pooling using transaction mode. If that's applicable for
your application it might reduce latency noticeably.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we automatically run duplicate_oids?



On 07/08/2013 11:03 PM, Peter Geoghegan wrote:

On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut  wrote:

I don't think rewriting it in Perl is necessary or even desirable.  I
don't see anything particularly unportable in that script as it is.

I was under the impression that the final patch ought to work on
Windows too. However, I suppose that since the number of people that
use windows as an everyday development machine is probably zero, we
could reasonably forgo doing anything on that platform.





Why the heck should we? To my certain knowledge there are people using 
Windows as a development platform for PostgreSQL code, albeit not core 
code. If we ever want to get them involved in writing core code we need 
to treat them as first class citizens.


This is actually a pretty trivial task. Here is a simple perl version:



   use strict;

   my @files = (qw( toasting.h indexing.h), glob("pg_*.h"));

   my $handle;
   my @lines;
   foreach my $file (@files)
   {
my $handle;
open($handle,$file) || die "$!";
my @flines = <$handle>;
close($handle);
chomp @flines;
push(@lines, @flines);
   }

   my %oidcounts;

   foreach (@lines)
   {
next if /^CATALOG\(.*BKI_BOOTSTRAP/;
next unless
  /^DATA\(insert *OID *= *([0-9][0-9]*).*$/ ||
  /^CATALOG\([^,]*,
   *([0-9][0-9]*).*BKI_ROWTYPE_OID\(([0-9][0-9]*)\).*$/ ||
  /^CATALOG\([^,]*, *([0-9][0-9]*).*$/ ||
  /^DECLARE_INDEX\([^,]*, *([0-9][0-9]*).*$/ ||
  /^DECLARE_UNIQUE_INDEX\([^,]*, *([0-9][0-9]*).*$/ ||
  /^DECLARE_TOAST\([^,]*, *([0-9][0-9]*), *([0-9][0-9]*).*$/;
$oidcounts{$1}++;
$oidcounts{$2}++ if $2;
   }

   my $found = 0;

   foreach my $oid (sort {$a <=> $b} keys %oidcounts)
   {
next unless $oidcounts{$oid} > 1;
$found = 1;
print "$oid\n";
   }

   exit $found;


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

Ian,

On 07/05/2013 07:28 PM, ivan babrou wrote:
> - /*
> -  * Rounding could cause connection to fail; need at 
> least 2 secs
> -  */

You removed this above comment... please check why it's there. The
relevant revision seems to be:

###
commit 2908a838ac2cf8cdccaa115249f8399eef8a731e
Author: Tom Lane 
Date:   Thu Oct 24 23:35:55 2002 +

Code review for connection timeout patch.  Avoid unportable assumption
that tv_sec is signed; return a useful error message on timeout failure;
honor PGCONNECT_TIMEOUT environment variable in PQsetdbLogin; make code
obey documentation statement that timeout=0 means no timeout.
###

> - if (timeout < 2)
> - timeout = 2;
> - /* calculate the finish time based on start + timeout */
> - finish_time = time(NULL) + timeout;
> + gettimeofday(&finish_time, NULL);
> + finish_time.tv_usec += (int) timeout_usec;

I vaguely recall tv_usec only being required to hold values up to
100 by some standard. A signed 32 bit value would qualify, but only
hold up to a good half hour worth of microseconds. That doesn't quite
seem enough to calculate finish_time the way you are proposing to do it.

> + finish_time.tv_sec  += finish_time.tv_usec / 100;
> + finish_time.tv_usec  = finish_time.tv_usec % 100;
>   }
>   }
>  
> @@ -1073,15 +1074,15 @@ pqSocketPoll(int sock, int forRead, int forWrite, 
> time_t end_time)
>   input_fd.events |= POLLOUT;
>  
>   /* Compute appropriate timeout interval */
> - if (end_time == ((time_t) -1))
> + if (end_time == NULL)
>   timeout_ms = -1;
>   else
>   {
> - time_t  now = time(NULL);
> + struct timeval now;
> + gettimeofday(&now, NULL);
>  
> - if (end_time > now)
> - timeout_ms = (end_time - now) * 1000;
> - else
> + timeout_ms = (end_time->tv_sec - now.tv_sec) * 1000 + 
> (end_time->tv_usec - now.tv_usec) / 1000;

I think that's incorrect on a platform where tv_sec and/or tv_usec is
unsigned. (And the cited commit above indicates there are such platforms.)

On 07/09/2013 02:25 PM, ivan babrou wrote:
> There's no complexity here :)

Not so fast, cowboy...  :-)

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

2013-07-09 Thread Merlin Moncure

On Fri, Jul 5, 2013 at 12:28 PM, ivan babrou  wrote:
> Hi, guys! I made a quick patch to support floating number in
> connect_timeout param for libpq. This will treat floating number as
> seconds so this is backwards-compatible. I don't usually write in C,
> so there may be mistakes. Could you review it and give me some
> feedback?

First thing that jumps into my head: why not use asynchronous
connection (PQconnectStart, etc) and code the timeout on top of that?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add session_preload_libraries configuration parameter

Peter Eisentraut  writes:
> preprepare has an SQL function as entry point, so you don't need to
> preload it.

Well the whole goal of that extension is to autoload itself entirely
server side at connection time. The function API had been made to cope
with 8.3 where it wasn't possible to get a snapshot from within the init
function of a module.

> I think the idea was to use PGOPTIONS to load it, controlled by the
> client side.

Oh. Never done that, I don't know if that's common in the field. If it's
in use, then the new session_preload_libraries GUC looks the only way
forward then.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add session_preload_libraries configuration parameter

On 7/8/13 4:37 AM, Dimitri Fontaine wrote:
>> I don't know of any actual legitimate uses of local_preload_libraries.
>> I recall that the plpgsql debugger was meant to use it, but doesn't
>> anymore.  So it's hard to judge what to do about this, without any
>> actual use cases.
> 
> Well there's my preprepare thing at 
> 
>   https://github.com/dimitri/preprepare

preprepare has an SQL function as entry point, so you don't need to
preload it.

> I don't think that the whitelisting is actually used in a way to allow
> for non superusers to load modules in the field, because the only way to
> do that with local_preload_libraries that I know of is to edit the
> postgresql.conf file and reload.
> 
>   alter role dim set local_preload_libraries = 'auto_explain';
>   ERROR:  55P02: parameter "local_preload_libraries" cannot be set after 
> connection start

I think the idea was to use PGOPTIONS to load it, controlled by the
client side.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add visibility map information to pg_freespace.

2013-07-09 Thread Kyotaro HORIGUCHI

Hello, I've brought visibilitymap extentions for pg_freespacemap
and pgstattuple.

At Mon, 08 Jul 2013 16:59:05 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20130708.165905.118860769.horiguchi.kyot...@lab.ntt.co.jp>
> I'll come again with the first implementation of it. And as for
> pg_freespacemap, I'll keep the current direction - adding column
> to present output records format of pg_freespace(). And
> documentation, if possible.

pg_freespace_vm_v2.patch:

  Interface has been changed from the first patch. The version of
  pg_freespace() provided with vm information is named
  pg_freespace_with_vminfo() and shows output like following.

| postgres=# select * from pg_freespace_with_vminfo('t'::regclass) limit 10;
|  blkno | avail | is_all_visible 
| ---+---+
|  0 |64 | t
|  1 |32 | t
|  2 |96 | t
|  3 |64 | t
|  4 |96 | t
|  5 |96 | t
|  6 |   128 | t
|  7 |32 | t
|  8 |96 | t
   
pgstattuple_vm_v1.patch:

  The first version of VM extension for pgstattuple. According to
  the previous discussion, the added column is named
  'all_visible_percent'.

| postgres=# select * from pgstattuple('t');
| -[ RECORD 1 ]---+-
| table_len   | 71770112
| tuple_count | 989859
| tuple_len   | 31675488
| tuple_percent   | 44.13
| dead_tuple_count| 99
| dead_tuple_len  | 3168
| dead_tuple_percent  | 0
| free_space  | 31886052
| free_percent| 44.43
| all_visible_percent | 99.98

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/contrib/pg_freespacemap/Makefile b/contrib/pg_freespacemap/Makefile
index b2e3ba3..d794df2 100644
--- a/contrib/pg_freespacemap/Makefile
+++ b/contrib/pg_freespacemap/Makefile
@@ -4,7 +4,7 @@ MODULE_big = pg_freespacemap
 OBJS = pg_freespacemap.o
 
 EXTENSION = pg_freespacemap
-DATA = pg_freespacemap--1.0.sql pg_freespacemap--unpackaged--1.0.sql
+DATA = pg_freespacemap--1.1.sql pg_freespacemap--1.0--1.1.sql pg_freespacemap--unpackaged--1.0.sql
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/pg_freespacemap/pg_freespacemap--1.0--1.1.sql b/contrib/pg_freespacemap/pg_freespacemap--1.0--1.1.sql
new file mode 100644
index 000..e7b25bd
--- /dev/null
+++ b/contrib/pg_freespacemap/pg_freespacemap--1.0--1.1.sql
@@ -0,0 +1,21 @@
+/* contrib/pg_freespacemap/pg_freespacemap--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION pg_freespacemap UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION pg_is_all_visible(regclass, bigint)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_is_all_visible'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION
+  pg_freespace_with_vminfo(rel regclass, blkno OUT bigint, avail OUT int2, is_all_visible OUT boolean)
+RETURNS SETOF RECORD
+AS $$
+  SELECT blkno, pg_freespace($1, blkno) AS avail, pg_is_all_visible($1, blkno) as is_all_visible
+  FROM generate_series(0, pg_relation_size($1) / current_setting('block_size')::bigint - 1) AS blkno;
+$$
+LANGUAGE SQL;
+
+-- Don't want these to be available to public.
+REVOKE ALL ON FUNCTION pg_freespace_with_vminfo(regclass) FROM PUBLIC;
diff --git a/contrib/pg_freespacemap/pg_freespacemap--1.0.sql b/contrib/pg_freespacemap/pg_freespacemap--1.0.sql
deleted file mode 100644
index 2adb52a..000
--- a/contrib/pg_freespacemap/pg_freespacemap--1.0.sql
+++ /dev/null
@@ -1,25 +0,0 @@
-/* contrib/pg_freespacemap/pg_freespacemap--1.0.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION pg_freespacemap" to load this file. \quit
-
--- Register the C function.
-CREATE FUNCTION pg_freespace(regclass, bigint)
-RETURNS int2
-AS 'MODULE_PATHNAME', 'pg_freespace'
-LANGUAGE C STRICT;
-
--- pg_freespace shows the recorded space avail at each block in a relation
-CREATE FUNCTION
-  pg_freespace(rel regclass, blkno OUT bigint, avail OUT int2)
-RETURNS SETOF RECORD
-AS $$
-  SELECT blkno, pg_freespace($1, blkno) AS avail
-  FROM generate_series(0, pg_relation_size($1) / current_setting('block_size')::bigint - 1) AS blkno;
-$$
-LANGUAGE SQL;
-
-
--- Don't want these to be available to public.
-REVOKE ALL ON FUNCTION pg_freespace(regclass, bigint) FROM PUBLIC;
-REVOKE ALL ON FUNCTION pg_freespace(regclass) FROM PUBLIC;
diff --git a/contrib/pg_freespacemap/pg_freespacemap--1.1.sql b/contrib/pg_freespacemap/pg_freespacemap--1.1.sql
new file mode 100644
index 000..7d2c2fe
--- /dev/null
+++ b/contrib/pg_freespacemap/pg_freespacemap--1.1.sql
@@ -0,0 +1,41 @@
+/* contrib/pg_freespacemap/pg_freespacemap--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_freespacemap" to load this file. \quit
+
+-- Register the C function.
+CREATE FUNCTION pg_freespace(regclass, bigint)
+RETURNS int2
+AS 'MODULE_PATHNAME', 'pg_freespace'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_is_

Re: [HACKERS] Review: extension template

Markus Wanner  writes:
>> Then what happens at pg_restore time? the CREATE EXTENSION in the backup
>> script will suddenly install the other extension's that happen to have
>> the same name? I think erroring out is the only safe way here.
>
> Extensions are commonly identified by name (installed ones as well as
> available ones, i.e. templates).
>
> Thus I think if a user renames a template, he might have good reasons to
> do so. He likely *wants* it to be a template for a different extension.
> Likewise when (re-)creating a template with the very same name of a
> pre-existing, installed extension.

I can understand that as a incomplete step towards a "migration" of some
sorts, but if we just allow to rename a template we open ourselves to be
producing non-restorable dumps (see below). I'm not at ease with that.

> Maybe the user just wanted to make a "backup" of the template prior to
> modifying it. If he then gives the new template the same name as the old
> one had, it very likely is similar, compatible or otherwise intended to
> replace the former one.
>
> The file-system templates work exactly that way (modulo DSOs). If you
> create an extension, then modify (or rename and re-create under the same
> name) the template on disk, then dump and restore, you end up with the
> new version of it. That's how it worked so far. It's simple to
> understand and use.

We have absolutely no control over the file-system templates and that's
why they work differently, I think. There's not even the notion of a
transaction over there.

> Or how do you think would pg_restore fail, if you followed the mental
> model of the template?

  # create template for extension foo version 'x' as '';
  # create extension foo;
  # alter template for extension foo rename to bar;

  $ pg_dump | psql

And now it's impossible to CREATE EXTENSION foo, because there's no
source to install it from available. I think we should actively prevent
that scenario to happen in the field (current patch prevents it).

Now, if I'm in the minority, let's just change that.

> However, this also means you restrict the user even further... How can
> he save a copy of an existing template, before (re-)creating it with
> CREATE TEMPLATE FOR EXTENSION?
>
> On the file system, a simple cp or mv is sufficient before
> (re)installing the package from your distribution, for example.

Usually what you do when you want to change an extension is that you
provide an upgrade script then run it with ALTER EXTENSION UPDATE.

Sometimes what you do is prepare a new installation script for a new
version of your extension and you don't provide an upgrade script, then
you update with the following method, in the case when you edited the
default_version property of the .control file:

  # drop extension foo;   -- drops version 1.0
  # create extension foo; -- installs version 1.2

The current file system based extensions allow you to maintain
separately the files foo--1.0.sql and foo--1.2.sql, and you don't need
to copy a current version of the whole extension away before hacking the
new version.

The Extension Template facility in the current patch allows you to do
much the same:

  # create template for extension foo version '1.0' as $foo$
  create function foo() returns int language sql as $$ select 1; $$;
$foo$;

  # create template for extension foo version '1.2' as $foo$
  create function foo() returns int language sql as $$ select 2; $$;
$foo$;

  # select ctlname, ctldefault, ctlversion
  from pg_extension_control
 where ctlname = 'foo';
   ctlname | ctldefault | ctlversion 
  -++
   foo | t  | 1.0
   foo | f  | 1.2
  (2 rows)

  # create extension foo;
  # select foo();
   foo 
  -
 1
  (1 row)

And now you can "upgrade" with:

  # drop extension foo;
  # create extension foo version '1.2';
  # select foo();
   foo 
  -
 2
  (1 row)

Or even:

  # alter template for extension foo set default version '1.2';
  # drop extension foo;
  # create extension foo;
  # select foo();
   foo 
  -
 2
  (1 row)

So I don't see that we've broken any use case here, really. I think I
understand your objection in principles, but it appears to me that we
would gain nothing here by allowing broken pg_dump scripts.

> What way? And what community consensus?

I see that you've spent extra time and effort to better understand any
community consensus that might exist around this patch series, and I
want to say thank you for that!

> Re-reading some of the past discussions, I didn't find anybody voting
> for a dependency between the template and the created extension. And at
> least Tom pretty clearly had the template model in mind, when he wrote
> [1]: "We don't want it to look like manipulating a template has anything
> to do with altering an extension of the same name (which might or might
> not even be installed)." or [2]: "But conflating this functionality
> [i.e. extension template

Re: [HACKERS] Add regression tests for ROLE (USER)




I think that it is not that simple: it is a good value to check that the
syntax error message conveys a useful information for the user, and that
changes to the parser rules do not alter good quality error messages.


It's good to check those things when a feature is implemented.  However,
once it's done, the odds of the bison parser breaking are very low.


I do not know that. When the next version of bison is out (I have 2.5 from 
2011 on my laptop, 2.7.1 was released on 2013-04-15), or if a new "super 
great acme incredible" drop-in replacement is proposed, you would like to 
see the impact, whether positive or negative, it has on error messages 
before switching.



Thus, the benefit of testing that over again thousands of times a day
is pretty tiny.


Sure, I agree that thousands of times per day is an overkill for syntax 
errors. But once in a while would be good, and for that you need to have 
them somewhere, and the current status is "nowhere".


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLogInsert scaling, revisited

2013-07-09 Thread Andres Freund

On 2013-07-09 08:00:52 +0900, Michael Paquier wrote:
> On Mon, Jul 8, 2013 at 6:16 PM, Heikki Linnakangas
>  wrote:
> > Ok, I've committed this patch now. Finally, phew!
> +1. Great patch!

And one more: +1

There seem to be quite some lowhanging fruits to make stuff faster after
this bottleneck is out of the way. The by far biggest thing visible in
profiles seems to be the CRC32 computation atm...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Millisecond-precision connect_timeout for libpq

On 07/09/2013 09:15 AM, ivan babrou wrote:
> Database server lost network — boom, 2 seconds delay. What's the point then?

Oh, I see. Good point. It could still improve connection time during
normal operation, though.

None the less, I now agree with you: we recommend a pooler, which may be
capable of millisecond timeouts, but arguably is vastly more complex
than the proposed patch. And it even brings its own set of gotchas (lots
of connections). I guess I don't quite buy the complexity argument, yet.

Sure, gettimeofday() is subject to clock adjustments. But so is time().
And if you're setting timeouts that low, you probably know what you're
doing (or at least care about latency a lot). Or is gettimeofday() still
considerably slower on certain architectures or in certain scenarios?
Where's the complexity?

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add an ldapoption to disable chasing LDAP referrals

2013-07-09 Thread Magnus Hagander

On Tue, Jul 9, 2013 at 3:33 AM, James Sewell wrote:

> Hey,
>
> New patch attached. I've moved from using a boolean to an enum trivalue.
>
> Let me know what you think.
>
>
Please add your patch at
https://commitfest.postgresql.org/action/commitfest_view?id=19 so it's not
missed in the next commitfest!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [Review] Re: [HACKERS] minor patch submission: CREATE CAST ... AS EXPLICIT



Hello Josh,


Generally speaking, I agree with Robert's objection.  The patch in
itself adds only one unnecessary keyword, which probably wouldn't be
noticeable, but the argument for it implies that we should be willing
to add a lot more equally-unnecessary keywords, which I'm not.  gram.o
is already about 10% of the entire postgres executable, which probably
goes far towards explaining why its inner loop always shows up high in
profiling: cache misses are routine.  And the size of those tables is
at least linear in the number of keywords --- perhaps worse than linear,
I'm not sure.  Adding a bunch of keywords *will* cost us in performance.
I'm not willing to pay that cost for something that adds neither
features nor spec compliance.


(1) Here are objective measures of the postgres stripped binary size 
compiled from scratch on my laptop this morning:


 - without "AS EXPLICIT": 5286408 bytes
  gram.o:  904936 bytes
  keywords.o:   20392 bytes

 - with "AS EXPLICIT"   : 5282312 bytes
  gram.o:  901632 bytes
  keywords.o:   20440 bytes

The executable binary is reduced by 4 KB with AS EXPLICIT, which 
seems to come from some "ld" flucke really, because the only difference 
I've found are the 8 bytes added to "keywords.o". This must be very 
specific to the version and options used with gcc & ld on my laptop.


(2) user annoyance vs cycles

I strongly disagree that user annoyance is of little value. This is one of 
the reason why I cannot stand MySQL (oops, EXCEPT is not implemented, 
casts are not for all types, the default engine is not ACID, the standard 
information_schema implementation does *NOT* implement the standard...).

I've made an extensive argument earlier in the thread.


Where are we with this patch?  Fabien, are you going to submit an
updated version which addresses the objections, or should I mark it
Returned With Feedback?


There is no need for an updated patch. I addressed the objections with 
words, not code:-)


I've stronly argued against the premises of Robert & Tom objections. 
Moreover, it happens that the patch does reduce, because of some flucke, 
the executable size, which is one of the motivation for the objections.


I have no doubt that the potential cache misses induced by the 8 bytes
extension of the keyword table are of less value that user time and
annoyance.

As for the general issue with the parser size: I work with languages and 
compilers as a researcher. We had issues at one point with a scanner 
because of too many keywords, and we solved it by removing keywords from 
the scanner and checking them semantically in the parser with a hash 
table, basically as suggested by Andres. The SQL syntax is pretty 
redundant so that there are little choices at each point. Some tools can 
generate perfect hash functions that can help (e.g. gperf). However I'm 
not sure of the actual impact in size and time by following this path, I'm 
just sure that there would be less keywords. IMHO, this issue is 
orthogonal & independent from this patch.


Finally, to answer your question directly, I'm really a nobody here, so 
you can do whatever pleases you with the patch.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.4 CF] Free VMs for Reviewers & Testers

2013-07-09 Thread Daniel Farina

On Tue, Jul 9, 2013 at 12:24 AM, Jesper Krogh  wrote:
>
> The really, really big ones are useful even for pushing limits, such
> as cr1.8xlarge, with 32 CPUs and 244GiB memory.  Current spot instance
> price (the heavily discounted "can die at any time" one) is $0.343/hr.
> Otherwise, it's 3.500/hr.
>
>
> Just to keep in mind cpus are similar throttled:
>
> One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz
> 2007 Opteron or 2007 Xeon processor. This is also the equivalent to an
> early-2006 1.7 GHz Xeon processor referenced in our original documentation.

This is only a statement of measurement (notably, it also is a metric
that is SMP-processor-count-oblivious), for lack of a more sensible
metric (certainly not clock cycles nor bogomips) in common use.

> Who knows what that does to memory bandwidth / context switches  etc.

Virtualization adds complexity, that is true, but so does a new
version of Linux or comparing across microprocessors, motherboards, or
operating systems.  I don't see a good reason to be off-put in the
common cases, especially since I can't think of another way to produce
such large machines on a short-term obligation basis.

The advantages are probably diminished (except for testing
virtualization overhead on common platforms) for smaller machines that
can be located sans-virtualization more routinely.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refresh materialized view concurrently

2013-07-09 Thread Hitoshi Harada

On Sat, Jul 6, 2013 at 9:20 AM, Kevin Grittner  wrote:
> Hitoshi Harada  wrote:
>
>> Oops!
>
> Indeed.  Thanks for the careful testing.
>
>> drop materialized view if exists mv;
>> drop table if exists foo;
>> create table foo(a, b) as values(1, 10);
>> create materialized view mv as select * from foo;
>> create unique index on mv(a);
>> insert into foo select * from foo;
>> refresh materialized view mv;
>> refresh materialized view concurrently mv;
>>
>> test=# refresh materialized view mv;
>> ERROR:  could not create unique index "mv_a_idx"
>> DETAIL:  Key (a)=(1) is duplicated.
>> test=# refresh materialized view concurrently mv;
>> REFRESH MATERIALIZED VIEW
>
> Fixed by scanning the temp table for duplicates before generating
> the diff:
>
> test=# refresh materialized view concurrently mv;
> ERROR:  new data for "mv" contains duplicate rows without any NULL columns
> DETAIL:  Row: (1,10)
>

I think the point is not check the duplicate rows.  It is about unique
key constraint violation.  So, if you change the original table foo as
values(1, 10), (1, 20), the issue is still reproduced.  In
non-concurrent operation it is checked by reindex_index when swapping
the heap, but apparently we are not doing constraint check in
concurrent mode.

>> [ matview with all columns covered by unique indexes fails ]
>
> Fixed.
>
>> Other than these, I've found index is opened with NoLock, relying
>> on ExclusiveLock of parent matview, and ALTER INDEX SET
>> TABLESPACE or something similar can run concurrently, but it is
>> presumably safe.  DROP INDEX, REINDEX would be blocked by the
>> ExclusiveLock.
>
> Since others were also worried that an index definition could be
> modified while another process is holding an ExclusiveLock on its
> table, I changed this.
>

OK, others are resolved.  One thing I need to apology
make_temptable_name_n, because I pointed the previous coding would be
a potential memory overrun, but actually the relation name is defined
by make_new_heap, so unless the function generates stupid long name,
there is not possibility to make such big name that overruns
NAMEDATALEN.  So +1 for revert make_temptable_name_n, which is also
meaninglessly invented.

I've found another issue, which is regarding
matview_maintenance_depth.  If SPI calls failed (pg_cancel_backend is
quite possible) and errors out in the middle of processing, this value
stays above zero, so subsequently we can issue DML on the matview.  We
should make sure the value becomes zero before jumping out from this
function.

--
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.4 CF] Free VMs for Reviewers & Testers

2013-07-09 Thread Jesper Krogh

> 
> The really, really big ones are useful even for pushing limits, such
> as cr1.8xlarge, with 32 CPUs and 244GiB memory.  Current spot instance
> price (the heavily discounted "can die at any time" one) is $0.343/hr.
> Otherwise, it's 3.500/hr.
> 

Just to keep in mind cpus are similar throttled:

One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 
Opteron or 2007 Xeon processor. This is also the equivalent to an early-2006 
1.7 GHz Xeon processor referenced in our original documentation.

Who knows what that does to memory bandwidth / context switches  etc.

Jesper

Re: [HACKERS] Millisecond-precision connect_timeout for libpq