Re: [HACKERS] wal_buffers

2012-08-30 Thread Amit Kapila
On Thursday, August 30, 2012 7:14 PM Robert Haas wrote:
On Wed, Aug 29, 2012 at 10:25 PM, Peter Geoghegan 
wrote:
> On 19 February 2012 05:24, Robert Haas  wrote:
>>> I have attached tps scatterplots.  The obvious conclusion appears to
>>> be that, with only 16MB of wal_buffers, the buffer "wraps around" with
>>> some regularity: we can't insert more WAL because the buffer we need
>>> to use still contains WAL that hasn't yet been fsync'd, leading to
>>> long stalls.  More buffer space ameliorates the problem.
>
>> Incidentally, I wondered if we could further improve group commit
>> performance by implementing commit_delay with a WaitLatch call, and
>> setting the latch in the event of WAL buffers wraparound (or rather, a
>> queued wraparound request - a segment switch needs WALWriteLock, which
>> the group commit leader holds for a relatively long time during the
>> delay). I'm not really sure how significant a win this might be,
>> though. There could be other types of contention, which could be
>> considerably more significant. I'll try and take a look at it next
>> week.

> I have a feeling that one of the big bottlenecks here is that we force
> an immediate fsync when we reach the end of a segment.  I think it was
> originally done that way to keep the code simple, and it does
> accomplish that, but it's not so hot for performance.  More generally,
> I think we really need to split WALWriteLock into two locks, one to
> protect the write position and the other to protect the flush
> position.  I think we're often ending up with a write (which is
> usually fast) waiting for a flush (which is often much slower) when in
> fact those things ought to be able to happen in parallel.

  This is really good idea for splitting WALWriteLock into two locks, 
  but in that case do we need separate handling for OPEN_SYNC method where 
  write and flush happens together?

  And more about WAL, do you have any suggestions regarding the idea of
triggering
  WALWriter in case Xlog buffers are nearly full?

With Regards,
Amit Kapila.


With Regards,
Amit Kapila.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Don't allow relative path for copy from file

2012-08-30 Thread Etsuro Fujita
> From: Robert Haas [mailto:robertmh...@gmail.com]

> On Thu, Aug 16, 2012 at 2:11 AM, Etsuro Fujita
>  wrote:
> > Agreed.  I'd like to withdraw the patch sent in the earlier post, and
propose
> to
> > update the documentation in the COPY reference page.  Please find attached
> a
> > patch.
> 
> I think this is a good idea, but I didn't like the exact wording you
> chose, so I committed something a little different.  Let me know
> whether it looks OK.

It looks fine to me.  Thanks!

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2012-08-30 Thread Kohei KaiGai
2012/8/30 Robert Haas :
> On Mon, Aug 13, 2012 at 3:35 PM, Kohei KaiGai  wrote:
>> The attached patch is a refreshed version of ALTER command
>> reworks towards the latest tree. Here is few big changes except
>> for code integration of the code to rename event triggers.
>
> This seems to have bit-rotted a bit.  Please rebase.
>
>> BTW, I had to adjust between oid of pg_largeobject_metadata
>> and pg_largeobject on three points of this patch, like:
>>
>> if (classId == LargeObjectRelationId)
>> classId = LargeObjectMetadataRelationId;
>>
>> When we supported largeobject permission features, we put
>> special handling to track dependency of its ownership.
>> The pg_depend records oid of pg_largeobject, instead of
>> pg_largeobject_metadata. Thus, we cannot use classId of
>> ObjectAddress being returned from get_object_address()
>> as an argument of heap_open() as is, if it indicates oid of
>> pg_largeobject.
>>
>> Was it a right decision to track dependency of large object using
>> oid of pg_largeobject, instead of pg_largeobject_metadata?
>> IIRC, the reason why we used oid of pg_largeobject is backward
>> compatibility for applications that tries to reference pg_depend
>> with built-in oids.
>
> I think it was a terrible decision and I'm pretty sure I said as much
> at the time, but I lost the argument, so I'm inclined to think we're
> stuck with continuing to support that kludge.
>
OK, we will keep to implement carefully...

> Some other preliminary comments:
>
> - Surely you need to take AccessExclusiveLock on the object being
> renamed, not just ShareUpdateExclusiveLock.
> - I don't think it's acceptable to assemble the object-type
> "object-name" already exists message using getObjectDescription();
> it's not good for translators.  Use an array of messages, one per
> object-type, as we have done in other cases.
> - I would like to handle either the RENAME case or the ALTER OWNER
> case in one patch, and the other in a follow-on patch.  Can you pick
> one to do first and remove everything related to the other one?
>
OK, I'll split the patch into three (isn't it?) portions; RENAME, SET OWNER
and SET SCHEMA, with all above your suggestions.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using pg_upgrade on log-shipping standby servers

2012-08-30 Thread Bruce Momjian
On Thu, Jul 26, 2012 at 10:36:59AM -0400, Bruce Momjian wrote:
> > > Pg_upgrade already creates a script to analyze the cluster, so we could
> > > create another script to upgrade a standby.  However, the problem with a
> > > script is that I have no idea what command people would use to do the
> > > copy.
> > 
> > Exactly.  Perhaps an example wouldn't hurt, but I wouldn't go too far.
> 
> Agreed.
> 
> > > I think I could create a list and pass that into a loop so only
> > > the command has to be modified, but again, how do we do that on Windows?
> > > Can we create a shell function in Windows and pass the file name as an
> > > argument?
> > 
> > I don't know, but I assume that somewhere in the known universe there is
> > a way on Windows to say, here is a list of files, copy them to that
> > host.
> 
> No idea.
> 
> > > Another problem is that the standby cluster might create _new_ files
> > > that don't exist on the master, e.g. WAL files, and those have to be
> > > removed.  I am not clear how to do that either, except by removing all
> > > files with a hard link count of 1, and again, this is difficult on
> > > Windows.
> > 
> > Well, then that would call for another list of files.
> 
> Well, not really.  If we create a list of all user table/index files,
> then any file not on the list would be removed on the standby, then all
> the files in the primary not on the list are copied to the standby.  
> One list is less error-prone.  This is easy in Unix shell and Perl, but
> hard on Windows without Perl.

There was too much concern about pg_upgrade upgrading a standby server
that I am not going to peruse the issue at this time.

I did add a TODO in case we ever want to resurrect the idea:

Consider a way to run pg_upgrade on standby servers

 http://archives.postgresql.org/pgsql-hackers/2012-07/msg00453.php 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 07:59:02PM -0700, Joe Conway wrote:
> On 08/30/2012 07:23 PM, Bruce Momjian wrote:
> > On Thu, Jul 12, 2012 at 06:01:00PM -0700, Joe Conway wrote:
> >> I'll take a look at the latter option sometime in the next few weeks and
> >> submit for the next commitfest.
> > 
> > Any news on this?
> 
> Not yet -- OBE. I'll try to set aside some time on the long weekend.

Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane  wrote:
>> Bruce Momjian  writes:
>>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
 Ok, I modified the part of pg_dump where tremendous number of LOCK
 TABLE are issued. I replace them with single LOCK TABLE with multiple
 tables. With 100k tables LOCK statements took 13 minutes in total, now
 it only takes 3 seconds. Comments?

>>> Was this applied?

>> No, we fixed the server side instead.

> But only for 9.2, right?  So people running back branches are still screwed.

Yeah, but they're screwed anyway, because there are a bunch of O(N^2)
behaviors involved here, not all of which are masked by what Tatsuo-san
suggested.

Six months or a year from now, we might have enough confidence in that
batch of 9.2 fixes to back-port them en masse.  Don't want to do it
today though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-08-30 Thread Amit Kapila
On Thursday, August 30, 2012 11:23 PM Robert Haas
[mailto:robertmh...@gmail.com] wrote:
On Fri, Aug 10, 2012 at 1:25 AM, Amit Kapila  wrote:
>>> I think the property that recovery only needs to worry about each
>>> block individually is one that we want to preserve.  Supporting this
>>> optimizating only when full_page_writes=off seems ugly,
>
>> I think recovery needs to worry about multiple blocks as well in some
cases.
>> Please see below case and correct me if I am wrong.
>> I think currently also there can be problems in case of
full_page_writes=off
>> for crash recovery.
>> 1. Tuple A on page 1 is updated.  The new version, tuple B, is placed on
>> page 2.
>> 2. Page 1 is Partially written to disk.
>> 3. During recovery, it can so appear that there is no need to update XMAX
>> and other related things in Old tuple
>>as LSN is greater than WAL lsn.
>> 4. Now also there can be other problems related to tuple visibility.

> Well, you're only supposed to turn full_page_writes=off if partial
> page writes are impossible on your system.  If you turn off
> full_page_writes on a system where partial page writes are impossible,

  I think you mean to say "full_page_writes on a system where partial page
writes are possible."
  Because if partial page writes are impossible then user should keep
full_page_writes = OFF.

> then you've intentionally broken crash recovery, and you get to keep
> both pieces.

  Robert, in broad I got your and Simon's idea that we should do
optimization of WAL (Reduce) in case update happens 
  on same page. I have implemented the final Patch which does WAL
optimization only in case when updated tuple is on same 
  page. Also we have observed that with fillfactor 80 the performance
improvement is good.

With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-08-30 Thread Joe Conway
On 08/30/2012 07:23 PM, Bruce Momjian wrote:
> On Thu, Jul 12, 2012 at 06:01:00PM -0700, Joe Conway wrote:
>> I'll take a look at the latter option sometime in the next few weeks and
>> submit for the next commitfest.
> 
> Any news on this?

Not yet -- OBE. I'll try to set aside some time on the long weekend.

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-08-30 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 06:01:00PM -0700, Joe Conway wrote:
> On 07/12/2012 02:53 PM, Tom Lane wrote:
> > Peter Eisentraut  writes:
> >> As long as we're spending time on this, I'd propose getting rid of
> >> lanplistrusted, at least for access checking.  Instead, just don't
> >> install USAGE privileges by default for those languages.
> > 
> > There's definitely something to that idea --- certainly lanpltrusted
> > dates from before we had a robust object-permissions system, and looks
> > like a bit of a wart now that we do have one.
> > 
> > I guess we could redefine the default privileges for languages as "none"
> > and then have the TRUSTED keyword mean to install public usage
> > privilege.  Or maybe it would be safer for upgrade purposes if we kept
> > the default interpretation as-is and did an automatic REVOKE when
> > TRUSTED wasn't specified.
> 
> +1
> 
> I'll take a look at the latter option sometime in the next few weeks and
> submit for the next commitfest.

Any news on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] fairly useless psql compatibility warning?

2012-08-30 Thread Peter Eisentraut
psql has supported older servers for a great while now, so this sort of
things seems pretty useless now:

psql (9.2rc1, server 9.1.4)
WARNING: psql version 9.2, server version 9.1.
Some psql features might not work

I think it should be reduced to warning when connected to a newer
server.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] emacs configuration for new perltidy settings

2012-08-30 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 12:35:26AM +0300, Peter Eisentraut wrote:
> This might be useful for some people.  Here is an emacs configuration
> for perl-mode that is compatible with the new perltidy settings.  Note
> that the default perl-mode settings produce indentation that will be
> completely shredded by the new perltidy settings.
> 
> (defun pgsql-perl-style ()
>   "Perl style adjusted for PostgreSQL project"
>   (interactive)
>   (setq tab-width 4)
>   (setq perl-indent-level 4)
>   (setq perl-continued-statement-offset 4)
>   (setq perl-continued-brace-offset 4)
>   (setq perl-brace-offset 0)
>   (setq perl-brace-imaginary-offset 0)
>   (setq perl-label-offset -2))
> 
> (add-hook 'perl-mode-hook
>(lambda ()
>  (if (string-match "postgresql" buffer-file-name)
>  (pgsql-perl-style

Added to src/tools/editors/emacs.samples;  applied patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/tools/editors/emacs.samples b/src/tools/editors/emacs.samples
new file mode 100644
index d9cfa2f..c8d8d07
*** a/src/tools/editors/emacs.samples
--- b/src/tools/editors/emacs.samples
***
*** 12,17 
--- 12,19 
  
  ;
  
+ ;;; Mode for C files to match src/tools/pgindent/pgindent formatting
+ 
  ;;; This set is known to work with old versions of emacs
  
  (setq auto-mode-alist
***
*** 80,85 
--- 82,107 
  
  ;
  
+ ;;; Mode for Perl files to match src/tools/pgindent/perltidyrc formatting
+ 
+ (defun pgsql-perl-style ()
+   "Perl style adjusted for PostgreSQL project"
+   (interactive)
+   (setq tab-width 4)
+   (setq perl-indent-level 4)
+   (setq perl-continued-statement-offset 4)
+   (setq perl-continued-brace-offset 4)
+   (setq perl-brace-offset 0)
+   (setq perl-brace-imaginary-offset 0)
+   (setq perl-label-offset -2))
+ 
+ (add-hook 'perl-mode-hook
+(lambda ()
+  (if (string-match "postgresql" buffer-file-name)
+  (pgsql-perl-style
+ 
+ ;
+ 
  ;;; To work on the documentation, the following (or a variant, as above)
  ;;; can be helpful.
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Peter Eisentraut
On Thu, 2012-08-30 at 17:36 -0400, Tom Lane wrote:
> Joe Abbate  writes:
> > As an aside, I installed jade (on Debian) and tried to make world but
> > got several errors, starting with the following:
> 
> > jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
> > stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
> > jade:E: unknown warning type "fully-tagged"
> 
> FWIW, that suggests that this version of jade is too old.  I'm not sure
> that jade per se (as opposed to the successor project openjade) can be
> used to build our docs at all --- you should check whether this is
> openjade, or really the original project.

This is a bit bizarre, actually.  His problem is that the old version of
jade doesn't understand the -wfully-tagged warning option.  But the
comment in the Makefile says

# -wfully-tagged needed to throw a warning on missing tags
# for older tool chains, 2007-08-31

AFAICT, the desirable effect of all these options together is to warn
about empty start tags, but only openjade supports -wfully-tagged, and
even the most recent version needs it to produce that warning.

Of course there could have been intermediate versions that I don't have
access to right now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jaime Casanova
On Thu, Aug 30, 2012 at 7:47 PM, Jaime Casanova  wrote:
> On Thu, Aug 30, 2012 at 4:58 PM, Joe Abbate  wrote:
>> On 30/08/12 17:36, Tom Lane wrote:
>>> FWIW, that suggests that this version of jade is too old.  I'm not sure
>>> that jade per se (as opposed to the successor project openjade) can be
>>> used to build our docs at all --- you should check whether this is
>>> openjade, or really the original project.
>>
>> It was the old jade.  After I installed openjade, as suggested by Alvaro
>> and Jeff Janes, and re-ran ./configure the invocation line changed to
>> use openjade.
>>
>
> so, now the question is: should we accept jade at all in configure? or
> should we fail after not finding jade and report why?
>

the last one should read "openjade" ;)


-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jaime Casanova
On Thu, Aug 30, 2012 at 4:58 PM, Joe Abbate  wrote:
> On 30/08/12 17:36, Tom Lane wrote:
>> FWIW, that suggests that this version of jade is too old.  I'm not sure
>> that jade per se (as opposed to the successor project openjade) can be
>> used to build our docs at all --- you should check whether this is
>> openjade, or really the original project.
>
> It was the old jade.  After I installed openjade, as suggested by Alvaro
> and Jeff Janes, and re-ran ./configure the invocation line changed to
> use openjade.
>

so, now the question is: should we accept jade at all in configure? or
should we fail after not finding jade and report why?

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jeff Janes
On Thu, Aug 30, 2012 at 2:05 PM, Jeff Janes  wrote:
> On Thu, Aug 30, 2012 at 1:18 PM, Joe Abbate  wrote:
>>
>>gmake world
>>
>> Unfortunately, that failed because the doc build requires jade.  I
>> managed to build contrib separately, but wanted to point out that jade
>> is not mentioned in the requirements page
>> (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).
>
> That page should probably point out that there are additional
> requirements to build the documentation, and link to the relevant
> place that describes those.

patch to do that attached.

Cheers,

Jeff


doc_require_doc_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra  wrote:
> That sounds like a pretty trivial patch. I've been thinking about yet
> another option - histograms (regular or with exponential bins).

I thought about that, too, but I think high-outliers is a lot more
useful.  At least for the kinds of things I worry about.

> What I'm not sure about is which of these options should be allowed at the
> same time - to me, combinations like 'sampling + aggregation' don't make
> much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Maybe, but perhaps not even.  I don't have a problem with saying that
the user is allowed to pick at most one method of reducing the output
volume.  I think we could say: either sample, or take high outliers,
or aggregate.  If we want to make some of those work in combination,
fine, but I don't think it's absolutely required.  What WILL be
required is a clear error message telling you what you did wrong if
you use an unsupported combination.

BTW, I think that using any of these options should automatically
enable -l, rather than requiring it to be specified separately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane  wrote:
> Bruce Momjian  writes:
>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
 Ok, I modified the part of pg_dump where tremendous number of LOCK
 TABLE are issued. I replace them with single LOCK TABLE with multiple
 tables. With 100k tables LOCK statements took 13 minutes in total, now
 it only takes 3 seconds. Comments?
>
>>> Shall I commit to master and all supported branches?
>
>> Was this applied?
>
> No, we fixed the server side instead.

But only for 9.2, right?  So people running back branches are still screwed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra  wrote:
> On 30 Srpen 2012, 18:02, Robert Haas wrote:
>> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra  wrote:
>>> This patch is a bit less polished (and more complex) than the other
>>> pgbench patch I've sent a while back, and I'm not sure how to handle the
>>> Windows branch. That needs to be fixed during the commit fest.
>>
>> What's the problem with the Windows branch?
>
> Well, there are comments about how timestamp does not work on Windows
> (filling 0), and I'm not sure how that affects the patch (e.g. determining
> the aggregation interval). I have no Windows workstation available so I
> can't actually try that.

Hmm.  That seems like it might be something we need to fix first...

>> Could you un-cuddle your brances to follow the PostgreSQL style, omit
>> braces around single-statement blocks, and remove the spurious
>> whitespace changes?
>
> OK, will do.
>
>> Instead of having both use_log_agg and naggseconds, I think you can
>> get by with just having a single variable that is zero if aggregation
>> is not in use and is otherwise the aggregation period.  On a related
>> note, you can't specify -A without an associated value, so there is no
>> point in documenting a "default".  As with your other patch, I suggest
>> using a long option name like --latency-aggregate-interval and then
>> making the name of the variable in the code match the option name,
>> with s/-/_/g, for clarity.
>
> Why --latency-aggregate-interval? It has nothing to do with latencies,
> it's merely number of transactions per interval.

Oh, I thought it was modifying the behavior of -l.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:48 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane  wrote:
>>> Yeah, the idea of replacing sum_grow with a boolean just occurred to me
>>> too.  As is, I think the code is making some less-than-portable
>>> assumptions about what will happen if sum_grow overflows; which can
>>> definitely happen, seeing that gistpenalty and its callees intentionally
>>> return infinity in some cases.  I'd rather it didn't attempt to add
>>> column penalties together, and I think there's a case to be made that
>>> not doing so is a back-patchable bug fix.
>
>> Keep in mind that the worst case outcome is the index quality is worse
>> than it otherwise would have been, so it's not like
>> OMG-PostgreSQ-eats-your-data.
>
> Agreed, but we've seen plenty of complaining about bloated gist indexes,
> and this might be the cause.

More likely one of several, but sure.

>>> I'll take a shot at improving this some more.
>
>> Okey dokey.
>
> Attached is a revised version of gistchoose(); I've not yet transposed
> the changes into gistRelocateBuildBuffersOnSplit().  It looks a lot
> more readable to me now.  Objections?

Looks good to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 23:44, Robert Haas wrote:
> On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra  wrote:
>> That sounds like a pretty trivial patch. I've been thinking about yet
>> another option - histograms (regular or with exponential bins).
>
> I thought about that, too, but I think high-outliers is a lot more
> useful.  At least for the kinds of things I worry about.

OK, let's fix the current patches first. We can add more options later.

>
>> What I'm not sure about is which of these options should be allowed at
>> the
>> same time - to me, combinations like 'sampling + aggregation' don't make
>> much sense. Maybe except 'latency-only-if-more-than + aggregation'.
>
> Maybe, but perhaps not even.  I don't have a problem with saying that
> the user is allowed to pick at most one method of reducing the output
> volume.  I think we could say: either sample, or take high outliers,
> or aggregate.  If we want to make some of those work in combination,
> fine, but I don't think it's absolutely required.  What WILL be
> required is a clear error message telling you what you did wrong if
> you use an unsupported combination.

I'll allow a single option - we can enable combinations that make sense
later.

>
> BTW, I think that using any of these options should automatically
> enable -l, rather than requiring it to be specified separately.

Good idea.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
On 30/08/12 17:36, Tom Lane wrote:
> FWIW, that suggests that this version of jade is too old.  I'm not sure
> that jade per se (as opposed to the successor project openjade) can be
> used to build our docs at all --- you should check whether this is
> openjade, or really the original project.

It was the old jade.  After I installed openjade, as suggested by Alvaro
and Jeff Janes, and re-ran ./configure the invocation line changed to
use openjade.

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-08-30 Thread Bruce Momjian
On Sun, Jun 17, 2012 at 11:45:54PM +0800, Magnus Hagander wrote:
> On Sun, Jun 17, 2012 at 11:42 PM, Tom Lane  wrote:
> > Magnus Hagander  writes:
> >> Is there a reason why we don't have a parameter on the client
> >> mirroring ssl_ciphers?
> >
> > Dunno, do we need one?  I am not sure what the cipher negotiation process
> > looks like or which side has the freedom to choose.
> 
> I haven't looked into the details, but it seems reasonable that
> *either* side should be able to at least define a list of ciphers it
> *doens't* want to talk with.
> 
> Do we need it - well, it makes sense for the client to be able to say
> "I won't trust 56-bit encryption" before it sends over the password,
> imo..
> 
> 
> >> That, or just have DEFAULT as being the default (which in current
> >> openssl means ALL:!aNULL:!eNULL.
> >
> > If our default isn't the same as the underlying default, I have to
> > question why not.
> 
> Yeah, that's exaclty what I'm questioning here..
> 
> >  But are you sure this "!" notation will work with
> > all openssl versions?
> 
> Uh. We have the ! notation in our default *now*. What openssl also
> supports is the text "DEFAULT", which is currently the equivalent of
> "ALL!aNULL!eNULL". The question, which is valid of course, should be
> if "DEFAULT" works with all openssl versions.
> 
> It would seem reasonable it does, but I haven't investigated.

Do we want to change our ssl_ciphers default to 'DEFAULT'?  Currently it
is 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH'.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 23:47, Robert Haas wrote:
> On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra  wrote:
>> On 30 Srpen 2012, 18:02, Robert Haas wrote:
>>> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra  wrote:
 This patch is a bit less polished (and more complex) than the other
 pgbench patch I've sent a while back, and I'm not sure how to handle
 the
 Windows branch. That needs to be fixed during the commit fest.
>>>
>>> What's the problem with the Windows branch?
>>
>> Well, there are comments about how timestamp does not work on Windows
>> (filling 0), and I'm not sure how that affects the patch (e.g.
>> determining
>> the aggregation interval). I have no Windows workstation available so I
>> can't actually try that.
>
> Hmm.  That seems like it might be something we need to fix first...
>
>>> Could you un-cuddle your brances to follow the PostgreSQL style, omit
>>> braces around single-statement blocks, and remove the spurious
>>> whitespace changes?
>>
>> OK, will do.
>>
>>> Instead of having both use_log_agg and naggseconds, I think you can
>>> get by with just having a single variable that is zero if aggregation
>>> is not in use and is otherwise the aggregation period.  On a related
>>> note, you can't specify -A without an associated value, so there is no
>>> point in documenting a "default".  As with your other patch, I suggest
>>> using a long option name like --latency-aggregate-interval and then
>>> making the name of the variable in the code match the option name,
>>> with s/-/_/g, for clarity.
>>
>> Why --latency-aggregate-interval? It has nothing to do with latencies,
>> it's merely number of transactions per interval.
>
> Oh, I thought it was modifying the behavior of -l.

It does, but AFAIK the "-l" means logging. I suppose
"--aggregate-interval" would be a good option name, I don't see a reason
to put there the additional word when there are other aggregated values
(e.g. num of transactions).

T.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:17 PM, Tomas Vondra  wrote:
> On 30 Srpen 2012, 17:53, Robert Haas wrote:
>> On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra  wrote:
>>> attached is a patch that improves performance when dropping multiple
>>> tables within a transaction. Instead of scanning the shared buffers for
>>> each table separately, the patch removes this and evicts all the tables
>>> in a single pass through shared buffers.
>>>
>>> Our system creates a lot of "working tables" (even 100.000) and we need
>>> to perform garbage collection (dropping obsolete tables) regularly. This
>>> often took ~ 1 hour, because we're using big AWS instances with lots of
>>> RAM (which tends to be slower than RAM on bare hw). After applying this
>>> patch and dropping tables in groups of 100, the gc runs in less than 4
>>> minutes (i.e. a 15x speed-up).
>>>
>>> This is not likely to improve usual performance, but for systems like
>>> ours, this patch is a significant improvement.
>>
>> Seems pretty reasonable.  But instead of duplicating so much code,
>> couldn't we find a way to use replace DropRelFileNodeAllBuffers with
>> DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
>> the first one could instead call the second one with a count of one
>> and a pointer to the address of the data they were planning to pass.
>> I'd probably swap the order of arguments to
>> DropRelFileNodeAllBuffersList as well.  We could do something similar
>> with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
>> code is needed.
>
> Yeah, I was thinking about that too, but I simply wasn't sure which is the
> best choice so I've sent the raw patch. OTOH these functions are called on
> a very limited number of places, so a refactoring like this seems fine.

If there are enough call sites then DropRelFileNodeAllBuffers could
become a one-line function that simply calls
DropRelFileNodeAllBuffersList(1, &arg).  But we should avoid
duplicating the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg default's verbosity?

2012-08-30 Thread Bruce Momjian
On Sun, Jun 17, 2012 at 12:00:20AM -0400, nik9...@gmail.com wrote:
> I've always used -1-f - < file.sql. It is confusing that -1 doesn't warn you 
> when it wont work though. 

This will be fixed in 9.3 with this commit:

commit be690e291d59e8d0c9f4df59abe09f1ff6cc0da9
Author: Robert Haas 
Date:   Thu Aug 9 09:59:45 2012 -0400

Make psql -1 < file behave as expected.

Previously, the -1 option was silently ignored.

Also, emit an error if -1 is used in a context where it won't be
respected, to avoid user confusion.

Original patch by Fabien COELHO, but this version is quite different
from the original submission.

---


> 
> Sent from my iPhone
> 
> On Jun 16, 2012, at 3:42 AM, Fabien COELHO  wrote:
> 
> > 
> > Hello pgdev,
> > 
> > (Second attempt)
> > 
> > I've conducted a statistical study about PostgreSQL use in OSS. One of the 
> > result is that quite a few projects have errors in their SQL setup scripts 
> > which lead to some statements to be ignored, typically somme ADD 
> > CONSTRAINTS which do not change the database schema from a functional point 
> > of view, or syntactic errors (typically a mysql syntax...) that
> > result in missing tables, but which are not found if the application is not 
> > fully tested.
> > 
> > I think that there are two reasons why these errors are not caught by 
> > application developers:
> > 
> > (1) the default verbosity is set to "notice", which is much to high. The 
> > users just get used to seeing a lot of messages on loading an sql script, 
> > and to ignore them, so that errors are just hidden in the flow of notices. 
> > I think that a better default setting would be "warnings", that is messages 
> > that require some attention from the developer.
> > 
> > (2) the default behavior of psql on errors is to keep going. Developers of 
> > SQL script that are expected to work shoud be advised to:
> > - encourage application devs to set ON_ERROR_STOP and/or use a global
> >   transaction in their script.
> > - provide a simple/short option to do that from the command line
> >   basically that could be an enhanced "-1", NOT restricted
> >   to "-f" but that would work on standard input as well.
> > 
> >   sh> psql -1 -f setup.sql # -1 does work here
> >   sh> psql -1 < setup.sql # -1 does not apply to stdin stuff...
> > 
> > 
> > So I would suggest the following todos:
> > 
> > 1 - change the default verbosity to "warning".
> > 
> > 2 - change -1 to work on stdin as well instead of being ignored,
> >or provide another option that would do that.
> > 
> > -- 
> > Fabien Coelho - coe...@cri.ensmp.fr
> > 
> > -- 
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
Hello Jeff,

On 30/08/12 17:05, Jeff Janes wrote:
> I think is probably because you don't have "DocBook DTD" or some of
> the other prerequisites listed in the URL I gave above.

Indeed.  I was able to build world after invoking the apt-get line in
J.2.3 on that page.  The only adjustment I had to make is to add a
symbolic from /usr/share/sgml/openjade1.3 to /usr/share/sgml/jade
because it was looking for the 'catalog' file in the latter.

Thanks,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Tom Lane
Joe Abbate  writes:
> As an aside, I installed jade (on Debian) and tried to make world but
> got several errors, starting with the following:

> jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
> stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
> jade:E: unknown warning type "fully-tagged"

FWIW, that suggests that this version of jade is too old.  I'm not sure
that jade per se (as opposed to the successor project openjade) can be
used to build our docs at all --- you should check whether this is
openjade, or really the original project.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Alvaro Herrera
Excerpts from Joe Abbate's message of jue ago 30 16:18:05 -0400 2012:
> Hello hackers,
> 
> In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
> some of those modules).  The build instructions
> (http://www.postgresql.org/docs/9.2/static/install-procedure.html )
> state the way to build everything (contrib + docs, etc.) is
> 
>gmake world
> 
> Unfortunately, that failed because the doc build requires jade.  I
> managed to build contrib separately, but wanted to point out that jade
> is not mentioned in the requirements page
> (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).
>  In fact, searching for 'jade' returns no results.
> 
> As an aside, I installed jade (on Debian) and tried to make world but
> got several errors, starting with the following:
> 
> jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
> stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
> jade:E: unknown warning type "fully-tagged"

I'm not sure what's the status of jade, but our Debian instructions to
build docs suggest to use openjade instead:
http://www.postgresql.org/docs/devel/static/docguide-toolsets.html

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:51:02PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
> >> However, as pointed out by Patric, if you dump and restore an old
> >> timestamptz value in one of these zones, it will fail to restore because
> >> of the sanity check.  I think therefore that we'd better enlarge the
> >> allowed range to 15:59:59 either way.
> 
> > Any status on this?
> 
> Shipped in last week's updates.

Thank you.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian  writes:
> On Thu, Aug 30, 2012 at 05:01:39PM -0400, Bruce Momjian wrote:
>> Oh, got -/_ mixed up. Fixed with attached applied patch.

> Oops, that text is talking about Python's configure, so I put the text
> back.  Seemed we had _no_ mention of our own --enable-shared.

Yeah, I just realized the same.  Apparently the configure script's
--help output was indeed the only documentation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
> >> We should just remove it now.
> 
> > --disable-shared removed, with the attached, applied patch.
> 
> No documentation changes?

I couldn't find any place we document it.  I did:

grep _shared *.sgml

and no hits were returned.  Should I search for something else?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Tom Lane
Bruce Momjian  writes:
> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>> it only takes 3 seconds. Comments?

>> Shall I commit to master and all supported branches?

> Was this applied?

No, we fixed the server side instead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 05:01:39PM -0400, Bruce Momjian wrote:
> On Thu, Aug 30, 2012 at 04:57:30PM -0400, Tom Lane wrote:
> > Bruce Momjian  writes:
> > > On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
> > >> No documentation changes?
> > 
> > > I couldn't find any place we document it.  I did:
> > >   grep _shared *.sgml
> > > and no hits were returned.  Should I search for something else?
> > 
> > It's --enable-shared, not --enable_shared.  I see at least one hit, in
> > installation.sgml.
> 
> Oh, got -/_ mixed up. Fixed with attached applied patch.

Oops, that text is talking about Python's configure, so I put the text
back.  Seemed we had _no_ mention of our own --enable-shared.

---


> 
> -- 
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
> 
>   + It's impossible for everything to be true. +

> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
> new file mode 100644
> index c02ed87..81cd550
> *** a/doc/src/sgml/installation.sgml
> --- b/doc/src/sgml/installation.sgml
> *** su - postgres
> *** 234,241 
>
>   
>
> !   If you have problems, run Python 2.3 or later's
> !   configure using the --enable-shared flag.  On some
> operating systems you don't have to build a shared library, but
> you will have to convince the PostgreSQL build
> system of this.  Consult the Makefile in
> --- 234,240 
>
>   
>
> !   On some
> operating systems you don't have to build a shared library, but
> you will have to convince the PostgreSQL build
> system of this.  Consult the Makefile in

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jeff Janes
On Thu, Aug 30, 2012 at 1:18 PM, Joe Abbate  wrote:
> Hello hackers,
>
> In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
> some of those modules).  The build instructions
> (http://www.postgresql.org/docs/9.2/static/install-procedure.html )
> state the way to build everything (contrib + docs, etc.) is
>
>gmake world
>
> Unfortunately, that failed because the doc build requires jade.  I
> managed to build contrib separately, but wanted to point out that jade
> is not mentioned in the requirements page
> (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).

That page should probably point out that there are additional
requirements to build the documentation, and link to the relevant
place that describes those.

>  In fact, searching for 'jade' returns no results.

Searching in the 9.2 docs from the web page doesn't work at all, I'm
not sure why.  You can search for "jade" in 9.1, and then once you
find the page there is a cross link to the 9.2 version of it:

http://www.postgresql.org/docs/9.2/static/docguide-toolsets.html

>
> As an aside, I installed jade (on Debian) and tried to make world but
> got several errors, starting with the following:
>
> jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
> stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
> jade:E: unknown warning type "fully-tagged"

I think is probably because you don't have "DocBook DTD" or some of
the other prerequisites listed in the URL I gave above.  Or maybe you
just need to re-run "make maintainer-clean" and "./configure" after
installing jade.  I've been tripped up by both issues in the past.


Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:51:56PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >>> Ok, I modified the part of pg_dump where tremendous number of LOCK
> >>> TABLE are issued. I replace them with single LOCK TABLE with multiple
> >>> tables. With 100k tables LOCK statements took 13 minutes in total, now
> >>> it only takes 3 seconds. Comments?
> 
> >> Shall I commit to master and all supported branches?
> 
> > Was this applied?
> 
> No, we fixed the server side instead.

Again, thanks.  I knew we fixed the server, but wasn't clear that made
the client changes unnecessary, but I think I now do remember discussion
about that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting htup.h

2012-08-30 Thread Alvaro Herrera
Excerpts from Alvaro Herrera's message of mié ago 29 15:13:11 -0400 2012:
> Excerpts from Andres Freund's message of mié ago 29 12:10:17 -0400 2012:
> 
> > > OK, scratch that thought then.  So we seem to be down to choosing a new
> > > name for what we're going to take out of htup.h.  If you don't like
> > > heap.h, maybe something like heap_tuple.h?  I'm not terribly excited
> > > about it either way though.  Any other ideas out there?
> > htup_details.h? That doesn't have the sound of "fringe details" that 
> > htup_private.h has.
> 
> htup_details.h seems reasonable, thanks.

Committed, with a slight adjustment that I had left out of this patch
but was present in a previous version of it: the function prototypes,
which in the last posted patch were to remain in htup.h, are now in
htup_details.h.  So I removed access/tupdesc.h from htup.h.  I had to
add #include "access/htup_details.h" to half a dozen more .c files but
that seemed a good tradeoff.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:57:30PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
> >> No documentation changes?
> 
> > I couldn't find any place we document it.  I did:
> > grep _shared *.sgml
> > and no hits were returned.  Should I search for something else?
> 
> It's --enable-shared, not --enable_shared.  I see at least one hit, in
> installation.sgml.

Oh, got -/_ mixed up. Fixed with attached applied patch.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
new file mode 100644
index c02ed87..81cd550
*** a/doc/src/sgml/installation.sgml
--- b/doc/src/sgml/installation.sgml
*** su - postgres
*** 234,241 
   
  
   
!   If you have problems, run Python 2.3 or later's
!   configure using the --enable-shared flag.  On some
operating systems you don't have to build a shared library, but
you will have to convince the PostgreSQL build
system of this.  Consult the Makefile in
--- 234,240 
   
  
   
!   On some
operating systems you don't have to build a shared library, but
you will have to convince the PostgreSQL build
system of this.  Consult the Makefile in

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian  writes:
> On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
>> No documentation changes?

> I couldn't find any place we document it.  I did:
>   grep _shared *.sgml
> and no hits were returned.  Should I search for something else?

It's --enable-shared, not --enable_shared.  I see at least one hit, in
installation.sgml.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Tom Lane
Bruce Momjian  writes:
> On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
>> However, as pointed out by Patric, if you dump and restore an old
>> timestamptz value in one of these zones, it will fail to restore because
>> of the sanity check.  I think therefore that we'd better enlarge the
>> allowed range to 15:59:59 either way.

> Any status on this?

Shipped in last week's updates.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian  writes:
> On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
>> We should just remove it now.

> --disable-shared removed, with the attached, applied patch.

No documentation changes?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane  wrote:
>> Yeah, the idea of replacing sum_grow with a boolean just occurred to me
>> too.  As is, I think the code is making some less-than-portable
>> assumptions about what will happen if sum_grow overflows; which can
>> definitely happen, seeing that gistpenalty and its callees intentionally
>> return infinity in some cases.  I'd rather it didn't attempt to add
>> column penalties together, and I think there's a case to be made that
>> not doing so is a back-patchable bug fix.

> Keep in mind that the worst case outcome is the index quality is worse
> than it otherwise would have been, so it's not like
> OMG-PostgreSQ-eats-your-data.

Agreed, but we've seen plenty of complaining about bloated gist indexes,
and this might be the cause.

>> I'll take a shot at improving this some more.

> Okey dokey.

Attached is a revised version of gistchoose(); I've not yet transposed
the changes into gistRelocateBuildBuffersOnSplit().  It looks a lot
more readable to me now.  Objections?

regards, tom lane

diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 7c6ce8495caac1e9d7c99afdb513689de93beea5..48d70d98e25363d8425020be7607f20348a0456a 100644
*** a/src/backend/access/gist/gistutil.c
--- b/src/backend/access/gist/gistutil.c
*** gistgetadjusted(Relation r, IndexTuple o
*** 363,475 
  }
  
  /*
!  * Search a page for the entry with lowest penalty.
   *
!  * The index may have multiple columns, and there's a penalty value for each column.
!  * The penalty associated with a column which appears earlier in the index definition is
!  * strictly more important than the penalty of column which appears later in the index
!  * definition.
   */
  OffsetNumber
  gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
  		   GISTSTATE *giststate)
  {
  	OffsetNumber maxoff;
  	OffsetNumber i;
! 	OffsetNumber which;
! 	float		sum_grow,
! which_grow[INDEX_MAX_KEYS];
  	GISTENTRY	entry,
  identry[INDEX_MAX_KEYS];
  	bool		isnull[INDEX_MAX_KEYS];
  
! 	maxoff = PageGetMaxOffsetNumber(p);
! 	*which_grow = -1.0;
! 	which = InvalidOffsetNumber;
! 	sum_grow = 1;
  	gistDeCompressAtt(giststate, r,
  	  it, NULL, (OffsetNumber) 0,
  	  identry, isnull);
  
! 	Assert(maxoff >= FirstOffsetNumber);
! 	Assert(!GistPageIsLeaf(p));
  
  	/*
! 	 * Loop over tuples on page.
  	 *
! 	 * We'll exit early if we find an index key that can accommodate the new key with no
! 	 * penalty on any column.  sum_grow is used to track this condition.  Normally, it is the
! 	 * sum of the penalties we've seen for this column so far, which is not a very useful
! 	 * quantity in general because the penalties for each column are only considered
! 	 * independently, but all we really care about is whether or not it's greater than zero.
! 	 * Since penalties can't be negative, the sum of the penalties will be greater than
! 	 * zero if and only if at least one penalty was greater than zero.  To make things just
! 	 * a bit more complicated, we arbitrarily set sum_grow to 1.0 whenever we want to force
! 	 * the at least one more iteration of this outer loop.  Any non-zero value would serve
! 	 * just as well.
  	 */
! 	for (i = FirstOffsetNumber; i <= maxoff && sum_grow; i = OffsetNumberNext(i))
  	{
- 		int			j;
  		IndexTuple	itup = (IndexTuple) PageGetItem(p, PageGetItemId(p, i));
  
! 		sum_grow = 0;
  
! 		/* Loop over indexed attribtues. */
  		for (j = 0; j < r->rd_att->natts; j++)
  		{
  			Datum		datum;
  			float		usize;
  			bool		IsNull;
  
  			datum = index_getattr(itup, j + 1, giststate->tupdesc, &IsNull);
  			gistdentryinit(giststate, j, &entry, datum, r, p, i,
  		   FALSE, IsNull);
  			usize = gistpenalty(giststate, j, &entry, IsNull,
  &identry[j], isnull[j]);
  
! 			if (which_grow[j] < 0 || usize < which_grow[j])
  			{
  /*
!  * We get here in two cases.  First, we may have just discovered that the
!  * current tuple is the best one we've seen so far; that is, for the first
!  * column for which the penalty is not equal to the best tuple seen so far,
!  * this one has a lower penalty than the previously-seen one.  But, when
!  * a new best tuple is found, we must record the best penalty value for
!  * all the remaining columns.  We'll end up here for each remaining index
!  * column in that case, too.
   */
! which = i;
! which_grow[j] = usize;
  if (j < r->rd_att->natts - 1)
! 	which_grow[j + 1] = -1;
! sum_grow += which_grow[j];
  			}
! 			else if (which_grow[j] == usize)
  			{
  /*
!  * The current tuple is exactly as good for this column as the best tuple
!  * seen so far.  The next iteration of this loop will compare the next
!  * column.
   */
- sum_grow += usize;
  			}
  			else
  			{
  /*
!  * The current tuple is worse for this column than the best tup

Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
> >> management in the server.  What I fixed so far on the pg_dump side
> >> should be enough to let partial dumps run at reasonable speed even if
> >> the whole database contains many tables.  But if psql is taking
> >> AccessShareLock on lots of tables, there's still a problem.
> > 
> > Ok, I modified the part of pg_dump where tremendous number of LOCK
> > TABLE are issued. I replace them with single LOCK TABLE with multiple
> > tables. With 100k tables LOCK statements took 13 minutes in total, now
> > it only takes 3 seconds. Comments?
> 
> Shall I commit to master and all supported branches?

Was this applied?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Bruce Momjian
On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
> Currently, our datetime input code thinks that any UTC offset of more
> than 14:59:59 either way from Greenwich must be a mistake.  However,
> after seeing Patric Bechtel's recent bug report, I went trolling in the
> Olson timezone files to see what are the largest offsets used there.
> I found three entries that are further out than that:
> 
> # ZoneNAMEGMTOFF  RULES   FORMAT  [UNTIL]
> Zone Asia/Manila  -15:56:00 - LMT 1844 Dec 31
> Zone America/Juneau15:02:19 - LMT 1867 Oct 18
> Zone America/Metlakatla15:13:42 - LMT 1867 Oct 18
> 
> These are all ancient history of course; it does not appear that any
> zones *currently* use offsets larger than +/- 14 hours, which if memory
> serves is what we considered when we set the existing sanity limit.
> 
> However, as pointed out by Patric, if you dump and restore an old
> timestamptz value in one of these zones, it will fail to restore because
> of the sanity check.  I think therefore that we'd better enlarge the
> allowed range to 15:59:59 either way.

Any status on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance patch for Win32

2012-08-30 Thread Bruce Momjian
On Tue, May 29, 2012 at 03:54:59PM -0700, Mark Dilger wrote:
> I was imagining that this would be a trap for linux developers
> who saw nothing wrong with their code until it made it to the
> build/test farm.  That's pretty far down the development
> process.  Of course, it is also a trap in the other direction, for
> Windows developers who use the pattern but do not include
> anything equivalent for the non-Windows execution path.
> 
> On the whole, however, your argument in favor of tighter
> patterns might be more convincing than my argument in favor
> of catching bugs sooner.
> 
> I will start implementing your suggestion for patch v2.

Any progress on this?

---


> 
> ━━━
> From: Tom Lane 
> To: Mark Dilger 
> Cc: "pgsql-hackers@postgresql.org" 
> Sent: Tuesday, May 29, 2012 3:42 PM
> Subject: Re: [HACKERS] Performance patch for Win32
> 
> Mark Dilger  writes:
> > I am hesitant to write a function like AllocateDirWithFilePattern
> > if the pattern is simply ignored on non-Windows.  In my patch,
> > the pattern underspecified the files, and the ad-hoc matching code
> > applied to all the returned files tightened that up.  But a person
> > could just as well overspecify the pattern and then they would get
> > different behavior on Windows vs. non-Windows, with fewer
> > files returned by FindNextFile() than would have matched the
> > ad-hoc pattern.
> 
> Well, if you're imagining that we wouldn't need to test carefully on
> both Windows and non-Windows, I think that's a pipe dream.  As an
> example, your proposal of AllocateDirWithFilePrefix would only work
> consistently across platforms if the prefix didn't contain anything
> that Windows thought was a pattern metacharacter.  (This might never
> come up, but I'm not too sure what the metacharacters are on Windows.)
> 
> Having said that, I have nothing particularly against the idea of
> specifying a prefix rather than an arbitrary pattern.  I'm just
> saying it'll still need testing.  Also, I wonder how many of the
> potential stat-equivalent operations we'll be unable to optimize
> away with the more restricted definition.  Using a tighter pattern
> on Windows seems basically free (modulo testing) if we accept that
> it's Windows-only.
> 
> regards, tom lane
> 
> 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
> On lör, 2012-05-26 at 15:53 -0400, Tom Lane wrote:
> > 2. Seeing that this is the first complaint since 9.0, should we decide
> > that --disable-shared is no longer worth supporting?  Seems like we
> > should either make this case work or remove this switch.
> 
> I think the last remaining use was the QNX port, which didn't support
> shared libraries.
> 
> We should just remove it now.

--disable-shared removed, with the attached, applied patch.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/configure b/configure
new file mode 100755
index 6a89cca..8f59c93
*** a/configure
--- b/configure
*** LCOV
*** 748,754 
  GCOV
  enable_debug
  enable_rpath
- enable_shared
  default_port
  WANTED_LANGUAGES
  enable_nls
--- 748,753 
*** with_libs
*** 810,816 
  enable_integer_datetimes
  enable_nls
  with_pgport
- enable_shared
  enable_rpath
  enable_spinlocks
  enable_debug
--- 809,814 
*** Optional Features:
*** 1490,1496 
disable 64-bit integer date/time support
--enable-nls[=LANGUAGES]
enable Native Language Support
-   --disable-shareddo not build shared libraries
--disable-rpath do not embed shared library search path in
executables
--disable-spinlocks do not use spinlocks
--- 1488,1493 
*** _ACEOF
*** 2471,2506 
  
  
  
- #
- # Option to disable shared libraries
- #
- 
- 
- # Check whether --enable-shared was given.
- if test "${enable_shared+set}" = set; then
-   enableval=$enable_shared;
-   case $enableval in
- yes)
-   :
-   ;;
- no)
-   :
-   ;;
- *)
-   { { $as_echo "$as_me:$LINENO: error: no argument expected for --enable-shared option" >&5
- $as_echo "$as_me: error: no argument expected for --enable-shared option" >&2;}
-{ (exit 1); exit 1; }; }
-   ;;
-   esac
- 
- else
-   enable_shared=yes
- 
- fi
- 
- 
- 
- 
  #
  # '-rpath'-like feature can be disabled
  #
--- 2468,2473 
diff --git a/configure.in b/configure.in
new file mode 100644
index fa48a2b..3acefa1
*** a/configure.in
--- b/configure.in
*** AC_DEFINE_UNQUOTED(DEF_PGPORT_STR, "${de
*** 164,176 
  AC_SUBST(default_port)
  
  #
- # Option to disable shared libraries
- #
- PGAC_ARG_BOOL(enable, shared, yes,
-   [do not build shared libraries])
- AC_SUBST(enable_shared)
- 
- #
  # '-rpath'-like feature can be disabled
  #
  PGAC_ARG_BOOL(enable, rpath, yes,
--- 164,169 
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
new file mode 100644
index 5b43819..1e3b401
*** a/src/Makefile.global.in
--- b/src/Makefile.global.in
*** with_libxml	= @with_libxml@
*** 165,171 
  with_libxslt	= @with_libxslt@
  with_system_tzdata = @with_system_tzdata@
  with_zlib	= @with_zlib@
- enable_shared	= @enable_shared@
  enable_rpath	= @enable_rpath@
  enable_nls	= @enable_nls@
  enable_debug	= @enable_debug@
--- 165,170 
*** endif
*** 397,409 
  # isn't created with the same link flags as libpq, it can't be used.)
  libpq = -L$(libpq_builddir) -lpq
  
- # If doing static linking, shared library dependency info isn't available,
- # so add in the libraries that libpq depends on.
- ifeq ($(enable_shared), no)
- libpq += $(filter -lintl -lssl -lcrypto -lkrb5 -lcrypt, $(LIBS)) \
- 	$(LDAP_LIBS_FE) $(PTHREAD_LIBS)
- endif
- 
  # This macro is for use by client executables (not libraries) that use libpq.
  # We force clients to pull symbols from the non-shared library libpgport
  # rather than pulling some libpgport symbols from libpq just because
--- 396,401 
diff --git a/src/Makefile.shlib b/src/Makefile.shlib
new file mode 100644
index 294d10f..4da2f10
*** a/src/Makefile.shlib
--- b/src/Makefile.shlib
*** LINK.static = $(AR) $(AROPT)
*** 81,100 
  
  ifdef SO_MAJOR_VERSION
  # Default library naming convention used by the majority of platforms
- ifeq ($(enable_shared), yes)
  shlib		= lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION).$(SO_MINOR_VERSION)
  shlib_major	= lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION)
  shlib_bare	= lib$(NAME)$(DLSUFFIX)
- endif
  # Testing the soname variable is a reliable way to determine whether a
  # linkable library is being built.
  soname		= $(shlib_major)
  else
  # Naming convention for dynamically loadable modules
- ifeq ($(enable_shared), yes)
  shlib		= $(NAME)$(DLSUFFIX)
  endif
- endif
  stlib		= lib$(NAME).a
  
  ifndef soname
--- 81,96 
*** $(stlib): $(OBJS) | $(SHLIB_PREREQS)
*** 321,327 
  	$(RANLIB) $@
  endif #haslibarule
  
- ifeq ($(enable_shared), yes)
  
  ifeq (,$(filter cygwin win32,$(PORTNAME)))
  ifneq ($(PORTNAME), aix)
--- 317,322 
*** $(stlib): $(shlib) $(DLL_DEFFIL

Re: [HACKERS] effective_io_concurrency

2012-08-30 Thread Peter Geoghegan
On 30 August 2012 20:28, Robert Haas  wrote:
> On Sat, Jul 28, 2012 at 4:09 PM, Jeff Janes  wrote:
>> But it might be better yet to make ordinary index scans benefit from
>> effective_io_concurrency, but even if/when that gets done it would
>> probably still be worthwhile to make the planner understand the
>> benefit.
>
> That sounds good too, but separate.

Indeed. The original effective_io_concurrency commit message said:

"""
***SNIP***

(The best way to handle this for plain index scans is still under debate,
so that part is not applied yet --- tgl)
"""

...seems like a pity that this debate never reached a useful conclusion.

Just how helpful is effective_io_concurrency? Did someone produce a
benchmark at some point?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I noticed all that, but didn't feel like putting in the effort to make
>> it better.  I would have been happy to have someone else pick up the
>> patch, but as it had been languishing I thought it would be better to
>> get it committed more or less as it was than to wait for someone to
>> have time to make it beautiful.  If you want to hack on it more that's
>> fine with me.  I kind of wonder if we ought to rename the variables,
>> and maybe turn sum_grow into a boolean.   But I'm not really eager to
>> go crazy if this is something we have to back-patch.
>
> Yeah, the idea of replacing sum_grow with a boolean just occurred to me
> too.  As is, I think the code is making some less-than-portable
> assumptions about what will happen if sum_grow overflows; which can
> definitely happen, seeing that gistpenalty and its callees intentionally
> return infinity in some cases.  I'd rather it didn't attempt to add
> column penalties together, and I think there's a case to be made that
> not doing so is a back-patchable bug fix.

Keep in mind that the worst case outcome is the index quality is worse
than it otherwise would have been, so it's not like
OMG-PostgreSQ-eats-your-data.

> I'll take a shot at improving this some more.

Okey dokey.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
Hello hackers,

In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
some of those modules).  The build instructions
(http://www.postgresql.org/docs/9.2/static/install-procedure.html )
state the way to build everything (contrib + docs, etc.) is

   gmake world

Unfortunately, that failed because the doc build requires jade.  I
managed to build contrib separately, but wanted to point out that jade
is not mentioned in the requirements page
(http://www.postgresql.org/docs/9.2/static/install-requirements.html ).
 In fact, searching for 'jade' returns no results.

As an aside, I installed jade (on Debian) and tried to make world but
got several errors, starting with the following:

jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
jade:E: unknown warning type "fully-tagged"

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> I noticed all that, but didn't feel like putting in the effort to make
> it better.  I would have been happy to have someone else pick up the
> patch, but as it had been languishing I thought it would be better to
> get it committed more or less as it was than to wait for someone to
> have time to make it beautiful.  If you want to hack on it more that's
> fine with me.  I kind of wonder if we ought to rename the variables,
> and maybe turn sum_grow into a boolean.   But I'm not really eager to
> go crazy if this is something we have to back-patch.

Yeah, the idea of replacing sum_grow with a boolean just occurred to me
too.  As is, I think the code is making some less-than-portable
assumptions about what will happen if sum_grow overflows; which can
definitely happen, seeing that gistpenalty and its callees intentionally
return infinity in some cases.  I'd rather it didn't attempt to add
column penalties together, and I think there's a case to be made that
not doing so is a back-patchable bug fix.

I'll take a shot at improving this some more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:27 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane  wrote:
>>> Should we backpatch that?
>
>> Arguably, yes.  Does the patch look sane to you?
>
> I was afraid you'd ask that.
>
> [ studies code for awhile ... ]
>
> I think this fixes the bug, but the function could really do with slightly
> more invasive code adjustments for clarity.  For instance, we could get
> rid of the "sum_grow = 1" kluge if we removed the "&& sum_grow" from the
> outer for statement (where it's pretty damn ugly/surprising anyway ---
> I dislike for loops that look like they're just running a counter when
> they're doing other things too) and instead put something like this at
> the bottom of the outer loop:
>
> /*
>  * If we examined all the columns and there is zero penalty for
>  * all of them, we can stop examining tuples; this is a good one
>  * to insert the new key into.
>  */
> if (sum_grow == 0 && j == r->rd_att->natts)
> break;
>
> I'm not thrilled with the added comments either; they need copy-editing
> and they randomly fail to fit in an 80-column window.  (pgindent will
> have something to say about that later for some of them, but I think it
> doesn't reformat comments that start at the left margin.)

Sorry.  I must have had my window sized wrong.  At any rate, as far as
the GiST code goes anyway, I'm inclined to think that even mis-spelled
comments are an improvement over the status quo.

> BTW, I think the real issue with the bug is not so much that we're
> resetting anything as that we may be initializing which_grow[j] for
> the next index column for the first time.  Note that the function's
> startup code only bothers to initialize which_grow[0] (although it
> does so in a less than legible fashion).  The rest have to get filled
> as the loop proceeds.

I noticed all that, but didn't feel like putting in the effort to make
it better.  I would have been happy to have someone else pick up the
patch, but as it had been languishing I thought it would be better to
get it committed more or less as it was than to wait for someone to
have time to make it beautiful.  If you want to hack on it more that's
fine with me.  I kind of wonder if we ought to rename the variables,
and maybe turn sum_grow into a boolean.   But I'm not really eager to
go crazy if this is something we have to back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 17:46, Robert Haas wrote:
> On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra  wrote:
>> Attached is an improved patch, with a call to rand() replaced with
>> getrand().
>>
>> I was thinking about the counter but I'm not really sure how to handle
>> cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a
>> good sampling, because it always keeps continuous sequences of
>> transactions. Maybe there's a clever way to use a counter, but let's
>> stick to a getrand() unless we can prove is't causing issues. Especially
>> considering that a lot of data won't be be written at all with low
>> sampling rates.
>
> I like this patch, and I think sticking with a random number is a good
> idea.  But I have two suggestions.  Number one, I think the sampling
> rate should be stored as a float, not an integer, because I can easily
> imagine wanting a sampling rate that is not an integer percentage -
> especially, one that is less than one percent, like half a percent or
> a tenth of a percent.  Also, I suggest that the command-line option
> should be a long option rather than a single character option.  That
> will be more mnemonic and avoid using up too many single letter
> options, of which there is a limited supply.  So to sample every
> hundredth result, you could do something like this:
>
> pgbench --latency-sample-rate 0.01

Right, I was thinking about that too. I'll do that in the next version of
the patch.

> Another option I personally think would be useful is an option to
> record only those latencies that are above some minimum bound, like
> this:
>
> pgbench --latency-only-if-more-than $MICROSECONDS
>
> The problem with recording all the latencies is that it tends to have
> a material impact on throughput.  Your patch should address that for
> the case where you just want to characterize the latency, but it would
> also be nice to have a way of recording the outliers.

That sounds like a pretty trivial patch. I've been thinking about yet
another option - histograms (regular or with exponential bins).

What I'm not sure about is which of these options should be allowed at the
same time - to me, combinations like 'sampling + aggregation' don't make
much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] effective_io_concurrency

2012-08-30 Thread Robert Haas
On Sat, Jul 28, 2012 at 4:09 PM, Jeff Janes  wrote:
> From my attempted reading of the thread "posix_fadvise v22", it seems
> like modification of the planner was never discussed, rather than
> being discussed and rejected.  So, is there a reason not to make the
> planner take account of effective_io_concurrency?

Not that I can see.

> But it might be better yet to make ordinary index scans benefit from
> effective_io_concurrency, but even if/when that gets done it would
> probably still be worthwhile to make the planner understand the
> benefit.

That sounds good too, but separate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 30, 2012 at 2:18 PM, Pavel Stehule  
> wrote:
>> 2012/8/30 Robert Haas :
>>> On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule  
>>> wrote:
 patch that implements "shared" client/server session variables

>>> I don't really see what we can do with this that we can't do without this.

>> a motivation for this patch was discussion about parametrised DO
>> statement - and simple possibility of access to host variables (psql)
>> variables from server - PL scripts.
>> 
>> It is based on Tom's and Magnus's ideas - it is secure, because only
>> variables explicitly mentioned in shared namespace are "shared".

> Sure, but you could get to the same place by issuing a SET command for
> just the particular variable you want to use with DO.  You don't
> really need a magic facility for it.

FWIW, I don't particularly care for this idea either.  It may be less
klugy than the original proposal, but it's still a kluge.  Also, it's
not very sensible to consider extensions of this sort unless we have
ambitions of turning psql into a full-fledged scripting language,
with conditionals and iteration at the very least.  I do not want
to go there.  If you need scripting capability, there are lots of
better tools out there already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding adjacent checkpoint records

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 6:19 PM, Jeff Janes  wrote:
> On Sat, Jun 9, 2012 at 5:43 AM, Tom Lane  wrote:
>> Simon Riggs  writes:
>>> So now the standard for my patches is that I must consider what will
>>> happen if the xlog is deleted?
>>
>> When you're messing around with something that affects data integrity, yes.
>> The long and the short of it is that this patch does reduce our ability
>> to recover from easily-foreseeable disasters.  The problem it was meant
>> to solve is not dire enough to justify that, and other fixes are
>> possible that don't require any compromises in this dimension.
>> So please revert.  We can revisit the original complaint in 9.3.
>
> This reversion was done, so
> b8b69d89905e04b910bcd   Wed Jun 13, 2012
> reverted:
> 18fb9d8d21a28caddb72  Wed Nov 2, 2011.
>
> However, the corresponding doc changes 43342891861cc2d08de and
> bd2396988a1afbcb6424 were not reverted.
>
> A simple reversion is probably not the right thing, because the
> original docs seemed rather inadequate.
>
> I've attached an attempt to fix this.  I also changed "WAL shipping"
> to "WAL archiving", as the reason for setting archive_timeout applies
> to all WAL archiving not just the special case of warm standby.

Committed, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to form a self-defined TupleTableSlot

2012-08-30 Thread Robert Haas
On Thu, Jul 26, 2012 at 11:39 PM,   wrote:
> Here is my task situation:
>
> I have a TupleTableSlot, with its own TupleDesc. Now I want to extract
> several attributes to form a new TupleTableSlot, how can I define my own
> TupleDesc and the ProjectionInfo?

You might get more helpful advice if you describe more specifically
what you're trying to do.  I mean, here's one way to create a
TupleDesc (from contrib/adminpack) but it may or may not be suitable
for your purposes:

tupdesc = CreateTemplateTupleDesc(2, false);
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starttime",
   TIMESTAMPOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 2, "filename",
   TEXTOID, -1, 0);

As ProjectInfo, maybe ExecBuildProjectionInfo?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane  wrote:
>> Should we backpatch that?

> Arguably, yes.  Does the patch look sane to you?

I was afraid you'd ask that.

[ studies code for awhile ... ]

I think this fixes the bug, but the function could really do with slightly
more invasive code adjustments for clarity.  For instance, we could get
rid of the "sum_grow = 1" kluge if we removed the "&& sum_grow" from the
outer for statement (where it's pretty damn ugly/surprising anyway ---
I dislike for loops that look like they're just running a counter when
they're doing other things too) and instead put something like this at
the bottom of the outer loop:

/*
 * If we examined all the columns and there is zero penalty for
 * all of them, we can stop examining tuples; this is a good one
 * to insert the new key into.
 */
if (sum_grow == 0 && j == r->rd_att->natts)
break;

I'm not thrilled with the added comments either; they need copy-editing
and they randomly fail to fit in an 80-column window.  (pgindent will
have something to say about that later for some of them, but I think it
doesn't reformat comments that start at the left margin.)

BTW, I think the real issue with the bug is not so much that we're
resetting anything as that we may be initializing which_grow[j] for
the next index column for the first time.  Note that the function's
startup code only bothers to initialize which_grow[0] (although it
does so in a less than legible fashion).  The rest have to get filled
as the loop proceeds.

Do you want to have another go at it, or would you like me to try to
make it better?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 18:02, Robert Haas wrote:
> On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra  wrote:
>> This patch is a bit less polished (and more complex) than the other
>> pgbench patch I've sent a while back, and I'm not sure how to handle the
>> Windows branch. That needs to be fixed during the commit fest.
>
> What's the problem with the Windows branch?

Well, there are comments about how timestamp does not work on Windows
(filling 0), and I'm not sure how that affects the patch (e.g. determining
the aggregation interval). I have no Windows workstation available so I
can't actually try that.

> Could you un-cuddle your brances to follow the PostgreSQL style, omit
> braces around single-statement blocks, and remove the spurious
> whitespace changes?

OK, will do.

> Instead of having both use_log_agg and naggseconds, I think you can
> get by with just having a single variable that is zero if aggregation
> is not in use and is otherwise the aggregation period.  On a related
> note, you can't specify -A without an associated value, so there is no
> point in documenting a "default".  As with your other patch, I suggest
> using a long option name like --latency-aggregate-interval and then
> making the name of the variable in the code match the option name,
> with s/-/_/g, for clarity.

Why --latency-aggregate-interval? It has nothing to do with latencies,
it's merely number of transactions per interval.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 3:35 PM, Kohei KaiGai  wrote:
> The attached patch is a refreshed version of ALTER command
> reworks towards the latest tree. Here is few big changes except
> for code integration of the code to rename event triggers.

This seems to have bit-rotted a bit.  Please rebase.

> BTW, I had to adjust between oid of pg_largeobject_metadata
> and pg_largeobject on three points of this patch, like:
>
> if (classId == LargeObjectRelationId)
> classId = LargeObjectMetadataRelationId;
>
> When we supported largeobject permission features, we put
> special handling to track dependency of its ownership.
> The pg_depend records oid of pg_largeobject, instead of
> pg_largeobject_metadata. Thus, we cannot use classId of
> ObjectAddress being returned from get_object_address()
> as an argument of heap_open() as is, if it indicates oid of
> pg_largeobject.
>
> Was it a right decision to track dependency of large object using
> oid of pg_largeobject, instead of pg_largeobject_metadata?
> IIRC, the reason why we used oid of pg_largeobject is backward
> compatibility for applications that tries to reference pg_depend
> with built-in oids.

I think it was a terrible decision and I'm pretty sure I said as much
at the time, but I lost the argument, so I'm inclined to think we're
stuck with continuing to support that kludge.

Some other preliminary comments:

- Surely you need to take AccessExclusiveLock on the object being
renamed, not just ShareUpdateExclusiveLock.
- I don't think it's acceptable to assemble the object-type
"object-name" already exists message using getObjectDescription();
it's not good for translators.  Use an array of messages, one per
object-type, as we have done in other cases.
- I would like to handle either the RENAME case or the ALTER OWNER
case in one patch, and the other in a follow-on patch.  Can you pick
one to do first and remove everything related to the other one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 17:53, Robert Haas wrote:
> On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra  wrote:
>> attached is a patch that improves performance when dropping multiple
>> tables within a transaction. Instead of scanning the shared buffers for
>> each table separately, the patch removes this and evicts all the tables
>> in a single pass through shared buffers.
>>
>> Our system creates a lot of "working tables" (even 100.000) and we need
>> to perform garbage collection (dropping obsolete tables) regularly. This
>> often took ~ 1 hour, because we're using big AWS instances with lots of
>> RAM (which tends to be slower than RAM on bare hw). After applying this
>> patch and dropping tables in groups of 100, the gc runs in less than 4
>> minutes (i.e. a 15x speed-up).
>>
>> This is not likely to improve usual performance, but for systems like
>> ours, this patch is a significant improvement.
>
> Seems pretty reasonable.  But instead of duplicating so much code,
> couldn't we find a way to use replace DropRelFileNodeAllBuffers with
> DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
> the first one could instead call the second one with a count of one
> and a pointer to the address of the data they were planning to pass.
> I'd probably swap the order of arguments to
> DropRelFileNodeAllBuffersList as well.  We could do something similar
> with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
> code is needed.

Yeah, I was thinking about that too, but I simply wasn't sure which is the
best choice so I've sent the raw patch. OTOH these functions are called on
a very limited number of places, so a refactoring like this seems fine.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: auto updatable views

2012-08-30 Thread Robert Haas
On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed  wrote:
> None of this new code kicks in for non-security barrier views, so the
> kinds of plans I posted upthread remain unchanged in that case. But
> now a significant fraction of the patch is code added to handle
> security barrier views. Of course we could simply say that such views
> aren't updatable, but that seems like an annoying limitation if there
> is a feasible way round it.

Maybe it'd be a good idea to split this into two patches: the first
could implement the feature but exclude security_barrier views, and
the second could lift that restriction.

Just a thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 2:18 PM, Pavel Stehule  wrote:
> 2012/8/30 Robert Haas :
>> On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule  
>> wrote:
>>> patch that implements "shared" client/server session variables
>>
>> I don't really see what we can do with this that we can't do without this.
>
> a motivation for this patch was discussion about parametrised DO
> statement - and simple possibility of access to host variables (psql)
> variables from server - PL scripts.
>
> It is based on Tom's and Magnus's ideas - it is secure, because only
> variables explicitly mentioned in shared namespace are "shared".

Sure, but you could get to the same place by issuing a SET command for
just the particular variable you want to use with DO.  You don't
really need a magic facility for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)

2012-08-30 Thread Robert Haas
On Thu, Aug 9, 2012 at 1:11 PM, Kevin Grittner
 wrote:
> I didn't figure it was; my emphasis was because this has been raised
> before and nothing happened for want of a consensus on what
> particular wording should be used, so users were left with no
> guidance.  I don't want this to continue to be a victim of "the
> perfect is the enemy of the good" syndrome.

So, to get the ball rolling here, I spent some time on this today, and
added a paragraph to the Linux Memory Overcommit section of the
documentation.  I back-patched it back to 9.0.   There were additional
merge on conflicts in 8.4 which I did not bother to resolve.  There
may be room to further improve what I did here; suggestions are
welcome.  I think we probably still need to add something to the
max_connections documentation; I have not done that.

>> Also, I am a bit doubtful about the advice on sizing the
>> connection pool as applied to small servers:
>> surely it's not sane to recommend that a single-processor system
>> with one disk should have max_connections = 3.  At least, *I*
>> don't think that's sane.
>
> I'm not sure it's wrong when combined with this: "Remember that this
> "sweet spot" is for the number of connections that are actively
> doing work.  ...  You should always make max_connections a bit
> bigger than the number of connections you enable in your connection
> pool. That way there are always a few slots available for direct
> connections for system maintenance and monitoring."  Where would you
> expect the knee to be for connections concurrently actively doing
> work on a single-core, single-drive system ?

I don't know.  But my experience with our customers is that people are
often forced to set the size of the connection pool far larger than
what that formula would suggest.  Many people are doing
transaction-level pooling, and for those people, they've got to look
at how many multi-statement transactions they've got and think about
what the peak value for that quantity is.  It's still worth using
pooling because it reduces the number of simultaneous connections, but
it's not going to reduce it to the kind of values you're talking
about.  Also, consider that transactions aren't all the same length.
Suppose 90% of your queries execute in 50ms, and 10% execute in 60s.
Even though it's less efficient, you've got to set the connection pool
large enough that at least some of the 50 ms queries can continue to
get processed even if the maximum number of 60s queries that you ever
expect to see in parallel are already running.  This may seem like a
theoretical problem but we have customers who use connection pools to
get the number of simultaneous connections down to, say, 800.  I
guarantee you that these people do not have 200 CPUs and 400 disks,
but they're smart people and they find that smaller pool sizes don't
work.

Sure, we can say, well, the fine print tells you that 2*CPUs+disks is
not REALLY the formula you should use, but it's just so far off what I
see in the field that I have a hard time thinking it's really helping
people to give them that as a starting point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate  writes:
> Yes, I suspected that an OID was stored.  What I'd still quibble with is
> the use of the ambiguous regproc in pg_operator (also pg_type) and the
> still-ambiguous schema-qualified proc name.  I guess it's not feasible
> (at least, short term), but it'd be preferable to store a "raw" OID and
> let the user cast to regprocedure (or change the 'regproc' to
> 'regprocedure').

Yeah, ideally those columns would be regprocedure.  There are
bootstrapping problems involved though with populating the initial
contents of the catalogs during initdb --- the regprocedure input
function doesn't work in that environment.  (It might be possible to
hack something for pg_operator, but the circularity is rather
fundamental for loading pg_type, since the input function would need to
consult pg_type to make sense of argument types.)

In the meantime I'd suggest casting the columns to regprocedure when
querying, if you want unambiguous results.  That's what pg_dump does.
Or you can cast to OID if you like numbers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello Tom,

On 30/08/12 13:23, Tom Lane wrote:
> Joe Abbate  writes:
>> Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
>> still outputs the same as before, namely:
> 
> Well, evidently you're *not* doing the same thing as pg_dump.

I meant that the Pyrseas dbtoyaml's output is essentially the same as
pg_dump, e.g.,

schema public:
  operator +(NONE, text):
procedure: upper

Therefore, if psql doesn't have problem restoring the operator from the
pg_dump output, neither should yamltodb have problem generating the SQL
to recreate the operator.  The above YAML (with or without the schema
qualification) does generate the correct SQL and pg_operator.oprcode
ends up with the correct OID.  So at least for this test case,
dbtoyam/yamltodb is not broken (but I'll have to do something about the
unittest difference).

> What's physically in there is an OID (and so the casts above are no-ops
> at the representational level).  What we're discussing is the behavior
> of the output function for the regproc or regprocedure types.

Yes, I suspected that an OID was stored.  What I'd still quibble with is
the use of the ambiguous regproc in pg_operator (also pg_type) and the
still-ambiguous schema-qualified proc name.  I guess it's not feasible
(at least, short term), but it'd be preferable to store a "raw" OID and
let the user cast to regprocedure (or change the 'regproc' to
'regprocedure').

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Pavel Stehule
2012/8/30 Robert Haas :
> On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule  
> wrote:
>> patch that implements "shared" client/server session variables
>
> I don't really see what we can do with this that we can't do without this.

a motivation for this patch was discussion about parametrised DO
statement - and simple possibility of access to host variables (psql)
variables from server - PL scripts.

It is based on Tom's and Magnus's ideas - it is secure, because only
variables explicitly mentioned in shared namespace are "shared".

http://archives.postgresql.org/pgsql-hackers/2012-06/msg01506.php

Regards

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Robert Haas
On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule  wrote:
> patch that implements "shared" client/server session variables

I don't really see what we can do with this that we can't do without this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-08-30 Thread Robert Haas
On Fri, Aug 10, 2012 at 1:25 AM, Amit Kapila  wrote:
>> I think the property that recovery only needs to worry about each
>> block individually is one that we want to preserve.  Supporting this
>> optimizating only when full_page_writes=off seems ugly,
>
> I think recovery needs to worry about multiple blocks as well in some cases.
> Please see below case and correct me if I am wrong.
> I think currently also there can be problems in case of full_page_writes=off
> for crash recovery.
> 1. Tuple A on page 1 is updated.  The new version, tuple B, is placed on
> page 2.
> 2. Page 1 is Partially written to disk.
> 3. During recovery, it can so appear that there is no need to update XMAX
> and other related things in Old tuple
>as LSN is greater than WAL lsn.
> 4. Now also there can be other problems related to tuple visibility.

Well, you're only supposed to turn full_page_writes=off if partial
page writes are impossible on your system.  If you turn off
full_page_writes on a system where partial page writes are impossible,
then you've intentionally broken crash recovery, and you get to keep
both pieces.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Alexander Korotkov
On Thu, Aug 30, 2012 at 9:11 PM, Robert Haas  wrote:

> On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
>  wrote:
> > I found that code of gistchoose doesn't follow it's logic. Idea of
> > gistchoose is that first column penalty is more important than penalty of
> > second column. If we meet same penalty values of first column then we
> choose
> > minimal penalty value of second column among them.
>
> Nice catch.  Thanks for the patch, which I have now committed.  But
> since these functions were very short on useful comments, I added
> some.  Hopefully they're even right, but let me know if you think
> otherwise, or have any ideas for further improvement.


Thanks! Comments looks good for me.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] splitting *_desc routines

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 12:14 PM, Andres Freund  wrote:
>> An alternative thing that might be worth considering before you go all
>> in on this is whether the xlogdump functionality shouldn't just be part
>> of the regular server executable, ie you'd call it with
>>
>>   postgres --xlogdump 
>>
>> and the only extra code needed is two lines for another redirect in
>> main.c.  We've already done that for things like --describe-config.
>> This'd likely be a lot less work than getting the _desc routines to
>> be operable standalone ...
> It definitely would be simpler. It doesn't seem nice to pile more and more
> utilities into the main postgres binary though.

Agreed.  Another advantage of making this standalone is that other
people might then be able to write code that does interesting things
with it.  If we just add the functionality into core then nobody's any
better off than before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Tom Lane
Andres Freund  writes:
> On Thursday, August 30, 2012 06:50:13 PM Matthias wrote:
>> According to the debugger num_hist = 1, so it divides by zero.

> Its curious though that the SIGFPE isn't properly cought though. That would 
> only lead to a different error, but ...

Not all platforms think an integer divide-by-zero maps to SIGFPE.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGFPE handler is naive

2012-08-30 Thread Robert Haas
On Tue, Aug 14, 2012 at 5:46 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> On Tue, Aug 14, 2012 at 08:40:06AM -0400, Robert Haas wrote:
>>> On Tue, Aug 14, 2012 at 6:50 AM, Greg Stark  wrote:
 It is possible to check if the signal was synchronous or was sent from
 an external process. You can check siginfo->si_pid to see who sent you
 the signal. I'm not sure checking that and handling it at
 check_for_interrupts if it's asynchronous is the best solution or not
 though.
>
>>> If that's portable it might be an option, but I doubt that it is.
>
>> I suspect it is portable.
>
> Single Unix Spec V2 says (on the sigaction man page)
>
> The si_code member contains a code identifying the cause of the
> signal. If the value of si_code is less than or equal to 0, then the
> signal was generated by a process and si_pid and si_uid respectively
> indicate the process ID and the real user ID of the sender.
>
> I'm not sure I would trust checking si_pid alone; it would definitely
> fail on my old HPUX box, where I see that field is union'ed with si_addr
> and so will read as garbage for a locally-sourced SIGFPE.  But it might
> be that checking si_code alone would work reliably.
>
> I think that rejecting an externally sourced SIGFPE might be worth doing
> if we can distinguish that reliably.

Currently, our signal handlers on all platforms are declared to take
just an int.  We'd need to change that in order to try to do this.
Any thoughts on how we could go about that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
>>  wrote:
>>> I found that code of gistchoose doesn't follow it's logic. Idea of
>>> gistchoose is that first column penalty is more important than penalty of
>>> second column. If we meet same penalty values of first column then we choose
>>> minimal penalty value of second column among them.
>
>> Nice catch.  Thanks for the patch, which I have now committed.
>
> Should we backpatch that?

Arguably, yes.  Does the patch look sane to you?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] canceling autovacuum task woes

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 11:59 PM, Peter Eisentraut  wrote:
> On Tue, 2012-07-24 at 16:14 -0400, Robert Haas wrote:
>> On Tue, Jul 24, 2012 at 4:09 PM, Tom Lane  wrote:
>> > Robert Haas  writes:
>> >> Yeah, you're right.  So you do get the table name.  But you don't get
>> >> the cause, which is what you really need to understand why it's
>> >> happening.  Attached is a patch that adds some more detail.
>> >
>> > Uh, what's the added dependency on pgstat.h for?  Looks sane to the
>> > eyeball otherwise.
>>
>> Woops, that was leftovers from some earlier silliness that I (mostly)
>> removed before posting.
>>
>> New version attached.
>>
>
> The detail message should have a period at the end.

Oops.  Fixed, sorry it took me so long to get to this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Heikki Linnakangas

On 30.08.2012 19:50, Matthias wrote:

It crashes in rangetypes_typeanalyze.c at line 186:

 delta = (non_empty_cnt - 1) / (num_hist - 1);

According to the debugger num_hist = 1, so it divides by zero. I guess
this is due to the new statistics collection for range types?


Yep. Fixed, thanks for the report!

I added just a check that the histogram is not created if there are less 
than 2 values in the sample. The corresponding code for scalars also 
checks that there are more than 1 distinct value, so that the histogram 
doesn't consist of all duplicates. We don't currently count the number 
of distinct values for ranges, so that would require a bit more work, 
but I don't think it makes much difference in practice.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
>  wrote:
>> I found that code of gistchoose doesn't follow it's logic. Idea of
>> gistchoose is that first column penalty is more important than penalty of
>> second column. If we meet same penalty values of first column then we choose
>> minimal penalty value of second column among them.

> Nice catch.  Thanks for the patch, which I have now committed.

Should we backpatch that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate  writes:
> On 30/08/12 12:27, Tom Lane wrote:
>> The reason for the difference is that in 9.2 there's more than one
>> pg_catalog.upper():

> Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
> still outputs the same as before, namely:

Well, evidently you're *not* doing the same thing as pg_dump.

A look at pg_dump says that what it does is to cast the column to
regprocedure, and then strip the argument types from that printout.

Perhaps some experimentation would be illuminating:

regression=# select 'upper'::regproc;
ERROR:  more than one function named "upper"
LINE 1: select 'upper'::regproc;
   ^
regression=# select 'upper(text)'::regprocedure;
 regprocedure 
--
 upper(text)
(1 row)

regression=# select 'upper(text)'::regprocedure::oid;
 oid 
-
 871
(1 row)

regression=# select 871::regprocedure;
 regprocedure 
--
 upper(text)
(1 row)

regression=# select 871::regproc; 
 regproc  
--
 pg_catalog.upper
(1 row)

> What's somewhat confusing is that the documentation (and \d pg_operator)
> states oprcode (as well as oprrest and oprjoin) are of type 'regproc'
> and that it references a pg_proc.oid.  Does the catalog actually store
> an OID, i.e., the OID of pg_catalog.upper(text), or something else?

What's physically in there is an OID (and so the casts above are no-ops
at the representational level).  What we're discussing is the behavior
of the output function for the regproc or regprocedure types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Don't allow relative path for copy from file

2012-08-30 Thread Robert Haas
On Thu, Aug 16, 2012 at 2:11 AM, Etsuro Fujita
 wrote:
> Agreed.  I'd like to withdraw the patch sent in the earlier post, and propose 
> to
> update the documentation in the COPY reference page.  Please find attached a
> patch.

I think this is a good idea, but I didn't like the exact wording you
chose, so I committed something a little different.  Let me know
whether it looks OK.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
 wrote:
> I found that code of gistchoose doesn't follow it's logic. Idea of
> gistchoose is that first column penalty is more important than penalty of
> second column. If we meet same penalty values of first column then we choose
> minimal penalty value of second column among them.

Nice catch.  Thanks for the patch, which I have now committed.  But
since these functions were very short on useful comments, I added
some.  Hopefully they're even right, but let me know if you think
otherwise, or have any ideas for further improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello Tom,

On 30/08/12 12:27, Tom Lane wrote:
> The reason for the difference is that in 9.2 there's more than one
> pg_catalog.upper():
> 
> regression=# \df upper
>   List of functions
>Schema   | Name  | Result data type | Argument data types |  Type  
> +---+--+-+
>  pg_catalog | upper | anyelement   | anyrange| normal
>  pg_catalog | upper | text | text| normal
> (2 rows)
> 
> whereas prior versions had only upper(text).  The regproc output
> function isn't allowed to print argument types, which is what would be
> needed to disambiguate altogether, but it does what it can to warn you
> that "upper" alone is not a unique name by schema-qualifying the name.
> 
> This is not new behavior in 9.2, it just happens to occur for upper()
> when it did not before.  If you're expecting regproc to always return
> unique unqualified names then your code is broken anyway, since such
> a requirement cannot be met when the function is overloaded.

Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
still outputs the same as before, namely:

SET search_path = public, pg_catalog;

--
-- Name: +; Type: OPERATOR; Schema: public; Owner: -
--

CREATE OPERATOR + (
PROCEDURE = upper,
RIGHTARG = text
);

What's somewhat confusing is that the documentation (and \d pg_operator)
states oprcode (as well as oprrest and oprjoin) are of type 'regproc'
and that it references a pg_proc.oid.  Does the catalog actually store
an OID, i.e., the OID of pg_catalog.upper(text), or something else?

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Matthias
2012/8/30 Albe Laurenz :
> Matthias wrote:
>> when running VACUUM ANALYZE on my database built on win32-x86 from
>> yesterday's git checkout I always get this at some point during VACUUM
>> ANALYZE:
>>
>> LOG:  server process (PID 5880) was terminated by exception 0xC094
>> DETAIL:  Failed process was running: VACUUM VERBOSE ANALYZE
>> HINT:  See C include file "ntstatus.h" for a description of the
>> hexadecimal value.
>> LOG:  terminating any other active server processes
>>
>> I am not sure if it's useful to report it here, but I thought I'd do
>> it anyway :)
>
> That seems to be STATUS_INTEGER_DIVIDE_BY_ZERO.
>
> Does it only happen with a certain table?
> Are you sure there is no data corruption?
> A stack trace would help:
> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postg
> reSQL_backend_on_Windows

Attached the debugger.

It crashes in rangetypes_typeanalyze.c at line 186:

delta = (non_empty_cnt - 1) / (num_hist - 1);

According to the debugger num_hist = 1, so it divides by zero. I guess
this is due to the new statistics collection for range types?

-Matthias


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:50:13 PM Matthias wrote:
> 2012/8/30 Albe Laurenz :
> > Matthias wrote:
> >> when running VACUUM ANALYZE on my database built on win32-x86 from
> >> yesterday's git checkout I always get this at some point during VACUUM
> >> ANALYZE:
> >> 
> >> LOG:  server process (PID 5880) was terminated by exception 0xC094
> >> DETAIL:  Failed process was running: VACUUM VERBOSE ANALYZE
> >> HINT:  See C include file "ntstatus.h" for a description of the
> >> hexadecimal value.
> >> LOG:  terminating any other active server processes
> >> 
> >> I am not sure if it's useful to report it here, but I thought I'd do
> >> it anyway :)
> > 
> > That seems to be STATUS_INTEGER_DIVIDE_BY_ZERO.
> > 
> > Does it only happen with a certain table?
> > Are you sure there is no data corruption?
> > A stack trace would help:
> > http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postg
> > reSQL_backend_on_Windows
> 
> Attached the debugger.
> 
> It crashes in rangetypes_typeanalyze.c at line 186:
> 
> delta = (non_empty_cnt - 1) / (num_hist - 1);
> 
> According to the debugger num_hist = 1, so it divides by zero.
Its curious though that the SIGFPE isn't properly cought though. That would 
only lead to a different error, but ...
Postgres does install a handler for it. What happens if you run "SELECT 
-2147483648/-1;"?

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate  writes:
> Hello hackers,
> I've been testing Pyrseas against 9.2rc1.  A test that does a CREATE
> OPERATOR is giving a small difference.  Specifically, the test issues
> the statement:

> CREATE OPERATOR + (PROCEDURE = upper, RIGHTARG = text);

> Pyrseas then queries the pg_operator catalog to map the procedure for
> output.  Selecting oprcode in 8.4, 9.0, and 9.1, always returns 'upper',
> but in 9.2rc1, the value is 'pg_catalog.upper'.  It does not matter
> whether pg_catalog is on the search_path or not.

The reason for the difference is that in 9.2 there's more than one
pg_catalog.upper():

regression=# \df upper
  List of functions
   Schema   | Name  | Result data type | Argument data types |  Type  
+---+--+-+
 pg_catalog | upper | anyelement   | anyrange| normal
 pg_catalog | upper | text | text| normal
(2 rows)

whereas prior versions had only upper(text).  The regproc output
function isn't allowed to print argument types, which is what would be
needed to disambiguate altogether, but it does what it can to warn you
that "upper" alone is not a unique name by schema-qualifying the name.

This is not new behavior in 9.2, it just happens to occur for upper()
when it did not before.  If you're expecting regproc to always return
unique unqualified names then your code is broken anyway, since such
a requirement cannot be met when the function is overloaded.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:09:43 PM Andres Freund wrote:
> On Thursday, August 30, 2012 06:06:59 PM Robert Haas wrote:
> > On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
> > 
> >  wrote:
> > > is there any way to check if row have already been modified by the
> > > current transaction? I tried condition txid_current() = xmin, but there
> > > is problem with the savepoints. After every savepoint rows are getting
> > > higher xmin values, but txid_current() remains the same.
> > 
> > It sounds like you're looking for a function that will give an array
> > of all XIDs for the current transcation, rather than just the XID of
> > the current sub-transaction.  I don't think we currently expose that.
> 
> txid_current_snapshot(), txis_visible_in_snapshot() may work.
No, it obviously cannot. No idea what I thought... txid_visible_in_snapshot is 
fine, but txid_current_snapshot() obviously will return a snapshot containing 
that sees everything the current transaction does.

You possibly could calculate a difference between a txid_current_snapshot() 
taken at the beginning of a repeatable read transaction and the current one, 
but thats too ugly.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
>  wrote:
>> is there any way to check if row have already been modified by the current
>> transaction? I tried condition txid_current() = xmin, but there is problem
>> with the savepoints. After every savepoint rows are getting higher xmin
>> values, but txid_current() remains the same.

> It sounds like you're looking for a function that will give an array
> of all XIDs for the current transcation, rather than just the XID of
> the current sub-transaction.  I don't think we currently expose that.

IIRC,  txid_current() actually reflects the current *top* transaction,
which is why rows inserted by subtransactions aren't matching it.
But yeah, there's no exported way to identify all the XIDs belonging
to the current transaction.

A larger problem with the above is that txid isn't an XID anyway,
so the comparisons would fail altogether once the XID epoch becomes
more than zero.  So if we did want to support this, it would be a
lot more useful to provide something along the lines of
xid_belongs_to_current_transaction(xid) returns bool
than to expose the XIDs as such.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting *_desc routines

2012-08-30 Thread Andres Freund
On Wednesday, August 29, 2012 10:06:16 PM Tom Lane wrote:
> Alvaro Herrera  writes:
> > I looked at Andres' patch and the general idea is rather horrible: it
> > links all backend files into the output executable.  This is so that the
> > *_desc functions can be used from their respective object files, which
> > obviously have a lot of requirements for backend infrastructure.
> 
> Check.
I said it was a preliminary hack though ;). Especially the way I assembled the 
object files... 
The xlogdump utility itself is equally crappy atm, it was just a demonstration 
which suited me enough for debugging... But it really doesn't need that much 
more.

> An alternative thing that might be worth considering before you go all
> in on this is whether the xlogdump functionality shouldn't just be part
> of the regular server executable, ie you'd call it with
> 
>   postgres --xlogdump 
> 
> and the only extra code needed is two lines for another redirect in
> main.c.  We've already done that for things like --describe-config.
> This'd likely be a lot less work than getting the _desc routines to
> be operable standalone ...
It definitely would be simpler. It doesn't seem nice to pile more and more 
utilities into the main postgres binary though.

Note the ugliness some the testing tools in src/backend go through just to 
link to a few files... Yuck.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:06:59 PM Robert Haas wrote:
> On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
> 
>  wrote:
> > is there any way to check if row have already been modified by the
> > current transaction? I tried condition txid_current() = xmin, but there
> > is problem with the savepoints. After every savepoint rows are getting
> > higher xmin values, but txid_current() remains the same.
> 
> It sounds like you're looking for a function that will give an array
> of all XIDs for the current transcation, rather than just the XID of
> the current sub-transaction.  I don't think we currently expose that.
txid_current_snapshot(), txis_visible_in_snapshot() may work.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
 wrote:
> is there any way to check if row have already been modified by the current
> transaction? I tried condition txid_current() = xmin, but there is problem
> with the savepoints. After every savepoint rows are getting higher xmin
> values, but txid_current() remains the same.

It sounds like you're looking for a function that will give an array
of all XIDs for the current transcation, rather than just the XID of
the current sub-transaction.  I don't think we currently expose that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Robert Haas
On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra  wrote:
> This patch is a bit less polished (and more complex) than the other
> pgbench patch I've sent a while back, and I'm not sure how to handle the
> Windows branch. That needs to be fixed during the commit fest.

What's the problem with the Windows branch?

Could you un-cuddle your brances to follow the PostgreSQL style, omit
braces around single-statement blocks, and remove the spurious
whitespace changes?

Instead of having both use_log_agg and naggseconds, I think you can
get by with just having a single variable that is zero if aggregation
is not in use and is otherwise the aggregation period.  On a related
note, you can't specify -A without an associated value, so there is no
point in documenting a "default".  As with your other patch, I suggest
using a long option name like --latency-aggregate-interval and then
making the name of the variable in the code match the option name,
with s/-/_/g, for clarity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Robert Haas
On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra  wrote:
> attached is a patch that improves performance when dropping multiple
> tables within a transaction. Instead of scanning the shared buffers for
> each table separately, the patch removes this and evicts all the tables
> in a single pass through shared buffers.
>
> Our system creates a lot of "working tables" (even 100.000) and we need
> to perform garbage collection (dropping obsolete tables) regularly. This
> often took ~ 1 hour, because we're using big AWS instances with lots of
> RAM (which tends to be slower than RAM on bare hw). After applying this
> patch and dropping tables in groups of 100, the gc runs in less than 4
> minutes (i.e. a 15x speed-up).
>
> This is not likely to improve usual performance, but for systems like
> ours, this patch is a significant improvement.

Seems pretty reasonable.  But instead of duplicating so much code,
couldn't we find a way to use replace DropRelFileNodeAllBuffers with
DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
the first one could instead call the second one with a count of one
and a pointer to the address of the data they were planning to pass.
I'd probably swap the order of arguments to
DropRelFileNodeAllBuffersList as well.  We could do something similar
with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
code is needed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Robert Haas
On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra  wrote:
> Attached is an improved patch, with a call to rand() replaced with
> getrand().
>
> I was thinking about the counter but I'm not really sure how to handle
> cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a
> good sampling, because it always keeps continuous sequences of
> transactions. Maybe there's a clever way to use a counter, but let's
> stick to a getrand() unless we can prove is't causing issues. Especially
> considering that a lot of data won't be be written at all with low
> sampling rates.

I like this patch, and I think sticking with a random number is a good
idea.  But I have two suggestions.  Number one, I think the sampling
rate should be stored as a float, not an integer, because I can easily
imagine wanting a sampling rate that is not an integer percentage -
especially, one that is less than one percent, like half a percent or
a tenth of a percent.  Also, I suggest that the command-line option
should be a long option rather than a single character option.  That
will be more mnemonic and avoid using up too many single letter
options, of which there is a limited supply.  So to sample every
hundredth result, you could do something like this:

pgbench --latency-sample-rate 0.01

Another option I personally think would be useful is an option to
record only those latencies that are above some minimum bound, like
this:

pgbench --latency-only-if-more-than $MICROSECONDS

The problem with recording all the latencies is that it tends to have
a material impact on throughput.  Your patch should address that for
the case where you just want to characterize the latency, but it would
also be nice to have a way of recording the outliers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hunspell and tsearch2 ?

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 8:31 AM, Dirk Lutzebäck
 wrote:
> we have issues with compound words in tsearch2 using the german (ispell)
> dictionary. This has been discussed before but there is no real solution
> using the recommended german dictionary at
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2 (convert old
> openoffice dict file to ispell suitable for tsearch):
>
> # select ts_lexize('german_ispell', 'vollklimatisiert');
>  ts_lexize
> 
>  {vollklimatisiert}
> (1 row)
>
> This should return atleast
>
>  {vollklimatisiert, voll, klimatisiert}
>
>
> The issue with compound words in ispell has been addressed in hunspell. But
> this has not been integrated fully to tsearch2 (according to the
> documentation).

Just out of curiosity, which part of the documentation are you looking
at?  The only mention of hunspell I see in the documentation is a
mention that we apparently support their dictionary-file format.

> Are there any plans to fully integrate hunspell into tsearch2? What is
> needed to do this? What is the functional delta which is missing? Maybe we
> can help...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Event Triggers reduced, v1

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 7:52 AM, Dimitri Fontaine
 wrote:
> I've been reviewing your changes and here's a very small patch with some
> details I would have spelled out differently. See what you think, I
> mostly needed to edit some code to get back in shape :)

I guess I don't particularly like either of these changes.  The first
one is mostly harmless, but I don't really see why it's any better,
and it does have the downside of traversing the string twice (once for
strlen and a second time in str_toupper) instead of just once.  It
also makes a line wider than 80 columns, which is a bit ugly.  In the
second hunk, the point is that we never have to do CreateCommandTag()
here at all unless either casserts are enabled or EventCacheLookup
returns a non-empty list.  That means that in a non-assert-enabled
build, we get to skip that work altogether in the presumably-common
case where there are no relevant event triggers.  Your proposed change
would avoid doing it twice when asserts are disabled, but the cost
would be that we'd have to do it once when asserts were disabled even
if no event triggers exist.  I don't think that's a good trade-off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello hackers,

I've been testing Pyrseas against 9.2rc1.  A test that does a CREATE
OPERATOR is giving a small difference.  Specifically, the test issues
the statement:

CREATE OPERATOR + (PROCEDURE = upper, RIGHTARG = text);

Pyrseas then queries the pg_operator catalog to map the procedure for
output.  Selecting oprcode in 8.4, 9.0, and 9.1, always returns 'upper',
but in 9.2rc1, the value is 'pg_catalog.upper'.  It does not matter
whether pg_catalog is on the search_path or not.

I looked over the release notes but I couldn't find any reference to
this possible change in behavior.  I'd like to confirm whether the
schema-qualified procedure name is a permanent change or an unintended
side effect of something else.

Thanks.

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment fixups

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 11:21 PM, Jeff Davis  wrote:
> I noticed a couple comments that look wrong to me. Patch attached.

Thanks, committed.  But I updated the parenthesized comment in the
first fix instead of removing it.  Let me know if you see an issue
with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rows modified in current transaction

2012-08-30 Thread Miroslav Šimulčík
Hi,

is there any way to check if row have already been modified by the current
transaction? I tried condition txid_current() = xmin, but there is problem
with the savepoints. After every savepoint rows are getting higher xmin
values, but txid_current() remains the same.

Regards,
Miroslav Simulcik


Re: [HACKERS] Chronic performance issue with Replication Failover and FSM.

2012-08-30 Thread Tom Lane
Daniel Farina  writes:
> but just today we promoted another system via streaming replication to
> pick up the planner fix in 9.1.5 (did you know: that planner bug seems
> to make GIN FTS indexes un-used in non-exotic cases, and one goes to
> seqscan?), and then a 40MB GIN index bloated to two gigs on a 1.5GB
> table over the course of maybe six hours.

I think this is probably unrelated to what Josh was griping about:
even granting that the system forgot any free space that had been
available within the original 40MB, that couldn't in itself lead
to eating more than another 40MB, no?  My guess is something is
broken about the oldest-xmin-horizon mechanism, such that VACUUM
is failing to recover space.  Can you put together a self-contained
test case that exhibits similar growth?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query plan optimization for CHECK NO INHERIT and single table?

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 2:00 AM, Matthias  wrote:
> Hey,
>
> I tried out the new CHECK NO INHERIT feature for inherited tables.
> There seems to be an opportunity to generate slightly better query
> plans sometimes. E.g. when I do
>
> SELECT * FROM base WHERE partition_id = 3
>
> and there exists only one child table for which partition_id = 3 is
> true I guess the query plan could just do a seq/index/whatever scan on
> that table. Right now the query plan has an intermediate "Append"
> node. This seems only useful if the results of multiple child tables
> would need to be included.

I think it's needed to translate between the child's set of attribute
numbers and their parent's set of attribute numbers - that is, the
child could have the columns in a different order, or could have extra
ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_buffers

2012-08-30 Thread Robert Haas
On Wed, Aug 29, 2012 at 10:25 PM, Peter Geoghegan  wrote:
> On 19 February 2012 05:24, Robert Haas  wrote:
>> I have attached tps scatterplots.  The obvious conclusion appears to
>> be that, with only 16MB of wal_buffers, the buffer "wraps around" with
>> some regularity: we can't insert more WAL because the buffer we need
>> to use still contains WAL that hasn't yet been fsync'd, leading to
>> long stalls.  More buffer space ameliorates the problem.
>
> Incidentally, I wondered if we could further improve group commit
> performance by implementing commit_delay with a WaitLatch call, and
> setting the latch in the event of WAL buffers wraparound (or rather, a
> queued wraparound request - a segment switch needs WALWriteLock, which
> the group commit leader holds for a relatively long time during the
> delay). I'm not really sure how significant a win this might be,
> though. There could be other types of contention, which could be
> considerably more significant. I'll try and take a look at it next
> week.

I have a feeling that one of the big bottlenecks here is that we force
an immediate fsync when we reach the end of a segment.  I think it was
originally done that way to keep the code simple, and it does
accomplish that, but it's not so hot for performance.  More generally,
I think we really need to split WALWriteLock into two locks, one to
protect the write position and the other to protect the flush
position.  I think we're often ending up with a write (which is
usually fast) waiting for a flush (which is often much slower) when in
fact those things ought to be able to happen in parallel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-08-30 Thread Bruce Momjian
On Tue, May 22, 2012 at 08:29:48AM -0400, Robert Haas wrote:
> On Tue, May 22, 2012 at 1:50 AM, Tom Lane  wrote:
> > Currently, the planner keeps paths that appear to win on the grounds of
> > either cheapest startup cost or cheapest total cost.  It suddenly struck
> > me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
> > fast-start preference, etc) we could know a-priori that cheapest startup
> > cost is not going to be interesting, and hence immediately discard any
> > path that doesn't win on total cost.
> >
> > This would require some additional logic to detect whether the case
> > applies, as well as extra complexity in add_path.  So it's possible
> > that it wouldn't be worthwhile overall.  Still, it seems like it might
> > be a useful idea to investigate.
> >
> > Thoughts?
> 
> Yeah, I think we should investigate that.  Presumably you could easily
> have a situation where one part of the tree is under a LIMIT or EXISTS
> and therefore needs to preserve fast-start plans but the rest of the
> (potentially large) tree isn't, so we need something fairly
> fine-grained, I think.  Maybe we could add a flag to each RelOptInfo
> indicating whether fast-start plans should be kept, or something like
> that.

Is this a TODO?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-08-30 Thread Robert Haas
On Wed, Aug 29, 2012 at 6:39 PM, Tom Lane  wrote:
> Well, I see your point about LPAD(), but the problem is how to tell
> the difference between a harmless cast omission and an actual mistake
> that the user will be very grateful if we point out.  If we allow
> implicit casts to text in the general case in function/operator calls,
> we are definitely going to re-introduce a lot of room for mistakes.

I concede that point.  :-)

> Upthread you were complaining about how we'd reject calls even when
> there was only one possible interpretation.  I wonder whether there'd be
> any value in taking that literally: that is, allow use of assignment
> rules when there is, in fact, exactly one function with the right number
> of parameters visible in the search path.  This would solve the LPAD()
> problem (at least as stated), and probably many other practical cases
> too, since I admit your point that an awful lot of users do not use
> function overloading.  The max() example I mentioned earlier would not
> get broken since there's more than one max(), and in general it seems
> likely that cases where there's a real risk would involve overloaded
> names.

That's an interesting idea.  I like it.

> The main downside I can see is that code that used to work is likely
> to stop working as soon as someone creates a potential overloading
> situation.  Worse, the error message could be pretty confusing, since
> if you had been successfully calling f(smallint) with f(42), you'd get
> "f(integer) does not exist", not something like "f() is ambiguous",
> after adding f(float8) to the mix.  This seems related to the confusing
> changes in regression test cases that I got in my experiments yesterday.

One thought I had when looking at those messages was that, in some
ways, the new messages were actually less confusing than the old
messages. I mean, if you try to call f(42) and you get f(integer) does
not exist, ok, you'll probably figure out that the issue is with the
argument type, since you most likely know that an f of some type does
in fact exist.  But it would be even more clear if the error message
said, ok, so there is an f, but I'm not going to call it because the
argument types don't match closely enough.  The distinction would be
even more useful if the function happens to be called snuffleupagus
rather than f, because then when you call snufleupagus(42.0), it'll
tell you "i know nothing about a function by that name" whereas when
you call snuffleupagus(42) it'll tell you "i know about a function by
that name, but not with those argument types".  I've certainly
encountered this confusion before whilst debugging my own and other
people's databases: is it giving me that error because the function
doesn't exist, or because of an argument type mismatch?

> This may be sufficient reason to reject the idea, since the very last
> thing we need in this area is any degradation in the relevance of the
> error messages.
>
>> ... as long as I work for a company that helps
>> people migrate from other database systems, I'm not going to be able
>> to stop caring about this issue even in cases where I don't personally
>> think implicit casting is a good idea, because other people who are
>> not me have tens of thousands of lines of procedural code written for
>> those other systems and if you tell them they've got to go through and
>> add hundreds or thousands of casts before they can migrate, it tends
>> to turn them off.  Maybe there's no perfect solution to that problem,
>> but the status quo is definitely not perfect either.
>
> Meh.  I tend to think that a better solution to those folks' problem is
> a package of add-on casts that they could install for use with their
> legacy code; not dumbing down the system's error detection capability
> for everyone.  Peter's original try at re-adding implicit text casts
> in that way didn't work very well IIRC, but maybe we could try harder.

Well, the big problem that you run into is that when you add casts,
you tend to create situations that the type system thinks are
ambiguous.  A particular example of this is textanycat, anytextcat,
and plain old textcat.  If you start adding casts, the system can get
confused about which one it's supposed to call in which situation.
The frustrating thing is that we don't really care.  The only reason
why there are three different operators in the first place is because
we want to make sure that everything someone does will match one of
them.  But then if something matches two of them, we error out
unnecessarily.

It would be nice to have a way to say "among this group of functions,
we don't care" or perhaps "among this group of functions, here is a
preference ordering; in case of doubt, pick the one with the highest
preference".  But in some sense I feel that that isn't really solving
the problem, because the only reason those extra functions exist in
the first place is to work around the fact that sometimes the system
doesn't perform typecasts in si

  1   2   >