date:20120822

Re: [HACKERS] xlog file naming

2012-08-22 Thread Peter Eisentraut

On Tue, 2012-08-21 at 12:01 -0400, Robert Haas wrote:
> On Wed, Aug 15, 2012 at 8:43 PM, Bruce Momjian  wrote:
> > Are there any TODO items here?
> 
> It's possible there's something we want to change here, but it's far
> from obvious what that thing is.  Our WAL file handling is
> ridiculously hard to understand, but the problem with changing it is
> that there will then be two things people have to understand, and a
> lot of tools that have to be revamped.  It isn't clear that it's worth
> going through that kind of pain for a minor improvement in clarity.

The idea was that since we already broke some tools, possibly silently
(...FF files that they previously skipped), a more radical renaming
might break those tools more obviously, and make some other things
simpler/easier down the road.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Why does analyze_new_cluster.sh use sleep?

2012-08-22 Thread Peter Eisentraut

The script analyze_new_cluster.sh output by pg_upgrade contains several
"sleep" calls (see contrib/pg_upgrade/check.c).  What is the point of
this?  If the purpose of this script is to get the database operational
again as soon as possible, waiting a few seconds doing nothing surely
isn't helping.

I could maybe see the point of waiting a bit between the different
vacuumdb calls, to catch some breath, but the one before the first call
to vacuumdb is highly dubious to me.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-22 Thread Amit Kapila

From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Thursday, August 23, 2012 10:10 AM
I wrote:
>> ... I really can't take responsibility for any of this since
>> I don't have a Windows development environment.  One of the Windows-
>> hacking committers needs to pick this issue up.  Anyone?

> [ crickets ]

> Accordingly, I don't think this is a release-blocker, so I'm going to
> move it to the non-blocker section of the open-items page.

> Anybody who wants to fix it is surely welcome to, but I'm not going
> to consider this item as a reason to postpone RC1.

Let me know if anything is expected from my side.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [v9.3] writable foreign tables

2012-08-22 Thread Kohei KaiGai

Hello,

The attached patch is just a proof-of-concept of writable foreign table
feature; thus, I don't expect it getting merged at the upcoming commit
fest. The purpose of this patch is to find out the best way to support
"write stuff" in FDW.

Basic idea of this patch is to utilize "ctid" field to track an identifier of
a particular foreign-row; to be updated or deleted. It shall be moved
to the modify-stage from the scan-stage as regular table doing.
Then, newly added methods being invoked at ExecUpdate or
ExecDelete with the "pseudo ctid", so FDW shall be able to modify
the target foreign-row.
It is a responsibility of FDW extension (and DBA) to ensure each
foreign-row has a unique identifier that has 48-bits width integer
data type in maximum. In case of pgsql_fdw, "ctid" of remote table
can perform as "pseudo ctid", as is. For other RDBMS, DBA will
need to indicate which column should perform.
INSERT is simple enough; all we need to do it to carry every field
of new tuple into the remote side.

This patch adds five new methods of FdwRoutine structure.
The BeginForeignModify and EndForeignModify give a chance
to initialize / destruct a private state that can be allocated on
ResultRelInfo. As literal, ExecForeignInsert, ExecForeignDelete
and ExecForeignUpdate are invoked for each new tuple, instead
of heap_*() for regular tables. If NULL was set on them, it means
this FDW does not support these operations.

I intend FDW to set up a prepared statement that modifies
a particular remote-row being identified with pseudo-ctid,
at the BeginForeignModify(). Then, ExecForeign*() kicks
the prepared statement with given pseudo-ctid.

The patched portion at contrib/file_fdw.c does not make sense
actually. It just prints messages for each invocation.
It is just a proof-of-concept to show possibility of implementation
based on real RDBMS.

In case when file_fdw performs behalf on "ftbl".

postgres=# SELECT ctid, * FROM ftbl;
  ctid  |  a  |  b
+-+-
 (0,1)  | 101 | aaa
 (0,2)  | 102 | bbb
 (0,3)  | 103 | ccc
 (0,4)  | 104 | ddd
 (0,5)  | 105 | eee
 (0,6)  | 106 | fff
 (0,7)  | 107 | ggg
 (0,8)  | 108 | hhh
 (0,9)  | 109 | iii
 (0,10) | 110 | jjj
(10 rows)
 ==> ctid is used to carray identifier of row; line number in this example.

postgres=# UPDATE ftbl SET a = a + 1 WHERE a > 107;
INFO:  ftbl is the target relation of UPDATE
INFO:  fdw_file: BeginForeignModify method
INFO:  fdw_file: UPDATE (lineno = 8)
INFO:  fdw_file: UPDATE (lineno = 9)
INFO:  fdw_file: UPDATE (lineno = 10)
INFO:  fdw_file: EndForeignModify method
UPDATE 3
postgres=# DELETE FROM ftbl WHERE a BETWEEN 103 AND 106;
INFO:  ftbl is the target relation of DELETE
INFO:  fdw_file: BeginForeignModify method
INFO:  fdw_file: DELETE (lineno = 3)
INFO:  fdw_file: DELETE (lineno = 4)
INFO:  fdw_file: DELETE (lineno = 5)
INFO:  fdw_file: DELETE (lineno = 6)
INFO:  fdw_file: EndForeignModify method
DELETE 4


This patch does not care about transaction control anyway.
According to the discussion in developer meeting at Ottawa,
I didn't include such a complex stuff in the first step.
(Probably, we can implement using XactCallback...)

Thanks,
-- 
KaiGai Kohei 


pgsql-v9.3-writable-fdw-poc.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clean up of postgresql_fdw.c

2012-08-22 Thread Michael Paquier

Sorry, this was not dedicated to this mailing list.
My apologies.
-- 
Michael Paquier
http://michael.otacoo.com

[HACKERS] Clean up of postgresql_fdw.c

2012-08-22 Thread Michael Paquier

Hi all,

I am looking at postgresql_fdw.c and I am cleaning up the functions inside
it.
Please find attached a patch that removes is_immutable_func as it does
exactly the same thing as func_volatile in lsyscache.c.
There is still one function remaining in postgresql_fdw.c called
pgxc_is_expr_shippable that is related to FQS planner.
Ashutosh, any thoughts about where to put it?
-- 
Michael Paquier
http://michael.otacoo.com


20120823_pgfdw_cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] temporal support patch

2012-08-22 Thread Jeff Davis

On Tue, 2012-08-21 at 17:07 -0500, Kevin Grittner wrote:
> The fact that it has an unknown sequence number or timestamp for
> purposes of ordering visibility of transactions doesn't mean you
> can't show that it completed in an audit log.  In other words, I
> think the needs for a temporal database are significantly different
> from the needs of an auditing system.

...

> I would assume an audit log would have very different needs from
> tracking changes for a temporal database view.  It even seems
> possible that you might want to see what people *looked* at, versus
> just changes.  You might want to see transactions which were rolled
> back, which are irrelevant for a temporal view.  If we're talking
> about an auditing system, we're talking about an almost completely
> different animal from a temporal view of the database.

OK, I think I see what you're saying now. Basically, an audit log means
different things to different people, so I think it confused the issue.
But "temporal" is fairly vague, too. It also seems like there might be a
lot of overlap, depending on how we define those terms.

I am most interested in the topic you brought up about serializability
and system time (transaction time), because it would be a fundamental
piece upon which we can build a lot of these other things (including
what could be called an audit log).

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-22 Thread Craig Ringer


On 08/23/2012 12:40 PM, Tom Lane wrote:

I wrote:

... I really can't take responsibility for any of this since
I don't have a Windows development environment.  One of the Windows-
hacking committers needs to pick this issue up.  Anyone?


[ crickets ]

I guess everybody who might take an interest in this is out sailing...


If it's critical I can do some test builds, but I burned my Windows dev 
env down during a recent upgrade and (thankfully) haven't had a reason 
to re-create it yet so it'd take me a little while.



Accordingly, I don't think this is a release-blocker, so I'm going to
move it to the non-blocker section of the open-items page.


Sounds sensible to me. It won't hurt anyone or damage data, so there's 
little reason not to fix it in a point release.


--
Craig Ringer



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-22 Thread Tom Lane

I wrote:
> ... I really can't take responsibility for any of this since
> I don't have a Windows development environment.  One of the Windows-
> hacking committers needs to pick this issue up.  Anyone?

[ crickets ]

I guess everybody who might take an interest in this is out sailing...

After further reflection I've realized that, while this is a new bug in
9.2, it is not really a regression from 9.1.  The failure only occurs if
pg_ctl is pointed at a configuration-only directory (one that contains
postgresql.conf but is not the real data directory).  But that is a case
that did not work at all in any previous release, so no users will be
relying on it.

Accordingly, I don't think this is a release-blocker, so I'm going to
move it to the non-blocker section of the open-items page.

Anybody who wants to fix it is surely welcome to, but I'm not going
to consider this item as a reason to postpone RC1.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-22 Thread Jeff Davis

On Tue, 2012-08-21 at 17:56 -0500, Kevin Grittner wrote:
> I don't think the concerns I raised about apparent order of
> execution for serializable transactions apply to audit logs.  If
> we've moved entirely off the topic of the original subject, it is a
> complete non-issue.

Now I'm confused. The serializability issues you were talking about only
seem to matter with respect to system time (a.k.a. transaction time),
right? If the user is supplying the time, then it's a non-issue.

And audit logs are based on system time, so I thought that audit logs
were the case you were talking about.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A caveat of partitioning tables in the document

2012-08-22 Thread Tom Lane

Fujii Masao  writes:
> On Wed, Aug 22, 2012 at 12:59 AM, Kasahara Tatsuhito
>  wrote:
>> The latest document (doc/src/sgml/ddl.sgml) says
>> ===
>> 2974
>> 2975 
>> 2976  
>> 2977   Constraint exclusion only works when the query's WHERE
>> 2978   clause contains constants.  A parameterized query will not be
>> 2979   optimized, since the planner cannot know which partitions the
>> 2980   parameter value might select at run time.  For the same reason,
>> 2981   stable functions such as 
>> CURRENT_DATE
>> 2982   must be avoided.
>> 2983  
>> 2984 > ===
>> but in my understanding, this problem will be solved on 9.2 (with
>> parameterized plans).
>> 
>> Or some issues still remain ?

> At least this limitation "A parameterized query will not be optimized,
> since the planner cannot know which partitions the parameter value
> might select at run time." has been solved unless I'm missing something.
> So we should just get rid of that sentence from the document.

Yes, I think we can take that out now.  The issue with stable functions
still remains though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1

2012-08-22 Thread Tom Lane

Andrew Dunstan  writes:
> On 08/22/2012 11:08 PM, Tom Lane wrote:
>> Curiously, I do *not* see the bug on my Fedora 16 machine, running
>> perl-5.14.2-198.fc16.x86_64

> Possibly we need to look at the output of perl -V to see if there's a 
> difference.

Mine sez

Summary of my perl5 (revision 5 version 14 subversion 2) configuration:
   
  Platform:
osname=linux, osvers=2.6.32-220.4.1.el6.x86_64, 
archname=x86_64-linux-thread-multi
uname='linux x86-17.phx2.fedoraproject.org 2.6.32-220.4.1.el6.x86_64 #1 smp 
thu jan 19 14:50:54 est 2012 x86_64 x86_64 x86_64 gnulinux '
config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4  -m64 -mtune=generic 
-Dccdlflags=-Wl,--enable-new-dtags -DDEBUGGING=-g -Dversion=5.14.2 
-Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, 
Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local 
-Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 
-Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl 
-Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl 
-Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 
/usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace 
-Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog 
-Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly 
-Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto 
-Ud_sethostent_r_proto -Ud_endprotoent_r_prot!
 o -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto 
-Dscriptdir=/usr/bin'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
  Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe 
-fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE 
-D_FILE_OFFSET_BITS=64',
optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe 
-fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.6.3 20120306 (Red Hat 4.6.3-2)', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
alignbytes=8, prototype=define
  Linker and Libraries:
ld='gcc', ldflags =' -fstack-protector'
libpth=/usr/local/lib64 /lib64 /usr/lib64
libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc 
-lgdbm_compat
perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.14.90'
  Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, 
ccdlflags='-Wl,--enable-new-dtags -Wl,-rpath,/usr/lib64/perl5/CORE'
cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic'


Characteristics of this binary (from libperl): 
  Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV
PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT
USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF
USE_REENTRANT_API
  Built under linux
  Compiled at May 30 2012 15:41:26
  @INC:
/usr/local/lib64/perl5
/usr/local/share/perl5
/usr/lib64/perl5/vendor_perl
/usr/share/perl5/vendor_perl
/usr/lib64/perl5
/usr/share/perl5
.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1

2012-08-22 Thread Andrew Dunstan



On 08/22/2012 11:08 PM, Tom Lane wrote:

Alex Hunsaker  writes:

I can reproduce the failure with 5.14.2

Me too, however it works for me with 5.14.1, looking more like a strange
perl bug.

Curiously, I do *not* see the bug on my Fedora 16 machine, running
perl-5.14.2-198.fc16.x86_64

I wondered if Fedora is carrying a patch that fixes it, but none of
their dozen or so patches look related:
http://pkgs.fedoraproject.org/cgit/perl.git/tree/?h=f16



Possibly we need to look at the output of perl -V to see if there's a 
difference.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1

2012-08-22 Thread Tom Lane

Alex Hunsaker  writes:
>> I can reproduce the failure with 5.14.2

> Me too, however it works for me with 5.14.1, looking more like a strange
> perl bug.

Curiously, I do *not* see the bug on my Fedora 16 machine, running
perl-5.14.2-198.fc16.x86_64

I wondered if Fedora is carrying a patch that fixes it, but none of
their dozen or so patches look related:
http://pkgs.fedoraproject.org/cgit/perl.git/tree/?h=f16

Baffled...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1

2012-08-22 Thread Alex Hunsaker

On Mon, Aug 20, 2012 at 10:14 AM, Alvaro Herrera
wrote:

> Excerpts from Alex Hunsaker's message of lun ago 20 12:03:11 -0400 2012:
> > On Sun, Aug 19, 2012 at 2:26 PM, Joel Jacobson  wrote:
> >
> > > After upgrading from 8.4 to 9.1, one of my plperl functions stopped
> > > working properly.
>
> I can reproduce the failure with 5.14.2
>
>
Me too, however it works for me with 5.14.1, looking more like a strange
perl bug.

I've tried reproducing this in straight perl but I don't think I can
without using some C code, specifically the call to SvPVutf8 in sv2cstr()
seems to be the culprit. If I change that to SvPV() it seems to work. Im
wondering if there is some strange caching of utf8 strings going on that =~
m// is not clearing.

Ill keep digging and hopefully be able to narrow it down to a commit
between 5.14.1 and 5.14.2 so we can understand more whats going here.

Re: [HACKERS] Is this an appropriate item?

2012-08-22 Thread Tatsuo Ishii

> On Wed, Aug 22, 2012 at 9:32 PM, Tatsuo Ishii  wrote:
>> Hi,
>>
>> I found following item in the Developer FAQ.
>> I don't see why this is related to developers.
>> 
>>  Why aren't there more compression options when dumping tables?
>>
> 
> it looks more like a TODO, or we think we are not interested on this?
> if the latter, then it probably is part of the "things we don't want"
> of the dev faq
> 
> btw, the previous item is this:
> "Why don't you use threads, raw devices, async-I/O,  favorite wizz-bang feature here>?"
> 
> what's the difference about the async-I/O mentioned here and
> "synchronous_commit=off", if there is none maybe we should remove that
> part

BTW, this:

>>  Why aren't there more compression options when dumping tables?

is a subsection of this:
> "Why don't you use threads, raw devices, async-I/O,  favorite 
> wizz-bang feature here>?"

I don't see any relationship between former and latter.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is this an appropriate item?

2012-08-22 Thread Jaime Casanova

On Wed, Aug 22, 2012 at 9:32 PM, Tatsuo Ishii  wrote:
> Hi,
>
> I found following item in the Developer FAQ.
> I don't see why this is related to developers.
> 
>  Why aren't there more compression options when dumping tables?
>

it looks more like a TODO, or we think we are not interested on this?
if the latter, then it probably is part of the "things we don't want"
of the dev faq

btw, the previous item is this:
"Why don't you use threads, raw devices, async-I/O, ?"

what's the difference about the async-I/O mentioned here and
"synchronous_commit=off", if there is none maybe we should remove that
part

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Is this an appropriate item?

2012-08-22 Thread Tatsuo Ishii

Hi,

I found following item in the Developer FAQ.
I don't see why this is related to developers.

 Why aren't there more compression options when dumping tables?

pg_dump's built-in compression method is gzip. The primary alternative, bzip2, 
is normally far too slow to be useful when dumping large tables.

The two main alternatives regularly proposed for better built-in compression at 
good speeds are LZO and LZMA/LZMA2/XZ. LZO is released under the GPL, 
incompatible with PostgreSQL. The LZMA2 code has been released into the public 
domain, but the C port is a secondary one (C++ is the main development focus) 
whose code quality hasn't seemed appropriate for this project. And this whole 
area has traditionally been filled with patent issues that go beyond just the 
restrictions of the software license.

Another limitation on changing this is that pg_dump output is intended to be 
archivable, so we had better be prepared to support compression methods for a 
very long time. The "latest and greatest" compression method is exactly what we 
*don't* want.

See the archives for an idea what characteristics an alternate compression tool 
would need to have in order to be considered for use in core PostgreSQL. 

--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] larger shared buffers slows down cluster

2012-08-22 Thread Andrew Dunstan



On 08/22/2012 05:19 PM, Jeff Janes wrote:




 Shared Buffers   Time
   48Gb   2058ms
8Gb372ms
1gb 67ms


Is this expected behaviour?

Yeah.  Clustering the table means that all the indexes and the old
version of the table all get dropped, and each time something is
dropped the entire buffer pool is scoured to remove the old buffers.

In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
Because now the scouring is done once per object, not once per fork.
Also, the check is done without an initial spinlock.

It perhaps could be improved further by only scouring the pool once,
at the end of the transaction, with a hash of all objects to be
dropped.





FYI, I have rerun the tests on amazon with 9.2 BETA - the improvement I 
saw ranged from a factor of roughly 2 (with 1Gb of shared memory) to 6 
(with 48Gb).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Docs: Make notes on sequences and rollback more obvious

2012-08-22 Thread Craig Ringer


On 08/21/2012 11:18 PM, Robert Haas wrote:

On Mon, Aug 20, 2012 at 4:45 AM, Craig Ringer  wrote:

Trying again with the attachments; the archiver only seemed to see the first
patch despite all three being attached. Including patches inline; if you
want 'em prettier, see:

   https://github.com/ringerc/postgres/tree/sequence_documentation_fixes


Subject: [PATCH 1/3] Make sure you can't read through mvcc.sgml without
  realising that not everything is MVCC.



The first of these three patches looks good to me, so I committed it.
I am not convinced that the others are ready to go in.  AFAICS, there
hasn't been any discussion of whether a list of non-transactional
features would be a useful thing to have, or if so where it should be
located in the docs and what should go into it.  I'm not necessarily
opposed to adding something, but I think it needs some actual
discussion before we commit anything.


Fine by me; I just thought a concrete proposed change might get people 
talking about it better than my doing some more broad hand-waving on the 
topic.


Anyone?

Should we add a section that lists exceptions to normal transactional 
behaviour in one place, so instead of having to say "SEQUENCEs and some 
other features" or "various types, functions and features" there's 
something *concrete* to point to when discussing transactional oddities?



+
+   
+Exceptions to normal transactional rules
+
+
+ Some PostgreSQL features, functions and data types differ from the
+ usual transactional behaviour described in this chapter. Differences
+ are generally mentioned in the documentation sections for the
+ features they affect. Such exceptions are collected here for
+ easy reference.
+
+
+
+ The following actions and features don't follow the typical
+ transactional rules:
+
+
+
+ 
+  
+   Serial pseudo-types 
+  
+ 
+ 
+  
+   SEQUENCEs - 
+  
+ 
+ 
+  
+   Advisory locks - 
+  
+ 
+ 
+  
+   Disk writes to files outside the database, as performed by
+   COPY ... TO, adminpack functions, and other 
add-ons.

+   See , .
+  
+ 
+ 
+  
+   Any network I/O or inter-process communication not explicitly
+   described as transactional in its documentation. For example,
+   sending an email from PL/PerlU would not be transactional;
+   the email would be sent before the transaction commits and
+   could not be un-sent if the transaction were to roll back.
+ 
+
+
+
+ 
+  When working with external non-transactional resources like files
+  on disk or network sockets the two-phase commit feature can be
+  useful. See: 
+ 
+ 
+  LISTEN/NOTIFY provides a lighter weight but still 
transaction-friendly method of
+  triggering changes outside the database in response to changes 
inside the

+  database. A LISTENing helper program running outside the database can
+  perform actions when it gets a NOTIFY after a transaction 
commits.  See:

+  .
+ 
+
+
+   
+
   



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 64-bit API for large object

2012-08-22 Thread Tatsuo Ishii

> On Wed, 2012-08-22 at 01:14 -0400, Tom Lane wrote:
>> Peter Eisentraut  writes:
>> > On Wed, 2012-08-22 at 07:27 +0900, Tatsuo Ishii wrote:
>> >> I found this in the TODO list:
>> >> Add API for 64-bit large object access 
>> >> If this is a still valid TODO item and nobody is working on this, I
>> >> would like to work in this.
>> 
>> > Large objects are limited to 2 GB in size, so a 64-bit API doesn't sound
>> > very useful to me at the moment.
>> 
>> Not entirely.  pg_largeobject.pageno is int32, but that's still 2G pages
>> not bytes, so there's three or so orders of magnitude that could be
>> gotten by expanding the client-side API before we'd have to change the
>> server's on-disk representation.
> 
> Well then a 64-bit API would be very useful.  Go for it.  :-)

Ok, I will do it.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] larger shared buffers slows down cluster

2012-08-22 Thread Andrew Dunstan



On 08/22/2012 05:19 PM, Jeff Janes wrote:

On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan  wrote:

This problem has been reported by a client.

Consider the following very small table test case:

create table bar as select a,b,c,d,e from generate_series(1,2) a,
generate_series(3,4) b, generate_series( 5,6) c,
generate_series(7,8) d, generate_series(9,10) e;
create index bar_a on bar(a);
create index bar_b on bar(b);
create index bar_c on bar(c);
create index bar_d on bar(d);
create index bar_e on bar(e);
create unique index bar_abcde on bar(a,b,c,d,e);


Now running:

cluster bar using bar_abcde;


appears to be very sensitive to the shared buffers setting. In an amazon
very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the
following timings:


 Shared Buffers   Time
   48Gb   2058ms
8Gb372ms
1gb 67ms


Is this expected behaviour?

Yeah.  Clustering the table means that all the indexes and the old
version of the table all get dropped, and each time something is
dropped the entire buffer pool is scoured to remove the old buffers.

In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
Because now the scouring is done once per object, not once per fork.
Also, the check is done without an initial spinlock.

It perhaps could be improved further by only scouring the pool once,
at the end of the transaction, with a hash of all objects to be
dropped.


If so, is there a good explanation? I'm not sure
what other operations might be affected this way.

drop, truncate, reindex, vacuum full.  What else causes a table to be
re-written?



OK, thanks for the info.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] larger shared buffers slows down cluster

2012-08-22 Thread Jeff Janes

On Wed, Aug 22, 2012 at 1:48 PM, Andrew Dunstan  wrote:
>
> This problem has been reported by a client.
>
> Consider the following very small table test case:
>
>create table bar as select a,b,c,d,e from generate_series(1,2) a,
>generate_series(3,4) b, generate_series( 5,6) c,
>generate_series(7,8) d, generate_series(9,10) e;
>create index bar_a on bar(a);
>create index bar_b on bar(b);
>create index bar_c on bar(c);
>create index bar_d on bar(d);
>create index bar_e on bar(e);
>create unique index bar_abcde on bar(a,b,c,d,e);
>
>
> Now running:
>
>cluster bar using bar_abcde;
>
>
> appears to be very sensitive to the shared buffers setting. In an amazon
> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the
> following timings:
>
>
> Shared Buffers   Time
>   48Gb   2058ms
>8Gb372ms
>1gb 67ms
>
>
> Is this expected behaviour?

Yeah.  Clustering the table means that all the indexes and the old
version of the table all get dropped, and each time something is
dropped the entire buffer pool is scoured to remove the old buffers.

In my hands, this is about 10 times better in 9.2 than 9.1.4, at 8GB.
Because now the scouring is done once per object, not once per fork.
Also, the check is done without an initial spinlock.

It perhaps could be improved further by only scouring the pool once,
at the end of the transaction, with a hash of all objects to be
dropped.

> If so, is there a good explanation? I'm not sure
> what other operations might be affected this way.

drop, truncate, reindex, vacuum full.  What else causes a table to be
re-written?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] larger shared buffers slows down cluster

2012-08-22 Thread Tom Lane

Andrew Dunstan  writes:
> Now running:
> cluster bar using bar_abcde;
> appears to be very sensitive to the shared buffers setting. In an amazon 
> very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the 
> following timings:

>  Shared Buffers   Time
>48Gb   2058ms
> 8Gb372ms
> 1gb 67ms

DropRelFileNodeBuffers, perhaps?  See recent commits to reduce the cost
of that for large shared_buffers, notably
e8d029a30b5a5fb74b848a8697b1dfa3f66d9697 and
ece01aae479227d9836294b287d872c5a6146a11

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] larger shared buffers slows down cluster

2012-08-22 Thread Andrew Dunstan



This problem has been reported by a client.

Consider the following very small table test case:

   create table bar as select a,b,c,d,e from generate_series(1,2) a,
   generate_series(3,4) b, generate_series( 5,6) c,
   generate_series(7,8) d, generate_series(9,10) e;
   create index bar_a on bar(a);
   create index bar_b on bar(b);
   create index bar_c on bar(c);
   create index bar_d on bar(d);
   create index bar_e on bar(e);
   create unique index bar_abcde on bar(a,b,c,d,e);


Now running:

   cluster bar using bar_abcde;


appears to be very sensitive to the shared buffers setting. In an amazon 
very large memory instance (64GB) and PostgreSQL 9.1.4, I observed the 
following timings:



Shared Buffers   Time
  48Gb   2058ms
   8Gb372ms
   1gb 67ms


Is this expected behaviour? If so, is there a good explanation? I'm not 
sure what other operations might be affected this way.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-22 Thread Josh Berkus


> I don't think the concerns I raised about apparent order of
> execution for serializable transactions apply to audit logs.  If
> we've moved entirely off the topic of the original subject, it is a
> complete non-issue.

That's true, your discusison is about Miroslav's original patch.  But a
lot of the other discussion on this thread is not.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Isn't remote_write a really dumb name for that setting?

2012-08-22 Thread Tom Lane

Bruce Momjian  writes:
> On Wed, Aug 22, 2012 at 01:01:04PM -0400, Tom Lane wrote:
>> AFAICT, the remote_write setting for synchronous_commit is named exactly
>> backwards, because the point of the setting is that it *doesn't* wait
>> for the remote to write anything.
>> 
>> As an alternative I suggest "remote_receive".  Perhaps somebody else
>> has a better idea?

> Yes, I didn't like remote_write either;  see this thread:
>   http://archives.postgresql.org/pgsql-hackers/2012-05/msg00375.php

Oh.  After re-reading that thread (and looking at the code to be sure),
I think the mode name is all right; it's the documentation that is 100%
broken.  The actual meaning of the setting is that we wait for the
remote to write() the data, but not fsync() it.  The description in the
SGML docs has nothing to do with reality.

Will fix the docs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Isn't remote_write a really dumb name for that setting?

2012-08-22 Thread Bruce Momjian

On Wed, Aug 22, 2012 at 01:01:04PM -0400, Tom Lane wrote:
> AFAICT, the remote_write setting for synchronous_commit is named exactly
> backwards, because the point of the setting is that it *doesn't* wait
> for the remote to write anything.
> 
> As an alternative I suggest "remote_receive".  Perhaps somebody else
> has a better idea?

Yes, I didn't like remote_write either;  see this thread:

http://archives.postgresql.org/pgsql-hackers/2012-05/msg00375.php

Yes, please, I would like it changed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NOT NULL constraints in foreign tables

2012-08-22 Thread Jeff Davis

On Tue, 2012-08-21 at 10:41 -0400, Robert Haas wrote: 
> The thing to keep in mind here is that EVERY property of a foreign
> table is subject to change at any arbitrary point in time, without our
> knowledge.  ... Why should CHECK constraints be any different than,
> say, column types?

So, let's say someone changes column types from int to bigint on the
remote side, and you still have int on the local side. It continues to
work and everything is fine until all of a sudden you get 2^33 back, and
that generates an error.

That sounds closer to the semantics of constraint enforcement mechanism
#2 than #3 to me. That is, everything is fine until you get something
that you know is wrong, and you throw an error.

> Why should that be any worse with foreign tables than anything else?
> I mean, lots of people, as things stand today, manage to set up
> partitioned tables using CHECK constraints.  There are undoubtedly
> people who don't understand the planner benefit of having an
> appropriate CHECK constraint on each partition, but it's not exactly a
> common cause of confusion.

But there are no consequences there other than performance. With
unenforced constraints, they may get correct results during development
and testing, and wrong results occasionally when in production. That's
hard to explain to a user.

> It seems to me that the term runtime enforcement is a misnomer; you
> can't really "enforce" anything about a foreign table.

Maybe I chose the wrong terms, but there are at least 3 semantically
different concepts. Feel free to suggest a better term.

> If we
> were to propose changing the semantics from the former to the latter,
> we'd be laughed out of town, and rightly so.

I'm not proposing changing the semantics, I'm saying that there are more
than just 2 semantic options available, and they offer different kinds
of guarantees. Users may be interested in all 3 for different use cases.

>   I mean, let's
> suppose that we were to allow unique constraints on foreign tables.

I'm sure there are cases where people will not want what I am
suggesting, but I think there are cases where it is plausibly useful.

> Now, if the query is something like "SELECT
> * FROM foreign_table WHERE id = 1", you could fairly cheaply validate
> that there is only one row with id = 1, but that's not the same thing
> as validating that the assumption (namely, that foreign_table (id) is
> unique) is still true.

And if you don't issue a query at all, the constraint might not still be
true; but I don't think that implies that checking it when you do run a
query is useless.

> I think if we go down this road of trying to validate
> remote-side CHECK constraints, we're going to end up with a mishmash
> of cases where constraints are checked and other cases where
> constraints are not checked, and then that really is going to be
> confusing.

If we use keywords to differentiate constraints that are different
semantically, then we can just say that some types of constraints are
allowed on foreign tables and some are not.

I guess what I'd like to avoid is saying that a check constraint on a
regular table means one thing, and the same check constraint on a
foreign table means something else. If we differentiate them by
requiring special keywords like "NOT ENFORCED", then it would be more
user-visible what's going on, and it would allow room for new semantics
later if we want. Normal constraints would be disallowed on foreign
tables, but NOT ENFORCED ones would be allowed.

That brings up another point: what if someone really, really, doesn't
want to pay the overhead of enforcing their constraint on a local table,
but wants the planner benefit? Would they have to make it a remote table
to bypass the constraint check?

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Isn't remote_write a really dumb name for that setting?

2012-08-22 Thread Tom Lane

AFAICT, the remote_write setting for synchronous_commit is named exactly
backwards, because the point of the setting is that it *doesn't* wait
for the remote to write anything.

As an alternative I suggest "remote_receive".  Perhaps somebody else
has a better idea?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] restartpoints stop generating on streaming replication slave

2012-08-22 Thread Mathieu Fenniak

Hi Fujii,

Thanks for the quick reply.

We tried setting the log_min_messages using set_config() to debug2, but
this doesn't seem to take affect on the bgwriter process; if we changed
this in postgresql.conf, we'd have to run with the verbose logging for days
or weeks before the restartpoints stop.  The pg_controldata definitely
shows that the master is checkpointing normally, but the slave's last
checkpoint corresponds to the last restartpoint logged.  (the output of the
pg_controldata is in this e-mail from my colleague:
http://archives.postgresql.org/pgsql-general/2012-08/msg00335.php)

We have a theory as to the cause of this problem, today.  We've been
looking into data corruption issues on our slave at the same time as we've
been looking into this problem.  It turns out that our automated slave
startup has been deleting the "backup_label" file while syncing from the
master, causing the slave to miss replaying WAL data, and causing
random-appearing corruption errors on queries from the slave.  At this
point we're theorizing that any issues we have on the slave are probably
issues caused by the corrupt backup restoration; we're going to fix the
corruption issue and then see if this problem ever re-occurs.

Mathieu

On Wed, Aug 22, 2012 at 8:10 AM, Fujii Masao  wrote:

> On Wed, Aug 22, 2012 at 5:52 AM, Mathieu Fenniak
>  wrote:
> > Hi all,
> >
> > I've been investigating an issue with our PostgreSQL 9.1.1 (Linux x86-64
> > CentOS 5.8) database where restartpoints suddenly stop being generated on
> > the slave after working correctly for a week or two.  The symptom of the
> > problem is that the pg_xlog directory on the slave doesn't get cleaned
> up,
> > and the log_checkpoints output (eg. restartpoint starting: time) stops
> > appearing.
> >
> > I was able to extract a core dump of the bgwriter process while it was in
> > BgWriterNap.  I inspected ckpt_start_time and last_checkpoint_time;
> > ckpt_start_time was 1345578533 (... 19:48:53 GMT) and
> last_checkpoint_time
> > was 1345578248 (... 19:44:08 GMT).  Based upon these values, I concluded
> > that it's performing checkpoints but missing the "if (ckpt_performed)"
> > condition (ie. CreateRestartPoint returns false); it's then setting
> > last_checkpoint_time to now - 5 minutes + 15 seconds.
> >
> > There seems to be two causes of a false retval in CreateRestartPoint; the
> > first is if !RecoveryInProgress(), and the second is if "the last
> checkpoint
> > record we've replayed is already our last restartpoint".  The first
> > condition doesn't seem likely; does anyone know how we might be hitting
> the
> > second condition?  We have continuous traffic on the master server in the
> > range of 1000 txn/s, and the slave seems to be completely up-to-date, so
> I
> > don't understand how we could be hitting this condition.
>
> To check whether you really hit either of the above two conditions, could
> you
> set log_min_messages to DEBUG2 on the standby? If you hit either, you'll
> get the log message like "skipping restartpoint..".
>
> Could you execute pg_controldata on both master and standby, and check
> whether their "Latest checkpoint location" are the same?
>
> Regards,
>
> --
> Fujii Masao
>

Re: [HACKERS] Avoiding shutdown checkpoint at failover

2012-08-22 Thread Fujii Masao

On Fri, Aug 17, 2012 at 8:38 AM, Bruce Momjian  wrote:
> On Thu, Mar  8, 2012 at 08:20:02AM -0500, Robert Haas wrote:
>> On Sat, Jan 28, 2012 at 8:57 AM, Simon Riggs  wrote:
>> > On Thu, Jan 26, 2012 at 5:27 AM, Fujii Masao  wrote:
>> >
>> >> One thing I would like to ask is that why you think walreceiver is more
>> >> appropriate for writing XLOG_END_OF_RECOVERY record than startup
>> >> process. I was thinking the opposite, because if we do so, we might be
>> >> able to skip the end-of-recovery checkpoint even in file-based 
>> >> log-shipping
>> >> case.
>> >
>> > Right now, WALReceiver has one code path/use case.
>> >
>> > Startup has so many, its much harder to know whether we'll screw up one of 
>> > them.
>> >
>> > If we can add it in either place then I choose the simplest, most
>> > relevant place. If the code is the same, we can move it around later.
>> >
>> > Let me write the code and then we can think some more.
>>
>> Are we still considering trying to do this for 9.2?  Seems it's been
>> over a month without a new patch, and it's not entirely clear that we
>> know what the design should be.
>
> Did this get completed?

No, not yet.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A caveat of partitioning tables in the document

2012-08-22 Thread Fujii Masao

On Wed, Aug 22, 2012 at 12:59 AM, Kasahara Tatsuhito
 wrote:
> Hi.
>
> The latest document (doc/src/sgml/ddl.sgml) says
> ===
> 2974
> 2975 
> 2976  
> 2977   Constraint exclusion only works when the query's WHERE
> 2978   clause contains constants.  A parameterized query will not be
> 2979   optimized, since the planner cannot know which partitions the
> 2980   parameter value might select at run time.  For the same reason,
> 2981   stable functions such as 
> CURRENT_DATE
> 2982   must be avoided.
> 2983  
> 2984  ===
> but in my understanding, this problem will be solved on 9.2 (with
> parameterized plans).
>
> Or some issues still remain ?

At least this limitation "A parameterized query will not be optimized,
since the planner cannot know which partitions the parameter value
might select at run time." has been solved unless I'm missing something.
So we should just get rid of that sentence from the document.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 9.2beta4 (& git HEAD) server crash on creating extension plpython3u

2012-08-22 Thread Jan Urbański


On 22/08/12 13:28, Sachin Srivastava wrote:

Yes, It worked for me also..
So will this be a workaround? Or do we intend to use something like
Py_SetPythonHome() before calling Py_Initialize()/


I think the best we can do is to document that for some installations 
you might need to set PYTHONHOME. I don't think we can figure it out at 
runtime (we could then use setenv to fix it). This is similar to having 
to set PYTHONPATH if you want to import code installed in virtualenvs or 
other nonstandard locations.


I'm leaning towards just documenting, mucking around with environment 
variables from inside of Postgres' shared libraries seems like a recipe 
for disaster.


Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-08-22 Thread Amit Kapila

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
Sent: Wednesday, August 22, 2012 8:34 AM
From: Jesper Krogh [mailto:jes...@krogh.cc] 
Sent: Wednesday, August 22, 2012 1:13 AM
On 21/08/12 16:57, Amit kapila wrote: 

>>Test results: 

>>>1. The pgbench test run for 10min. 
>>> 2. The test reult is for modified pgbench (such that total row size is
1800 and updated columns are of length 300) tpc-b testcase. 
>>> The result and modified pgbench code is attached with mail. 

>>> 3. The performance improvement shown in the m/c I have tested is quite
good (more than 100% for sync commit = off).


>> I cannot comment on completeness or correctness of the code, but I do
think a relevant test would be 
>> to turn synchronous_commit on as default. 


>> Even though you aim at an improved performance, it would be nice to see
the reduction in WAL-size 
>>as an effect of this patch. 

> Yes, I shall take care of doing both the above tests and send the report.

The data for WAL reduction is as below:

The number of transactions processed are 16000 by doing update only of size
250 bytes with an record size of 1800. 

I had made sure no full_page_write happens by making checkpoint interval and
checkpoints segments  longer.

 

Original code - 1.8GModified code - 1.1G  Diff - 63% reduction, incase
of fill factor 100. 
Original code - 1.6GModified code - 1.1G  Diff - 45% reduction, incase
of fill factor 80. 

 

I am still in process of collecting synchronous commit mode on data.

Please let me know what more kind of data will be helpful to indicate the
benefits of this implementation.



With Regards,

Amit Kapila.

Re: [HACKERS] multi-master pgbench?

2012-08-22 Thread David Fetter

On Wed, Aug 22, 2012 at 10:13:43AM +0900, Tatsuo Ishii wrote:
> >> What does "propagation of the writes" mean?
> > 
> > I apologize for not being clear.  In a multi-master system, people
> > frequently wish to know how quickly a write operation has been
> > duplicated to the other nodes.  In some sense, those write
> > operations are incomplete until they have happened on all nodes,
> > even in the asynchronous case.
> 
> IMO, that kind of functionnality is beyond the scope of benchmark
> tools.

I was trying to come up with something that would distinguish pgbench
for multi-master from pgbench run on independent nodes.  Is there some
other distinction to draw?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] restartpoints stop generating on streaming replication slave

2012-08-22 Thread Fujii Masao

On Wed, Aug 22, 2012 at 5:52 AM, Mathieu Fenniak
 wrote:
> Hi all,
>
> I've been investigating an issue with our PostgreSQL 9.1.1 (Linux x86-64
> CentOS 5.8) database where restartpoints suddenly stop being generated on
> the slave after working correctly for a week or two.  The symptom of the
> problem is that the pg_xlog directory on the slave doesn't get cleaned up,
> and the log_checkpoints output (eg. restartpoint starting: time) stops
> appearing.
>
> I was able to extract a core dump of the bgwriter process while it was in
> BgWriterNap.  I inspected ckpt_start_time and last_checkpoint_time;
> ckpt_start_time was 1345578533 (... 19:48:53 GMT) and last_checkpoint_time
> was 1345578248 (... 19:44:08 GMT).  Based upon these values, I concluded
> that it's performing checkpoints but missing the "if (ckpt_performed)"
> condition (ie. CreateRestartPoint returns false); it's then setting
> last_checkpoint_time to now - 5 minutes + 15 seconds.
>
> There seems to be two causes of a false retval in CreateRestartPoint; the
> first is if !RecoveryInProgress(), and the second is if "the last checkpoint
> record we've replayed is already our last restartpoint".  The first
> condition doesn't seem likely; does anyone know how we might be hitting the
> second condition?  We have continuous traffic on the master server in the
> range of 1000 txn/s, and the slave seems to be completely up-to-date, so I
> don't understand how we could be hitting this condition.

To check whether you really hit either of the above two conditions, could you
set log_min_messages to DEBUG2 on the standby? If you hit either, you'll
get the log message like "skipping restartpoint..".

Could you execute pg_controldata on both master and standby, and check
whether their "Latest checkpoint location" are the same?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Expressions without type

2012-08-22 Thread Tom Lane

Ashutosh Bapat  writes:
> I need to check type of expressions appearing in a Query tree and I am
> using exprType() for that. But for certain expressions their type is not
> defined like List, FromExpr, JoinExpr. Such expressions are acceptable in
> the code, but expressions which have a type need to obey certain criteria
> like they can not be composite type etc. exprType() throws error when it's
> passed expressions which do not have type and hence I can not use it
> directly. Is there a function, which would tell me whether a given
> expression can have a type or not?

It seems to me that this is the result of fuzzy thinking.  Neither
FromExpr nor JoinExpr can possibly occur in random places in a query
tree: they only appear in the join tree, all of which can be thought of
as returning SETOF RECORD if you want, but it doesn't seem very helpful
to apply exprType to the jointree.  Lists likewise don't occur in any
context where it seems helpful to associate a type with the node tree.
So I think you need to think a bit harder about what you need to
accomplish and which parts of a query you need to accomplish it on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-22 Thread Tom Lane

Amit kapila  writes:
>> Can't we test the same condition that postgres.exe itself would test?

>To implement the postgre.exe way we have following options:

>1. Duplicate the function pgwin32_is_admin and related function to pg_ctl, 
> as currently it is not exposed.
>2. Make that visible to pg_ctl, but for that it need to link with postgre 
> lib. 
>3. Move the functions to some common place may be src/port. 
>4. any other better way?

> Curretly I have implemented the patch with Approach-1, but I believe 
> Approach-3 would have been better.

After poking around a bit I realized that you'd copied the
environment-variable hack from initdb.c, which has got basically the
same problem of needing to drop admin privileges.  I think it is just
as ugly and dangerous there as here.  So I would be in favor of approach
#3 and merging initdb's copy of the code too.  In fact, given that
GetCommandLine() appears to be OS-provided, it seems to me that *all*
of the functionality needed could be wrapped up in a utility subroutine
with the semantics of "re-exec myself in a restricted process if
needed".

On the other hand, that's kind of a big chunk of work to take on at the
last minute for what is admittedly a rather hypothetical risk.  Maybe
it'd be best to just duplicate initdb's code into pg_ctl for the moment
and plan on cleaning it up later when there's more time.

However, I really can't take responsibility for any of this since
I don't have a Windows development environment.  One of the Windows-
hacking committers needs to pick this issue up.  Anyone?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2RC1 wraps this Thursday ...

2012-08-22 Thread Amit kapila

From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Tuesday, August 21, 2012 10:31 PM
Amit Kapila  writes:
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>> * pg_ctl crashes on Win32 when neither PGDATA nor -D specified



>>> isn't there a way to actually test if we're in a restricted process?

>> Do you mean to say that it should check if pg_ctl runs as an administrative
>> user then do the re-fork in restricted mode.

> Something like that.  The proposed patch depends on there not being a
> conflicting environment variable, which seems rather fragile to me.
> Can't we test the same condition that postgres.exe itself would test?

   To implement the postgre.exe way we have following options:
   1. Duplicate the function pgwin32_is_admin and related function to pg_ctl, 
as currently it is not exposed.
   2. Make that visible to pg_ctl, but for that it need to link with postgre 
lib. 
   3. Move the functions to some common place may be src/port. 
   4. any other better way?

Curretly I have implemented the patch with Approach-1, but I believe Approach-3 
would have been better.
However I was not sure which is the best place to move functions, so I have 
implemented with Approach-1.

Please let me know if the attached patch is acceptable. I shall wait today 
night for your confirmation and shall let you know before
I leave my work place in which case I shall complete tommorow morning but not 
sure whether that much delay is acceptable.

With Regards,
Amit Kapila.
   diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index af8d8b2..ff5d373 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -258,6 +258,121 @@ xstrdup(const char *s)
 }
 
 /*
+ * Call GetTokenInformation() on a token and return a dynamically sized
+ * buffer with the information in it. This buffer must be free():d by
+ * the calling function!
+ */
+static BOOL
+pgwin32_get_dynamic_tokeninfo(HANDLE token, TOKEN_INFORMATION_CLASS class,
+ char **InfoBuffer, 
char *errbuf, int errsize)
+{
+   DWORD   InfoBufferSize;
+
+   if (GetTokenInformation(token, class, NULL, 0, &InfoBufferSize))
+   {
+   snprintf(errbuf, errsize, "could not get token information: got 
zero size\n");
+   return FALSE;
+   }
+
+   if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
+   {
+   snprintf(errbuf, errsize, "could not get token information: 
error code %lu\n",
+GetLastError());
+   return FALSE;
+   }
+
+   *InfoBuffer = malloc(InfoBufferSize);
+   if (*InfoBuffer == NULL)
+   {
+   snprintf(errbuf, errsize, "could not allocate %d bytes for 
token information\n",
+(int) InfoBufferSize);
+   return FALSE;
+   }
+
+   if (!GetTokenInformation(token, class, *InfoBuffer,
+InfoBufferSize, 
&InfoBufferSize))
+   {
+   snprintf(errbuf, errsize, "could not get token information: 
error code %lu\n",
+GetLastError());
+   return FALSE;
+   }
+
+   return TRUE;
+}
+
+
+/*
+ * Returns nonzero if the current user has administrative privileges,
+ * or zero if not.
+ */
+static int
+pgwin32_is_admin(void)
+{
+   HANDLE  AccessToken;
+   char   *InfoBuffer = NULL;
+   charerrbuf[256];
+   PTOKEN_GROUPS Groups;
+   PSIDAdministratorsSid;
+   PSIDPowerUsersSid;
+   SID_IDENTIFIER_AUTHORITY NtAuthority = {SECURITY_NT_AUTHORITY};
+   UINTx;
+   BOOLsuccess;
+
+   if (!OpenProcessToken(GetCurrentProcess(), TOKEN_READ, &AccessToken))
+   {
+   write_stderr("could not open process token: error code %lu\n",
+GetLastError());
+   exit(1);
+   }
+
+   if (!pgwin32_get_dynamic_tokeninfo(AccessToken, TokenGroups,
+  
&InfoBuffer, errbuf, sizeof(errbuf)))
+   {
+   write_stderr("%s", errbuf);
+   exit(1);
+   }
+
+   Groups = (PTOKEN_GROUPS) InfoBuffer;
+
+   CloseHandle(AccessToken);
+
+   if (!AllocateAndInitializeSid(&NtAuthority, 2,
+SECURITY_BUILTIN_DOMAIN_RID, DOMAIN_ALIAS_RID_ADMINS, 0, 0, 0, 
0, 0,
+ 0, 
&AdministratorsSid))
+   {
+   write_stderr("could not get SID for Administrators group: error 
code %lu\n",
+GetLastError());
+   exit(1);
+   }
+
+   if (!AllocateAndInitializeSid(&NtAuthority, 2,
+   SECURITY_BUILTIN_DOMAIN_RID, DOMAIN_ALIAS_RID_POWER_USERS, 0, 0, 0, 0, 
0,
+

[HACKERS] Expressions without type

2012-08-22 Thread Ashutosh Bapat

Hi All,
I need to check type of expressions appearing in a Query tree and I am
using exprType() for that. But for certain expressions their type is not
defined like List, FromExpr, JoinExpr. Such expressions are acceptable in
the code, but expressions which have a type need to obey certain criteria
like they can not be composite type etc. exprType() throws error when it's
passed expressions which do not have type and hence I can not use it
directly. Is there a function, which would tell me whether a given
expression can have a type or not?

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [HACKERS] 64-bit API for large object

2012-08-22 Thread Peter Eisentraut

On Wed, 2012-08-22 at 01:14 -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On Wed, 2012-08-22 at 07:27 +0900, Tatsuo Ishii wrote:
> >> I found this in the TODO list:
> >> Add API for 64-bit large object access 
> >> If this is a still valid TODO item and nobody is working on this, I
> >> would like to work in this.
> 
> > Large objects are limited to 2 GB in size, so a 64-bit API doesn't sound
> > very useful to me at the moment.
> 
> Not entirely.  pg_largeobject.pageno is int32, but that's still 2G pages
> not bytes, so there's three or so orders of magnitude that could be
> gotten by expanding the client-side API before we'd have to change the
> server's on-disk representation.

Well then a 64-bit API would be very useful.  Go for it.  :-)




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 9.2beta4 (& git HEAD) server crash on creating extension plpython3u

2012-08-22 Thread Sachin Srivastava

Yes, It worked for me also..
So will this be a workaround? Or do we intend to use something like
Py_SetPythonHome() before calling Py_Initialize()/

On Wed, Aug 22, 2012 at 3:32 PM, Jan Urbański  wrote:

> On 21/08/12 20:13, Josh Berkus wrote:
>
>>
>>  No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE)
>>> branch.
>>>
>>
> I have reproduced this on Linux, seems like the fix is to to run the
> postmaster with this env variable exported:
>
> PYTHONHOME=/opt/ActivePython-**3.2/
>
> (or wherever you installed ActivePython).
>
> To give credit, I found the decisive clue here:
> http://manojadinesh.blogspot.**com/2012/06/fatal-python-**
> error-pyinitialize-unable.html
>
> Cheers,
> Jan
>



-- 
Regards,
Sachin Srivastava
EnterpriseDB, India

Re: [HACKERS] PostgreSQL 9.2beta4 (& git HEAD) server crash on creating extension plpython3u

2012-08-22 Thread Jan Urbański


On 21/08/12 20:13, Josh Berkus wrote:



No. I get the same backtrace when I try against the 9.1.5 (REL9_1_STABLE)
branch.


I have reproduced this on Linux, seems like the fix is to to run the 
postmaster with this env variable exported:


PYTHONHOME=/opt/ActivePython-3.2/

(or wherever you installed ActivePython).

To give credit, I found the decisive clue here: 
http://manojadinesh.blogspot.com/2012/06/fatal-python-error-pyinitialize-unable.html


Cheers,
Jan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: psql boolean display

2012-08-22 Thread Pavel Stehule

2012/8/21 Tom Lane :
> "Kevin Grittner"  writes:
>> The type itself does output true/false; it's just psql that uses
>> t/f.
>
> No, 't'/'f' is what boolout() returns.  The 'true'/'false' results from
> casting bool to text are intentionally different --- IIRC, Peter E.
> argued successfully that this cast behavior is required by SQL spec.
> But we'd already been returning 't'/'f' to applications for far too many
> years to change it.  (And that argument has not gotten any weaker since
> then.  Keep in mind that Postgres was returning 't'/'f' for bool years
> before the SQL spec even had a boolean type.)
>
> If we're going to do something like this at all, I agree that psql is
> the place to do it, not the server.  But my beef with this patch is that
> it's thinking too small --- why would bool be the only type that
> somebody would want to editorialize on the display of?  I'd rather see
> some general "substitute this for that in display of columns of type X"
> feature.
>

can you explain your idea, please? - I can't to imagine any general
solution for other types than "enum" like types (without significant
enhancing of psql scripting possibilities)

Regards

Pavel

> regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-22 Thread Pavel Stehule

2012/8/22 Vlad Arkhipov :
> On 08/22/2012 08:34 AM, Gavin Flower wrote:
>
> About 10 years ago, I implemented some temporal features in a database to
> cope with insurance quotes that had to be valid for a specified number of
> days in the future that was invariant with respect to future changes in
> premiums with effective dates within the period of validity of the quote. If
> anyone is interested, I'll see if I can find my notes and write it up (but
> in a different thread!).
>
> Cheers,
> Gavin
>
> What you mean is not an audit logs, it's a business time. Pavel Stehule in
> the beginning of this thread gave a link to a description of SQL2011 design
> of this feature. Audit logs are more related to system time. For example IBM
> DB2 uses following syntax for system time (which is mostly
> SQL2011-conformant).
>
> CREATE TABLE policy (
>   id INT primary key not null,
>   vin VARCHAR(10),
>   annual_mileage INT,
>   rental_car CHAR(1),
>   coverage_amt INT,
>
>   sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW BEGIN NOT NULL,
>   sys_end TIMESTAMP(12) GENERATED ALWAYS AS ROW END NOT NULL,
>   trans_start TIMESTAMP(12) GENERATED ALWAYS AS TRANSACTION START ID
> IMPLICITLY HIDDEN,
>
>   PERIOD SYSTEM_TIME (sys_start, sys_end)
> );
>
> CREATE TABLE policy_history LIKE policy;
>
> ALTER TABLE policy ADD VERSIONING USE HISTORY TABLE policy_history;
>
> And the following syntax for querying for historical data.
>
> SELECT coverage_amt
> FROM policy FOR SYSTEM_TIME AS OF '2010-12-01'
> WHERE id = ;
>
> SELECT count(*)
> FROM policy FOR SYSTEM_TIME FROM '2011-11-30' TO '-12-30'
> WHERE vin = 'A';

I like this design - it is simple without other objects

Regards

Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit Logs WAS: temporal support patch

2012-08-22 Thread Vlad Arkhipov


On 08/22/2012 08:34 AM, Gavin Flower wrote:
About 10 years ago, I implemented some temporal features in a database 
to cope with insurance quotes that had to be valid for a specified 
number of days in the future that was invariant with respect to future 
changes in premiums with effective dates within the period of validity 
of the quote. If anyone is interested, I'll see if I can find my notes 
and write it up (but in a different thread!).


Cheers,
Gavin

What you mean is not an audit logs, it's a business time. Pavel Stehule 
in the beginning of this thread gave a link to a description of SQL2011 
design of this feature. Audit logs are more related to system time. For 
example IBM DB2 uses following syntax for system time (which is mostly 
SQL2011-conformant).


CREATE TABLE policy (
  id INT primary key not null,
  vin VARCHAR(10),
  annual_mileage INT,
  rental_car CHAR(1),
  coverage_amt INT,

  sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW BEGIN NOT NULL,
  sys_end TIMESTAMP(12) GENERATED ALWAYS AS ROW END NOT NULL,
  trans_start TIMESTAMP(12) GENERATED ALWAYS AS TRANSACTION START ID 
IMPLICITLY HIDDEN,


  PERIOD SYSTEM_TIME (sys_start, sys_end)
);

CREATE TABLE policy_history LIKE policy;

ALTER TABLE policy ADD VERSIONING USE HISTORY TABLE policy_history;

And the following syntax for querying for historical data.

SELECT coverage_amt
FROM policy FOR SYSTEM_TIME AS OF '2010-12-01'
WHERE id = ;

SELECT count(*)
FROM policy FOR SYSTEM_TIME FROM '2011-11-30' TO '-12-30'
WHERE vin = 'A';

47 matches

Mail list logo