Re: [HACKERS] Reporting WAL file containing checkpoint's REDO record in pg_controldata's result

2012-03-25 Thread Magnus Hagander
On Fri, Mar 23, 2012 at 18:05, Fujii Masao  wrote:
> On Sat, Mar 24, 2012 at 1:49 AM, Robert Haas  wrote:
>> On Fri, Mar 23, 2012 at 12:42 PM, Fujii Masao  wrote:
>>> On Fri, Mar 23, 2012 at 9:41 PM, Robert Haas  wrote:
 On Fri, Mar 23, 2012 at 6:13 AM, Fujii Masao  wrote:
> On Fri, Mar 23, 2012 at 5:56 PM, Magnus Hagander  
> wrote:
>> Might it be a good idea to put it on it's own row instead of changing
>> the format of an existing row, in order not to break scripts and
>> programs that are parsing the previous output?
>
> Good idea! What row name should we use for the WAL file containing
> REDO record? "Latest checkpoint's REDO file"?

 Sounds good to me.  I like the idea, too.  The status quo is an
 unnecessary nuisance, so this will be a nice usability improvement.
>>>
>>> Attached patch adds new row "Latest checkpoint's REDO WAL segment:" into
>>> the result of pg_controldata. I used the term "WAL segment" for the row name
>>> instead of "file" because "WAL segment" is used in another row "Bytes per 
>>> WAL
>>> segment:". But better name?
>>
>> s/segment/file/g?
>
> Yep, "file" might be more intuitive for a user than "segment". Attached is the
> "file" version of the patch.

We're already using "file" to mean something different *internally*,
don't we? And since pg_controldata shows fairly internal information,
I'm not sure this is the best idea.

Maybe compromise and call it "segment file" - that is both easier to
understand than segment, and not actually using a term that means
something else...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-25 Thread Andres Freund
On Friday, March 16, 2012 10:40:46 AM Dimitri Fontaine wrote:
> > This will have the effect of calling triggers outside of alphabetic
> > order. I don't think thats a good idea even if one part is ANY and the
> > other a specific command.
> > I don't think there is any reason anymore to separate the two? The only
> 
> > callsite seems to look like:
> The idea is to have a predictable ordering of command triggers. The code
> changed in the patch v16 (you pasted code from git in between v15 and
> v16, I cleaned it up) and is now easier to read:
> 
> case CMD_TRIGGER_FIRED_BEFORE:
> whenstr = "BEFORE";
> procs[0] = cmd->before_any;
> procs[1] = cmd->before;
> break;
> 
> case CMD_TRIGGER_FIRED_AFTER:
> whenstr = "AFTER";
> procs[0] = cmd->after;
> procs[1] = cmd->after_any;
> break;
> 
> So it's BEFORE ANY then BEFORE command then AFTER command then AFTER
> ANY. That's an arbitrary I made and we can easily reconsider. Triggers
> are called in alphabetical order in each “slot” here.
> 
> In my mind it makes sense to have ANY triggers around the specific
> triggers, but it's hard to explain why that feels better.
I still think this would be a mistake. I don't have a hard time imagining 
usecases where a specific trigger should be called before or after an ANY 
trigger because e.g. it wants to return a more specific error or doesn't want 
to check all preconditions already done by the ANY trigger... All that would 
be precluded by enforcing a strict ordering between ANY and specific triggers.
I don't see a use-case that would benefit from the current behaviour...

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-03-25 Thread Jeff Davis
On Sat, 2012-03-17 at 17:48 +0100, Cédric Villemain wrote:
> I agree with Andres.
> 
> 
> I believe we should use sync_file_range (_before?) with linux.
> 
> And we can use posix_fadvise_dontneed on other kernels.
> 
OK, updated patch attached. sync_file_range() is preferred,
posix_fadvise() is a fallback.

Regards,
Jeff Davis


initdb-fsync-20120325.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for parallel pg_dump

2012-03-25 Thread Joachim Wieland
On Fri, Mar 23, 2012 at 11:11 AM, Alvaro Herrera
 wrote:
> Are you going to provide a rebased version?

Rebased version attached, this patch also includes Robert's earlier suggestions.


parallel_pg_dump_5.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY / extend ExclusiveLock

2012-03-25 Thread Tom Lane
Jim Nasby  writes:
> On 3/22/12 2:13 PM, Stephen Frost wrote:
>> Has anyone been working on or considering how to improve the logic
>> around doing extends on relations to perhaps make larger extensions
>> for larger tables?  Or make larger extensions when tables are growing
>> very quickly?

> I know that there's been discussion around this. Way back in the day we 
> extended relations one page at a time. I don't remember if that was changed 
> or not.

No, it's still on the to-do list.  One issue is that you probably don't
want individual insertion queries taking big response-time hits to do
large extensions themselves --- so somehow the bulk of the work needs to
be pushed to some background process.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY / extend ExclusiveLock

2012-03-25 Thread Jim Nasby

On 3/22/12 2:13 PM, Stephen Frost wrote:

Greetings,

   I've recently become a bit annoyed and frustrated looking at this in
   top:

23296 postgres  20   0 3341m 304m 299m S   12  0.9   1:50.02 postgres: sfrost 
gis [local] COPY waiting
24362 postgres  20   0 3353m 298m 285m D   12  0.9   1:24.99 postgres: sfrost 
gis [local] COPY
24429 postgres  20   0 3340m 251m 247m S   11  0.8   1:13.79 postgres: sfrost 
gis [local] COPY waiting
24138 postgres  20   0 3341m 249m 244m S   10  0.8   1:28.09 postgres: sfrost 
gis [local] COPY waiting
24153 postgres  20   0 3340m 246m 241m S   10  0.8   1:24.44 postgres: sfrost 
gis [local] COPY waiting
24166 postgres  20   0 3341m 318m 313m S   10  1.0   1:40.52 postgres: sfrost 
gis [local] COPY waiting
24271 postgres  20   0 3340m 288m 283m S   10  0.9   1:34.12 postgres: sfrost 
gis [local] COPY waiting
24528 postgres  20   0 3341m 290m 285m S   10  0.9   1:21.23 postgres: sfrost 
gis [local] COPY waiting
24540 postgres  20   0 3340m 241m 236m S   10  0.7   1:15.91 postgres: sfrost 
gis [local] COPY waiting

   Has anyone been working on or considering how to improve the logic
   around doing extends on relations to perhaps make larger extensions
   for larger tables?  Or make larger extensions when tables are growing
   very quickly?

   I haven't looked at the code, but I'm guessing we extend relations
   when they're full (that part makes sense..), but we extend them an
   itty-bitty bit at a time, which very quickly ends up being not fast
   enough for the processes that want to get data into the table.

   My gut feeling is that we could very easily and quickly improve this
   situation by having a way to make larger extensions, and then using
   that method when we detect that a table is growing very quickly.


I know that there's been discussion around this. Way back in the day we 
extended relations one page at a time. I don't remember if that was changed or 
not.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpoint patches

2012-03-25 Thread Jim Nasby

On 3/23/12 7:38 AM, Robert Haas wrote:

And here are the latency results for 95th-100th percentile with
checkpoint_timeout=16min.

ckpt.master.13: 1703, 1830, 2166, 17953, 192434, 43946669
ckpt.master.14: 1728, 1858, 2169, 15596, 187943, 9619191
ckpt.master.15: 1700, 1835, 2189, 22181, 206445, 8212125

The picture looks similar here.  Increasing checkpoint_timeout isn't
*quite*  as good as spreading out the fsyncs, but it's pretty darn
close.  For example, looking at the median of the three 98th
percentile numbers for each configuration, the patch bought us a 28%
improvement in 98th percentile latency.  But increasing
checkpoint_timeout by a minute bought us a 15% improvement in 98th
percentile latency.  So it's still not clear to me that the patch is
doing anything on this test that you couldn't get just by increasing
checkpoint_timeout by a few more minutes.  Granted, it lets you keep
your inter-checkpoint interval slightly smaller, but that's not that
exciting.  That having been said, I don't have a whole lot of trouble
believing that there are other cases where this is more worthwhile.


I wouldn't be too quick to dismiss increasing checkpoint frequency (ie: 
decreasing checkpoint_timeout).

On a high-value production system you're going to care quite a bit about 
recovery time. I certainly wouldn't want to run our systems with 
checkpoint_timeout='15 min' if I could avoid it.

Another $0.02: I don't recall the community using pg_bench much at all to 
measure latency... I believe it's something fairly new. I point this out 
because I believe there are differences in analysis that you need to do for TPS 
vs latency. I think Robert's graphs support my argument; the numeric 
X-percentile data might not look terribly good, but reducing peak latency from 
100ms to 60ms could be a really big deal on a lot of systems. My intuition is 
that one or both of these patches actually would be valuable in the real world; 
it would be a shame to throw them out because we're not sure how to performance 
test them...
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] who's familiar with the GSOC application process

2012-03-25 Thread Josh Berkus
On 3/23/12 1:58 PM, Dave Cramer wrote:
> Someone has approached the JDBC list to do some work on the driver as
> a GSOC project.
> 
> I need to know what is involved in mentoring and how to get the
> project approved

I've approved you as a mentor.  I suggest also joining the
pgsql-students mailing list.

You should get the student to submit an application on Monday.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how can i see the log..?

2012-03-25 Thread Kevin Grittner
lekon  wrote:
 
> i want see the log of INSERT, SELECT, DELETE... within postgresql
> v9.1.3
 
The pgsql-hackers list is for discussions about development of the
PostgreSQL datatbase system; discussions about how to use it should
go to another list.  The particular question would be appropriate for
pgsql-general or pgsql-admin.  See this page descriptions of the
various listings to help decide where to post future questions:
 
http://www.postgresql.org/community/lists/
 
Regarding the question itself, see the log_statement configuration
setting:
 
http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT
 
-Kevin 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-25 Thread Andres Freund
On Friday, March 23, 2012 04:32:02 PM Dimitri Fontaine wrote:
> I would like to get back on code level review now if at all possible,
> and I would integrate your suggestions here into the next patch revision
> if another one is needed.
Ok, I will give it another go.

Btw I just wanted to alert you to being careful when checking in the expect 
files ;)

 NOTICE:  snitch: BEFORE any DROP TRIGGER
-ERROR:  unexpected name list length (3)
+NOTICE:  snitch: BEFORE DROP TRIGGER  foo_trigger
+NOTICE:  snitch: AFTER any DROP TRIGGER
 create conversion test for 'utf8' to 'sjis' from utf8_to_sjis;
j

you had an apparerently un-noticed error in there ;)


1.
if (!HeapTupleIsValid(tup))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("command trigger \"%s\" does not exist, 
skipping",
trigname)));
The "skipping" part looks like a copy/pasto...

2.
In PLy_exec_command_trigger youre doing a PG_TRY() which looks pointless in 
the current incarnation. Did you intend to add something in the catch?
I think without doing a decref of pltdata both in the sucess and the failure 
path youre leaking memory.

3.
In plpython: Why do you pass objectId/pltobjectname/... as "NULL" instead of 
None? Using a string for it seems like a bad from of in-band signalling to me.

4. 
Not sure whether InitCommandContext is the best place to suppress command 
trigger usage for some commands. That seems rather unintuitive to me. But 
perhaps the implementation-ease is big enough...

Thats everything new I found... Not bad I think. After this somebody else 
should take a look at I think (commiter or not).

> The only point yet to address from last round from Andres is about the
> API around CommandFiresTrigger() and the Memory Context we use here.
> We're missing an explicit Reset call, and to be able to have we need to
> have a more complex API, because of the way RemoveObjects() and
> RemoveRelations() work.
> 
> We would need to add no-reset APIs and an entry point to manually reset
> the memory context, which currently gets disposed at the same time as
> its parent context, the current one that's been setup before entering
> standard_ProcessUtility().
Not sure if youre expecting further input from me about that?

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] occasional startup failures

2012-03-25 Thread Magnus Hagander
On Sun, Mar 25, 2012 at 18:59, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> Every so often buildfarm animals (nightjar and raven recently, for
>> example) report failures on starting up the postmaster. It appears that
>> these failures are due to the postmaster not creating the pid file
>> within 5 seconds, and so the logic in commit
>> 0bae3bc9be4a025df089f0a0c2f547fa538a97bc kicks in. Unfortunately, when
>> this happens the postmaster has in fact sometimes started up, and the
>> end result is that subsequent buildfarm runs will fail when they detect
>> that there is already a postmaster listening on the port, and without
>> manual intervention to kill the "rogue" postmaster this continues endlessly.
>
>> I can probably add some logic to the buildfarm script to try to detect
>> this condition and kill an errant postmaster so subsequent runs don't
>> get affected, but that seems to be avoiding a problem rather than fixing
>> it. I'm not sure what we can do to improve it otherwise, though.
>
> Yeah, this has been discussed before.  IMO the only real fix is to
> arrange things so that the postmaster process is an immediate child of
> pg_ctl, allowing pg_ctl to know its PID directly and not have to rely
> on the pidfile appearing before it can detect whether the postmaster
> is still alive.  Then there is no need for a guesstimated timeout.
> That means not using system() anymore, but rather fork/exec, which
> mainly implies having to write our own code for stdio redirection.
> So that's certainly doable if a bit tedious.  I have no idea about
> the Windows side of it though.

We already do something like this on Win32 - at least one reason being
dealing with restricted tokens. Right now we just close the handles to
the child, but we could easily keep those around for doing this type
of detection.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] occasional startup failures

2012-03-25 Thread Tom Lane
Andrew Dunstan  writes:
> Every so often buildfarm animals (nightjar and raven recently, for 
> example) report failures on starting up the postmaster. It appears that 
> these failures are due to the postmaster not creating the pid file 
> within 5 seconds, and so the logic in commit 
> 0bae3bc9be4a025df089f0a0c2f547fa538a97bc kicks in. Unfortunately, when 
> this happens the postmaster has in fact sometimes started up, and the 
> end result is that subsequent buildfarm runs will fail when they detect 
> that there is already a postmaster listening on the port, and without 
> manual intervention to kill the "rogue" postmaster this continues endlessly.

> I can probably add some logic to the buildfarm script to try to detect 
> this condition and kill an errant postmaster so subsequent runs don't 
> get affected, but that seems to be avoiding a problem rather than fixing 
> it. I'm not sure what we can do to improve it otherwise, though.

Yeah, this has been discussed before.  IMO the only real fix is to
arrange things so that the postmaster process is an immediate child of
pg_ctl, allowing pg_ctl to know its PID directly and not have to rely
on the pidfile appearing before it can detect whether the postmaster
is still alive.  Then there is no need for a guesstimated timeout.
That means not using system() anymore, but rather fork/exec, which
mainly implies having to write our own code for stdio redirection.
So that's certainly doable if a bit tedious.  I have no idea about
the Windows side of it though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] query cache

2012-03-25 Thread Billy Earney
Thanks..  I'll keep those issues in mind.

On Sat, Mar 24, 2012 at 6:18 PM, Tatsuo Ishii  wrote:

> >> Well, you'd have to start by demonstrating the benefit of it.  The
> >> advantage of query caches in proxies and clients is well-known, because
> you
> >> can offload some of the work of the database onto other servers, this
> >> increasing capacity.  Adding a query cache to the database server would
> >> require the "query identity recognition" of the cache to be far cheaper
> (as
> >> in 10X cheaper) than planning and running the query, which seems
> unlikely
> >> at best.
> >>
> >> I figured I'd create the md5 digest of the sourceText of a query, and
> then
> > look that up in a hash.  I don't think that will be very expensive.  I'll
> > have another hash to keep track of which queries are dependent on which
> > relations, so that when a relation is changed somehow (and committed),
> the
> > query is then invalidated and removed from the query hash.
>
> From the experience of implementing query cache in pgool-II there are
> some suggestions:
>
> - A query result cache should not be created if the transaction
>  including the SELECT is not committed.
>
> - Since a transaction could have many SELECTs, you need to keep those
>  query results somewhere in a temporary storage. You could either
>  discard or register them to the query cache storage depending on the
>  transaction's fate, either aborted or committed.
>
> - If a SELECT has non-immutable functions, then the query result
>  should not be cached.
>
> - If a SELECT uses temporary tables, then the query result should not
>  be cached.
>
> - If a SELECT uses unlogged tables, then the query result should not
>  be cached because their data could vanish after crash recovery. Of
>  course this is only applied if you plan to use cache storage which
>  does not survive after crash.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>


[HACKERS] occasional startup failures

2012-03-25 Thread Andrew Dunstan


Every so often buildfarm animals (nightjar and raven recently, for 
example) report failures on starting up the postmaster. It appears that 
these failures are due to the postmaster not creating the pid file 
within 5 seconds, and so the logic in commit 
0bae3bc9be4a025df089f0a0c2f547fa538a97bc kicks in. Unfortunately, when 
this happens the postmaster has in fact sometimes started up, and the 
end result is that subsequent buildfarm runs will fail when they detect 
that there is already a postmaster listening on the port, and without 
manual intervention to kill the "rogue" postmaster this continues endlessly.


I can probably add some logic to the buildfarm script to try to detect 
this condition and kill an errant postmaster so subsequent runs don't 
get affected, but that seems to be avoiding a problem rather than fixing 
it. I'm not sure what we can do to improve it otherwise, though.


Thoughts?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats

2012-03-25 Thread Tom Lane
Robert Haas  writes:
> The bit about maybe not getting both t and f as MCVs on a Boolean does
> seem a little worrying, but I'm not sure whether it actually affects
> query planning in a materially negative way.  Can you demonstrate a
> case where it matters?

If we were trying to force that to happen it would be wrong anyway.
Consider a column that contains *only* "t", or at least has so few
"f"'s that "f" appears never or only once in the selected sample.
(IIRC there is a clamp that prevents selecting anything as an MCV
unless it appears at least twice in the sample.)

Like Robert, I'm not convinced whether or not this is a reasonable
change, but arguing for it on the basis of boolean columns doesn't
seem very sound.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema

2012-03-25 Thread Robert Haas
On Sun, Mar 25, 2012 at 6:11 AM, Marc Mamin  wrote:
> Hello,
>
> Here is something we'd like to have:
>
> http://archives.postgresql.org/pgsql-hackers/2012-01/msg00650.php
>
> As we are quite busy and this issue hasn't a high priority, we haven't
> followed it until now :-(
>
> I'm only a Postgres user, not a hacker, so I don't have the knowledge to
> help on this nor to evaluate if this is might be a good Gssoc project.
>
> Just an idea for the case you are looking for another topic.

Good idea.  If anyone want so pursue it, I'd strongly suggest building
it as a contrib module rather than dedicated syntax, because I'm not
sure there'd be any consensus on adding syntax for it to core.

Actually, though, I wonder how much faster it would be than CREATE
TABLE AS?  Block-level copy should be faster than tuple-level copy,
but I'm not sure whether it would be a lot faster or only slightly
faster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats

2012-03-25 Thread Robert Haas
On Sat, Mar 24, 2012 at 12:17 AM, Dan McGee  wrote:
> This is a bit of a corner case in all honesty, but if you have a short
> table (under 20 rows), the 10% heuristic used that decides whether
> distinct values scale with the row count will result in rather odd
> values for stadistinct in pg_statistic, such as '-0.2' or '-0.67',
> rather than the expected '2'. Additionally, this can cause only one of
> {t, f} to appear in the most common values array.
>
> Does this actually affect query planning in any way? Probably not, but
> it is extremely odd to look at pg_stats for these columns, and the
> solution seems easy.

But the stats aren't there to be looked at, but rather to guide query
planning.  If at execution time there are 100 rows in the table,
should we still assume that there are only 2 distinct values in the
table, or that it's gone up to about 50 distinct values?  It's hard to
say, but there's no apparent reason to think that the number of
distinct values will scale up for a large table but not a small table.

The bit about maybe not getting both t and f as MCVs on a Boolean does
seem a little worrying, but I'm not sure whether it actually affects
query planning in a materially negative way.  Can you demonstrate a
case where it matters?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL optimisations on Linux machines with more than 24 cores

2012-03-25 Thread Robert Haas
On Sun, Mar 25, 2012 at 7:14 AM, Constantin Teodorescu
 wrote:
> "This paper analyzes the scalability of seven system applications (Exim,
> memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on
> Linux on a 48-core computer. Except for gmake, all applications
> trigger scalability bottlenecks inside a recent Linux kernel. Using mostly
> standard parallel programming techniques— this paper introduces one new
> technique, sloppy counters—these bottlenecks can be removed from the
> kernel or avoided by changing the applications slightly. Modifying the
> kernel required in total 3002 lines of code changes. A speculative
> conclusion from this analysis is that there is no scalability reason to give
> up on traditional operating system organizations just yet."
>
> There are a couple of recommendations there for PostgreSQL, small changes
> in PostgreSQL source (ex. increasing the number of mutexes from 16 to 1024,
> row and table-level locking) other in Linux kernel (modified lseek)  that
> seems to be effective for scaling beyond 24 cores.

Linux 3.2 includes lseek modifications to improve scalability.
PostgreSQL 9.2 will include significant lock manager changes; we did
not adopt their proposal of just bumping up the number of lock manager
locks, as that doesn't help in common cases, such as when lots of
concurrent backends are accessing the same table, and it has other
downsides, such as slowing down operations that must touch multiple
lock manager partitions.  Rather, we introduced a "fast path" for
common cases that bypasses the main lock manager altogether; thus,
traffic on the lock manager locks will be dramatically reduced in
PostgreSQL 9.2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL optimisations on Linux machines with more than 24 cores

2012-03-25 Thread Constantin Teodorescu
Hello all ... again !

SHORT INTRODUCTION
===
Remember me? A couple of years ago (maybe 10) I created the PgAccess
package for managing PostgreSQL!
Since than I have done many things, all of them related to PostgreSQL, of
course!
Our company is managing the National Animal Tracking and Registering
Database in Romania, a pretty big database on a 24 core Intel Xeon machine
with 32 Gb RAM


THE PROBLEM

After many months of optimisation and fine tuning, getting the most out of
our machine, I just found an article regarding Linux scalability on many
cores, including an analysis of other different programs, including
PostgreSQL: http://pdos.csail.mit.edu/papers/linux:osdi10.pdf

"This paper analyzes the scalability of seven system applications (Exim,
memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on
Linux on a 48-core computer. Except for gmake, all applications
trigger scalability bottlenecks inside a recent Linux kernel. Using mostly
standard parallel programming techniques— this paper introduces one new
technique, sloppy counters—these bottlenecks can be removed from the
kernel or avoided by changing the applications slightly. Modifying the
kernel required in total 3002 lines of code changes. A speculative
conclusion from this analysis is that there is no scalability reason to
give up on traditional operating system organizations just yet."

There are a couple of recommendations there for PostgreSQL, small changes
in PostgreSQL source (ex. increasing the number of mutexes from 16 to 1024,
row and table-level locking) other in Linux kernel (modified lseek)  that
seems to be effective for scaling beyond 24 cores.


THE QUESTION
=
The article says that the tests were done using a PostgreSQL 8.3.9 version.
Is the new 9.1.3 PostgreSQL using some of the patches recommended in that
article that would enhance multi-core scalability?

Thank you very much for your attention,
Constantin Teodorescu


[HACKERS] [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats

2012-03-25 Thread Dan McGee
This is a bit of a corner case in all honesty, but if you have a short
table (under 20 rows), the 10% heuristic used that decides whether
distinct values scale with the row count will result in rather odd
values for stadistinct in pg_statistic, such as '-0.2' or '-0.67',
rather than the expected '2'. Additionally, this can cause only one of
{t, f} to appear in the most common values array.

Does this actually affect query planning in any way? Probably not, but
it is extremely odd to look at pg_stats for these columns, and the
solution seems easy.
---

The only other minor changes included here were to make it clear when we were
comparing float values, so use 0.0 instead of 0.

Example stats output from the database I noticed this on:

archweb=# SELECT c.relname, a.attname, pg_stat_get_live_tuples(c.oid) AS 
n_live_tup, stadistinct, stanullfrac, stawidth, stavalues1, stanumbers1 FROM 
pg_statistic s JOIN pg_class c ON c.oid = s.starelid JOIN pg_attribute a ON 
c.oid = a.attrelid AND a.attnum = s.staattnum LEFT JOIN pg_namespace n ON n.oid 
= c.relnamespace JOIN pg_type t ON t.oid = a.atttypid WHERE NOT a.attisdropped 
AND nspname = 'public' AND t.typname = 'bool' ORDER BY stadistinct, n_live_tup;
relname|attname| n_live_tup | stadistinct | 
stanullfrac | stawidth | stavalues1 |  stanumbers1  
---+---++-+-+--++---
 mirrors_mirrorprotocol| is_download   |  3 |   -0.67 | 
  0 |1 | {t}| {0.67}
 arches| agnostic  |  3 |   -0.67 | 
  0 |1 | {f}| {0.67}
 repos | staging   | 10 |-0.2 | 
  0 |1 | {f,t}  | {0.7,0.3}
 repos | testing   | 10 |-0.2 | 
  0 |1 | {f,t}  | {0.7,0.3}
 devel_pgpsignature| valid |264 |   1 | 
  0 |1 | {t}| {1}
 packages_flagrequest  | is_spam   |415 |   1 | 
  0 |1 | {f}| {1}
 donors| visible   |716 |   1 | 
  0 |1 | {t}| {1}
 auth_user | is_superuser  | 95 |   2 | 
  0 |1 | {f,t}  | {0.957895,0.0421053}
 user_profiles | notify| 95 |   2 | 
  0 |1 | {t,f}  | {0.957895,0.0421053}
 auth_user | is_active | 95 |   2 | 
  0 |1 | {t,f}  | {0.621053,0.378947}
 auth_user | is_staff  | 95 |   2 | 
  0 |1 | {f,t}  | {0.873684,0.126316}
 releng_iso| active|158 |   2 | 
  0 |1 | {f,t}  | {0.89,0.106667}
 mirrors_mirror| isos  |180 |   2 | 
  0 |1 | {t,f}  | {0.972678,0.0273224}
 mirrors_mirror| active|180 |   2 | 
  0 |1 | {t,f}  | {0.672131,0.327869}
 mirrors_mirror| public|180 |   2 | 
  0 |1 | {t,f}  | {0.978142,0.0218579}
 mirrors_mirrorurl | has_ipv6  |379 |   2 | 
  0 |1 | {f,t}  | {0.709763,0.290237}
 mirrors_mirrorurl | has_ipv4  |379 |   2 | 
  0 |1 | {t}| {0.997361}
 packages_flagrequest  | is_legitimate |415 |   2 | 
  0 |1 | {t,f}  | {0.992754,0.00724638}
 packages_signoffspecification | enabled   |   1130 |   2 | 
  0 |1 | {t,f}  | {0.977578,0.0224215}
 packages_signoffspecification | known_bad |   1130 |   2 | 
  0 |1 | {f,t}  | {0.993722,0.00627803}
 mirrors_mirrorlog | is_success|  12715 |   2 | 
  0 |1 | {t,f}  | {0.953345,0.0466552}
 package_depends   | optional  |  28592 |   2 | 
  0 |1 | {f,t}  | {0.880322,0.119678}
 package_files | is_directory  | 225084 |   2 | 
  0 |1 | {f,t}  | {0.829933,0.170067}
(23 rows)


 src/backend/commands/analyze.c |   18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 9cd6e67..995ed9d 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -2110,7 +2110,7 @@ compute_minimal_stats(VacAttrStatsP stats,
 * least 2 instances in the sample.
 */
if (track_cnt < track_max && to

[HACKERS] who's familiar with the GSOC application process

2012-03-25 Thread Dave Cramer
Someone has approached the JDBC list to do some work on the driver as
a GSOC project.

I need to know what is involved in mentoring and how to get the
project approved


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade incorrectly equates pg_default and database tablespace

2012-03-25 Thread Ants Aasma
Hi,

while working on a support case I stumbled upon a bug in pg_upgrade.
Upgrade fails with "No such file or directory" when a database is
moved to a non-default tablespace and contains a table that is moved
to pg_default. The cause seems to be that the following test
incorrectly equates empty spclocation with database tablespace:

tblspace = PQgetvalue(res, relnum, i_spclocation);
/* if no table tablespace, use the database tablespace */
if (strlen(tblspace) == 0)
tblspace = dbinfo->db_tblspace;

Patch to fix this is attached.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
index 36683fa..3914403 100644
--- a/contrib/pg_upgrade/info.c
+++ b/contrib/pg_upgrade/info.c
@@ -253,6 +253,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 	char	   *nspname = NULL;
 	char	   *relname = NULL;
 	int			i_spclocation,
+i_spcoid,
 i_nspname,
 i_relname,
 i_oid,
@@ -269,7 +270,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 
 	snprintf(query, sizeof(query),
 			 "SELECT c.oid, n.nspname, c.relname, "
-			 "	c.relfilenode, %s "
+			 "	c.relfilenode, t.oid AS spcoid, %s "
 			 "FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n "
 			 "	   ON c.relnamespace = n.oid "
 			 "  LEFT OUTER JOIN pg_catalog.pg_tablespace t "
@@ -306,6 +307,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 	i_nspname = PQfnumber(res, "nspname");
 	i_relname = PQfnumber(res, "relname");
 	i_relfilenode = PQfnumber(res, "relfilenode");
+	i_spcoid = PQfnumber(res, "spcoid");
 	i_spclocation = PQfnumber(res, "spclocation");
 
 	for (relnum = 0; relnum < ntups; relnum++)
@@ -325,7 +327,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 
 		tblspace = PQgetvalue(res, relnum, i_spclocation);
 		/* if no table tablespace, use the database tablespace */
-		if (strlen(tblspace) == 0)
+		if (atooid(PQgetvalue(res, relnum, i_spcoid)) == InvalidOid)
 			tblspace = dbinfo->db_tblspace;
 		strlcpy(curr->tablespace, tblspace, sizeof(curr->tablespace));
 	}
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
index c1925cf..234ca99 100644
--- a/contrib/pg_upgrade/pg_upgrade.h
+++ b/contrib/pg_upgrade/pg_upgrade.h
@@ -109,7 +109,8 @@ typedef struct
 	char		relname[NAMEDATALEN];	/* relation name */
 	Oid			reloid;			/* relation oid */
 	Oid			relfilenode;	/* relation relfile node */
-	char		tablespace[MAXPGPATH];	/* relations tablespace path */
+	/* relations tablespace path, empty for pg_default */
+	char		tablespace[MAXPGPATH];
 } RelInfo;
 
 typedef struct

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)

2012-03-25 Thread Michael Tautschnig
Hi,

[...]
> > Placing a sync (i.e., the strongest Power barrier) accordingly would, 
> > however,
> > still be insufficient for the second problem, as it would only fix the
> > reordering of read-read pairs by Worker 1 and the store atomicity issue from
> > Worker 0. But the writes on Worker 0 could still be reordered (problem 
> > number
> > 2). One possible fix consists of placing a sync between the two writes on 
> > Worker
> > 0, and an address dependency between the two reads on Worker 1. Clearly,
> > however, these are changes that cannot any longer be hidden behind the
> > ResetLatch/WaitLatch interface, but rather go in the code using these.
> 
[...]
> However, your example is enough unlike the actual code that the
> conclusion you state following the word "clearly" isn't actually clear
> to me.  According to latch.h, the correct method of using a latch is
> like this:
> 
>  * for (;;)
>  * {
>  * ResetLatch();
>  * if (work to do)
>  * Do Stuff();
>  * WaitLatch();
>  * }
> 
> Meanwhile, anyone who is creating additional work to do should add the
> work to the queue and then set the latch.
> 

When writing the above statement, including the "clearly", we were possibly too
much thinking of the above usage hint, which just uses ResetLatch and WaitLatch.
As you say, ...

> So it seems to me that we could potentially fix this by inserting
> barriers at the end of ResetLatch and at the beginning of SetLatch and
> WaitLatch.  Then the latch has to get reset before we check whether
> there's work to do; and we've got to finish checking for work before
> we again try to wait for the latch.  Similarly, any work that was in
> progress before SetLatch was called will be forced to be committed to
> memory before SetLatch does anything else.  Adding that many barriers
> might not be very good for performance but it seems OK from a
> correctness point of view, unless I am missing something, which is
> definitely possible.  I'd appreciate any thoughts you have on this, as
> this is clearly subtle and tricky to get exactly right.
> 

... placing another barrier in "SetLatch" could just do the trick. We will apply
our tools to actually prove this and come back with the conclusive answer.

Best,
Michael



pgpYu5lX5GiTq.pgp
Description: PGP signature


[HACKERS] how can i see the log..?

2012-03-25 Thread lekon
i want see the log of INSERT, SELECT, DELETE... within postgresql v9.1.3

i'm so sorry, my english level...

how can i see that.

transaction log?... or other thing...?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/how-can-i-see-the-log-tp5583083p5583083.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regarding column reordering project for GSoc 2012

2012-03-25 Thread Claes Jakobsson

On 20 mar 2012, at 13.08, Heikki Linnakangas wrote:
> On 20.03.2012 11:10, Claes Jakobsson wrote:
>> 
>> Personally I'd love a type 2 JDBC driver for PostgreSQL.
> 
> Why?

listen/notify over SSL for example unless that's been fixed in the JDBC driver 
recently. And I'm sure there are other things in libpq that would be nice to 
have.

As mainly a Perl dude which uses libpq via DBD::Pg I find it odd that the Java 
people doesn't do the same instead of reimplementing everything.

Cheers,
/Claes


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error trying to compile a simple C trigger

2012-03-25 Thread Marco Nenciarini
Il giorno mar, 20/03/2012 alle 11.16 +, Peter Geoghegan ha scritto:
> On 20 March 2012 10:53, Marco Nenciarini
>  wrote:
> > alert.c: In function ‘dbms_alert_defered_signal’:
> > alert.c:839:33: error: dereferencing pointer to incomplete type
> > make: *** [alert.o] Error 1
> >
> > I've also tried the example at
> >
> > http://www.postgresql.org/docs/devel/static/trigger-example.html
> >
> > and the result is exactly the same.
> >
> > trigtest.c: In function ‘trigf’:
> > trigtest.c:44:36: error: dereferencing pointer to incomplete type
> > make: *** [trigtest.o] Error 1
> 
> I'd say this is an unintended consequence of a pgrminclude run. Try adding 
> this:
> 
> #include "access/tupdesc.h"

It doesn't work. The error is stil the same.

Regards,
Marco

-- 
Marco Nenciarini - System manager @ Devise.IT
marco.nenciar...@devise.it | http://www.devise.it



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regarding column reordering project for GSoc 2012

2012-03-25 Thread Claes Jakobsson
On 19 mar 2012, at 22.45, Merlin Moncure wrote:
> one pl/java based project that IMO is just screaming to be done is a
> pl/java based FDW (wrapping JDBC) that would approximately reproduce
> dblink...maybe with some extra features geared towards ETL type jobs
> like a row based callback for transformations in java.

Personally I'd love a type 2 JDBC driver for PostgreSQL. It should be feasible 
as a summer project. It's somewhere deep down on my TODO list so I'd be happy 
to help.

Cheers,
Claes 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema

2012-03-25 Thread Marc Mamin
Hello,

 

Here is something we'd like to have:

 

http://archives.postgresql.org/pgsql-hackers/2012-01/msg00650.php

 

As we are quite busy and this issue hasn't a high priority, we haven't followed 
it until now :-(

 

I'm only a Postgres user, not a hacker, so I don't have the knowledge to help 
on this nor to evaluate if this is might be a good Gssoc project.

 

Just an idea for the case you are looking for another topic.

 

best regards,

 

Marc Mamin

 

From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Qi Huang
Sent: Samstag, 24. März 2012 05:20
To: cbbro...@gmail.com; kevin.gritt...@wicourts.gov
Cc: pgsql-hackers@postgresql.org; and...@anarazel.de; 
alvhe...@commandprompt.com; neil.con...@gmail.com; dan...@heroku.com; 
j...@agliodbs.com
Subject: Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema

 

> Date: Thu, 22 Mar 2012 13:17:01 -0400
> Subject: Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema
> From: cbbro...@gmail.com
> To: kevin.gritt...@wicourts.gov
> CC: pgsql-hackers@postgresql.org
> 
> On Thu, Mar 22, 2012 at 12:38 PM, Kevin Grittner
>  wrote:
> > Tom Lane  wrote:
> >> Robert Haas  writes:
> >>> Well, the standard syntax apparently aims to reduce the number of
> >>> returned rows, which ORDER BY does not.  Maybe you could do it
> >>> with ORDER BY .. LIMIT, but the idea here I think is that we'd
> >>> like to sample the table without reading all of it first, so that
> >>> seems to miss the point.
> >>
> >> I think actually the traditional locution is more like
> >>   WHERE random() < constant
> >> where the constant is the fraction of the table you want.  And
> >> yeah, the presumption is that you'd like it to not actually read
> >> every row.  (Though unless the sampling density is quite a bit
> >> less than 1 row per page, it's not clear how much you're really
> >> going to win.)
> >
> > It's all going to depend on the use cases, which I don't think I've
> > heard described very well yet.
> >
> > I've had to pick random rows from, for example, a table of
> > disbursements to support a financial audit.  In those cases it has
> > been the sample size that mattered, and order didn't.  One
> > interesting twist there is that for some of these financial audits
> > they wanted the probability of a row being selected to be
> > proportional to the dollar amount of the disbursement.  I don't
> > think you can do this without a first pass across the whole data
> > set.
> 
> This one was commonly called "Dollar Unit Sampling," though the
> terminology has gradually gotten internationalized.
> http://www.dummies.com/how-to/content/how-does-monetary-unit-sampling-work.html
> 
> What the article doesn't mention is that some particularly large items
> might wind up covering multiple samples. In the example, they're
> looking for a sample every $3125 down the list. If there was a single
> transaction valued at $3, that (roughly) covers 10 of the desired
> samples.
> 
> It isn't possible to do this without scanning across the entire table.
> 
> If you want repeatability, you probably want to instantiate a copy of
> enough information to indicate the ordering chosen. That's probably
> something that needs to be captured as part of the work of the audit,
> so not only does it need to involve a pass across the data, it
> probably requires capturing a fair bit of data for posterity.
> -- 
> When confronted by a difficult problem, solve it by reducing it to the
> question, "How would the Lone Ranger handle this?"

 

 

The discussion till now has gone far beyond my understanding.

Could anyone explain briefly what is the idea for now? 

The designing detail for me is still unfamiliar. I can only take time to 
understand while possible after being selected and put time on it to read 
relevant material. 

For now, I'm still curious why Neil's implementation is no longer working? The 
Postgres has been patched a lot, but the general idea behind Neil's 
implementation should still work, isn't it? 

Besides, whether this query is needed is still not decided . Seems this is 
another hard to decide point.  Is it that this topic is still not so prepared 
for the Gsoc yet? If really so, I think I still have time to switch to other 
topics. Any suggestion?

 

Thanks.

Best Regards and Thanks

Huang Qi Victor

Computer Science of National University of Singapore



Re: [HACKERS] foreign key locks, 2nd attempt

2012-03-25 Thread Simon Riggs
On Sat, Mar 17, 2012 at 10:45 PM, Alvaro Herrera
 wrote:

> Here is v11.  This version is mainly updated to add pg_upgrade support,
> as discussed.  It also contains the README file that was posted earlier
> (plus wording fixes per Bruce), a couple of bug fixes, and some comment
> updates.

The main thing we're waiting on are the performance tests to confirm
the lack of regression.

You are working on that, right?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers