date:20110414

Re: [HACKERS] small bug in recoveryStopsHere()

2011-04-14 Thread Jaime Casanova

On Thu, Apr 14, 2011 at 1:30 PM, Robert Haas  wrote:
> I discovered while fooling around the other night that the named
> restore point patch introduced a small bug into recoveryStopsHere():
> the test at the top of the function now lets through two
> resource-manager IDs rather than one, but the remainder of the
> function tests only the record_info flag and not the
> resource-manager-id.  So the test for record_info == XLOG_XACT_COMMIT,
> for example, will also return true for an XLOG_CHECKPOINT_SHUTDOWN
> record, but the decoded commit time will be some random garbage rather
> than a commit time, because the format of the record is totally
> different.
>

i guess, that's why i originally used a more complicated aproach (now
i can breath again, i didn't fully reminded why i use that)
"""
!   couldStop = true;
if (record->xl_rmid != RM_XACT_ID)
!   couldStop = false;
!   /*
!* Or when we found a named restore point
!*/
record_info = record->xl_info & ~XLR_INFO_MASK;
+   if ((record->xl_rmid == RM_XLOG_ID) && (record_info == 
XLOG_RESTORE_POINT))
+   couldStop = true;
+
+   if (!couldStop)
+   return false;
"""

but i agree that your solution is more readible, i don't see any
problems from here

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign table permissions and cloning

2011-04-14 Thread Tom Lane

Robert Haas  writes:
> On Fri, Apr 1, 2011 at 5:13 AM, Thom Brown  wrote:
>> On 1 April 2011 12:57, Shigeru HANADA  wrote:
>>> NOT NULL constraint on foreign table is just declaration and can't
>>> force data integrity.  And I noticed that CREATE FOREIGN TABLE
>>> document doesn't mention that serial and bigserial can't be used in
>>> foreign table.  Please see foreign_table_doc.patch for this fix.

>> I'd be inclined to generalise it to say that default values can't be
>> used on a foreign table, and then say that as a result, serial and
>> bigserial can't be used.

> +1.

Why is this a documentation issue and not a code issue?  IMO we should
flat out reject both NOT NULL and DEFAULT declarations on foreign
tables, until such time as we're prepared to do something useful with
them.  Reasons:

1. Accepting non-functional constraint declarations is something we've
been heard to ridicule mysql for.  With good reason.

2. It probably won't be too long before the planner makes optimization
decisions that assume NOT NULL declarations to be truthful.  When that
day comes, I don't want to be seeing an exception for foreign tables in
that logic.

3. When we do get around to making it actually work, we will have a
backwards-compatibility problem if prior versions accepted the
declaration but treated it as a no-op.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Tom Lane

David Fetter  writes:
> On Thu, Apr 14, 2011 at 10:43:16AM -0400, Tom Lane wrote:
>> ... I think a lot of this ultimately
>> traces to the extensible, data-type-agnostic design philosophy.  The
>> fact that we don't know what an integer is until we look in pg_type,
>> and don't know what an "=" operator does until we look up its
>> properties, is great from a flexibility point of view; but this sort
>> of query is where the costs become obvious.

> Is it time to revisit that decision?

Umm ... what are you proposing?  Toss user-defined data types and
user-defined operators overboard?  No interest in that here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Pavel Stehule

2011/4/15 Jim Nasby :
> On Apr 14, 2011, at 4:20 PM, Kevin Grittner wrote:
>> Tom Lane  wrote:
>>> Robert Haas  writes:
>>
 So far the most promising proposal I've seen seems to be to let
 id mean the parameter called id only when it can't refer to
 anything in the query.
>>
>>> Yeah, I've come round to that position too.  I think allowing
>>> parameter names to be checked only after query names is probably
>>> the best answer.
>>
>> +1
>>
>> That seems the most useful and least surprising approach to me.
>
> As part of this, can we also allow specifying an alias for the function name? 
> That would make it far less onerous to disambiguate parameters. Unfortunately 
> we obviously couldn't use AS as the keyword for this alias; maybe we could 
> use ALIAS instead? IE:
>
> CREATE FUNCTION function_with_really_really_descriptive_name (
>  some_parameter int
> ) RETURNS int LANGUAGE SQL ALIAS fwrrdn AS $$
>        SELECT fwrrdn.some_parameter
> $$;
> --

I see this can be problem for other languages - mainly for PLpgSQL.
There should be aliases supported too. And this small feature can be
terible when somebody will try to port your code to other platforms.
Personally I am thinking, so it isn't necessary

-1

Regards

Pavel Stehule

> Jim C. Nasby, Database Architect                   j...@nasby.net
> 512.569.9461 (cell)                         http://jim.nasby.net
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] using a lot of maintenance_work_mem

2011-04-14 Thread Jim Nasby

On Apr 9, 2011, at 9:23 PM, Stephen Frost wrote:
> Actually, Tom has a point in that work_mem can be set above 1GB (which
> is where I had it set previously..).  I didn't think it'd actually do
> anything given the MaxAlloc limit, but suprisingly, it does (at least,
> under 8.4).  I'm currently trying to see if we've got anything that's
> going to *break* with work_mem set up that high; right now I have a
> hashagg plan running across this data set which has 2.4G allocted to
> it so far.
> 
> I'll update this thread with whatever I find out.  I'm trying to
> remember the other issues that I ran in to with this limit (beyond the
> whole sort limit, which I do think would be helped by allowing a larger
> value, but it's not as big a deal).

FWIW, I regularly set maintenance_work_mem to 8G for index builds. Presumably 
that's equivalent to running a sort in a regular query with work_mem set that 
high...
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for GSoC : ADJ dashboard (Administration related software)

2011-04-14 Thread Jim Nasby

On Apr 14, 2011, at 1:45 PM, Robert Haas wrote:
> On Wed, Apr 6, 2011 at 11:06 AM, Erdinc Akkaya  wrote:
>> Project Title: ADJ Dashboard
>> Name : Erdinc AKKAYA
>> Email: erdinc.akk...@gmail.com
>> 
>> Synopsis
>> AnyDBJSP is a database monitoring and reporting solution with a browser
>> based
>> interface. ADJ dashboard mainly will be written for database admins(DBA).
>> This tool will have pre-defined sql queries. In addition to these, DBA can
>> write his/her own sql query and report structure easy. Similar projects are,
>> EnterpiseDB's dashboard, cybertec's pgwatch (www.cybertec.at).
>> 
>> Benefits to the PostgreSQL Community
>> All the sql queries are stored in xml files. these xml files can be updated
>> easy, with contribution of community this xml file can grow easyly and at
>> the end
>> community will have a strong database monitoring tool. Query xml files will
>> be
>> published on web(like wiki pages) and everyone can easy design his/her own
>> monitoring tool easyly.
>> 
>> Other adventage is, after a time, community will have a strong sql queries
>> that
>> monitors databases so if someone wants to study on postgresql database
>> administration
>> ADJ dashboard will be a good tool to start.
> 
> I think it would be a lot more useful to pick one of the existing
> adminstration tools for PostgreSQL and work on improving it, rather
> than trying to create a whole new one.

It doesn't sound like he's creating a new tool, just adding support for an 
existing one. And any shops that run more than one RDBMS would certainly be 
interested in a tool that can monitor more than one brand of RDBMS.

I think a better question is: just how large is the ADJ community, or how many 
folks would actually use PG support in ADJ?
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Jim Nasby

On Apr 14, 2011, at 4:20 PM, Kevin Grittner wrote:
> Tom Lane  wrote:
>> Robert Haas  writes:
> 
>>> So far the most promising proposal I've seen seems to be to let
>>> id mean the parameter called id only when it can't refer to
>>> anything in the query.
> 
>> Yeah, I've come round to that position too.  I think allowing
>> parameter names to be checked only after query names is probably
>> the best answer.
> 
> +1
> 
> That seems the most useful and least surprising approach to me.

As part of this, can we also allow specifying an alias for the function name? 
That would make it far less onerous to disambiguate parameters. Unfortunately 
we obviously couldn't use AS as the keyword for this alias; maybe we could use 
ALIAS instead? IE:

CREATE FUNCTION function_with_really_really_descriptive_name (
 some_parameter int
) RETURNS int LANGUAGE SQL ALIAS fwrrdn AS $$
SELECT fwrrdn.some_parameter
$$;
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typed table DDL loose ends

2011-04-14 Thread Noah Misch

On Thu, Apr 14, 2011 at 11:23:49AM -0700, Robert Haas wrote:
> On Thu, Apr 14, 2011 at 5:18 AM, Noah Misch  wrote:
> >> I guess my gut feeling is that it would make more sense to forbid it
> >> outright for 9.1, and we can look at relaxing that restriction later
> >> if we're so inclined.
> >>
> >> Much as with the problem Tom fixed in commit
> >> eb51af71f241e8cb199790dee9ad246bb36b3287, I'm concerned that there may
> >> be other cases that we're not thinking of right now, and while we
> >> could find them all and fix them, the amount of functionality gained
> >> is fairly marginal, and I don't really want to hold up the release
> >> while we bug-swat.
> >
> > Symmetry was the best cause I could find to continue allowing it, and your 
> > case
> > in favor of reducing the bug surface is more compelling. ?Let's forbid it.
> 
> OK.  Care to propose a patch?

Sure; attached.  It requires that the type relation be RELKIND_COMPOSITE_TYPE.
We hadn't explicitly discussed the use of foreign table, view, toast table, or
sequence row types.  The first two might have some value, someday; I'm sure
nobody cares for the second two.

nm
diff --git a/src/backend/parser/parse_utilcmd.c 
b/src/backend/parser/parse_utilcmd.c
index eba890b..31b1fb0 100644
*** a/src/backend/parser/parse_utilcmd.c
--- b/src/backend/parser/parse_utilcmd.c
***
*** 825,830  transformOfType(CreateStmtContext *cxt, TypeName *ofTypename)
--- 825,831 
TupleDesc   tupdesc;
int i;
Oid ofTypeId;
+   booltypeOk = false;
  
AssertArg(ofTypename);
  
***
*** 833,842  transformOfType(CreateStmtContext *cxt, TypeName *ofTypename)
ofTypeId = HeapTupleGetOid(tuple);
ofTypename->typeOid = ofTypeId; /* cached for later */
  
!   if (typ->typtype != TYPTYPE_COMPOSITE)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
!errmsg("type %s is not a composite type",
format_type_be(ofTypeId;
  
tupdesc = lookup_rowtype_tupdesc(ofTypeId, -1);
--- 834,852 
ofTypeId = HeapTupleGetOid(tuple);
ofTypename->typeOid = ofTypeId; /* cached for later */
  
!   if (typ->typtype == TYPTYPE_COMPOSITE)
!   {
!   RelationtypeRelation;
! 
!   Assert(OidIsValid(typ->typrelid));
!   typeRelation = relation_open(typ->typrelid, AccessShareLock);
!   typeOk = (typeRelation->rd_rel->relkind == 
RELKIND_COMPOSITE_TYPE);
!   relation_close(typeRelation, NoLock);
!   }
!   if (!typeOk)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
!errmsg("type %s is not a stand-alone composite 
type",
format_type_be(ofTypeId;
  
tupdesc = lookup_rowtype_tupdesc(ofTypeId, -1);
diff --git a/src/test/regress/expected/tyindex 0874a64..69ad58e 100644
*** a/src/test/regress/expected/typed_table.out
--- b/src/test/regress/expected/typed_table.out
***
*** 91,96  DETAIL:  drop cascades to table persons
--- 91,98 
  drop cascades to function get_all_persons()
  drop cascades to table persons2
  drop cascades to table persons3
+ CREATE TABLE persons5 OF stuff; -- only CREATE TYPE AS types may be used
+ ERROR:  type stuff is not a stand-alone composite type
  DROP TABLE stuff;
  -- implicit casting
  CREATE TYPE person_type AS (id int, name text);
diff --git a/src/test/regress/sql/typed_table.sqindex b0d452c..25aaccb 100644
*** a/src/test/regress/sql/typed_table.sql
--- b/src/test/regress/sql/typed_table.sql
***
*** 46,51  CREATE TABLE persons4 OF person_type (
--- 46,53 
  DROP TYPE person_type RESTRICT;
  DROP TYPE person_type CASCADE;
  
+ CREATE TABLE persons5 OF stuff; -- only CREATE TYPE AS types may be used
+ 
  DROP TABLE stuff;
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Andres Freund

On Thursday 14 April 2011 23:10:41 Tom Lane wrote:
> Noah Misch  writes:
> > On Thu, Apr 14, 2011 at 11:15:00AM -0700, Robert Haas wrote:
> >> It shouldn't be
> >> terribly difficult to come up with some kind of hash function based
> >> on, say, the first two characters of the keyword that would be a lot
> >> faster than what we're doing now.
> > 
> > I'd look at `gperf', which generates code for this from your keyword
> > list.
> 
> FWIW, mysql used to use gperf for this purpose, but they've abandoned it
> in favor of some homegrown hashing scheme.  I don't know exactly why,
> but I wonder if it was for licensing reasons.  gperf itself is GPL, and
> I don't see any disclaimer in the docs saying that its output isn't.
http://lists.gnu.org/archive/html/bug-gnu-utils/2008-08/msg5.html :

> Thanks for the suggestion; it indeed becomes sort of an FAQ. I've added
> 
> this text to the documentation:
>gperf is under GPL, but that does not cause the output produced
>by gperf to be under GPL.  The reason is that the output contains
>only small pieces of text that come directly from gperf's source
>code -- only about 7 lines long, too small for being significant --,
>and therefore the output is not a "derivative work" of gperf (in the
>sense of U.S.@: copyright law).

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Kevin Grittner

Tom Lane  wrote:
> Robert Haas  writes:
 
>> So far the most promising proposal I've seen seems to be to let
>> id mean the parameter called id only when it can't refer to
>> anything in the query.
 
> Yeah, I've come round to that position too.  I think allowing
> parameter names to be checked only after query names is probably
> the best answer.
 
+1
 
That seems the most useful and least surprising approach to me.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Tom Lane

Robert Haas  writes:
> On Thu, Apr 14, 2011 at 1:55 PM, Tom Lane  wrote:
>> Hmm, what I read Dimitri to be proposing is that we *require* parameter
>> names to be qualified with the function name.  I don't recall hearing
>> that before.  It would solve the problem perhaps, but I think the moans
>> and groans will be numerous.

> So far the most promising proposal I've seen seems to be to let id
> mean the parameter called id only when it can't refer to anything in
> the query.

> Tabula raza, I'd prefer your proposal to make any ambiguity an error,
> but it's not worth the breakage.

Yeah, I've come round to that position too.  I think allowing parameter
names to be checked only after query names is probably the best answer.

> I'd be fine with having a way to
> explicitly request that behavior though, a la Perl's "use strict".

This is possible but it's not clear it's worth the work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Merlin Moncure

On Thu, Apr 14, 2011 at 4:10 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> On Thu, Apr 14, 2011 at 11:15:00AM -0700, Robert Haas wrote:
>>> It shouldn't be
>>> terribly difficult to come up with some kind of hash function based
>>> on, say, the first two characters of the keyword that would be a lot
>>> faster than what we're doing now.
>
>> I'd look at `gperf', which generates code for this from your keyword list.
>
> FWIW, mysql used to use gperf for this purpose, but they've abandoned it
> in favor of some homegrown hashing scheme.  I don't know exactly why,
> but I wonder if it was for licensing reasons.  gperf itself is GPL, and
> I don't see any disclaimer in the docs saying that its output isn't.

I dont think it matters -- see the entry in the bison faq:
"Is there some way that I can GPL the output people get from use of my
program? For example, if my program is used to develop hardware
designs, can I require that these designs must be free?

In general this is legally impossible; copyright law does not give
you any say in the use of the output people make from their data using
your program. If the user uses your program to enter or convert his
own data, the copyright on the output belongs to him, not you. More
generally, when a program translates its input into some other form,
the copyright status of the output inherits that of the input it was
generated from."

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Tom Lane

Noah Misch  writes:
> On Thu, Apr 14, 2011 at 11:15:00AM -0700, Robert Haas wrote:
>> It shouldn't be
>> terribly difficult to come up with some kind of hash function based
>> on, say, the first two characters of the keyword that would be a lot
>> faster than what we're doing now.

> I'd look at `gperf', which generates code for this from your keyword list.

FWIW, mysql used to use gperf for this purpose, but they've abandoned it
in favor of some homegrown hashing scheme.  I don't know exactly why,
but I wonder if it was for licensing reasons.  gperf itself is GPL, and
I don't see any disclaimer in the docs saying that its output isn't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 1:55 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Apr 14, 2011 at 11:37 AM, Dimitri Fontaine
>>  wrote:
>>> Unless we make it so that no such version ever exists.  Meaning that the
>>> code works fine as is or using WHERE id = developer_lookup.id.  AS id
>>> can't ever be the parameter in this case, you're just fine.
>>>
>>> Bearing in mind that $1 etc shortcuts still are available, I don't
>>> really see this qualification of parameter names with function names so
>>> big a problem that we should find a way to avoid it and risk breaking
>>> compatibility.
>>>
>>> Don't forget that any ambiguity here will mean *huge* migration costs.
>
>> If I'm reading your email correctly, we're in agreement.
>
> Hmm, what I read Dimitri to be proposing is that we *require* parameter
> names to be qualified with the function name.  I don't recall hearing
> that before.  It would solve the problem perhaps, but I think the moans
> and groans will be numerous.

So far the most promising proposal I've seen seems to be to let id
mean the parameter called id only when it can't refer to anything in
the query.

Tabula raza, I'd prefer your proposal to make any ambiguity an error,
but it's not worth the breakage.  I'd be fine with having a way to
explicitly request that behavior though, a la Perl's "use strict".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 12:45 PM, Noah Misch  wrote:
>> It shouldn't be
>> terribly difficult to come up with some kind of hash function based
>> on, say, the first two characters of the keyword that would be a lot
>> faster than what we're doing now.
>
> I'd look at `gperf', which generates code for this from your keyword list.

I thought of that, but wasn't sure we wanted to introduce a dependency
on that tool.  That might be a pain, especially on Windows.  But maybe
we could steal the basic approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Tom Lane

Robert Haas  writes:
> On Thu, Apr 14, 2011 at 11:37 AM, Dimitri Fontaine
>  wrote:
>> Unless we make it so that no such version ever exists.  Meaning that the
>> code works fine as is or using WHERE id = developer_lookup.id.  AS id
>> can't ever be the parameter in this case, you're just fine.
>> 
>> Bearing in mind that $1 etc shortcuts still are available, I don't
>> really see this qualification of parameter names with function names so
>> big a problem that we should find a way to avoid it and risk breaking
>> compatibility.
>> 
>> Don't forget that any ambiguity here will mean *huge* migration costs.

> If I'm reading your email correctly, we're in agreement.

Hmm, what I read Dimitri to be proposing is that we *require* parameter
names to be qualified with the function name.  I don't recall hearing
that before.  It would solve the problem perhaps, but I think the moans
and groans will be numerous.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Tom Lane

Andres Freund  writes:
> Where I am with you is that its quite possible that it will not make sense 
> (performancewise) for all contexts. Which is quite annoying.

The mcxt stuff was designed from day one to support multiple types of
contexts, so it wouldn't be very hard at all to have different contexts
using different allocation policies.  The issue is to figure out what
to use where ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Andres Freund

On Thursday 14 April 2011 22:21:26 Alvaro Herrera wrote:
> Excerpts from Andres Freund's message of jue abr 14 17:08:34 -0300 2011:
> > Hi,
> > 
> > On Thursday 14 April 2011 16:43:16 Tom Lane wrote:
> > > I doubt that it's possible to make AllocSetAlloc radically cheaper.
> > 
> > I actually doubt your doubt. I think you could add some SLAB like
> > interface for common allocation sizes making it significantly faster for
> > some uses (because allocation/deallocation is a single linked list
> > operation). Maybe even treat everything < some_size as a slab object in
> > the next bigger slab.
> I think the problem with a slab allocator is that it would make
> hierarchical context management slower and/or more complicated (e.g.
> reset context on transaction abort).
I am not that sure that it would be slower. I think that if youre careful you 
mostly can reuse what currently is done for chunks to implement slabs.
For context resets you can just throw away all chunks/blocks.

Where I am with you is that its quite possible that it will not make sense 
(performancewise) for all contexts. Which is quite annoying.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Merlin Moncure

On Thu, Apr 14, 2011 at 2:08 AM, Greg Smith  wrote:
> This week several list regulars here waded into the MySQL Convention.  I
> decided to revisit PostgreSQL vs. MySQL performance using the sysbench
> program as part of that.  It's not important to what I'm going to describe
> to understand exactly what statements sysbench runs here or how to use it,
> but if anyone is curious I've got some more details about how I ran the
> tests in my talk slides at http://projects.2ndquadrant.com/talks  The
> program has recently gone through some fixes that make it run a bit better
> both in general and against PostgreSQL.  The write tests are still broken
> against PostgreSQL, but it now seems to do a reasonable job simulating a
> simple SELECT-only workload.  A fix from Jignesh recently made its way into
> the database generation side of the code that makes it less tedious to test
> with it too.
> The interesting part was how per-client scaling compared between the two
> databases; graph attached.  On my 8 core server, PostgreSQL scales nicely up
> to a steady 50K TPS.  I see the same curve, almost identical numbers, with
> PostgreSQL and pgbench--no reason to suspect sysbench is doing anything
> shady.  The version of MySQL I used hits around 67K TPS with innodb when
> busy with lots of clients.  That part doesn't bother me; nobody expects
> PostgreSQL to be faster on trivial SELECT statements and the gap isn't that
> big.
>
> The shocking part was the single client results.  I'm using to seeing
> Postgres get around 7K TPS per core on those, which was the case here, and I
> never considered that an interesting limitation to think about before.
>  MySQL turns out to hit 38K TPS doing the same work.  Now that's a gap
> interesting enough to make me wonder what's going on.
>
> Easy enough to exercise the same sort of single client test case with
> pgbench and put it under a profiler:
>
> sudo opcontrol --init
> sudo opcontrol --setup --no-vmlinux
> createdb pgbench
> pgbench -i -s 10 pgbench
> psql -d pgbench -c "vacuum"
> sudo opcontrol --start
> sudo opcontrol --reset
> pgbench -S -n -c 1 -T 60 pgbench
> sudo opcontrol --dump ; sudo opcontrol --shutdown
> opreport -l image:$HOME/pgwork/inst/test/bin/postgres
>
> Here's the top calls, from my laptop rather than the server that I generated
> the graph against.  It does around 5.5K TPS with 1 clients and 10K with 2
> clients, so same basic scaling:
>
> samples  %        image name               symbol name
> 53548     6.7609  postgres                 AllocSetAlloc
> 32787     4.1396  postgres                 MemoryContextAllocZeroAligned
> 26330     3.3244  postgres                 base_yyparse
> 21723 2.7427  postgres                 hash_search_with_hash_value
> 20831 2.6301  postgres                 SearchCatCache
> 19094     2.4108  postgres                 hash_seq_search
> 18402     2.3234  postgres                 hash_any
> 15975     2.0170  postgres                 AllocSetFreeIndex
> 14205     1.7935  postgres                 _bt_compare
> 13370     1.6881  postgres                 core_yylex
> 10455     1.3200  postgres                 MemoryContextAlloc
> 10330     1.3042  postgres                 LockAcquireExtended
> 10197     1.2875  postgres                 ScanKeywordLookup
> 9312      1.1757  postgres                 MemoryContextAllocZero
>
> I don't know nearly enough about the memory allocator to comment on whether
> it's possible to optimize it better for this case to relieve any bottleneck.
>  Might just get a small gain then push the limiter to the parser or hash
> functions.  I was surprised to find that's where so much of the time was
> going though.
>
> P.S. When showing this graph in my talk, I pointed out that anyone who is
> making decisions about which database to use based on trivial SELECTs on
> small databases isn't going to be choosing between PostgreSQL and MySQL
> anyway--they'll be deploying something like MongoDB instead if that's the
> important metric.

on my workstation VM, I get:
6.7k selects single client,
12k selects piped through single user backend,
13.5k piped through single user backend, one transaction
23k in plpgsql 'execute' in loop (which is really two queries, one to
build the query and one to execute),
100k in non dynamic query plpgsql in loop.

in addition to parsing and planning, the network and the protocol
really factor in.  this is why i'm so keen on being able to inject
queries directly in to the backend via stored procedure.  I'm also
really curious how a plpgsql-ish language performs when managing
transactions itself.

libpqtypes was written so you could stage data on the client and hand
it off to the backend and act on it in plpgsl world.  it works really
well -- you can amortize the turnaround losses through the newtork
over a lot of data.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL, xl_heap_insert and tuple oid mystry

2011-04-14 Thread jagan

Hi,

> 
> > Where is the OID of tuple stored in a WAL record of a
> tuple? If not with xl_heap_header, where is it stored? Is it
> stored at all?
> 
> It's stored in the "tuple data" portion. 

I see it now. I was having alignment issues, which I resolved. Thanks for your 
help. I am still not sure if I understand how attributes are packed in to a 
buffer. In particular, according to the inline comments in code, this is how a 
xl_heap_insert stores data:

Oid + [padding bits] + [data]

Now, to unpack the data, we would have to know what type of data is stored, 
which is obtained from pg_type. Different types require either a "no (char) 
align", "short align", "int aligned" or "double align". I went over the 
varattrib code and I am also able to unpack varchar, text and other variable 
length types. 

The code I wrote works fine but runs into alignment issues with some of the 
records. So, I am still missing something even though I have spent weeks 
reading postgres code.

I notice the following:

1. Oid starts at MAXALIGNed, which is fine.

2. The subsequent attribute starts right away. There never seems to be padding 
even if the subsequent tuple needs to be double aligned. Hence, I skip aligning 
the first attribute by default, which is bizarre but works.

3. When unpacking goes out of alignment, I print the buffer see if this is 
because they are externally stored. That is not the case as I can see text 
fields are actually stored in the buffer.

4. The alignment goes out of whack even for entries with no varattrib for 
certain tuples. 

In general, what is the best way to unpack buffers containing tuples? I came 
across "heap_form_tuple" but not sure if that is most optimal way to go about. 
Is there some documentation on how tuples are internally stored? 

Thanks in advance for any pointers.
Jagan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Greg Smith


Heikki Linnakangas wrote:
In this case you could just use prepared statements and get rid of all 
the parser related overhead, which includes much of the allocations.


Trying that gives me around 9200 TPS instead of 5500 on my laptop, so a 
pretty big gain here.  Will have to include that in my next round of 
graphs across multiple client loads once I'm home again and can run 
easily on my server.


To provide a matching profile from the same system as the one I already 
submitted from, for archival sake, here's what the profile I get looks 
like with "-M prepared":


samples  %image name   symbol name
33093 4.8518  postgres AllocSetAlloc
30012 4.4001  postgres hash_seq_search
27149 3.9803  postgres MemoryContextAllocZeroAligned
26987 3.9566  postgres hash_search_with_hash_value
25665 3.7628  postgres hash_any
16820 2.4660  postgres _bt_compare
14778 2.1666  postgres LockAcquireExtended
12263 1.7979  postgres AllocSetFreeIndex
11727 1.7193  postgres tas
11602 1.7010  postgres SearchCatCache
11022 1.6159  postgres pg_encoding_mbcliplen
10963 1.6073  postgres MemoryContextAllocZero
9296  1.3629  postgres MemoryContextCreate
8368  1.2268  postgres fmgr_isbuiltin
7973  1.1689  postgres LockReleaseAll
7423  1.0883  postgres ExecInitExpr
7309  1.0716  postgres pfree

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Alvaro Herrera

Excerpts from Andres Freund's message of jue abr 14 17:08:34 -0300 2011:
> Hi,
> 
> On Thursday 14 April 2011 16:43:16 Tom Lane wrote:
> > I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I actually doubt your doubt. I think you could add some SLAB like interface 
> for common allocation sizes making it significantly faster for some uses 
> (because allocation/deallocation is a single linked list operation).
> Maybe even treat everything < some_size as a slab object in the next bigger 
> slab.

I think the problem with a slab allocator is that it would make
hierarchical context management slower and/or more complicated (e.g.
reset context on transaction abort).

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Andres Freund

Hi,

On Thursday 14 April 2011 16:43:16 Tom Lane wrote:
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
I actually doubt your doubt. I think you could add some SLAB like interface 
for common allocation sizes making it significantly faster for some uses 
(because allocation/deallocation is a single linked list operation).
Maybe even treat everything < some_size as a slab object in the next bigger 
slab.

Note that I think youre otherwise right...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Tom Lane

Heikki Linnakangas  writes:
> There's one very low-hanging fruit here, though. I profiled the pgbench 
> case, with -M prepared, and found that like in Greg Smith's profile, 
> hash_seq_search pops up quite high in the list. Those calls are coming 
> from LockReleaseAll(), where we scan the local lock hash to find all 
> locks held. We specify the initial size of the local lock hash table as 
> 128, which is unnecessarily large for small queries like this. Reducing 
> it to 8 slashed the time spent in hash_seq_search().

> I think we should make that hash table smaller. It won't buy much, 
> somewhere between 1-5 % in this test case, but it's very easy to do and 
> I don't see much downside, it's a local hash table so it will grow as 
> needed.

8 sounds awfully small.  Can you even get as far as preparing the
statements you intend to use without causing that to grow?

I agree that 128 may be larger than necessary, but I don't think we
should pessimize normal usage to gain a small fraction on trivial
queries.  I'd be happier with something like 16 or 32.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Noah Misch

On Thu, Apr 14, 2011 at 11:15:00AM -0700, Robert Haas wrote:
> It's fairly far down in the noise on this particular profile, but in
> the low-hanging fruit department, I think we should fix
> ScanKeywordLookup to use a smarter algorithm that is more like O(1)
> rather than O(lg n) in the number of keywords.

+1

> It shouldn't be
> terribly difficult to come up with some kind of hash function based
> on, say, the first two characters of the keyword that would be a lot
> faster than what we're doing now.

I'd look at `gperf', which generates code for this from your keyword list.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Heikki Linnakangas


On 14.04.2011 17:43, Tom Lane wrote:

Greg Smith  writes:

samples  %image name   symbol name
53548 6.7609  postgres AllocSetAlloc
32787 4.1396  postgres MemoryContextAllocZeroAligned
26330 3.3244  postgres base_yyparse
21723 2.7427  postgres hash_search_with_hash_value
20831 2.6301  postgres SearchCatCache
19094 2.4108  postgres hash_seq_search
18402 2.3234  postgres hash_any
15975 2.0170  postgres AllocSetFreeIndex
14205 1.7935  postgres _bt_compareSince
13370 1.6881  postgres core_yylex
10455 1.3200  postgres MemoryContextAlloc
10330 1.3042  postgres LockAcquireExtended
10197 1.2875  postgres ScanKeywordLookup
9312  1.1757  postgres MemoryContextAllocZero


Yeah, this is pretty typical ...


In this case you could just use prepared statements and get rid of all 
the parser related overhead, which includes much of the allocations.



I don't know nearly enough about the memory allocator to comment on
whether it's possible to optimize it better for this case to relieve any
bottleneck.


I doubt that it's possible to make AllocSetAlloc radically cheaper.
I think the more likely route to improvement there is going to be to
find a way to do fewer pallocs.  For instance, if we had more rigorous
rules about which data structures are read-only to which code, we could
probably get rid of a lot of just-in-case tree copying that happens in
the parser and planner.

But at the same time, even if we could drive all palloc costs to zero,
it would only make a 10% difference in this example.  And this sort of
fairly flat profile is what I see in most cases these days --- we've
been playing performance whack-a-mole for long enough now that there
isn't much low-hanging fruit left.  For better or worse, the system
design we've chosen just isn't amenable to minimal overhead for simple
queries.  I think a lot of this ultimately traces to the extensible,
data-type-agnostic design philosophy.  The fact that we don't know what
an integer is until we look in pg_type, and don't know what an "="
operator does until we look up its properties, is great from a flexibility
point of view; but this sort of query is where the costs become obvious.


I think the general strategy to make this kind of queries faster will be 
to add various fastpaths to cache and skip even more work. For example,


There's one very low-hanging fruit here, though. I profiled the pgbench 
case, with -M prepared, and found that like in Greg Smith's profile, 
hash_seq_search pops up quite high in the list. Those calls are coming 
from LockReleaseAll(), where we scan the local lock hash to find all 
locks held. We specify the initial size of the local lock hash table as 
128, which is unnecessarily large for small queries like this. Reducing 
it to 8 slashed the time spent in hash_seq_search().


I think we should make that hash table smaller. It won't buy much, 
somewhere between 1-5 % in this test case, but it's very easy to do and 
I don't see much downside, it's a local hash table so it will grow as 
needed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 11:37 AM, Dimitri Fontaine
 wrote:
> Robert Haas  writes:
>> rhaas=# CREATE OR REPLACE FUNCTION developer_lookup(id integer)
>> RETURNS text AS $$SELECT name FROM developer WHERE id = $1$$ LANGUAGE
>> sql STABLE;
>>
>> Now, when this person attempts to recreate this function on a
>> hypothetical version of PostgreSQL that thinks "id" is ambiguous, it
>> doesn't work.
>
> Unless we make it so that no such version ever exists.  Meaning that the
> code works fine as is or using WHERE id = developer_lookup.id.  AS id
> can't ever be the parameter in this case, you're just fine.
>
> Bearing in mind that $1 etc shortcuts still are available, I don't
> really see this qualification of parameter names with function names so
> big a problem that we should find a way to avoid it and risk breaking
> compatibility.
>
> Don't forget that any ambiguity here will mean *huge* migration costs.

If I'm reading your email correctly, we're in agreement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-04-14 Thread Robert Haas

On Mon, Apr 4, 2011 at 2:03 PM, Alexey Klyukin  wrote:
> Here's the update of Selena's patch, which also shows all errors in
> configuration parameters (as well as parser errors) during reload.

You should add this here:

https://commitfest.postgresql.org/action/commitfest_view/open

On a quick glance, this patch appears to contain some superfluous
hunks where you changed whitespace or variable names.  You might want
to remove those and repost before adding to the CF app.  Also, some
submission notes would be very helpful - when you send in the revised
version, detail in the email the exact purpose of the changes so that
someone can review the patch without having to read this thread and
all preceding threads in their entirety.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for GSoC : ADJ dashboard (Administration related software)

2011-04-14 Thread Robert Haas

On Wed, Apr 6, 2011 at 11:06 AM, Erdinc Akkaya  wrote:
> Project Title: ADJ Dashboard
> Name : Erdinc AKKAYA
> Email: erdinc.akk...@gmail.com
>
> Synopsis
> AnyDBJSP is a database monitoring and reporting solution with a browser
> based
> interface. ADJ dashboard mainly will be written for database admins(DBA).
> This tool will have pre-defined sql queries. In addition to these, DBA can
> write his/her own sql query and report structure easy. Similar projects are,
> EnterpiseDB's dashboard, cybertec's pgwatch (www.cybertec.at).
>
> Benefits to the PostgreSQL Community
> All the sql queries are stored in xml files. these xml files can be updated
> easy, with contribution of community this xml file can grow easyly and at
> the end
> community will have a strong database monitoring tool. Query xml files will
> be
> published on web(like wiki pages) and everyone can easy design his/her own
> monitoring tool easyly.
>
> Other adventage is, after a time, community will have a strong sql queries
> that
> monitors databases so if someone wants to study on postgresql database
> administration
> ADJ dashboard will be a good tool to start.

I think it would be a lot more useful to pick one of the existing
adminstration tools for PostgreSQL and work on improving it, rather
than trying to create a whole new one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign table permissions and cloning

2011-04-14 Thread Robert Haas

On Fri, Apr 1, 2011 at 1:29 AM, Shigeru HANADA
 wrote:
> In addition to the 2nd GRANT above, "GRANT SELECT (colour) ON stuff TO
> user_a" (omitting TABLE) will succeed too because parser assumes that
> the target object is a regular table if object type was TABLE or
> omitted. This inconsistent behavior would be an oversight and need to
> be fixed.

+1.

> How about to drop "GRANT xxx ON FOREIGN TABLE foo" syntax support and
> use "GRANT xxx ON [TABLE] foo" for foreign tables?  ISTM that "ON
> FOREIGN TABLE" specification is useless because possible privilege
> type would be same as TABLE.

-1.  We should be consistent about treating foreign tables as their
own object type - and the possible privilege types are NOT the same -
only SELECT is supported.

> Probabry we should mention in GRANT documents that ALL TABLES
> IN SCHEMA is considered to include foreign tables.

Or else change the behavior so that it doesn't, which would probably be my vote.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign table permissions and cloning

2011-04-14 Thread Robert Haas

On Fri, Apr 1, 2011 at 5:13 AM, Thom Brown  wrote:
> On 1 April 2011 12:57, Shigeru HANADA  wrote:
>> NOT NULL constraint on foreign table is just declaration and can't
>> force data integrity.  And I noticed that CREATE FOREIGN TABLE
>> document doesn't mention that serial and bigserial can't be used in
>> foreign table.  Please see foreign_table_doc.patch for this fix.
>
> I'd be inclined to generalise it to say that default values can't be
> used on a foreign table, and then say that as a result, serial and
> bigserial can't be used.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-04-14 Thread Dimitri Fontaine

Robert Haas  writes:
> rhaas=# CREATE OR REPLACE FUNCTION developer_lookup(id integer)
> RETURNS text AS $$SELECT name FROM developer WHERE id = $1$$ LANGUAGE
> sql STABLE;
>
> Now, when this person attempts to recreate this function on a
> hypothetical version of PostgreSQL that thinks "id" is ambiguous, it
> doesn't work.

Unless we make it so that no such version ever exists.  Meaning that the
code works fine as is or using WHERE id = developer_lookup.id.  AS id
can't ever be the parameter in this case, you're just fine.

Bearing in mind that $1 etc shortcuts still are available, I don't
really see this qualification of parameter names with function names so
big a problem that we should find a way to avoid it and risk breaking
compatibility.

Don't forget that any ambiguity here will mean *huge* migration costs.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [DOCS] [HACKERS] Uppercase SGML entity declarations

2011-04-14 Thread Robert Haas

On Tue, Apr 5, 2011 at 6:55 PM, Josh Kupershmidt  wrote:
> On Mon, Apr 4, 2011 at 3:02 PM, Robert Haas  wrote:
>> In theory, we have
>> documentation that explains this:
>>
>> http://www.postgresql.org/docs/current/static/docguide-toolsets.html
>
> While we're on the subject..
>
> Attached is a patch against that page suggesting using openjade 1.3,
> not 1.4devel as part of the doc build toolset. Source of this
> recommendation:
> 
>
> I just double checked, and with the latest openjade package (1.4devel)
> on Ubuntu 10.10, I still see the same segfault; downgrading to package
> "openjade1.3" allows me to make postgres-A4.pdf successfully.

OK, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] small bug in recoveryStopsHere()

2011-04-14 Thread Robert Haas

I discovered while fooling around the other night that the named
restore point patch introduced a small bug into recoveryStopsHere():
the test at the top of the function now lets through two
resource-manager IDs rather than one, but the remainder of the
function tests only the record_info flag and not the
resource-manager-id.  So the test for record_info == XLOG_XACT_COMMIT,
for example, will also return true for an XLOG_CHECKPOINT_SHUTDOWN
record, but the decoded commit time will be some random garbage rather
than a commit time, because the format of the record is totally
different.

Absent objections, I'll push the attached fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


recovery-stops-here.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typed table DDL loose ends

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 5:18 AM, Noah Misch  wrote:
>> I guess my gut feeling is that it would make more sense to forbid it
>> outright for 9.1, and we can look at relaxing that restriction later
>> if we're so inclined.
>>
>> Much as with the problem Tom fixed in commit
>> eb51af71f241e8cb199790dee9ad246bb36b3287, I'm concerned that there may
>> be other cases that we're not thinking of right now, and while we
>> could find them all and fix them, the amount of functionality gained
>> is fairly marginal, and I don't really want to hold up the release
>> while we bug-swat.
>
> Symmetry was the best cause I could find to continue allowing it, and your 
> case
> in favor of reducing the bug surface is more compelling.  Let's forbid it.

OK.  Care to propose a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 7:26 AM, A.M.  wrote:
> From what I understood, the primary purpose of the SysV check was to protect 
> the shared memory from multiple stompers. The interlock was a neat 
> side-effect.

Not really - the purpose of the interlock is to protect the underlying
data files.  The nattch interlock allows us to be very confident that
there isn't another postmaster running on the same data directory on
the same machine, and that is extremely important.

You've just about convinced me that it might not be a bad idea to add
the fcntl() interlock in addition because, as you say, that has a
chance of working even over NFS.  But the interlock we have now is
*extremely* reliable, and I think we'd need to get some other
amazingly compelling benefit to consider changing it (even if we were
convinced that the alternate method was also reliable).  I don't see
that there is one.  Anyone running an existing version of PostgreSQL
in an environment where they care *at all* about performance has
already adjusted their SysV shm settings way up.  The benefit of using
POSIX shm is that in, say, PostgreSQL 9.2, it might be possible for
shared buffers to have a somewhat higher default setting out of the
box, and be further increased from there without kernel parameter
changes.  And there might be more benefits besides that, but certainly
those by themselves seem pretty worthwhile.  SysV shm is extremely
portable, so I don't think that we're losing anything by continuing to
allocate a small amount of it (a few kilobytes, perhaps) and just push
everything else out into POSIX shm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Robert Haas

On Thu, Apr 14, 2011 at 7:43 AM, Tom Lane  wrote:
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I think the more likely route to improvement there is going to be to
> find a way to do fewer pallocs.  For instance, if we had more rigorous
> rules about which data structures are read-only to which code, we could
> probably get rid of a lot of just-in-case tree copying that happens in
> the parser and planner.
>
> But at the same time, even if we could drive all palloc costs to zero,
> it would only make a 10% difference in this example.  And this sort of
> fairly flat profile is what I see in most cases these days --- we've
> been playing performance whack-a-mole for long enough now that there
> isn't much low-hanging fruit left.

There are other architectural approaches that we could take to
reducing the parsing overhead.  Random ideas:

1. Separate the parser into a parser for DML statements only, and
another parser for everything else, to avoid cluttering the main
parser with lots of keywords and non-terminals that aren't going to be
used for the kinds of queries people care about parsing speed.

2. Hand-code a parser, or use some tool other than bison to generate one.

3. Some kind of parse-tree cache, so that executing the same exact
statement many times doesn't require reparsing it every time.

It's fairly far down in the noise on this particular profile, but in
the low-hanging fruit department, I think we should fix
ScanKeywordLookup to use a smarter algorithm that is more like O(1)
rather than O(lg n) in the number of keywords.  It shouldn't be
terribly difficult to come up with some kind of hash function based
on, say, the first two characters of the keyword that would be a lot
faster than what we're doing now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread David Fetter

On Thu, Apr 14, 2011 at 10:43:16AM -0400, Tom Lane wrote:
> 
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I think the more likely route to improvement there is going to be to
> find a way to do fewer pallocs.  For instance, if we had more rigorous
> rules about which data structures are read-only to which code, we could
> probably get rid of a lot of just-in-case tree copying that happens in
> the parser and planner.

How much work would that be, and how would it be enforced in new
development?

> But at the same time, even if we could drive all palloc costs to
> zero, it would only make a 10% difference in this example.  And this
> sort of fairly flat profile is what I see in most cases these days
> --- we've been playing performance whack-a-mole for long enough now
> that there isn't much low-hanging fruit left.  For better or worse,
> the system design we've chosen just isn't amenable to minimal
> overhead for simple queries.  I think a lot of this ultimately
> traces to the extensible, data-type-agnostic design philosophy.  The
> fact that we don't know what an integer is until we look in pg_type,
> and don't know what an "=" operator does until we look up its
> properties, is great from a flexibility point of view; but this sort
> of query is where the costs become obvious.

Is it time to revisit that decision?  Should we wait until, say, we no
longer claim to support 32-bit machines on the server side?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread Martijn van Oosterhout

On Thu, Apr 14, 2011 at 10:26:33AM -0400, A.M. wrote:
> 1) the SysV nattch method's primary purpose is to protect the shmem
> region. This is no longer necessary in my patch because the shared
> memory in unlinked immediately after creation, so only the initial
> postmaster and its children have access.

Umm, you don't unlink SysV shared memory. All the flag does is make
sure it goes away when the last user goes away. In the mean time people
can still connect to it.

> The lock file contents are currently important to get the pid of a
> potential, conflicting postmaster. With the fcntl API, we can return
> a live conflicting PID (whether a postmaster or a stuck child), so
> that's an improvement. This could be used, for example, for STONITH,
> to reliably kill a dying replication clone- just loop on the pids
> returned from the lock.

SysV shared memory also gives you a PID, that's the point.
> 
> Even if the fcntl check passes, the pid in the lock file is checked, so the 
> lock file behavior remains the same.

The interlock is to make sure there are no living postmaster children.
The lockfile won't tell you that. So the issue is that while fcntl can
work, sysv can do better.

Also, I think you underestimate the value of the current interlock.
Before this people did manage to trash their databases regularly this
way. Lockfiles can be deleted and yes, people do it all the time.

Actually, it occurs to me you can solve NFS problem by putting the
lockfile in the socket dir. That can't possibly be on NFS.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>   - Charles de Gaulle

signature.asc
Description: Digital signature

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.


On Apr 13, 2011, at 11:37 PM, Tom Lane wrote:

> "A.M."  writes:
>> 1. As long one keeps SysV shared memory around, the postgresql project
>> has to maintain the annoying platform-specific document on how to
>> configure the poorly named kernel parameters.
> 
> No, if it's just a small area, I don't see that that's an issue.
> You're going to max out on other things (like I/O bandwidth) long before
> you run into the limit on how many postmasters you can have from this.
> The reason that those parameters are problematic now is that people tend
> to want *large* shmem segments and the typical defaults aren't friendly
> to that.

That's assuming that no other processes on the system are using up the 
available segments (such as older postgresql instances).


>> 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? 
>> (Pun intended.)
> 
> You were losing this argument already, but ad hominem attacks are pretty
> much guaranteed to get people to tune you out.  

I apologized to Robert Haas in another post- no offense was intended.

> There are real,
> substantive, unfixable reasons to not trust fcntl locking completely.

...on NFS which the postgresql community doesn't recommend anyway. But even in 
that case, the existing lock file (even without the fcntl lock), can catch that 
case via the PID in the file contents. That is what I meant when I claimed that 
the behavior remains the same.

> 
>> I would encourage you to take a look at the patch.
> 
> Just to be perfectly clear: I have not read your patch, and am not
> likely to before the next commitfest starts, because I have
> approximately forty times too many things to do already.  I'm just going
> off your own abbreviated description of the patch.  But from what I know
> about fcntl locking, it's not a sufficient substitute for the SysV shmem
> interlock, because it has failure modes that the SysV interlock doesn't,
> and those failure modes occur in real-world cases.  Yeah, it'd be nice
> to also be able to lock against other postmasters on other NFS clients,
> but we hardly ever hear of somebody getting burnt by the lack of that
> (and fcntl wouldn't be a bulletproof defense anyway).  On the other
> hand, accidentally trying to start a duplicate postmaster on the same
> machine is an everyday occurrence.

I really do appreciate the time you have put into feedback. I posed this 
question also to Robert Haas: is there a different API which you would find 
acceptable? I chose fcntl because it seemed well-suited for this task, but the 
feedback has been regarding NFS v<4 concerns.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Tom Lane

Greg Smith  writes:
> samples  %image name   symbol name
> 53548 6.7609  postgres AllocSetAlloc
> 32787 4.1396  postgres MemoryContextAllocZeroAligned
> 26330 3.3244  postgres base_yyparse
> 21723 2.7427  postgres hash_search_with_hash_value
> 20831 2.6301  postgres SearchCatCache
> 19094 2.4108  postgres hash_seq_search
> 18402 2.3234  postgres hash_any
> 15975 2.0170  postgres AllocSetFreeIndex
> 14205 1.7935  postgres _bt_compare
> 13370 1.6881  postgres core_yylex
> 10455 1.3200  postgres MemoryContextAlloc
> 10330 1.3042  postgres LockAcquireExtended
> 10197 1.2875  postgres ScanKeywordLookup
> 9312  1.1757  postgres MemoryContextAllocZero

Yeah, this is pretty typical ...

> I don't know nearly enough about the memory allocator to comment on 
> whether it's possible to optimize it better for this case to relieve any 
> bottleneck.

I doubt that it's possible to make AllocSetAlloc radically cheaper.
I think the more likely route to improvement there is going to be to
find a way to do fewer pallocs.  For instance, if we had more rigorous
rules about which data structures are read-only to which code, we could
probably get rid of a lot of just-in-case tree copying that happens in
the parser and planner.

But at the same time, even if we could drive all palloc costs to zero,
it would only make a 10% difference in this example.  And this sort of
fairly flat profile is what I see in most cases these days --- we've
been playing performance whack-a-mole for long enough now that there
isn't much low-hanging fruit left.  For better or worse, the system
design we've chosen just isn't amenable to minimal overhead for simple
queries.  I think a lot of this ultimately traces to the extensible,
data-type-agnostic design philosophy.  The fact that we don't know what
an integer is until we look in pg_type, and don't know what an "="
operator does until we look up its properties, is great from a flexibility
point of view; but this sort of query is where the costs become obvious.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.

On Apr 14, 2011, at 8:22 AM, Florian Weimer wrote:

> * Tom Lane:
> 
>> Well, the fundamental point is that "ignoring NFS" is not the real
>> world.  We can't tell people not to put data directories on NFS,
>> and even if we did tell them not to, they'd still do it.  And NFS
>> locking is not trustworthy, because the remote lock daemon can crash
>> and restart (forgetting everything it ever knew) while your own machine
>> and the postmaster remain blissfully awake.
> 
> Is this still the case with NFSv4?  Does the local daemon still keep
> the lock state?

The lock handling has been fixed in NFSv4.

http://nfs.sourceforge.net/
"NFS Version 4 introduces support for byte-range locking and share reservation. 
Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must 
maintain contact with an NFS Version 4 server to continue extending its open 
and lock leases."

http://linux.die.net/man/2/flock
"flock(2) does not lock files over NFS. Use fcntl(2) instead: that does work 
over NFS, given a sufficiently recent version of Linux and a server which 
supports locking."

I would need some more time to dig up what "recent version of Linux" specifies, 
but NFSv4 is likely required.

> 
>> None of this is to say that an fcntl lock might not be a useful addition
>> to what we do already.  It is to say that fcntl can't just replace what
>> we do already, because there are real-world failure cases that the
>> current solution handles and fcntl alone wouldn't.
> 
> If it requires NFS misbehavior (possibly in an older version), and you
> have to start postmasters on separate nodes (which you normally
> wouldn't do), doesn't this make it increasingly unlikely that it's
> going to be triggered in the wild?

With the patch I offer, it would be possible to use shared storage and failover 
postgresql nodes on different machines over NFS. (The second postmaster blocks 
and waits for the lock to be released.) Obviously, such as a setup isn't as 
strong as using replication, but given a sufficiently fail-safe shared storage 
setup, it could be made reliable.

Cheers,
M

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.

On Apr 13, 2011, at 9:30 PM, Robert Haas wrote:

> On Wed, Apr 13, 2011 at 6:11 PM, A.M.  wrote:
>>> I don't see why we need to get rid of SysV shared memory; needing less
>>> of it seems just as good.
>> 
>> 1. As long one keeps SysV shared memory around, the postgresql project has 
>> to maintain the annoying platform-specific document on how to configure the 
>> poorly named kernel parameters. If the SysV region is very small, that means 
>> I can run more postgresql instances within the same kernel limits, but one 
>> can still hit the limits. My patch allows the postgresql project to delete 
>> that page and the hassles with it.
>> 
>> 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? 
>> (Pun intended.)
> 
> With all due respect, I think this is an unproductive conversation.
> Your patch proves that SysV is wholly unnecessary only if we also
> agree that fcntl() locking is just as reliable as the nattch
> interlock, and Tom and I are trying to explain why we don't believe
> that's the case.  Saying that we're just wrong without responding to
> our points substantively doesn't move the conversation forward.

Sorry- it wasn't meant to be an attack- just a dumb pun. I am trying to argue 
that, even if the fcntl is unreliable, the startup procedure is just as 
reliable as it is now. The reasons being:

1) the SysV nattch method's primary purpose is to protect the shmem region. 
This is no longer necessary in my patch because the shared memory in unlinked 
immediately after creation, so only the initial postmaster and its children 
have access.

2) the standard postgresql lock file remains the same

Furthermore, there is indeed a case where the SysV nattch cannot work while the 
fcntl locking can indeed catch: if two separate machines have a postgresql data 
directory mounted over NFS, postgresql will currently allow both machines to 
start a postmaster in that directory because the SysV nattch check fails and 
then the pid in the lock file is the pid on the first machine, so postgresql 
will say "starting anyway". With fcntl locking, this can be fixed. SysV only 
has presence on one kernel.

> 
> In case it's not clear, here again is what we're concerned about: A
> System V shm *cannot* be removed until nobody is attached to it.  A
> lock file can be removed, or the lock can be accidentally released by
> the apparently innocuous operation of closing a file descriptor.
> 
>> Both you and Tom have somehow assumed that the patch alters current 
>> postgresql behavior. In fact, the opposite is true. I haven't changed any of 
>> the existing behavior. The "robust" behavior remains. I merely added fcntl 
>> interlocking on top of the lock file to replace the SysV shmem check.
> 
> This seems contradictory.  If you replaced the SysV shmem check, then
> it's not there, which means you altered the behavior.

>From what I understood, the primary purpose of the SysV check was to protect 
>the shared memory from multiple stompers. The interlock was a neat 
>side-effect. 

The lock file contents are currently important to get the pid of a potential, 
conflicting postmaster. With the fcntl API, we can return a live conflicting 
PID (whether a postmaster or a stuck child), so that's an improvement. This 
could be used, for example, for STONITH, to reliably kill a dying replication 
clone- just loop on the pids returned from the lock.

Even if the fcntl check passes, the pid in the lock file is checked, so the 
lock file behavior remains the same.

If you were to implement a daemon with a shared data directory but no shared 
memory, how would implement the interlock? Would you still insist on SysV 
shmem? Unix daemons generally rely on lock files alone. Perhaps there is a 
different API on which we can agree.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump --binary-upgrade vs. ALTER TYPE ... DROP ATTRIBUTE

2011-04-14 Thread Noah Misch

On Wed, Apr 13, 2011 at 11:46:45PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > If we adopt the elsewhere-proposed approach of forbidding the use of
> > rowtypes to create typed tables, the circularity-checking logic here
> > can become simpler.  I think it's not actually water-tight right now:
> 
> > rhaas=# create table a (x int);
> > CREATE TABLE
> > rhaas=# create table b of a;
> > CREATE TABLE
> > rhaas=# create table c () inherits (b);
> > CREATE TABLE
> > rhaas=# create table d of c;
> > CREATE TABLE
> > rhaas=# alter table a of d;
> > ALTER TABLE
> 
> "alter table a of d"?  What the heck does that mean, and why would it be
> a good idea?

CREATE TABLE a ...; ...; ALTER TABLE a OF d;  =  CREATE TABLE a OF d;

It's a good idea as a heavy lifter for `pg_dump --binary-upgrade'.  See the rest
of this thread for the full background.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread Florian Weimer

* Tom Lane:

> Well, the fundamental point is that "ignoring NFS" is not the real
> world.  We can't tell people not to put data directories on NFS,
> and even if we did tell them not to, they'd still do it.  And NFS
> locking is not trustworthy, because the remote lock daemon can crash
> and restart (forgetting everything it ever knew) while your own machine
> and the postmaster remain blissfully awake.

Is this still the case with NFSv4?  Does the local daemon still keep
the lock state?

> None of this is to say that an fcntl lock might not be a useful addition
> to what we do already.  It is to say that fcntl can't just replace what
> we do already, because there are real-world failure cases that the
> current solution handles and fcntl alone wouldn't.

If it requires NFS misbehavior (possibly in an older version), and you
have to start postmasters on separate nodes (which you normally
wouldn't do), doesn't this make it increasingly unlikely that it's
going to be triggered in the wild?

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typed table DDL loose ends

2011-04-14 Thread Noah Misch

On Wed, Apr 13, 2011 at 07:57:29PM -0700, Robert Haas wrote:
> On Sat, Apr 9, 2011 at 6:57 PM, Noah Misch  wrote:
> > While looking at the typed table/pg_upgrade problem, I ran into a few 
> > smaller
> > problems in the area. ?I'm not envisioning a need for much code shift to fix
> > them, but there are a few points of policy.
> >
> > * Table row types used in typed tables vs. ALTER TABLE
> > As previously noted:
> > ?CREATE TABLE t ();
> > ?CREATE TABLE is_a OF t;
> > ?ALTER TABLE t ADD c int;
> > ?\d is_a
> > ?-- No columns
> >
> > At first I thought we should just forbid the use of table row types in 
> > CREATE
> > TABLE OF. ?However, we've been quite systematic about not distinguishing 
> > between
> > table row types and CREATE TYPE AS types; I've only found a distinction in 
> > ALTER
> > TABLE/ALTER TYPE, where we direct you to the other command. ?It would be 
> > nice to
> > preserve this heritage. ?That doesn't look particularly difficult; it may
> > actually yield a net code reduction.
> 
> I guess my gut feeling is that it would make more sense to forbid it
> outright for 9.1, and we can look at relaxing that restriction later
> if we're so inclined.
> 
> Much as with the problem Tom fixed in commit
> eb51af71f241e8cb199790dee9ad246bb36b3287, I'm concerned that there may
> be other cases that we're not thinking of right now, and while we
> could find them all and fix them, the amount of functionality gained
> is fairly marginal, and I don't really want to hold up the release
> while we bug-swat.

Symmetry was the best cause I could find to continue allowing it, and your case
in favor of reducing the bug surface is more compelling.  Let's forbid it.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Itanium HP-UX build failure, register stack

2011-04-14 Thread Andrew Dunstan




On 04/14/2011 02:04 AM, Josh Berkus wrote:

On 4/13/11 5:46 AM, Heikki Linnakangas wrote:

On 13.04.2011 14:22, Andrew Dunstan wrote:

I wish we could get some buildfarm coverage for HPUX. I've whined about
this in the past, but nobody's ever made an offer to provide suitable
platform(s) that I know of.

I only have temporary access to this HPUX box, but I'm trying to arrange
that.

HP is willing to give us access to boxes.  I just talked to their
Itanium partnership person today; I thought we had boxes but I can get
some for her if we don't.



Well please get me details and access and I'll set up a buildfarm member 
or two.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2011-04-14 Thread Heikki Linnakangas


On 14.04.2011 10:15, Pavel Stehule wrote:

Hello

I have a problem with PQputCopyData function. It doesn't signal some error.

while ((row = mysql_fetch_row(res)) != NULL)
{
snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], 
row[1]);
copy_result = PQputCopyData(pconn, buffer, 
strlen(buffer));
printf(">>%s<<\n", PQerrorMessage(pconn));
printf("%d\n", copy_result);
if (copy_result != 1)
{
fprintf(stderr, "Copy to target table failed: 
%s",
PQerrorMessage(pconn));
EXIT;
}
}

it returns 1 for broken values too :(

Is necessary some special check?


The way COPY works is that PQputCopyData just sends the data to the 
server, and the server will buffer it in its internal buffer and 
processes it when it feels like it. The PQputCopyData() calls don't even 
need to match line boundaries.


I think you'll need to send all the data and finish the COPY until you 
get an error. If you have a lot of data to send, you might want to slice 
it into multiple COPY statements of say 50MB each, so that you can catch 
errors in between.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PQputCopyData dont signal error

2011-04-14 Thread Pavel Stehule

Hello


>
> The way COPY works is that PQputCopyData just sends the data to the server,
> and the server will buffer it in its internal buffer and processes it when
> it feels like it. The PQputCopyData() calls don't even need to match line
> boundaries.
>

Yes, it is current behave - then documentation is obsolete

> I think you'll need to send all the data and finish the COPY until you get
> an error. If you have a lot of data to send, you might want to slice it into
> multiple COPY statements of say 50MB each, so that you can catch errors in
> between.

:( I wold to import table in one statement

Regards

Pavel Stehule



>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PQputCopyData dont signal error

2011-04-14 Thread Pavel Stehule

Hello

I have a problem with PQputCopyData function. It doesn't signal some error.

while ((row = mysql_fetch_row(res)) != NULL)
{
snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], 
row[1]);
copy_result = PQputCopyData(pconn, buffer, 
strlen(buffer));
printf(">>%s<<\n", PQerrorMessage(pconn));
printf("%d\n", copy_result);
if (copy_result != 1)
{
fprintf(stderr, "Copy to target table failed: 
%s",
PQerrorMessage(pconn));
EXIT;
}
}

it returns 1 for broken values too :(

Is necessary some special check?

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Single client performance on trivial SELECTs

2011-04-14 Thread Greg Smith

This week several list regulars here waded into the MySQL Convention.  I 
decided to revisit PostgreSQL vs. MySQL performance using the sysbench 
program as part of that.  It's not important to what I'm going to 
describe to understand exactly what statements sysbench runs here or how 
to use it, but if anyone is curious I've got some more details about how 
I ran the tests in my talk slides at 
http://projects.2ndquadrant.com/talks  The program has recently gone 
through some fixes that make it run a bit better both in general and 
against PostgreSQL.  The write tests are still broken against 
PostgreSQL, but it now seems to do a reasonable job simulating a simple 
SELECT-only workload.  A fix from Jignesh recently made its way into the 
database generation side of the code that makes it less tedious to test 
with it too. 

The interesting part was how per-client scaling compared between the two 
databases; graph attached.  On my 8 core server, PostgreSQL scales 
nicely up to a steady 50K TPS.  I see the same curve, almost identical 
numbers, with PostgreSQL and pgbench--no reason to suspect sysbench is 
doing anything shady.  The version of MySQL I used hits around 67K TPS 
with innodb when busy with lots of clients.  That part doesn't bother 
me; nobody expects PostgreSQL to be faster on trivial SELECT statements 
and the gap isn't that big.


The shocking part was the single client results.  I'm using to seeing 
Postgres get around 7K TPS per core on those, which was the case here, 
and I never considered that an interesting limitation to think about 
before.  MySQL turns out to hit 38K TPS doing the same work.  Now that's 
a gap interesting enough to make me wonder what's going on.


Easy enough to exercise the same sort of single client test case with 
pgbench and put it under a profiler:


sudo opcontrol --init
sudo opcontrol --setup --no-vmlinux
createdb pgbench
pgbench -i -s 10 pgbench
psql -d pgbench -c "vacuum"
sudo opcontrol --start
sudo opcontrol --reset
pgbench -S -n -c 1 -T 60 pgbench
sudo opcontrol --dump ; sudo opcontrol --shutdown
opreport -l image:$HOME/pgwork/inst/test/bin/postgres

Here's the top calls, from my laptop rather than the server that I 
generated the graph against.  It does around 5.5K TPS with 1 clients and 
10K with 2 clients, so same basic scaling:


samples  %image name   symbol name
53548 6.7609  postgres AllocSetAlloc
32787 4.1396  postgres MemoryContextAllocZeroAligned
26330 3.3244  postgres base_yyparse
21723 2.7427  postgres hash_search_with_hash_value
20831 2.6301  postgres SearchCatCache
19094 2.4108  postgres hash_seq_search
18402 2.3234  postgres hash_any
15975 2.0170  postgres AllocSetFreeIndex
14205 1.7935  postgres _bt_compare
13370 1.6881  postgres core_yylex
10455 1.3200  postgres MemoryContextAlloc
10330 1.3042  postgres LockAcquireExtended
10197 1.2875  postgres ScanKeywordLookup
9312  1.1757  postgres MemoryContextAllocZero

I don't know nearly enough about the memory allocator to comment on 
whether it's possible to optimize it better for this case to relieve any 
bottleneck.  Might just get a small gain then push the limiter to the 
parser or hash functions.  I was surprised to find that's where so much 
of the time was going though.


P.S. When showing this graph in my talk, I pointed out that anyone who 
is making decisions about which database to use based on trivial SELECTs 
on small databases isn't going to be choosing between PostgreSQL and 
MySQL anyway--they'll be deploying something like MongoDB instead if 
that's the important metric.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


<>
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

51 matches

Mail list logo