date:20110714

[HACKERS] SSI error messages

2011-07-14 Thread Peter Eisentraut

Some of these new error messages from the SSI code are a mouthful:

not enough elements in RWConflictPool to record a rw-conflict
not enough elements in RWConflictPool to record a potential rw-conflict

These are basically "out of shared memory" conditions that could be
moved to a DETAIL message.

Canceled on identification as a pivot, during conflict out checking.
Canceled on conflict out to old pivot %u.
Canceled on identification as a pivot, with conflict out to old committed 
transaction %u.
Canceled on conflict out to old pivot.
Canceled on identification as a pivot, during conflict in checking.
Canceled on identification as a pivot, during write.
Canceled on conflict out to pivot %u, during read.
Canceled on identification as a pivot, during commit attempt.
Canceled on commit attempt with conflict in from prepared pivot.

These are DETAIL messages, with the rest of the report saying

ERROR:  could not serialize access due to read/write dependencies among 
transactions
HINT:  The transaction might succeed if retried.

AFAICT, the documentation doesn't mention anything about this "pivot"
that keeps coming up.  Is it useful that have the user face this
information?  Is there anything a user can derive from seeing one of
these messages versus another, and in addition to the error and hint,
that would help them address the situation?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ON COMMIT action not catalogued?

2011-07-14 Thread Peter Eisentraut

(Going through some loose ends in the information schema ...)

Is my understanding right that the ON COMMIT action of temporary tables
is not recorded in the system catalogs anywhere?  Would make sense,
wouldn't be a big problem, just want to make sure I didn't miss
anything.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-14 Thread Peter Eisentraut

On tor, 2011-07-14 at 12:01 -0700, Josh Berkus wrote:
> Currently we have 8 patches "Ready for Committer" in the current CF.
> Some of them have been that status for some time.
> 
> >From traffic on this list, I'm getting the impression that nobody other
> than Robert, Heikki and Tom are committing other people's patches.  I
> know we have more committers than that.  Bruce?  Simon?  Itagaki?  Bueller?

I'm still working on preparing 9.1.  I'm not following 9.2 development
yet.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-14 Thread Tatsuo Ishii

>> I have looked into the v6 patches. One thing I would like to suggest
>> is, enhancing the error message when temp_file_limit will be exceeded.
>>
>> ERROR:  aborting due to exceeding temp file limit
>>
>> Is it possible to add the current temp file limit to the message? For
>> example,
>>
>> ERROR:  aborting due to exceeding temp file limit 1kB
>>
>> I know the current setting of temp_file_limit can be viewd in other
>> ways, but I think this will make admin's or application developer's
>> life a little bit easier.
> 
> Hi Tatsuo,
> 
> Yeah, good suggestion - I agree that it would be useful to supply
> extra detail, I'll amend and resubmit a new patch (along with whatever
> review modifications we come up with in the meantime)!
> 
> Cheers
> 
> Mark

Maybe we could add more info regarding current usage and requested
amount in addition to the temp file limit value. I mean something
like:

ERROR:  aborting due to exceeding temp file limit. Current usage 9000kB, 
requested size 1024kB, thus it will exceed temp file limit 1kB.

What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] per-column generic option

2011-07-14 Thread Shigeru Hanada

(2011/07/15 4:17), Josh Berkus wrote:
> All,
> 
> Is the spec for this feature still under discussion?  I don't seem to
> see a consensus on this thread.

Yeah, a remaining concern is whether generic (FDW) options should be
separated from existing attoptions or not.

Indeed, reloptions/attoptions mechanism seems to be also applicable to
generic options, since both need to store multiple key-value pairs, but
IMHO generic options should be separated from reloptions/attoptions for
several reasons:

1) FDW options are handled by only FDW, but reloptions/attoptions are
handled by PG core modules such as planner, AM and autovacuum.  If we
can separate them completely, they would be able to share one attribute,
but I worry that some of reloptions/attoptions make sense for some FDW.
 For instance, n_distinct might be useful to control costs of a foreign
table scan.  Though attoptions can't be set via CREATE/ALTER FOREIGN
TABLE yet.

2) In future, newly added option might conflict somebody's FDW option.
Robert Haas has pointed out this issue some days ago.  FDW validator
would reject unknown options, so every FDW would have to know all of
reloptions/attoptions to avoid this issue.

3) It would be difficult to unify syntax to set options from perspective
of backward compatibility and syntax consistency.  Other SQL/MED objects
have the syntax such as "OPTIONS (key 'value', ...)", but
reloptions/attoptions have the syntax such as "SET (key = value, ...)".
 Without syntax unification, some tools should care relkind before
handling attoptions.  For instance, pg_dump should choose syntax used to
dump attoptions.  It seems undesired complexity.

Any comments/objections/questions are welcome.

Regards,
-- 
Shigeru Hanada

  * 英語 - 自動検出
  * 英語
  * 日本語

  * 英語
  * 日本語

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 6:43 PM, Robert Haas wrote:

>> Is that all I need to do, or is there something else I should be aware of 
>> with regard to unlogged tables?
> 
> Probably not, in this case. Just a thought: maybe you could rewrite the query 
> to check whether the namespace name starts with pg_temp. Maybe that would be 
> version-independent...

Ah, good idea, I forgot about pg_temp.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Robert Haas

On Jul 14, 2011, at 5:13 PM, "David E. Wheeler"  wrote:
> On Jul 14, 2011, at 3:05 PM, Tom Lane wrote:
> 
>> Josh Berkus  writes:
 There are a ton of
 things that change with each release, and all we do by putting in
 hacks for backwards compatibility is add bloat that needs to be
 maintained, and encourage vendors to be lazy.
>> 
>>> I don't agree that having comprehensive system views with multi-version
>>> stability would be a "hack".
>> 
>> If we had that, it wouldn't be a hack.
> 
> Is that an endorsement for adding such a feature?
> 
>> Putting in a hack to cover the
>> specific case of relistemp, on the other hand, is just a hack.
> 
> Sure.
> 
>> The real question here, IMO, is "how many applications are there that
>> really need to know about temporary relations, but have no interest in
>> the related feature of unlogged relations?".  Because only such apps
>> would be served by a compatibility hack for this.  An app that thinks it
>> knows the semantics of relistemp, and isn't updated to grok unlogged
>> tables, may be worse than broken --- it may be silently incorrect.
> 
> So pgTAP creates temporary tables to store result sets so that it can then 
> compare the results of two queries. The function in question was getting a 
> list of columns in such a temporary table in order to make sure that the 
> types were the same between two such tables before comparing results. It 
> checked relistemp to make sure it was looking at the temp table rather than 
> some other table that might happen to have the same name.
> 
> So now the query looks like this:
> 
>SELECT pg_catalog.format_type(a.atttypid, a.atttypmod)
>  FROM pg_catalog.pg_attribute a
>  JOIN pg_catalog.pg_class c ON a.attrelid = c.oid
> WHERE c.relname = $1
> -- AND c.relistemp -- 8.3-9.0
>   AND c.relpersistence = 't'  -- 9.1
>   AND attnum > 0
>   AND NOT attisdropped
> ORDER BY attnum
> 
> Is that all I need to do, or is there something else I should be aware of 
> with regard to unlogged tables?

Probably not, in this case. Just a thought: maybe you could rewrite the query 
to check whether the namespace name starts with pg_temp. Maybe that would be 
version-independent...

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Pavan Deolasee

On Thu, Jul 14, 2011 at 6:22 PM, Simon Riggs  wrote:

> On Thu, Jul 14, 2011 at 4:57 PM, Pavan Deolasee
>  wrote:
>
> > Thanks Simon for looking at the patch.
>
> Sorry, I didn't notice there was a patch attached. Not reviewed it. I
> thought we were still just talking.
>
>
No problem. Please review it when you have time.


>
> > I am not sure if the use case is really narrow.
>
> This is a very rare issue, because of all the work yourself and Heikki
> have put in.
>
>
I don't think its rare case since vacuum on any table, small or large, would
take two passes today. Avoiding one of the two, would help the system in
general. HOT and other things help to a great extent, but unfortunately they
don't make vacuum completely go away. So we still want to do vacuum in the
most efficient way.


> It's only important when we have a (1) big table (hence long scan
> time), (2) a use case that avoids HOT *and* (3) we have dirtied a
> large enough section of table that the vacuum map is ineffective and
> we need to scan high % of table. That combination is pretty rare, so
> penalising everybody else with 8 bytes per block seems too much to me.
>
>
The additional space is allocated only for those pages which has dead-vacuum
line pointers and would stay only till the next HOT-prune operation on the
page. So everybody does not pay the penalty, even if we assume its too much
and if thats what worries you most.


>
> I have great faith in your talents, just not sure about this
> particular use of them.


Not sure if thats a compliment or a criticism :-)


> I'm sorry to voice them now you've written the
> patch.
>
>
Yeah, I would have liked if you would have said this earlier, but I
appreciate comments any time.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Review of patch Bugfix for XPATH() if expression returns a scalar value

2011-07-14 Thread Florian Pflug

Hi Radosław,

Do you plan to comment on this patch (the one that adds support
for XPATH expressions returning scalar values, not the one that
escapes text and attributes nodes which are selected) further,
or should it be marked "Ready for Committer"?

I'm asking because I just noticed that you marked the other patch
as "Ready for Commiter", but not this one.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 4:57 PM, Pavan Deolasee
 wrote:

> Thanks Simon for looking at the patch.

Sorry, I didn't notice there was a patch attached. Not reviewed it. I
thought we were still just talking.

> I am not sure if the use case is really narrow.

This is a very rare issue, because of all the work yourself and Heikki
have put in.

It's only important when we have a (1) big table (hence long scan
time), (2) a use case that avoids HOT *and* (3) we have dirtied a
large enough section of table that the vacuum map is ineffective and
we need to scan high % of table. That combination is pretty rare, so
penalising everybody else with 8 bytes per block seems too much to me.

Big VACUUMs are a problem, but my observation would be that those are
typically transaction wraparound VACUUMs and the extra writes are not
caused by row removal. So we do sometimes do Phase 2 and Phase 3 even
when there is a very low number of row removals - since not all
VACUUMs are triggered by changes.

> Today, we dirty the pages in
> both the passes and also emit WAL records.

This is exactly the thing I'm suggesting we avoid.

> Just the heap scan can take a
> very long time for large tables, blocking the autovacuum worker threads from
> doing useful work on other tables. If I am not wrong, we use ring buffers
> for vacuum which would most-likely force those buffers to be written/read
> twice to the disk.

I think the problem comes from dirtying too many blocks. Scanning the
tables using the ring buffer is actually fairly cheap. The second scan
only touches the blocks that need secondary cleaning, so the cost of
it is usually much less.

I'm suggesting we write each block at most once, rather than twice as
we do now. Yes, we have to do both scans.

My idea does exactly same number of writes as yours. On read-only I/O,
your idea is clearly more efficient, but overall that's not by enough
to justify the 8 byte per block overhead, IMHO.

> Which part of the patch you think is very complex ? We can try to simplify
> that. Or are you seeing any obvious bugs that I missed ? IMHO taking out a
> phase completely from vacuum (as this patch does) can simplify things.

I have great faith in your talents, just not sure about this
particular use of them. I'm sorry to voice them now you've written the
patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays

2011-07-14 Thread Nathan Boley

Review of patch: https://commitfest.postgresql.org/action/patch_view?id=539

I glanced through the code and played around with the feature and,
in general, it looks pretty good. But I have a some comments/questions.

Design:

First, it makes me uncomfortable that you are using the MCV and histogram slot
kinds in a way that is very different from other data types.

I realize that tsvector uses MCV in the same way that you do but:

1) I don't like that very much either.
2) TS vector is different in that equality ( in the btree sense )
doesn't make sense, whereas it does for arrays.

Using the histogram slot for the array lengths is also very surprising to me.

Why not just use a new STA_KIND? It's not like we are running out of
room, and this will be the second 'container' type that splits the container
and stores stats about the elements.


Second, I'm fairly unclear about what the actual underlying statistical
method is and what assumptions it makes. Before I can really review
the method, I will probably need to see some documentation, as I say below.
Do you have any tests/simulations that explore the estimation procedure
that I can look at? When will it fail? In what cases does it improve
estimation?

Documentation:

Seems a bit lacking - especially if you keep the non standard usage of
mcv/histograms. Also, it would be really nice to see some update to
the row-estimation-examples page ( in chapter 55 ).

The code comments are good in general, but there are not enough high level
comments . It would be really nice to have a comment that gives an overview
of what each of the following functions are supposed to do:

mcelem_array_selec
mcelem_array_contain_overlap_selec

and especially

mcelem_array_contained_selec

Also, in the compute_array_stats you reference compute_tsvector_stats for
the algorithm, but I think that it would be smarter to either copy the
relevant portions of the comment over or to reference a published document.
If compute_tsvector_stats changes, I doubt that anyone will remember to
update the comment.

Finally, it would be really nice to see, explicitly, and in
mathematical notation

1) The data that is being collected and summarized. ( the statistic )
2) The estimate being derived from the statistic ( ie the selectivity )
i) Any assumptions being made ( ie independence of elements within
an array )

For instance, the only note I could find that actually addressed the
estimator and
assumptions underneath it was

+* mult * dist[i] / mcelem_dist[i] gives us probability 
probability
+* of qual matching from assumption of independent 
element occurence
+* with condition that length = i.

and that's pretty cryptic. That should be expanded upon and placed
more prominently.

A couple nit picks:

1) In calc_distr you go to some lengths to avoid round off errors. Since it is
   certainly just the order of the estimate that matters, why not just
   perform the calculation in log space?

2) compute_array_stats is really long. Could you break it up? ( I know that
   the other analyze functions are the same way, but why continue in
that vein? )

Best,
Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 3:05 PM, Tom Lane wrote:

> Josh Berkus  writes:
>>> There are a ton of
>>> things that change with each release, and all we do by putting in
>>> hacks for backwards compatibility is add bloat that needs to be
>>> maintained, and encourage vendors to be lazy.
> 
>> I don't agree that having comprehensive system views with multi-version
>> stability would be a "hack".
> 
> If we had that, it wouldn't be a hack.

Is that an endorsement for adding such a feature?

> Putting in a hack to cover the
> specific case of relistemp, on the other hand, is just a hack.

Sure.

> The real question here, IMO, is "how many applications are there that
> really need to know about temporary relations, but have no interest in
> the related feature of unlogged relations?".  Because only such apps
> would be served by a compatibility hack for this.  An app that thinks it
> knows the semantics of relistemp, and isn't updated to grok unlogged
> tables, may be worse than broken --- it may be silently incorrect.

So pgTAP creates temporary tables to store result sets so that it can then 
compare the results of two queries. The function in question was getting a list 
of columns in such a temporary table in order to make sure that the types were 
the same between two such tables before comparing results. It checked relistemp 
to make sure it was looking at the temp table rather than some other table that 
might happen to have the same name.

So now the query looks like this:

SELECT pg_catalog.format_type(a.atttypid, a.atttypmod)
  FROM pg_catalog.pg_attribute a
  JOIN pg_catalog.pg_class c ON a.attrelid = c.oid
 WHERE c.relname = $1
-- AND c.relistemp -- 8.3-9.0
   AND c.relpersistence = 't'  -- 9.1
   AND attnum > 0
   AND NOT attisdropped
 ORDER BY attnum

Is that all I need to do, or is there something else I should be aware of with 
regard to unlogged tables?

Thanks,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Josh Berkus


>> I don't agree that having comprehensive system views with multi-version
>> stability would be a "hack".
> 
> If we had that, it wouldn't be a hack.  Putting in a hack to cover the
> specific case of relistemp, on the other hand, is just a hack.

I agree.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Tom Lane

Josh Berkus  writes:
>> There are a ton of
>> things that change with each release, and all we do by putting in
>> hacks for backwards compatibility is add bloat that needs to be
>> maintained, and encourage vendors to be lazy.

> I don't agree that having comprehensive system views with multi-version
> stability would be a "hack".

If we had that, it wouldn't be a hack.  Putting in a hack to cover the
specific case of relistemp, on the other hand, is just a hack.

The real question here, IMO, is "how many applications are there that
really need to know about temporary relations, but have no interest in
the related feature of unlogged relations?".  Because only such apps
would be served by a compatibility hack for this.  An app that thinks it
knows the semantics of relistemp, and isn't updated to grok unlogged
tables, may be worse than broken --- it may be silently incorrect.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Josh Berkus


> As one of said vendors, I completely disagree. 

I don't agree that you qualify as a vendor.  You're on the friggin' core
team.

I'm talking about vendors like DBVizualizer or TORA, for which
PostgreSQL is just one of the databases they support.  If stuff breaks
gratuitously, the reaction of some of them is always to either drop
support or delay it for a year or more.  This doesn't benefit our community.

> There are a ton of
> things that change with each release, and all we do by putting in
> hacks for backwards compatibility is add bloat that needs to be
> maintained, and encourage vendors to be lazy.

I don't agree that having comprehensive system views with multi-version
stability would be a "hack".

> Break compatibility is actually something that is important to us - it
> forces us to fix obvious issues, and makes it much harder to
> inadvertently miss important changes.

What I'm hearing from you is: "Breaking backwards compatibility is
something we should do more of because it lets us know which vendors are
paying attention and weeds out the unfit."   Is that what you meant to say?

That seems like a way to ensure that PostgreSQL support continue to be
considered optional, or based on outdated versions, for multi-database
tools.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-14 Thread Bruce Momjian

Josh Berkus wrote:
> 
> > ISTM that you are right that there are other committers that have not
> > done anything. How strange that you name myself, yet not others.
> 
> Touchy today, eh?
> 
> And I do name others, read the email again.
> 
> Anyway, my question is: the list of committers I know of who have
> general knowledge of the codebase and can commit a wide variety of other
> people's patches are:
> 
> Tom
> Robert
> Heikki
> Bruce
> Simon

I have found my reading of the email lists is too delayed to effectively
commit other's patches.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 2:10 PM, Dave Page wrote:

> Break compatibility is actually something that is important to us - it
> forces us to fix obvious issues, and makes it much harder to
> inadvertently miss important changes.

Agreed, but a deprecation cycle would be much appreciated.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-14 Thread Josh Berkus


> ISTM that you are right that there are other committers that have not
> done anything. How strange that you name myself, yet not others.

Touchy today, eh?

And I do name others, read the email again.

Anyway, my question is: the list of committers I know of who have
general knowledge of the codebase and can commit a wide variety of other
people's patches are:

Tom
Robert
Heikki
Bruce
Simon

Who else?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-14 Thread Heikki Linnakangas


I think there's two bugs in the existing gistFindPath code:


if (top->parent && XLByteLT(top->parent->lsn, 
GistPageGetOpaque(page)->nsn) &&
GistPageGetOpaque(page)->rightlink != 
InvalidBlockNumber /* sanity check */ )
{
/* page splited while we thinking of... */
ptr = (GISTInsertStack *) 
palloc0(sizeof(GISTInsertStack));
ptr->blkno = GistPageGetOpaque(page)->rightlink;
ptr->childoffnum = InvalidOffsetNumber;
ptr->parent = top;
ptr->next = NULL;
tail->next = ptr;
tail = ptr;
}


First, notice that we're setting "ptr->parent = top". 'top' is the 
current node we're processing, and ptr represents the node to the right 
of the current node. The current node is *not* the parent of the node to 
the right. I believe that line should be "ptr->parent = top->parent".


Second, we're adding the entry for the right sibling to the end of the 
list of nodes to visit. But when we process entries from the list, we 
exit immediately when we see a leaf page. That means that the right 
sibling can get queued up behind leaf pages, and thus never visited.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 11:33, Alexander Korotkov wrote:

On Wed, Jul 13, 2011 at 5:59 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:


One thing that caught my eye is that when you empty a buffer, you load the
entire subtree below that buffer, down to the next buffered or leaf level,
into memory. Every page in that subtree is kept pinned. That is a problem;
in the general case, the buffer manager can only hold a modest number of
pages pinned at a time. Consider that the minimum value for shared_buffers
is just 16. That's unrealistically low for any real system, but the default
is only 32MB, which equals to just 4096 buffers. A subtree could easily be
larger than that.


With level step = 1 we need only 2 levels in subtree. With mininun index
tuple size (12 bytes) each page can have at maximum 675. Thus I think
default shared_buffers is enough for level step = 1.


Hundreds of buffer pins is still a lot. And with_level_step=2, the 
number of pins required explodes to 675^2 = 455625.


Pinning a buffer that's already in the shared buffer cache is cheap, I 
doubt you're gaining much by keeping the private hash table in front of 
the buffer cache. Also, it's possible that not all of the subtree is 
actually required during the emptying, so in the worst case pre-loading 
them is counter-productive.



I believe it's enough
to add check we have sufficient shared_buffers, isn't it?


Well, what do you do if you deem that shared_buffers is too small? Fall 
back to the old method? Also, shared_buffers is shared by all backends, 
so you can't assume that you get to use all of it for the index build. 
You'd need a wide safety margin.



I don't think you're benefiting at all from the buffering that BufFile does
for you, since you're reading/writing a full block at a time anyway. You
might as well use the file API in fd.c directly, ie.
OpenTemporaryFile/FileRead/**FileWrite.


BufFile is distributing temporary data through several files. AFAICS
postgres avoids working with files larger than 1GB. Size of tree buffers can
easily be greater. Without BufFile I need to maintain set of files manually.


Ah, I see. Makes sense.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Dave Page

On Thursday, July 14, 2011, Josh Berkus  wrote:
>
>> There has never, ever, been a guarantee that the system catalogs don't
>> change across versions.  Anybody issuing such queries should expect to
>> have to retest them and possibly change them in each new major release.
>
> I know that's always been our policy.  It his, however,
> vendor-unfriendly because we don't supply any interface for many things
> (such as temp tables) other than the system catalogs.
>
> So if we're going to break compatibility, then we could stand to make a
> little noise about it.

As one of said vendors, I completely disagree. There are a ton of
things that change with each release, and all we do by putting in
hacks for backwards compatibility is add bloat that needs to be
maintained, and encourage vendors to be lazy.

Break compatibility is actually something that is important to us - it
forces us to fix obvious issues, and makes it much harder to
inadvertently miss important changes.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Understanding GIN posting trees

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 17:28, Teodor Sigaev wrote:

Why is the posting tree a tree? AFAICS, we never search it using the
TID, it's always scanned in whole. It would be simpler to store the TIDs
in a posting list in no particular order. This could potentially make
insertions cheaper, as you could just append to the last posting list
page for the key, instead of traversing the posting tree to a particular
location. You could also pack the tids denser, as you wouldn't need to
reserve free space for additions in the middle.

For consistentFn call we need to collect all data for current tid. We do
that by scanning over logical ordered arrays of tids and trees allows to
do that by scanning a leafs pages.


Oh, I see. You essentially do a merge join of all the posting trees of 
query keys.


Hmm, but we do need to scan all the posting trees of all the matched 
keys in whole anyway. We could collect all TIDs in the posting lists of 
all the keys into separate TIDBitmaps, and then combine the bitmaps, 
calling consistentFn for each TID that was present in at least one 
bitmap. I guess the performance characteristics of that would be 
somewhat different from what we have now, and you'd need to keep a lot 
of in-memory bitmaps if the query contains a lot of keys.



While we're at it, it just occurred to me that it if the number of query 
keys is limited, say <= 16, you could build a lookup table for each 
combination of keys either occurring or not. You could use then use that 
instead of calling consistentFn for each possible match. You could even 
use the table to detect common cases like "all/any keys must match", 
perhaps opening some optimization opportunities elsewhere.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a committer in the house?

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 8:01 PM, Josh Berkus  wrote:

> Currently we have 8 patches "Ready for Committer" in the current CF.
> Some of them have been that status for some time.
>
> From traffic on this list, I'm getting the impression that nobody other
> than Robert, Heikki and Tom are committing other people's patches.  I
> know we have more committers than that.  Bruce?  Simon?  Itagaki?  Bueller?

Someone called Simon Riggs has committed one patch by another author
and reviewed 3 others, as well as spending many days working on bugs
for beta.

It would be sensible if people based their comments on actual events
rather than negative impressions.

ISTM that you are right that there are other committers that have not
done anything. How strange that you name myself, yet not others.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Magnus Hagander

On Thu, Jul 14, 2011 at 21:59, Josh Berkus  wrote:
>
>> There has never, ever, been a guarantee that the system catalogs don't
>> change across versions.  Anybody issuing such queries should expect to
>> have to retest them and possibly change them in each new major release.
>
> I know that's always been our policy.  It his, however,
> vendor-unfriendly because we don't supply any interface for many things
> (such as temp tables) other than the system catalogs.
>
> So if we're going to break compatibility, then we could stand to make a
> little noise about it.

We've broken the admin apps in pretty much every single release. And
they generally don't complain. If someone developing an admin app
hasn't been doing extensive testing starting *at the latest* with
beta1 (and recommended per each alpha), they shouldn't expect to
release until quite long after the release.

That said, a stable set of system views certainly wouldn't hurt - but
making extra noise about a simple change like this one would likely
not make a difference.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Simon Riggs

On Wed, Jul 13, 2011 at 10:56 PM, Peter Geoghegan  wrote:
> Attached is patch for the WAL writer that removes its tight polling
> loop (which probably doesn't get hit often in practice, as we just
> sleep if wal_writer_delay is under a second), and, at least
> potentially, reduces power consumption when idle by using a latch.
>
> I will break all remaining power consumption work down into
> per-auxiliary process patches. I think that this is appropriate - if
> we hit a snag on one of the processes, there is no need to have that
> hold up everything.
>
> I've commented that we handle all expected signals, and therefore we
> shouldn't worry about having timeout invalidated by signals, just as
> with the archiver. Previously, we didn't even worry about Postmaster
> death within the tight polling loop, presumably because
> wal_writer_delay is typically small enough to avoid that being a
> problem. I thought that WL_POSTMASTER_DEATH might be superfluous here,
> but then again there is a codepath specifically for the case where
> wal_writer_delay exceeds one second, so it is included in this initial
> version.
>
> Comments?

ISTM that this in itself isn't enough to reduce power consumption.

Currently the only people that use WALWriter are asynchronous commits,
so we should include within RecordTransactionCommit() a SetLatch()
command for the WALWriter.

That way we can have WALWriter sleep until its needed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Josh Berkus


> There has never, ever, been a guarantee that the system catalogs don't
> change across versions.  Anybody issuing such queries should expect to
> have to retest them and possibly change them in each new major release.

I know that's always been our policy.  It his, however,
vendor-unfriendly because we don't supply any interface for many things
(such as temp tables) other than the system catalogs.

So if we're going to break compatibility, then we could stand to make a
little noise about it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need help understanding pg_locks

2011-07-14 Thread Bruce Momjian


Looks good to me.

---

Florian Pflug wrote:
> On Jul14, 2011, at 22:18 , Bruce Momjian wrote:
> > !OID of the database in which the lock target exists, or
> > !zero if the lock is a shared object, or
> > !null if the lock is on a transaction ID
> 
> For consistency, I think it should say "target" in the second part
> of the sentence also now, instead of "lock ... on".
> 
> Updated patch attached. I tried to make the descriptions a
> bit more consistent, replaced "object" by "target", and
> added "targeted by" after the phrase which describes the
> locked (or waited-for) object.
> 
> best regards,
> Florian Pflug
> 
> diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
> index d4a1d36..33be5d0 100644
> *** a/doc/src/sgml/catalogs.sgml
> --- b/doc/src/sgml/catalogs.sgml
> ***
> *** 6928,6936 
> oid
>  linkend="catalog-pg-database">pg_database.oid
> 
> !OID of the database in which the object exists, or
> !zero if the object is a shared object, or
> !null if the object is a transaction ID
> 
>
>
> --- 6928,6936 
> oid
>  linkend="catalog-pg-database">pg_database.oid
> 
> !OID of the database in which the lock target exists, or
> !zero if the target is a shared object, or
> !null if the target is a transaction ID
> 
>
>
> ***
> *** 6938,6944 
> oid
>  linkend="catalog-pg-class">pg_class.oid
> 
> !OID of the relation, or null if the object is not
>  a relation or part of a relation
> 
>
> --- 6938,6944 
> oid
>  linkend="catalog-pg-class">pg_class.oid
> 
> !OID of the relation targeted by the lock, or null if the target is 
> not
>  a relation or part of a relation
> 
>
> ***
> *** 6947,6954 
> integer
> 
> 
> !Page number within the relation, or null if the object
> !is not a tuple or relation page
> 
>
>
> --- 6947,6954 
> integer
> 
> 
> !Page number targeted by the lock within the relation,
> !or null if the target is not a relation page or tuple
> 
>
>
> ***
> *** 6956,6962 
> smallint
> 
> 
> !Tuple number within the page, or null if the object is not a tuple
> 
>
>
> --- 6956,6963 
> smallint
> 
> 
> !Tuple number targeted by the lock within the page,
> !or null if the target is not a tuple
> 
>
>
> ***
> *** 6964,6971 
> text
> 
> 
> !Virtual ID of a transaction, or null if the object is not a
> !virtual transaction ID
> 
>
>
> --- 6965,6972 
> text
> 
> 
> !Virtual ID of the transaction targeted by the lock,
> !or null if the target is not a virtual transaction ID
> 
>
>
> ***
> *** 6973,6979 
> xid
> 
> 
> !ID of a transaction, or null if the object is not a transaction ID
> 
>
>
> --- 6974,6981 
> xid
> 
> 
> !ID of the transaction targeted by the lock,
> !or null if the target is not a transaction ID
> 
>
>
> ***
> *** 6981,6988 
> oid
>  linkend="catalog-pg-class">pg_class.oid
> 
> !OID of the system catalog containing the object, or null if the
> !object is not a general database object
> 
>
>
> --- 6983,6990 
> oid
>  linkend="catalog-pg-class">pg_class.oid
> 
> !OID of the system catalog containing the lock target, or null if the
> !target is not a general database object
> 
>
>
> ***
> *** 6990,6997 
> oid
> any OID column
> 
> !OID of the object within its system catalog, or null if the
> !object is not a general database object.
>  For advisory locks it is used to distinguish the two key
>  spaces (1 for an int8 key, 2 for two int4 keys).
> 
> --- 6992,6999 
> oid
> any OID column
> 
> !OID of the lock target within its system catalog, or null if the
> !target is not a general database object.
>  For advisory locks it is used to distinguish the two key
>  spaces (1 for an int8 key, 2 for two int4 keys).
> 
> ***
> *** 7001,7010 
> smallint
> 
> 
> !For a table column, this is the column

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov

On Wed, Jul 13, 2011 at 5:59 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> One thing that caught my eye is that when you empty a buffer, you load the
> entire subtree below that buffer, down to the next buffered or leaf level,
> into memory. Every page in that subtree is kept pinned. That is a problem;
> in the general case, the buffer manager can only hold a modest number of
> pages pinned at a time. Consider that the minimum value for shared_buffers
> is just 16. That's unrealistically low for any real system, but the default
> is only 32MB, which equals to just 4096 buffers. A subtree could easily be
> larger than that.
>
With level step = 1 we need only 2 levels in subtree. With mininun index
tuple size (12 bytes) each page can have at maximum 675. Thus I think
default shared_buffers is enough for level step = 1. I believe it's enough
to add check we have sufficient shared_buffers, isn't it?


> I don't think you're benefiting at all from the buffering that BufFile does
> for you, since you're reading/writing a full block at a time anyway. You
> might as well use the file API in fd.c directly, ie.
> OpenTemporaryFile/FileRead/**FileWrite.

BufFile is distributing temporary data through several files. AFAICS
postgres avoids working with files larger than 1GB. Size of tree buffers can
easily be greater. Without BufFile I need to maintain set of files manually.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 23:41, Alexander Korotkov wrote:

Do you think using "rightlink" as pointer to parent page is possible during
index build? It would allow to simplify code significantly, because of no
more need to maintain in-memory structures for parents memorizing.


I guess, but where do you store the rightlink, then? Would you need a 
final pass through the index to fix all the rightlinks?


I think you could use the NSN field. It's used to detect concurrent page 
splits, but those can't happen during index build, so you don't need 
that field during index build. You just have to make it look like an 
otherwise illegal NSN, so that it won't be mistaken for a real NSN after 
the index is built. Maybe add a new flag to mean that the NSN is 
actually invalid.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need help understanding pg_locks

2011-07-14 Thread Florian Pflug

On Jul14, 2011, at 22:18 , Bruce Momjian wrote:
> !OID of the database in which the lock target exists, or
> !zero if the lock is a shared object, or
> !null if the lock is on a transaction ID

For consistency, I think it should say "target" in the second part
of the sentence also now, instead of "lock ... on".

Updated patch attached. I tried to make the descriptions a
bit more consistent, replaced "object" by "target", and
added "targeted by" after the phrase which describes the
locked (or waited-for) object.

best regards,
Florian Pflug

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index d4a1d36..33be5d0 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 6928,6936 
oid
pg_database.oid

!OID of the database in which the object exists, or
!zero if the object is a shared object, or
!null if the object is a transaction ID

   
   
--- 6928,6936 
oid
pg_database.oid

!OID of the database in which the lock target exists, or
!zero if the target is a shared object, or
!null if the target is a transaction ID

   
   
***
*** 6938,6944 
oid
pg_class.oid

!OID of the relation, or null if the object is not
 a relation or part of a relation

   
--- 6938,6944 
oid
pg_class.oid

!OID of the relation targeted by the lock, or null if the target is not
 a relation or part of a relation

   
***
*** 6947,6954 
integer


!Page number within the relation, or null if the object
!is not a tuple or relation page

   
   
--- 6947,6954 
integer


!Page number targeted by the lock within the relation,
!or null if the target is not a relation page or tuple

   
   
***
*** 6956,6962 
smallint


!Tuple number within the page, or null if the object is not a tuple

   
   
--- 6956,6963 
smallint


!Tuple number targeted by the lock within the page,
!or null if the target is not a tuple

   
   
***
*** 6964,6971 
text


!Virtual ID of a transaction, or null if the object is not a
!virtual transaction ID

   
   
--- 6965,6972 
text


!Virtual ID of the transaction targeted by the lock,
!or null if the target is not a virtual transaction ID

   
   
***
*** 6973,6979 
xid


!ID of a transaction, or null if the object is not a transaction ID

   
   
--- 6974,6981 
xid


!ID of the transaction targeted by the lock,
!or null if the target is not a transaction ID

   
   
***
*** 6981,6988 
oid
pg_class.oid

!OID of the system catalog containing the object, or null if the
!object is not a general database object

   
   
--- 6983,6990 
oid
pg_class.oid

!OID of the system catalog containing the lock target, or null if the
!target is not a general database object

   
   
***
*** 6990,6997 
oid
any OID column

!OID of the object within its system catalog, or null if the
!object is not a general database object.
 For advisory locks it is used to distinguish the two key
 spaces (1 for an int8 key, 2 for two int4 keys).

--- 6992,6999 
oid
any OID column

!OID of the lock target within its system catalog, or null if the
!target is not a general database object.
 For advisory locks it is used to distinguish the two key
 spaces (1 for an int8 key, 2 for two int4 keys).

***
*** 7001,7010 
smallint


!For a table column, this is the column number (the
 classid and objid refer to the
!table itself).  For all other object types, this column is
!zero.  Null if the object is not a general database object

   
   
--- 7003,7013 
smallint


!Column number targeted by the lock (the
 classid and objid refer to the
!table itself),
!or zero if the target is some other general database object,
!or null if the target is not a general database object

   
   


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscriptio

Re: [HACKERS] patch: enhanced get diagnostics statement 2

2011-07-14 Thread Pavel Stehule

2011/7/14 Alvaro Herrera :
> Excerpts from Pavel Stehule's message of jue jul 14 16:25:56 -0400 2011:
>> 2011/7/14 Alvaro Herrera :
>> > A couple items for this patch:
>
>> it is good idea
>
> Thanks ... I expect you're going to resubmit the patch based on David's
> last version and my comments?
>

yes, but tomorrow, time to go sleep

Regards

Pavel

> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov

Do you think using "rightlink" as pointer to parent page is possible during
index build? It would allow to simplify code significantly, because of no
more need to maintain in-memory structures for parents memorizing.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] patch: enhanced get diagnostics statement 2

2011-07-14 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of jue jul 14 16:25:56 -0400 2011:
> 2011/7/14 Alvaro Herrera :
> > A couple items for this patch:

> it is good idea

Thanks ... I expect you're going to resubmit the patch based on David's
last version and my comments?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: enhanced get diagnostics statement 2

2011-07-14 Thread Pavel Stehule

2011/7/14 Alvaro Herrera :
> A couple items for this patch:
>
> The docs state that the variable to receive each diagnostic value needs
> to be "of the right data type" but fails to specify what it is.  I think
> it'd be good to turn that  into a table with three
> columns: name, type, description.
>
> This seems poor style:
>
> +                               case PLPGSQL_GETDIAG_ERROR_CONTEXT:
> +                               case PLPGSQL_GETDIAG_ERROR_DETAIL:
> +                               case PLPGSQL_GETDIAG_ERROR_HINT:
> +                               case PLPGSQL_GETDIAG_RETURNED_SQLSTATE:
> +                               case PLPGSQL_GETDIAG_MESSAGE_TEXT:
> +                                   if (!$2)
> +                                       ereport(ERROR,
> +                                           (errcode(ERRCODE_SYNTAX_ERROR),
> +                                            errmsg("EXCEPTION_CONTEXT or 
> EXCEPTION_DETAIL or EXCEPTION_HINT or RETURNED_SQLSTATE or MESSAGE_TEXT are 
> not allowed in current diagnostics statement"),
> +                                                    parser_errposition(@1)));
> +
>
>
> I think we could replace this with something like
>
> +                                   if (!$2)
> +                                       ereport(ERROR,
> +                                           (errcode(ERRCODE_SYNTAX_ERROR),
> +                                            errmsg("diagnostic value %s is 
> not allowed in GET CURRENT DIAGNOSTICS statement", stringify(ditem->kind)),
>
>
> Other than that, and a few grammar fixes in code comments, this seems
> good to me.
>

it is good idea

Regards

Pavel

> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Kevin Grittner

"David E. Wheeler"  wrote:
 
> A deprecation cycle at least might be useful.
 
How about a "relistemp" extension on pgxn.org for the "generated
column" function to provide the backward compatibility?  Is the new
extension mechanism a sane way to help those who need a phase-out
period?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need help understanding pg_locks

2011-07-14 Thread Bruce Momjian

Florian Pflug wrote:
> I still believe the chance of confusion to be extremely small, but since
> you feel otherwise, what about "Targeted" instead of "Locked". As in
> 
>   OID of the relation targeted by the lock, or null if the lock does not
>   target a relation or part of a relation.
> 
>   Page number within the relation targeted by the lock, or null if the
>   lock does not target a tuple or a relation page.
> 
>   Virtual ID of the transaction targeted by the lock, or null if the lock
>   does not target a virtual transaction ID.
> 
> "Protected"/"protects" instead of "Targeted"/"targets" would also work.
> 
> Both avoid the imprecision of saying "Locked", and the ambiguity "on" -
> which might either mean the physical location of the lock, or the object
> its protecting/targeting. 
> 
> > I reworded that line to:
> > 
> > +   OID of the relation of the lock target, or null if the lock is not
> 
> I'm not a huge fan of that. IMHO " .. of .. of .. " chains are hard to
> read. Plus, there isn't such a thing as the "relation of a lock target" -
> the relation *is* the lock target, not a part thereof.

Agreed.  I like "targeted by".  New patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
new file mode 100644
index c5851af..6fa6fa9
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 6928,6936 
oid
pg_database.oid

!OID of the database in which the object exists, or
!zero if the object is a shared object, or
!null if the lock object is on a transaction ID

   
   
--- 6928,6936 
oid
pg_database.oid

!OID of the database in which the lock target exists, or
!zero if the lock is a shared object, or
!null if the lock is on a transaction ID

   
   
***
*** 6938,6944 
oid
pg_class.oid

!OID of the relation, or null if the lock object is not
 on a relation or part of a relation

   
--- 6938,6944 
oid
pg_class.oid

!OID of the relation targeted by the lock, or null if the lock is not
 on a relation or part of a relation

   
***
*** 6947,6953 
integer


!Page number within the relation, or null if the lock object
 is not on a tuple or relation page

   
--- 6947,6953 
integer


!Page number within the relation targeted by the lock, or null if the lock
 is not on a tuple or relation page

   
***
*** 6956,6963 
smallint


!Tuple number within the page, or null if the lock object is not
!on a tuple

   
   
--- 6956,6963 
smallint


!Tuple number within the page targeted by the lock, or null if
!the lock is not on a tuple

   
   
***
*** 6965,6971 
text


!Virtual ID of a transaction lock, or null if the lock object is not
 on a virtual transaction ID

   
--- 6965,6971 
text


!Virtual ID of a transaction targeted by the lock, or null if the lock is not
 on a virtual transaction ID

   
***
*** 6974,6980 
xid


!ID of a transaction lock, or null if the lock object is not on a transaction ID

   
   
--- 6974,6981 
xid


!ID of a transaction targeted by the lock, or null if the lock
!is not on a transaction ID

   
   
***
*** 6982,6989 
oid
pg_class.oid

!OID of the system catalog containing the object, or null if the
!lock object is not on a general database object.

   
   
--- 6983,6990 
oid
pg_class.oid

!OID of the system catalog targeted by the lock, or null if the
!lock is not on a general database object.

   
   
***
*** 6991,6998 
oid
any OID column

!OID of the object within its system catalog, or null if the
!lock object is not on a general database object.
 For advisory locks it is used to distinguish the two key
 spaces (1 for an int8 key, 2 for two int4 keys).

--- 6992,6999 
oid
any OID column

!OID of the object within its system catalog targeted by the
!lock, or null if the lock is not on a general database object.
 For advisory locks it is used to dis

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 12:19 PM, Tom Lane wrote:

>> Are they not testing our 9.1 betas?
> 
> There has never, ever, been a guarantee that the system catalogs don't
> change across versions.  Anybody issuing such queries should expect to
> have to retest them and possibly change them in each new major release.
> I see nothing to sweat about here.

A deprecation cycle at least might be useful.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: enhanced get diagnostics statement 2

2011-07-14 Thread Alvaro Herrera

A couple items for this patch:

The docs state that the variable to receive each diagnostic value needs
to be "of the right data type" but fails to specify what it is.  I think
it'd be good to turn that  into a table with three
columns: name, type, description.

This seems poor style:

+   case PLPGSQL_GETDIAG_ERROR_CONTEXT:
+   case PLPGSQL_GETDIAG_ERROR_DETAIL:
+   case PLPGSQL_GETDIAG_ERROR_HINT:
+   case PLPGSQL_GETDIAG_RETURNED_SQLSTATE:
+   case PLPGSQL_GETDIAG_MESSAGE_TEXT:
+   if (!$2)
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("EXCEPTION_CONTEXT or 
EXCEPTION_DETAIL or EXCEPTION_HINT or RETURNED_SQLSTATE or MESSAGE_TEXT are not 
allowed in current diagnostics statement"),
+parser_errposition(@1)));
+   


I think we could replace this with something like

+   if (!$2)
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("diagnostic value %s is not 
allowed in GET CURRENT DIAGNOSTICS statement", stringify(ditem->kind)),


Other than that, and a few grammar fixes in code comments, this seems
good to me.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Understanding GIN posting trees

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 22:10, Tom Lane wrote:

Heikki Linnakangas  writes:

Why is the posting tree a tree? AFAICS, we never search it using the
TID, it's always scanned in whole. It would be simpler to store the TIDs
in a posting list in no particular order. This could potentially make
insertions cheaper, as you could just append to the last posting list
page for the key, instead of traversing the posting tree to a particular
location. You could also pack the tids denser, as you wouldn't need to
reserve free space for additions in the middle.


Surely VACUUM would like to search it by TID for deletion purposes?


It doesn't, it scans all the tid lists in whole. I guess it could search 
by TID, it could be a win if there's only a few deleted tuples, in a 
small range of pages.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Tom Lane

Bruce Momjian  writes:
> Josh Berkus wrote:
>> BTW, if we're dumping relistemp, we're going to need to notify every
>> maker of a PostgreSQL admin interface before we release 9.1.

> Are they not testing our 9.1 betas?

There has never, ever, been a guarantee that the system catalogs don't
change across versions.  Anybody issuing such queries should expect to
have to retest them and possibly change them in each new major release.
I see nothing to sweat about here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] per-column generic option

2011-07-14 Thread Josh Berkus

All,

Is the spec for this feature still under discussion?  I don't seem to
see a consensus on this thread.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Understanding GIN posting trees

2011-07-14 Thread Tom Lane

Heikki Linnakangas  writes:
> Why is the posting tree a tree? AFAICS, we never search it using the 
> TID, it's always scanned in whole. It would be simpler to store the TIDs 
> in a posting list in no particular order. This could potentially make 
> insertions cheaper, as you could just append to the last posting list 
> page for the key, instead of traversing the posting tree to a particular 
> location. You could also pack the tids denser, as you wouldn't need to 
> reserve free space for additions in the middle.

Surely VACUUM would like to search it by TID for deletion purposes?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension ownership and pg_dump

2011-07-14 Thread Tom Lane

Heikki Linnakangas  writes:
> pg_dump seems to not dump the owner of extensions. Is that intentional?

Yeah.  Partly it was a matter of not wanting to implement ALTER
EXTENSION OWNER, but I think there were some definitional issues too.
See the archives from back around February or so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Is there a committer in the house?

2011-07-14 Thread Josh Berkus

All,

Currently we have 8 patches "Ready for Committer" in the current CF.
Some of them have been that status for some time.

>From traffic on this list, I'm getting the impression that nobody other
than Robert, Heikki and Tom are committing other people's patches.  I
know we have more committers than that.  Bruce?  Simon?  Itagaki?  Bueller?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Extension ownership and pg_dump

2011-07-14 Thread Heikki Linnakangas


pg_dump seems to not dump the owner of extensions. Is that intentional?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow pg_archivecleanup to ignore extensions

2011-07-14 Thread Josh Berkus

On 7/12/11 11:17 AM, Josh Berkus wrote:
> On 7/12/11 7:38 AM, Simon Riggs wrote:
>> On Sun, Jul 10, 2011 at 7:13 PM, Josh Berkus  wrote:
>>
>>> This patch[1] is for some reason marked "waiting on Author".  But I
>>> can't find that there's been any review of it searching the list.
>>> What's going on with it?  Has it been reviewed?
>>
>> Yes, I reviewed it on list. Some minor changes were discussed. I'm
>> with Greg now, so we'll discuss and handle it.
> 
> I couldn't find the review searching the archives.  Can you please link
> it in the Commitfest application?  Thanks.

Given the total lack of activity on this patch, I'm bumping it.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Alvaro Herrera

Excerpts from Pavan Deolasee's message of jue jul 14 13:54:36 -0400 2011:
> On Thu, Jul 14, 2011 at 12:43 PM, Heikki Linnakangas <
> heikki.linnakan...@enterprisedb.com> wrote:

> > Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in
> > one uint64 column, or invent a new datatype for LSNs, or store it as text in
> > %X/%X format.
> >
> >
> Yeah. I don't remember what issues I faced with uint64 column, may be did
> not get around the case where uint64 is not natively supported on the
> platform. But %X/%X looks very reasonable. Will change to that.

Where is this to be stored?

AFAIR we no longer support platforms that do not have working 64 bit
integer types.

As a column name, relindxvacxlogid seems a bit unfortunate, BTW ...

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need help understanding pg_locks

2011-07-14 Thread Florian Pflug

On Jul14, 2011, at 19:06 , Bruce Momjian wrote:
> Florian Pflug wrote:
>> On Jul13, 2011, at 21:08 , Bruce Momjian wrote:
>>> +   OID of the relation lock target, or null if the lock is not
>>>   on a relation or part of a relation
>> 
>> That, however, not so much. "relation lock target" might easily
>> be interpreted as the "relation's lock target" or the
>> "relation lock's target" - at least by non-native speakers such
>> as myself. The same is true fro "transaction lock target" and
>> friends.
>> 
>> Can't we simply go with "Locked relation", "Locked transaction id"
>> and so on (as in my versions B,C and D up-thread)? I can't really
>> get excited about the slight imprecision caused by the fact that some
>> rows describe aspiring lock holders instead of current lock holders.
>> The existence of the "granted" column makes the situation pretty clear.
>> 
>> Plus, it's technically not even wrong - a process is waiting because
>> somebody else *is* actually holding a lock on the object. So
>> the tuple/transaction/... is, in fact, a "Locked tuple/transaction/..."
> 
> I think it will be very confusing to have "locked" refer to the person
> holding the lock while the row is based on who is waiting for it.

I still believe the chance of confusion to be extremely small, but since
you feel otherwise, what about "Targeted" instead of "Locked". As in

  OID of the relation targeted by the lock, or null if the lock does not
  target a relation or part of a relation.

  Page number within the relation targeted by the lock, or null if the
  lock does not target a tuple or a relation page.

  Virtual ID of the transaction targeted by the lock, or null if the lock
  does not target a virtual transaction ID.

"Protected"/"protects" instead of "Targeted"/"targets" would also work.

Both avoid the imprecision of saying "Locked", and the ambiguity "on" -
which might either mean the physical location of the lock, or the object
its protecting/targeting. 

> I reworded that line to:
> 
> +   OID of the relation of the lock target, or null if the lock is not

I'm not a huge fan of that. IMHO " .. of .. of .. " chains are hard to
read. Plus, there isn't such a thing as the "relation of a lock target" -
the relation *is* the lock target, not a part thereof.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Pavan Deolasee

On Thu, Jul 14, 2011 at 12:43 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> On 14.07.2011 18:57, Pavan Deolasee wrote:
>
>> On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs
>>  wrote:
>>
>>> I'd say that seems way too complex for such a small use case and we've
>>>
>>> only just fixed the bugs from 8.4 vacuum map complexity. The code's
>>> looking very robust now and I'm uneasy that such changes are really
>>> worth it.
>>>
>>>  Thanks Simon for looking at the patch.
>>
>> I am not sure if the use case is really narrow. Today, we dirty the pages
>> in
>> both the passes and also emit WAL records. Just the heap scan can take a
>> very long time for large tables, blocking the autovacuum worker threads
>> from
>> doing useful work on other tables. If I am not wrong, we use ring buffers
>> for vacuum which would most-likely force those buffers to be written/read
>> twice to the disk.
>>
>
> Seems worthwhile to me. What bothers me a bit is the need for the new
> 64-bit LSN value on each heap page. Also, note that temporary tables are not
> WAL-logged, so there's no LSNs.
>
>
Yeah, the good thing is we store it only when its needed. Temp and unlogged
tables don't need any special handling because we don't rely on the WAL
logging for the table itself. As you have noted down the thread, any counter
which is guaranteed to not wrap-around would have worked. LSN is just very
handy for the purpose (let me think more if we can just do with a flag).


> How does this interact with the visibility map? If you set the visibility
> map bit after vacuuming indexes, a subsequent vacuum will not visit the
> page. The second vacuum will update relindxvacxlogid/off, but it will not
> clean up the dead line pointers left behind by the first vacuum. Now the LSN
> on the page differs from the one stored in pg_class, so subsequent pruning
> will not remove the dead line pointers either. I think you can sidestep that
> if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead
> of equality.
>
>
Hmm. We need to carefully see that. As the patch stands, we don't set the
visibility map bit when there are any dead line pointers on the page. I'm
not sure if its clear, but the ItemIdIsDead() test accounts for dead as well
as dead-vacuum line pointers, so the test in lazy_heap_scan() won't let VM
bit to be set early. The next vacuum cycle would still look at the page and
set the bit if the page appears all-visible at that time. Note that in the
next vacuum cycle, we would first clear the dead-vacuum line pointers if its
not already done by some intermediate HOT-prune operation.



> Ignoring the issue stated in previous paragraph, I think you wouldn't
> actually need an 64-bit LSN. A smaller counter is enough, as wrap-around
> doesn't matter. In fact, a single bit would be enough. After a successful
> vacuum, the counter on each heap page (with dead line pointers) is N, and
> the value in pg_class is N. There are no other values on the heap, because
> vacuum will have cleaned them up. When you begin the next vacuum, it will
> stamp pages with N+1. So at any stage, there is only one of two values on
> any page, so a single bit is enough. (But as I said, that doesn't hold if
> vacuum skips some pages thanks to the visibility map)
>
>
I am not sure if that can take care of aborted vacuum case with a single bit
or a counter that can wrap-around. Let me think more about it.



> Is there something in place to make sure that pruning uses an up-to-date
> relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date,
> you'll just miss the opportunity to remove some dead tuples.
>
>
Yeah, exactly. We just rely on the value that was read when the pg_class
tuple was last read. If we don't have the up-to-date value, we might miss
some opportunity to clean up those dead-vauum line pointers.


> Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in
> one uint64 column, or invent a new datatype for LSNs, or store it as text in
> %X/%X format.
>
>
Yeah. I don't remember what issues I faced with uint64 column, may be did
not get around the case where uint64 is not natively supported on the
platform. But %X/%X looks very reasonable. Will change to that.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Understanding GIN posting trees

2011-07-14 Thread Teodor Sigaev


I have a couple of questions on GIN:

The code seems to assume that it's possible for the same TID to appear
twice for a single key (see addItemPointersToTuple()). I understand that
it's possible for a single heap tuple to contain the same key twice. For
example if you index an array of integers like [1,2,1]. But once you've
inserted all the keys for a single heap item, you never try to insert
the same TID again, so no duplicates should occur.

Looking at the history, it looks like pre-8.4 we assumed that no such
duplicates are possible. Duplicates of a single key for one column are
eliminated in extractEntriesSU(), but apparently when the multi-column
support was added, we didn't make the de-duplication to run across the
keys extracted from all columns. Now that the posting tree/list
insertion code has to deal with duplicates anyway, the de-duplication
performed in extractEntriesSU() seems pointless. But I wonder if it
would be better to make extractEntriesSU() remove duplicates across all
columns, so that the insertion code wouldn't need to deal with duplicates.


During vacuuming of pending list we could get a powerloss and some data will be 
in tree and pending list both, after restart pgsql will try to insert the same 
data to tree.



Dealing with the duplicates in the insertion code isn't particularly
difficult. And in fact, now that we only support the getbitmap method,
we wouldn't really need to eliminate duplicates anyway. But I have an
ulterior motive:



Why is the posting tree a tree? AFAICS, we never search it using the
TID, it's always scanned in whole. It would be simpler to store the TIDs
in a posting list in no particular order. This could potentially make
insertions cheaper, as you could just append to the last posting list
page for the key, instead of traversing the posting tree to a particular
location. You could also pack the tids denser, as you wouldn't need to
reserve free space for additions in the middle.
For consistentFn call we need to collect all data for current tid. We do that by 
scanning over logical ordered arrays of tids and trees allows to do that by 
scanning a leafs pages.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Bruce Momjian

Josh Berkus wrote:
> All,
> 
> BTW, if we're dumping relistemp, we're going to need to notify every
> maker of a PostgreSQL admin interface before we release 9.1.
> 
> This is why we should have had a complete set of sysviews ...

Are they not testing our 9.1 betas?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Josh Berkus

All,

BTW, if we're dumping relistemp, we're going to need to notify every
maker of a PostgreSQL admin interface before we release 9.1.

This is why we should have had a complete set of sysviews ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUG] SSPI authentication fails on Windows when server parameter is localhost or domain name

2011-07-14 Thread Magnus Hagander

On Wed, Jun 15, 2011 at 10:53, Ahmed Shinwari  wrote:
> Hi All,
>
> I faced a bug on Windows while connecting via SSPI authentication. I was
> able to find the bug and have attached the patch. Details listed below;
>
> Postgres Installer: Version 9.0.4
> OS: Windows Server 2008 R2/Windows 7

Thanks - great analysis!

However, I think there is a better fix for this - simply moving a }
one line. In particular, I'm concerned about passing the same pointer
both as input and output to the function - I couldn't find anything in
the documentation saying this was safe (nor did I find anything saying
it's unsafe, but.) Especially since this code clearly behaves
different on different versions - I've been completely unable to
reproduce this on any of my test machines, but they are all Windows
Server 2003.

So - attached is a new version of the patch, how does this look to
you? FYI, I've had Thom test this new version and it does appear to
work fine in his scenario.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 7799111..936cfea 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -1349,16 +1349,22 @@ pg_SSPI_recvauth(Port *port)
 		  _("could not accept SSPI security context"), r);
 		}

+		/*
+		 * Overwrite the current context with the one we just received.
+		 * If sspictx is NULL it was the first loop and we need to allocate
+		 * a buffer for it. On subsequent runs, we can just overwrite the
+		 * buffer contents since the size does not change.
+		 */
 		if (sspictx == NULL)
 		{
 			sspictx = malloc(sizeof(CtxtHandle));
 			if (sspictx == NULL)
 ereport(ERROR,
 		(errmsg("out of memory")));
-
-			memcpy(sspictx, &newctx, sizeof(CtxtHandle));
 		}

+		memcpy(sspictx, &newctx, sizeof(CtxtHandle));
+
 		if (r == SEC_I_CONTINUE_NEEDED)
 			elog(DEBUG4, "SSPI continue needed");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] help with sending email

2011-07-14 Thread Robert Haas

On Jul 13, 2011, at 12:29 PM, "Fernando Acosta Torrelly"  
wrote:
> Hi everybody:
> 
> I am using pgmail to send email in an application, but I would like to use 
> html format
> 
> Does anybody has an example how to do this?, or what do you recommend me to 
> use por doing this?
> 
> Thanks in advance for your attention.
> 
This is the mailing list for PostgreSQL, not pgmail.  And it is also a 
development mailing list; there are others for user questions.

...Robert

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-14 Thread Mark Kirkwood


On 14/07/11 18:48, Tatsuo Ishii wrote:


Hi I am the new reviewer:-)

I have looked into the v6 patches. One thing I would like to suggest
is, enhancing the error message when temp_file_limit will be exceeded.

ERROR:  aborting due to exceeding temp file limit

Is it possible to add the current temp file limit to the message? For
example,

ERROR:  aborting due to exceeding temp file limit 1kB

I know the current setting of temp_file_limit can be viewd in other
ways, but I think this will make admin's or application developer's
life a little bit easier.


Hi Tatsuo,

Yeah, good suggestion - I agree that it would be useful to supply extra 
detail, I'll amend and resubmit a new patch (along with whatever review 
modifications we come up with in the meantime)!


Cheers

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need help understanding pg_locks

2011-07-14 Thread Bruce Momjian

Florian Pflug wrote:
> On Jul13, 2011, at 21:08 , Bruce Momjian wrote:
> > -   OID of the database in which the object exists, or
> > -   zero if the object is a shared object, or
> > -   null if the lock object is on a transaction ID
> > +   OID of the database in which the lock target exists, or
> > +   zero if the lock is a shared object, or
> > +   null if the lock is on a transaction ID
> 
> This sounds good.
> 
> > +   OID of the relation lock target, or null if the lock is not
> >on a relation or part of a relation
> 
> That, however, not so much. "relation lock target" might easily
> be interpreted as the "relation's lock target" or the
> "relation lock's target" - at least by non-native speakers such
> as myself. The same is true fro "transaction lock target" and
> friends.
> 
> Can't we simply go with "Locked relation", "Locked transaction id"
> and so on (as in my versions B,C and D up-thread)? I can't really
> get excited about the slight imprecision caused by the fact that some
> rows describe aspiring lock holders instead of current lock holders.
> The existence of the "granted" column makes the situation pretty clear.
> 
> Plus, it's technically not even wrong - a process is waiting because
> somebody else *is* actually holding a lock on the object. So
> the tuple/transaction/... is, in fact, a "Locked tuple/transaction/..."

I think it will be very confusing to have "locked" refer to the person
holding the lock while the row is based on who is waiting for it.

I reworded that line to:

+   OID of the relation of the lock target, or null if the lock is not

Update patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c5851af..84c2257 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6928,9 +6928,9 @@
   oid
   pg_database.oid
   
-   OID of the database in which the object exists, or
-   zero if the object is a shared object, or
-   null if the lock object is on a transaction ID
+   OID of the database in which the lock target exists, or
+   zero if the lock is a shared object, or
+   null if the lock is on a transaction ID
   
  
  
@@ -6938,7 +6938,7 @@
   oid
   pg_class.oid
   
-   OID of the relation, or null if the lock object is not
+   OID of the relation of the lock target, or null if the lock is not
on a relation or part of a relation
   
  
@@ -6947,7 +6947,7 @@
   integer
   
   
-   Page number within the relation, or null if the lock object
+   Page number within the relation, or null if the lock
is not on a tuple or relation page
   
  
@@ -6956,7 +6956,7 @@
   smallint
   
   
-   Tuple number within the page, or null if the lock object is not
+   Tuple number within the page, or null if the lock is not
on a tuple
   
  
@@ -6965,7 +6965,7 @@
   text
   
   
-   Virtual ID of a transaction lock, or null if the lock object is not
+   Virtual ID of a transaction lock target, or null if the lock is not
on a virtual transaction ID
   
  
@@ -6974,7 +6974,7 @@
   xid
   
   
-   ID of a transaction lock, or null if the lock object is not on a transaction ID
+   ID of a transaction lock target, or null if the lock is not on a transaction ID
   
  
  
@@ -6982,8 +6982,8 @@
   oid
   pg_class.oid
   
-   OID of the system catalog containing the object, or null if the
-   lock object is not on a general database object.
+   OID of the system catalog containing the lock target, or null if the
+   lock is not on a general database object.
   
  
  
@@ -6992,7 +6992,7 @@
   any OID column
   
OID of the object within its system catalog, or null if the
-   lock object is not on a general database object.
+   lock is not on a general database object.
For advisory locks it is used to distinguish the two key
spaces (1 for an int8 key, 2 for two int4 keys).
   
@@ -7005,7 +7005,7 @@
For a table column, this is the column number (the
classid and objid refer to the
table itself).  For all other object types, this column is
-   zero.  Null if the lock object is not on a general database object.
+   zero.  Null if the lock is not on a general database object.
   
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 9:55 AM, Heikki Linnakangas wrote:

> Far back. But you only need it in >= 9.1. Older versions have the 
> pg_class.relistemp column anyway.

Yeah.

> Not sure how this helps, though. If you modify pgTAP to install that 
> automatically in pgTAP when dealing with a new server version, you might as 
> well modify its queries to use relispersistence = 't' directly when dealing 
> with a new server version. It works as a manual work-around if you can't 
> upgrade pgTAP, I guess.

Yeah, that's what I'd rather avoid. I'll probably have to modify the function 
that makes the call to look at the version number. Annoying, but do-able.

  https://github.com/theory/pgtap/blob/master/sql/pgtap.sql.in#L5894

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Kevin Grittner

"David E. Wheeler"  wrote:
> On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote:
> 
>> create or replace function relistemp(rel pg_class)
>>  returns boolean language sql immutable strict as
>>  $$select $1.relpersistence = 't';$$;
>> 
>> Just don't forget to use the table name or alias in front of
>> it... :-)
> 
> Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and
> higher)?

As far as I know, the technique of creating a function with a record
type as its only parameter to use as a "generated column" goes way
back.  This particular function won't work prior to 9.1, because you
won't have the relpersistence column, but then, prior to 9.1 you
wouldn't need to run this because you already have a relistemp
column.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 19:51, David E. Wheeler wrote:

On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote:


create or replace function relistemp(rel pg_class)
  returns boolean language sql immutable strict as
  $$select $1.relpersistence = 't';$$;

Just don't forget to use the table name or alias in front of it... :-)


Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and higher)?


Far back. But you only need it in >= 9.1. Older versions have the 
pg_class.relistemp column anyway.


Not sure how this helps, though. If you modify pgTAP to install that 
automatically in pgTAP when dealing with a new server version, you might 
as well modify its queries to use relispersistence = 't' directly when 
dealing with a new server version. It works as a manual work-around if 
you can't upgrade pgTAP, I guess.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Full GUID support

2011-07-14 Thread David E. Wheeler

On Jul 14, 2011, at 12:05 AM, Hiroshi Saito wrote:

> Hi Thomas-san, Ralf-san.
> 
> I appreciate your great work.
> Thanks!
> 
> CC to Postgres-ML.
> 
> Regards,
> Hiroshi Saito
> 
> (2011/07/14 3:49), Thomas Lotterer wrote:
> > Thanks for the hint.
> > Our ftp daemon is dumping core.
> > We are debugging ...

Ah, good news, thanks.

Where should I report stuff like this in the future? I sent a message about 
this to r...@engelschall.com on May 15th and also a Twitter DM but didn't hear 
back…

Thanks,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_class.relistemp

2011-07-14 Thread David E. Wheeler

On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote:

> create or replace function relistemp(rel pg_class)
>  returns boolean language sql immutable strict as
>  $$select $1.relpersistence = 't';$$;
> 
> Just don't forget to use the table name or alias in front of it... :-)

Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and higher)?

Thanks,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 18:57, Pavan Deolasee wrote:

On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs  wrote:

I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.


Thanks Simon for looking at the patch.

I am not sure if the use case is really narrow. Today, we dirty the pages in
both the passes and also emit WAL records. Just the heap scan can take a
very long time for large tables, blocking the autovacuum worker threads from
doing useful work on other tables. If I am not wrong, we use ring buffers
for vacuum which would most-likely force those buffers to be written/read
twice to the disk.


Seems worthwhile to me. What bothers me a bit is the need for the new 
64-bit LSN value on each heap page. Also, note that temporary tables are 
not WAL-logged, so there's no LSNs.


How does this interact with the visibility map? If you set the 
visibility map bit after vacuuming indexes, a subsequent vacuum will not 
visit the page. The second vacuum will update relindxvacxlogid/off, but 
it will not clean up the dead line pointers left behind by the first 
vacuum. Now the LSN on the page differs from the one stored in pg_class, 
so subsequent pruning will not remove the dead line pointers either. I 
think you can sidestep that if you check that the page's vacuum LSN <= 
vacuum LSN in pg_class, instead of equality.


Ignoring the issue stated in previous paragraph, I think you wouldn't 
actually need an 64-bit LSN. A smaller counter is enough, as wrap-around 
doesn't matter. In fact, a single bit would be enough. After a 
successful vacuum, the counter on each heap page (with dead line 
pointers) is N, and the value in pg_class is N. There are no other 
values on the heap, because vacuum will have cleaned them up. When you 
begin the next vacuum, it will stamp pages with N+1. So at any stage, 
there is only one of two values on any page, so a single bit is enough. 
(But as I said, that doesn't hold if vacuum skips some pages thanks to 
the visibility map)


Is there something in place to make sure that pruning uses an up-to-date 
relindxvacxlogid/off value? I guess it doesn't matter if it's 
out-of-date, you'll just miss the opportunity to remove some dead tuples.


Seems odd to store relindxvacxlogid/off as two int32 columns. Store it 
in one uint64 column, or invent a new datatype for LSNs, or store it as 
text in %X/%X format.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 10:53 AM, Heikki Linnakangas
 wrote:
> On 14.07.2011 12:42, Simon Riggs wrote:
>>
>> On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao
>>  wrote:
>>
>>> Currently walwriter might write out the WAL before a transaction commits.
>>> IOW, walwriter tries to write out the WAL in wal_buffers in every
>>> wakeups.
>>> This might be useful for long transaction which generates lots of WAL
>>> records before commit. So we should call SetLatch() in XLogInsert()
>>> instead
>>> of RecordTransactionCommit()? Though I'm not sure how much walwriter
>>> improves the performance of synchronous commit case..
>>
>> Yeh, we did previously have a heuristic to write out the WAL when it
>> was more than half full. Not sure I want to put exactly that code back
>> into such a busy code path.
>>
>> I suggest that we set latch every time the wal buffers wrap.
>>
>> So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
>> SetLatch on the WALWriter.
>>
>> That's a simple test and we only check it if we're switch WAL buffer page.
>
> That was my first though too - but I wonder if that's too aggressive? A
> backend that does for example a large bulk load will cycle through the
> buffers real quick. It seems like a bad idea to wake up walwriter between
> each buffer in that case. Then again, setting a latch that's already set is
> cheap, so maybe it works fine in practice.
>
> Maybe it would be better to do it less frequently, say, every time you
> switch to new WAL segment. Or every 10 buffers or something like that.

Yes, that roughly what I'm saying. When nextidx == 0 is just after we
wrapped wal_buffers, i.e. we only wake up the WALWriter every
wal_buffers pages.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 12:42, Simon Riggs wrote:

On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao  wrote:


Currently walwriter might write out the WAL before a transaction commits.
IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
This might be useful for long transaction which generates lots of WAL
records before commit. So we should call SetLatch() in XLogInsert() instead
of RecordTransactionCommit()? Though I'm not sure how much walwriter
improves the performance of synchronous commit case..


Yeh, we did previously have a heuristic to write out the WAL when it
was more than half full. Not sure I want to put exactly that code back
into such a busy code path.

I suggest that we set latch every time the wal buffers wrap.

So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
SetLatch on the WALWriter.

That's a simple test and we only check it if we're switch WAL buffer page.


That was my first though too - but I wonder if that's too aggressive? A 
backend that does for example a large bulk load will cycle through the 
buffers real quick. It seems like a bad idea to wake up walwriter 
between each buffer in that case. Then again, setting a latch that's 
already set is cheap, so maybe it works fine in practice.


Maybe it would be better to do it less frequently, say, every time you 
switch to new WAL segment. Or every 10 buffers or something like that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Peter Eisentraut

On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote:
> Hackers,
> 
> >> B. 6. Current behaviour _is intended_ (there is "if"  to check node type) 
> >> and _"natural"_. In this particular case user ask for text content of some 
> >> node, and this content is actually "<".
> > 
> > I don't buy that. The check for the node type is there because
> > two different libxml functions are used to convert nodes to
> > strings. The if has absolutely *zero* to do with escaping, expect
> > for that missing escape_xml() call in the "else" case.
> > 
> > Secondly, there is little point in having an type XML if we
> > don't actually ensure that values of that type can only contain
> > well-formed XML.
> 
> Can anyone else weigh in on this? Peter?

Looks like a good change to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Peter Eisentraut

On ons, 2011-07-13 at 11:58 +0200, Nicolas Barbier wrote:
> 2011/6/29, Florian Pflug :
> 
> > Secondly, there is little point in having an type XML if we
> > don't actually ensure that values of that type can only contain
> > well-formed XML.
> 
> +1. The fact that XPATH() must return a type that cannot depend on the
> given expression (even if it is a constant string) may be unfortunate,
> but returning XML-that-is-not-quite-XML sounds way worse to me.

The example given was

XPATH('/*/text()', '<')

This XPath expression returns a node set, and XML is a serialization
format of a node, so returning xml[] in this particular case seems
entirely reasonable to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Simon Riggs

On Tue, Jul 12, 2011 at 9:47 PM, Pavan Deolasee
 wrote:

> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php
> PFA a patch which implements the idea with some variation.
> At the start of the first pass, we remember the current LSN. Every page that
> needs some work is HOT-pruned so that dead tuples are truncated to dead line
> pointers. We collect those dead line pointers and mark them as
> dead-vacuumed. Since we don't have any LP flag bits available, we instead
> just use the LP_DEAD flag along with offset value 1 to mark the line pointer
> as dead-vacuumed. The page is defragmented and we  store the LSN remembered
> at the start of the pass in the page special area as vacuum LSN. We also
> update the free space at that point because we are not going to do a second
> pass on the page anymore.
>
> Once we collect all dead line pointers and mark them as dead-vacuumed, we
> clean-up the indexes and remove all index pointers pointing to those
> dead-vacuumed line pointers. If the index vacuum finishes successfully, we
> store the LSN in the pg_class row of the table (needs catalog changes). At
> that point, we are certain that there are no index pointers pointing to
> dead-vacuumed line pointers and they can be reclaimed at the next
> opportunity.
>
> During normal operations or subsequent vacuum, if the page is chosen for
> HOT-prunung, we check if has any dead-vacuumed line pointers and if the
> vacuum LSN stored on the page special area is equal to the one stored in the
> pg_class row, and reclaim those dead-vacuum line pointers (the index
> pointers to these line pointers are already taken care of). If the pg_class
> LSN is not the same, the last vacuum probably did not finish completely and
> we collect the dead-vacuum line pointers just like other dead line pointers
> and try to clean up the index pointers as usual.
> I ran few pgbench tests with the patch. I don't see much difference in the
> overall tps, but the vacuum time for the accounts table reduces by nearly
> 50%. I neither see much difference in the overall bloat, but then pgbench
> uses HOT very nicely and the accounts table got only couple of vacuum cycles
> in my 7-8 hour run.
> There are couple of things that probably need more attention. I am not sure
> if we need to teach ANALYZE to treat dead line pointers differently. Since
> they take up much less space than a dead tuple, they should definitely have
> a lower weight, but at the same time, we need to take into account the
> number of indexes on the table. The start of first pass LSN that we are
> remembering is in fact the start of the WAL page and I think there could be
> some issues with that, especially for very tiny tables. For example, first
> vacuum may run completely. If another vacuum is started on the same table
> and say it gets the same LSN (because we did not write more than 1 page
> worth WAL in between) and if the second vacuum aborts after it cleaned up
> few pages, we might get into some trouble. The likelihood of such things
> happening is very small, but may be its worth taking care of it. May be we
> can get the exact current LSN and not store it in the pg_class if we don't
> do anything during the cycle.
> Comments ?

Hi Pavan,

I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.

You're trying to avoid Phase 3, the second pass on the heap. Why not
avoid the write in Phase 1 if its clear that we'll need to come back
again in Phase 3? So we either do a write in Phase 1 or in Phase 3,
but never both? That minimises the writes, which are what hurt the
most.

We can reduce the overall cost simply by not doing Phase 2 and Phase 3
if the number of rows to remove is too few, say < 1%.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Pavan Deolasee

On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs  wrote:

>
>
> Hi Pavan,
>
> I'd say that seems way too complex for such a small use case and we've
> only just fixed the bugs from 8.4 vacuum map complexity. The code's
> looking very robust now and I'm uneasy that such changes are really
> worth it.
>
>
Thanks Simon for looking at the patch.

I am not sure if the use case is really narrow. Today, we dirty the pages in
both the passes and also emit WAL records. Just the heap scan can take a
very long time for large tables, blocking the autovacuum worker threads from
doing useful work on other tables. If I am not wrong, we use ring buffers
for vacuum which would most-likely force those buffers to be written/read
twice to the disk.

Which part of the patch you think is very complex ? We can try to simplify
that. Or are you seeing any obvious bugs that I missed ? IMHO taking out a
phase completely from vacuum (as this patch does) can simplify things.

> You're trying to avoid Phase 3, the second pass on the heap. Why not
> avoid the write in Phase 1 if its clear that we'll need to come back
> again in Phase 3? So we either do a write in Phase 1 or in Phase 3,
> but never both? That minimises the writes, which are what hurt the
> most.
>
>
You can possibly do the work in the Phase 3, but that doesn't avoid the
second scan.

> We can reduce the overall cost simply by not doing Phase 2 and Phase 3
> if the number of rows to remove is too few, say < 1%.
>

If you have set the vacuum parameters such that it kicks in when there are
say 5% updates/deletes, you would most likely have that much work to do
anyways.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] SAVEPOINTs and COMMIT performance

2011-07-14 Thread Simon Riggs

On Mon, Jun 6, 2011 at 10:33 AM, Heikki Linnakangas
 wrote:
> On 06.02.2011 23:09, Simon Riggs wrote:
>>
>> On Sun, 2011-02-06 at 12:11 -0500, Bruce Momjian wrote:
>>>
>>> Did this ever get addressed?
>>
>> Patch attached.
>>
>> Seems like the easiest fix I can come up with.
>
>> @@ -2518,7 +2518,7 @@ CommitTransactionCommand(void)
>>                case TBLOCK_SUBEND:
>>                        do
>>                        {
>> -                               CommitSubTransaction();
>> +                               CommitSubTransaction(true);
>>                                s = CurrentTransactionState;    /* changed
>> by pop */
>>                        } while (s->blockState == TBLOCK_SUBEND);
>>                        /* If we had a COMMIT command, finish off the main
>> xact too */
>
> We also get into this codepath at RELEASE SAVEPOINT, in which case it is
> wrong to not reassign the locks to the parent subtransaction.

Attached patch splits TBLOCK_SUBEND state into two new states:
TBLOCK_SUBCOMMIT and TBLOCK_SUBRELEASE, so that we can do the right
thing.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


savepoint_commit_performance.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for distinguishing PG instances in event log

2011-07-14 Thread Magnus Hagander

2011/5/26 MauMau :
> Hello,
>
> I wrote and attached a patch for the TODO item below (which I proposed).
>
> Allow multiple Postgres clusters running on the same machine to distinguish
> themselves in the event log
> http://archives.postgresql.org/pgsql-hackers/2011-03/msg01297.php
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg00574.php
>
> I changed two things from the original proposal.
>
> 1. regsvr32.exe needs /n when you specify event source
> I described the reason in src/bin/pgevent/pgevent.c.
>
> 2. I moved the article for event log registration to more suitable place
> The traditional place and what I originally proposed were not best, because
> those who don't build from source won't read those places.
>
> I successfully tested event log registration/unregistration, event logging
> with/without event_source parameter, and SHOWing event_source parameter with
> psql on Windows Vista (32-bit). I would appreciate if someone could test it
> on 64-bit Windows who has the 64-bit environment.
>
> I'll add this patch to the first CommitFest of 9.2. Thank you in advance for
> reviewing it.

+
+ On Windows, you need to register an event source
+ and its library with the operating system in order
+ to make use of the eventlog option for
+ log_destination.
+ See  for details.
+

* This part is not strictly correct - you don't *need* to do that, it
just makes things look nicer, no?

* Also, what is the use for set_eventlog_parameters()? It's just a
string variable, it shuold work without it.

* We these days avoid #ifdef'ing gucs just because they are not on
that platform - so the list is consistent. The guc should be available
on non-windows platforms as well.

* The guc also needs to go in postgresql.conf.sample

* We never build in unicode mode, so all those checks are unnecessary.

* Are we really allowed to call MessageBox in DlLRegisterService?
Won't that break badly in cases like silent installs?

Attached is an updated patch, which doesn't work yet. I believe the
changes to the backend are correct, but probably some of the cleanups
and changes in the dll are incorrect, because I seem to be unable to
register either the default or a custom handler so far.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 842558d..583a5c9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2975,6 +2975,13 @@ local0.*/var/log/postgresql
  to the  syslog daemon's configuration file
  to make it work.
 
+
+ On Windows, you need to register an event source
+ and its library with the operating system in order
+ to make use of the eventlog option for
+ log_destination.
+ See  for details.
+

   
  
@@ -3221,6 +3228,24 @@ local0.*/var/log/postgresql

   
 
+ 
+  event_source (string)
+  
+   event_source configuration parameter
+  
+   
+
+ When logging to event log is enabled, this parameter
+ determines the program name used to identify
+ PostgreSQL messages in
+ event log. The default is
+ PostgreSQL.
+ This parameter can only be set in the postgresql.conf
+ file or on the server command line.
+
+   
+  
+
   
 
  
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 0410cff..41b9009 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1552,19 +1552,6 @@ PostgreSQL, contrib and HTML documentation successfully made. Ready to install.
   
 
   
-   Registering eventlog on Windows:
-   
-To register a Windows eventlog
-library with the operating system, issue this command after installation:
-
-regsvr32 pgsql_library_directory/pgevent.dll
-
-This creates registry entries used by the event viewer.
-   
-  
-
-  
Uninstallation:

 To undo the installation use the command gmake
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index ef83206..bfbb641 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2294,4 +2294,54 @@ ssh -L 6:db.foo.com:5432 j...@shell.foo.com
 
  
 
+ 
+  Registering Event Log on Windows
+
+  
+   event log
+   event log
+  
+
+  
+   To register a Windows event log
+   library with the operating system, issue this command:
+
+regsvr32 pgsql_library_directory/pgevent.dll
+
+   This creates registry entries used by the event viewer, under the default event
+   source named "PostgreSQL".
+   
+
+   
+   You can specify the event source name with /i option. In this case, -n option
+   is necessary, too:
+
+regsvr32 /n /i:event_source_name
+ pgsql_library_directory/pgevent.dll
+
+   You also need to set  in
+   postgresql.conf. Note that the event source specific

Re: [HACKERS] proposal: a validator for configuration files

2011-07-14 Thread Alexey Klyukin


On Jul 14, 2011, at 4:38 AM, Alvaro Herrera wrote:

> Excerpts from Florian Pflug's message of mié jul 13 20:12:28 -0400 2011:
>> On Jul14, 2011, at 01:38 , Alvaro Herrera wrote:
>>> One strange thing here is that you could get two such messages; say if a
>>> file has 100 parse errors and there are also valid lines that contain
>>> bogus settings (foo = bar).  I don't find this to be too problematic,
>>> and I think fixing it would be excessively annoying.
>>> 
>>> For example, a bogus run would end like this:
>>> 
>>> 95 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 4, near end of line
>>> 96 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 41, near end of line
>>> 97 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 104, near end of line
>>> 98 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 156, near end of line
>>> 99 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 208, near end of line
>>> 100 LOG:  syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" 
>>> line 260, near end of line
>>> 101 LOG:  too many errors found, stopped processing file 
>>> "/pgsql/install/HEAD/data/postgresql.conf"
>>> 102 LOG:  unrecognized configuration parameter "plperl.err"
>>> 103 LOG:  unrecognized configuration parameter "this1"
>>> 104 LOG:  too many errors found, stopped processing file 
>>> "/pgsql/install/HEAD/data/postgresql.conf"
>>> 105 FATAL:  errors detected while parsing configuration files
>> 
>> How about changing ParseConfigFile to say "too many *syntax* error found"
>> instead? It'd be more precise, and we wouldn't emit exactly the
>> same message twice.
> 
> Yeah, I thought about doing it that way but refrained because it'd be
> one more string to translate.  That's a poor reason, I admit :-)  I'll
> change it.

This is happening because a check for total number of errors so far is 
happening only after coming across at least one non-recognized configuration 
option.  What about adding one more check right after ParseConfigFile, so we 
can bail out early when overwhelmed with syntax errors? This would save a line 
in translation :).

> 
>> Do you want me to take a closer look at your modified version of the
>> patch before you commit, or did you post it more as a "FYI, this is
>> how it's going to look like"?
> 
> I know I'd feel more comfortable if you (and Alexey, and Selena) gave it
> another look :-)

I have checked it here and don't see any more problems with it.

--
Command Prompt, Inc.  http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Radosław Smogura


On Thu, 14 Jul 2011 15:15:33 +0300, Peter Eisentraut wrote:

On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote:

Hackers,

>> B. 6. Current behaviour _is intended_ (there is "if"  to check 
node type) and _"natural"_. In this particular case user ask for text 
content of some node, and this content is actually "<".

>
> I don't buy that. The check for the node type is there because
> two different libxml functions are used to convert nodes to
> strings. The if has absolutely *zero* to do with escaping, expect
> for that missing escape_xml() call in the "else" case.
>
> Secondly, there is little point in having an type XML if we
> don't actually ensure that values of that type can only contain
> well-formed XML.

Can anyone else weigh in on this? Peter?


Looks like a good change to me.
I'll bump it in few hours, as I can't recall password from keyring. Now 
I have hands clean and it's not my business to care about this.


Best regards.
Radek.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Understanding GIN posting trees

2011-07-14 Thread Heikki Linnakangas


I have a couple of questions on GIN:

The code seems to assume that it's possible for the same TID to appear 
twice for a single key (see addItemPointersToTuple()). I understand that 
it's possible for a single heap tuple to contain the same key twice. For 
example if you index an array of integers like [1,2,1]. But once you've 
inserted all the keys for a single heap item, you never try to insert 
the same TID again, so no duplicates should occur.


Looking at the history, it looks like pre-8.4 we assumed that no such 
duplicates are possible. Duplicates of a single key for one column are 
eliminated in extractEntriesSU(), but apparently when the multi-column 
support was added, we didn't make the de-duplication to run across the 
keys extracted from all columns. Now that the posting tree/list 
insertion code has to deal with duplicates anyway, the de-duplication 
performed in extractEntriesSU() seems pointless. But I wonder if it 
would be better to make extractEntriesSU() remove duplicates across all 
columns, so that the insertion code wouldn't need to deal with duplicates.


Dealing with the duplicates in the insertion code isn't particularly 
difficult. And in fact, now that we only support the getbitmap method, 
we wouldn't really need to eliminate duplicates anyway. But I have an 
ulterior motive:


Why is the posting tree a tree? AFAICS, we never search it using the 
TID, it's always scanned in whole. It would be simpler to store the TIDs 
in a posting list in no particular order. This could potentially make 
insertions cheaper, as you could just append to the last posting list 
page for the key, instead of traversing the posting tree to a particular 
location. You could also pack the tids denser, as you wouldn't need to 
reserve free space for additions in the middle.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-14 Thread Alexander Korotkov

On Thu, Jul 14, 2011 at 12:56 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> First, notice that we're setting "ptr->parent = top". 'top' is the current
> node we're processing, and ptr represents the node to the right of the
> current node. The current node is *not* the parent of the node to the right.
> I believe that line should be "ptr->parent = top->parent".
>
I think same.


> Second, we're adding the entry for the right sibling to the end of the list
> of nodes to visit. But when we process entries from the list, we exit
> immediately when we see a leaf page. That means that the right sibling can
> get queued up behind leaf pages, and thus never visited.

I think possible solution is to save right sibling immediatly after current
page . Thus, this code fragment should looks like this:


>if (top->parent && XLByteLT(top->parent->lsn,
> GistPageGetOpaque(page)->nsn) &&
>GistPageGetOpaque(page)->**rightlink !=
> InvalidBlockNumber /* sanity check */ )
>{
>/* page splited while we thinking of... */
>ptr = (GISTInsertStack *) palloc0(sizeof(**
> GISTInsertStack));
>ptr->blkno = GistPageGetOpaque(page)->**rightlink;
>ptr->childoffnum = InvalidOffsetNumber;
>ptr->parent = top->parent;
>ptr->next = top->next;
>top->next = ptr;
>if (tail == top);
>tail = ptr;

}
>

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao  wrote:

> Currently walwriter might write out the WAL before a transaction commits.
> IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
> This might be useful for long transaction which generates lots of WAL
> records before commit. So we should call SetLatch() in XLogInsert() instead
> of RecordTransactionCommit()? Though I'm not sure how much walwriter
> improves the performance of synchronous commit case..

Yeh, we did previously have a heuristic to write out the WAL when it
was more than half full. Not sure I want to put exactly that code back
into such a busy code path.

I suggest that we set latch every time the wal buffers wrap.

So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
SetLatch on the WALWriter.

That's a simple test and we only check it if we're switch WAL buffer page.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov

On Thu, Jul 14, 2011 at 12:42 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> Pinning a buffer that's already in the shared buffer cache is cheap, I
> doubt you're gaining much by keeping the private hash table in front of the
> buffer cache.
>
Yes, I see. Pinning a lot of buffers don't gives singnificant benefits but
produce a lot of problems. Also removing the hash table can simplify code.

Also, it's possible that not all of the subtree is actually required during
> the emptying, so in the worst case pre-loading them is counter-productive.

What do you think about pre-fetching all of the subtree? It requires actual
loading of level_step - 1 levels of it. I some cases it still can be
  counter-productive. But probably it is productive in average?

> Well, what do you do if you deem that shared_buffers is too small? Fall
> back to the old method? Also, shared_buffers is shared by all backends, so
> you can't assume that you get to use all of it for the index build. You'd
> need a wide safety margin.

I assumed to check if there are enough of shared_buffers before switching to
buffering method. But concurent backends makes this method unsafe.

There are other difficulties with concurrent backends: it would be nice
estimate usage of effective cache by other backeds before switching to
buffering method. If don't take care about it then we can don't switch to
buffering method which it can give significant benefit.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Fujii Masao

On Thu, Jul 14, 2011 at 5:39 PM, Simon Riggs  wrote:
> On Wed, Jul 13, 2011 at 10:56 PM, Peter Geoghegan  
> wrote:
>> Attached is patch for the WAL writer that removes its tight polling
>> loop (which probably doesn't get hit often in practice, as we just
>> sleep if wal_writer_delay is under a second), and, at least
>> potentially, reduces power consumption when idle by using a latch.
>>
>> I will break all remaining power consumption work down into
>> per-auxiliary process patches. I think that this is appropriate - if
>> we hit a snag on one of the processes, there is no need to have that
>> hold up everything.
>>
>> I've commented that we handle all expected signals, and therefore we
>> shouldn't worry about having timeout invalidated by signals, just as
>> with the archiver. Previously, we didn't even worry about Postmaster
>> death within the tight polling loop, presumably because
>> wal_writer_delay is typically small enough to avoid that being a
>> problem. I thought that WL_POSTMASTER_DEATH might be superfluous here,
>> but then again there is a codepath specifically for the case where
>> wal_writer_delay exceeds one second, so it is included in this initial
>> version.
>>
>> Comments?
>
> ISTM that this in itself isn't enough to reduce power consumption.
>
> Currently the only people that use WALWriter are asynchronous commits,
> so we should include within RecordTransactionCommit() a SetLatch()
> command for the WALWriter.
>
> That way we can have WALWriter sleep until its needed.

+1

Currently walwriter might write out the WAL before a transaction commits.
IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
This might be useful for long transaction which generates lots of WAL
records before commit. So we should call SetLatch() in XLogInsert() instead
of RecordTransactionCommit()? Though I'm not sure how much walwriter
improves the performance of synchronous commit case..

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] help with sending email

2011-07-14 Thread Fernando Acosta Torrelly

Hi everybody: 

 

I am using pgmail to send email in an application, but I would like to use
html format 

 

Does anybody has an example how to do this?, or what do you recommend me to
use por doing this?

 

Thanks in advance for your attention. 

 

Best Regards,

 

Fernando Acosta 

Lima - Perú

Re: [HACKERS] Full GUID support

2011-07-14 Thread Hiroshi Saito


Hi Thomas-san, Ralf-san.

I appreciate your great work.
Thanks!

CC to Postgres-ML.

Regards,
Hiroshi Saito

(2011/07/14 3:49), Thomas Lotterer wrote:
> Thanks for the hint.
> Our ftp daemon is dumping core.
> We are debugging ...
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

79 matches

Mail list logo