date:20110714

Re: [HACKERS] lazy vxid locks, v2

2011-07-14 Thread Jeff Davis

On Tue, 2011-07-05 at 13:15 -0400, Robert Haas wrote:
On Tue, Jul 5, 2011 at 1:13 PM, Robert Haas robertmh...@gmail.com wrote:
Here is an updated version of the lazy vxid locks patch [1], which
applies over the latest reduce the overhead of frequent table
locks[2] patch.

[1] https://commitfest.postgresql.org/action/patch_view?id=585
[2] https://commitfest.postgresql.org/action/patch_view?id=572

And then I forgot the attachment.

The patch looks good, and I like the concept.

My only real comment is one that you already made: the
BackendIdGetProc() mechanism is not awesome. However, that seems like
material for a separate patch, if at all.

Big disclaimer: I did not do any performance review, despite the fact
that this is a performance patch.

I see that there are some active performance concerns around this patch,
specifically that it may cause an increase in spinlock contention:

http://archives.postgresql.org/message-id/banlktikp4egbfw9xdx9bq_vk8dqa11w...@mail.gmail.com

Fortunately, there's a subsequent discussion that shows a lot of
promise:

http://archives.postgresql.org/pgsql-hackers/2011-07/msg00293.php

I'll mark this waiting on author pending the results of that
discussion.

I like the approach you're taking with this series of patches, so
perhaps we shouldn't set the bar so high that you have to remove all of
the bottlenecks before making any progress. Then again, maybe there's
not a huge cost to leaving these patches on the shelf until we're sure
that they lead somewhere.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space

2011-07-14 Thread Tatsuo Ishii

 Hackers,
 
 This patch needs a new reviewer, per Cedric.  Please help!

Hi I am the new reviewer:-)

I have looked into the v6 patches. One thing I would like to suggest
is, enhancing the error message when temp_file_limit will be exceeded.

ERROR:  aborting due to exceeding temp file limit

Is it possible to add the current temp file limit to the message? For
example,

ERROR:  aborting due to exceeding temp file limit 1kB

I know the current setting of temp_file_limit can be viewd in other
ways, but I think this will make admin's or application developer's
life a little bit easier.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Full GUID support

2011-07-14 Thread Hiroshi Saito


Hi Thomas-san, Ralf-san.

I appreciate your great work.
Thanks!

CC to Postgres-ML.

Regards,
Hiroshi Saito

(2011/07/14 3:49), Thomas Lotterer wrote:
 Thanks for the hint.
 Our ftp daemon is dumping core.
 We are debugging ...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] help with sending email

2011-07-14 Thread Fernando Acosta Torrelly

Hi everybody: 

 

I am using pgmail to send email in an application, but I would like to use
html format 

 

Does anybody has an example how to do this?, or what do you recommend me to
use por doing this?

 

Thanks in advance for your attention. 

 

Best Regards,

 

Fernando Acosta 

Lima - Perú

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Fujii Masao

On Thu, Jul 14, 2011 at 5:39 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, Jul 13, 2011 at 10:56 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 Attached is patch for the WAL writer that removes its tight polling
 loop (which probably doesn't get hit often in practice, as we just
 sleep if wal_writer_delay is under a second), and, at least
 potentially, reduces power consumption when idle by using a latch.

 I will break all remaining power consumption work down into
 per-auxiliary process patches. I think that this is appropriate - if
 we hit a snag on one of the processes, there is no need to have that
 hold up everything.

 I've commented that we handle all expected signals, and therefore we
 shouldn't worry about having timeout invalidated by signals, just as
 with the archiver. Previously, we didn't even worry about Postmaster
 death within the tight polling loop, presumably because
 wal_writer_delay is typically small enough to avoid that being a
 problem. I thought that WL_POSTMASTER_DEATH might be superfluous here,
 but then again there is a codepath specifically for the case where
 wal_writer_delay exceeds one second, so it is included in this initial
 version.

 Comments?

 ISTM that this in itself isn't enough to reduce power consumption.

 Currently the only people that use WALWriter are asynchronous commits,
 so we should include within RecordTransactionCommit() a SetLatch()
 command for the WALWriter.

 That way we can have WALWriter sleep until its needed.

+1

Currently walwriter might write out the WAL before a transaction commits.
IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
This might be useful for long transaction which generates lots of WAL
records before commit. So we should call SetLatch() in XLogInsert() instead
of RecordTransactionCommit()? Though I'm not sure how much walwriter
improves the performance of synchronous commit case..

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

2011-07-14 Thread Alexander Korotkov

On Thu, Jul 14, 2011 at 12:42 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Pinning a buffer that's already in the shared buffer cache is cheap, I
 doubt you're gaining much by keeping the private hash table in front of the
 buffer cache.

Yes, I see. Pinning a lot of buffers don't gives singnificant benefits but
produce a lot of problems. Also removing the hash table can simplify code.

Also, it's possible that not all of the subtree is actually required during
 the emptying, so in the worst case pre-loading them is counter-productive.

What do you think about pre-fetching all of the subtree? It requires actual
loading of level_step - 1 levels of it. I some cases it still can be
  counter-productive. But probably it is productive in average?


 Well, what do you do if you deem that shared_buffers is too small? Fall
 back to the old method? Also, shared_buffers is shared by all backends, so
 you can't assume that you get to use all of it for the index build. You'd
 need a wide safety margin.

I assumed to check if there are enough of shared_buffers before switching to
buffering method. But concurent backends makes this method unsafe.

There are other difficulties with concurrent backends: it would be nice
estimate usage of effective cache by other backeds before switching to
buffering method. If don't take care about it then we can don't switch to
buffering method which it can give significant benefit.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Currently walwriter might write out the WAL before a transaction commits.
 IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
 This might be useful for long transaction which generates lots of WAL
 records before commit. So we should call SetLatch() in XLogInsert() instead
 of RecordTransactionCommit()? Though I'm not sure how much walwriter
 improves the performance of synchronous commit case..

Yeh, we did previously have a heuristic to write out the WAL when it
was more than half full. Not sure I want to put exactly that code back
into such a busy code path.

I suggest that we set latch every time the wal buffers wrap.

So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
SetLatch on the WALWriter.

That's a simple test and we only check it if we're switch WAL buffer page.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for GiST: move childoffnum to child

2011-07-14 Thread Alexander Korotkov

On Thu, Jul 14, 2011 at 12:56 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 First, notice that we're setting ptr-parent = top. 'top' is the current
 node we're processing, and ptr represents the node to the right of the
 current node. The current node is *not* the parent of the node to the right.
 I believe that line should be ptr-parent = top-parent.

I think same.


 Second, we're adding the entry for the right sibling to the end of the list
 of nodes to visit. But when we process entries from the list, we exit
 immediately when we see a leaf page. That means that the right sibling can
 get queued up behind leaf pages, and thus never visited.

I think possible solution is to save right sibling immediatly after current
page . Thus, this code fragment should looks like this:


if (top-parent  XLByteLT(top-parent-lsn,
 GistPageGetOpaque(page)-nsn) 
GistPageGetOpaque(page)-**rightlink !=
 InvalidBlockNumber /* sanity check */ )
{
/* page splited while we thinking of... */
ptr = (GISTInsertStack *) palloc0(sizeof(**
 GISTInsertStack));
ptr-blkno = GistPageGetOpaque(page)-**rightlink;
ptr-childoffnum = InvalidOffsetNumber;
ptr-parent = top-parent;
ptr-next = top-next;
top-next = ptr;
if (tail == top);
tail = ptr;

}


--
With best regards,
Alexander Korotkov.

[HACKERS] Understanding GIN posting trees

2011-07-14 Thread Heikki Linnakangas


I have a couple of questions on GIN:

The code seems to assume that it's possible for the same TID to appear 
twice for a single key (see addItemPointersToTuple()). I understand that 
it's possible for a single heap tuple to contain the same key twice. For 
example if you index an array of integers like [1,2,1]. But once you've 
inserted all the keys for a single heap item, you never try to insert 
the same TID again, so no duplicates should occur.


Looking at the history, it looks like pre-8.4 we assumed that no such 
duplicates are possible. Duplicates of a single key for one column are 
eliminated in extractEntriesSU(), but apparently when the multi-column 
support was added, we didn't make the de-duplication to run across the 
keys extracted from all columns. Now that the posting tree/list 
insertion code has to deal with duplicates anyway, the de-duplication 
performed in extractEntriesSU() seems pointless. But I wonder if it 
would be better to make extractEntriesSU() remove duplicates across all 
columns, so that the insertion code wouldn't need to deal with duplicates.


Dealing with the duplicates in the insertion code isn't particularly 
difficult. And in fact, now that we only support the getbitmap method, 
we wouldn't really need to eliminate duplicates anyway. But I have an 
ulterior motive:


Why is the posting tree a tree? AFAICS, we never search it using the 
TID, it's always scanned in whole. It would be simpler to store the TIDs 
in a posting list in no particular order. This could potentially make 
insertions cheaper, as you could just append to the last posting list 
page for the key, instead of traversing the posting tree to a particular 
location. You could also pack the tids denser, as you wouldn't need to 
reserve free space for additions in the middle.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Radosław Smogura


On Thu, 14 Jul 2011 15:15:33 +0300, Peter Eisentraut wrote:

On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote:

Hackers,

 B. 6. Current behaviour _is intended_ (there is if  to check 
node type) and _natural_. In this particular case user ask for text 
content of some node, and this content is actually .


 I don't buy that. The check for the node type is there because
 two different libxml functions are used to convert nodes to
 strings. The if has absolutely *zero* to do with escaping, expect
 for that missing escape_xml() call in the else case.

 Secondly, there is little point in having an type XML if we
 don't actually ensure that values of that type can only contain
 well-formed XML.

Can anyone else weigh in on this? Peter?


Looks like a good change to me.
I'll bump it in few hours, as I can't recall password from keyring. Now 
I have hands clean and it's not my business to care about this.


Best regards.
Radek.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: a validator for configuration files

2011-07-14 Thread Alexey Klyukin


On Jul 14, 2011, at 4:38 AM, Alvaro Herrera wrote:

 Excerpts from Florian Pflug's message of mié jul 13 20:12:28 -0400 2011:
 On Jul14, 2011, at 01:38 , Alvaro Herrera wrote:
 One strange thing here is that you could get two such messages; say if a
 file has 100 parse errors and there are also valid lines that contain
 bogus settings (foo = bar).  I don't find this to be too problematic,
 and I think fixing it would be excessively annoying.
 
 For example, a bogus run would end like this:
 
 95 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 4, near end of line
 96 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 41, near end of line
 97 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 104, near end of line
 98 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 156, near end of line
 99 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 208, near end of line
 100 LOG:  syntax error in file /pgsql/install/HEAD/data/postgresql.conf 
 line 260, near end of line
 101 LOG:  too many errors found, stopped processing file 
 /pgsql/install/HEAD/data/postgresql.conf
 102 LOG:  unrecognized configuration parameter plperl.err
 103 LOG:  unrecognized configuration parameter this1
 104 LOG:  too many errors found, stopped processing file 
 /pgsql/install/HEAD/data/postgresql.conf
 105 FATAL:  errors detected while parsing configuration files
 
 How about changing ParseConfigFile to say too many *syntax* error found
 instead? It'd be more precise, and we wouldn't emit exactly the
 same message twice.
 
 Yeah, I thought about doing it that way but refrained because it'd be
 one more string to translate.  That's a poor reason, I admit :-)  I'll
 change it.

This is happening because a check for total number of errors so far is 
happening only after coming across at least one non-recognized configuration 
option.  What about adding one more check right after ParseConfigFile, so we 
can bail out early when overwhelmed with syntax errors? This would save a line 
in translation :).

 
 Do you want me to take a closer look at your modified version of the
 patch before you commit, or did you post it more as a FYI, this is
 how it's going to look like?
 
 I know I'd feel more comfortable if you (and Alexey, and Selena) gave it
 another look :-)

I have checked it here and don't see any more problems with it.

--
Command Prompt, Inc.  http://www.CommandPrompt.com
PostgreSQL Replication, Consulting, Custom Development, 24x7 support




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for distinguishing PG instances in event log

2011-07-14 Thread Magnus Hagander

2011/5/26 MauMau maumau...@gmail.com:
 Hello,

 I wrote and attached a patch for the TODO item below (which I proposed).

 Allow multiple Postgres clusters running on the same machine to distinguish
 themselves in the event log
 http://archives.postgresql.org/pgsql-hackers/2011-03/msg01297.php
 http://archives.postgresql.org/pgsql-hackers/2011-05/msg00574.php

 I changed two things from the original proposal.

 1. regsvr32.exe needs /n when you specify event source
 I described the reason in src/bin/pgevent/pgevent.c.

 2. I moved the article for event log registration to more suitable place
 The traditional place and what I originally proposed were not best, because
 those who don't build from source won't read those places.

 I successfully tested event log registration/unregistration, event logging
 with/without event_source parameter, and SHOWing event_source parameter with
 psql on Windows Vista (32-bit). I would appreciate if someone could test it
 on 64-bit Windows who has the 64-bit environment.

 I'll add this patch to the first CommitFest of 9.2. Thank you in advance for
 reviewing it.

+para
+ On Windows, you need to register an event source
+ and its library with the operating system in order
+ to make use of the systemitemeventlog/systemitem option for
+ varnamelog_destination/.
+ See xref linkend=event-log-registration for details.
+/para

* This part is not strictly correct - you don't *need* to do that, it
just makes things look nicer, no?

* Also, what is the use for set_eventlog_parameters()? It's just a
string variable, it shuold work without it.

* We these days avoid #ifdef'ing gucs just because they are not on
that platform - so the list is consistent. The guc should be available
on non-windows platforms as well.

* The guc also needs to go in postgresql.conf.sample

* We never build in unicode mode, so all those checks are unnecessary.

* Are we really allowed to call MessageBox in DlLRegisterService?
Won't that break badly in cases like silent installs?

Attached is an updated patch, which doesn't work yet. I believe the
changes to the backend are correct, but probably some of the cleanups
and changes in the dll are incorrect, because I seem to be unable to
register either the default or a custom handler so far.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 842558d..583a5c9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2975,6 +2975,13 @@ local0.*/var/log/postgresql
  to the  applicationsyslog/application daemon's configuration file
  to make it work.
 /para
+para
+ On Windows, you need to register an event source
+ and its library with the operating system in order
+ to make use of the systemitemeventlog/systemitem option for
+ varnamelog_destination/.
+ See xref linkend=event-log-registration for details.
+/para
/note
   /listitem
  /varlistentry
@@ -3221,6 +3228,24 @@ local0.*/var/log/postgresql
/listitem
   /varlistentry
 
+ varlistentry id=guc-event-source xreflabel=event_source
+  termvarnameevent_source/varname (typestring/type)/term
+  indexterm
+   primaryvarnameevent_source/ configuration parameter/primary
+  /indexterm
+   listitem
+para
+ When logging to applicationevent log/ is enabled, this parameter
+ determines the program name used to identify
+ productnamePostgreSQL/productname messages in
+ applicationevent log/application. The default is
+ literalPostgreSQL/literal.
+ This parameter can only be set in the filenamepostgresql.conf/
+ file or on the server command line.
+/para
+   /listitem
+  /varlistentry
+
   /variablelist
 /sect2
  sect2 id=runtime-config-logging-when
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 0410cff..41b9009 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1552,19 +1552,6 @@ PostgreSQL, contrib and HTML documentation successfully made. Ready to install.
   /procedure
 
   formalpara
-   titleRegistering applicationeventlog/ on systemitem
-   class=osnameWindows/:/title
-   para
-To register a systemitem class=osnameWindows/ applicationeventlog/
-library with the operating system, issue this command after installation:
-screen
-userinputregsvr32 replaceablepgsql_library_directory//pgevent.dll/
-/screen
-This creates registry entries used by the event viewer.
-   /para
-  /formalpara
-
-  formalpara
titleUninstallation:/title
para
 To undo the installation use the command commandgmake
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index ef83206..bfbb641 100644
--- a/doc/src/sgml/runtime.sgml
+++

Re: [HACKERS] SAVEPOINTs and COMMIT performance

2011-07-14 Thread Simon Riggs

On Mon, Jun 6, 2011 at 10:33 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.02.2011 23:09, Simon Riggs wrote:

 On Sun, 2011-02-06 at 12:11 -0500, Bruce Momjian wrote:

 Did this ever get addressed?

 Patch attached.

 Seems like the easiest fix I can come up with.

 @@ -2518,7 +2518,7 @@ CommitTransactionCommand(void)
                case TBLOCK_SUBEND:
                        do
                        {
 -                               CommitSubTransaction();
 +                               CommitSubTransaction(true);
                                s = CurrentTransactionState;    /* changed
 by pop */
                        } while (s-blockState == TBLOCK_SUBEND);
                        /* If we had a COMMIT command, finish off the main
 xact too */

 We also get into this codepath at RELEASE SAVEPOINT, in which case it is
 wrong to not reassign the locks to the parent subtransaction.

Attached patch splits TBLOCK_SUBEND state into two new states:
TBLOCK_SUBCOMMIT and TBLOCK_SUBRELEASE, so that we can do the right
thing.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


savepoint_commit_performance.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Pavan Deolasee

On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs si...@2ndquadrant.com wrote:



 Hi Pavan,

 I'd say that seems way too complex for such a small use case and we've
 only just fixed the bugs from 8.4 vacuum map complexity. The code's
 looking very robust now and I'm uneasy that such changes are really
 worth it.


Thanks Simon for looking at the patch.

I am not sure if the use case is really narrow. Today, we dirty the pages in
both the passes and also emit WAL records. Just the heap scan can take a
very long time for large tables, blocking the autovacuum worker threads from
doing useful work on other tables. If I am not wrong, we use ring buffers
for vacuum which would most-likely force those buffers to be written/read
twice to the disk.

Which part of the patch you think is very complex ? We can try to simplify
that. Or are you seeing any obvious bugs that I missed ? IMHO taking out a
phase completely from vacuum (as this patch does) can simplify things.



 You're trying to avoid Phase 3, the second pass on the heap. Why not
 avoid the write in Phase 1 if its clear that we'll need to come back
 again in Phase 3? So we either do a write in Phase 1 or in Phase 3,
 but never both? That minimises the writes, which are what hurt the
 most.


You can possibly do the work in the Phase 3, but that doesn't avoid the
second scan.


 We can reduce the overall cost simply by not doing Phase 2 and Phase 3
 if the number of rows to remove is too few, say  1%.


If you have set the vacuum parameters such that it kicks in when there are
say 5% updates/deletes, you would most likely have that much work to do
anyways.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Simon Riggs

On Tue, Jul 12, 2011 at 9:47 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:

 http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php
 PFA a patch which implements the idea with some variation.
 At the start of the first pass, we remember the current LSN. Every page that
 needs some work is HOT-pruned so that dead tuples are truncated to dead line
 pointers. We collect those dead line pointers and mark them as
 dead-vacuumed. Since we don't have any LP flag bits available, we instead
 just use the LP_DEAD flag along with offset value 1 to mark the line pointer
 as dead-vacuumed. The page is defragmented and we  store the LSN remembered
 at the start of the pass in the page special area as vacuum LSN. We also
 update the free space at that point because we are not going to do a second
 pass on the page anymore.

 Once we collect all dead line pointers and mark them as dead-vacuumed, we
 clean-up the indexes and remove all index pointers pointing to those
 dead-vacuumed line pointers. If the index vacuum finishes successfully, we
 store the LSN in the pg_class row of the table (needs catalog changes). At
 that point, we are certain that there are no index pointers pointing to
 dead-vacuumed line pointers and they can be reclaimed at the next
 opportunity.

 During normal operations or subsequent vacuum, if the page is chosen for
 HOT-prunung, we check if has any dead-vacuumed line pointers and if the
 vacuum LSN stored on the page special area is equal to the one stored in the
 pg_class row, and reclaim those dead-vacuum line pointers (the index
 pointers to these line pointers are already taken care of). If the pg_class
 LSN is not the same, the last vacuum probably did not finish completely and
 we collect the dead-vacuum line pointers just like other dead line pointers
 and try to clean up the index pointers as usual.
 I ran few pgbench tests with the patch. I don't see much difference in the
 overall tps, but the vacuum time for the accounts table reduces by nearly
 50%. I neither see much difference in the overall bloat, but then pgbench
 uses HOT very nicely and the accounts table got only couple of vacuum cycles
 in my 7-8 hour run.
 There are couple of things that probably need more attention. I am not sure
 if we need to teach ANALYZE to treat dead line pointers differently. Since
 they take up much less space than a dead tuple, they should definitely have
 a lower weight, but at the same time, we need to take into account the
 number of indexes on the table. The start of first pass LSN that we are
 remembering is in fact the start of the WAL page and I think there could be
 some issues with that, especially for very tiny tables. For example, first
 vacuum may run completely. If another vacuum is started on the same table
 and say it gets the same LSN (because we did not write more than 1 page
 worth WAL in between) and if the second vacuum aborts after it cleaned up
 few pages, we might get into some trouble. The likelihood of such things
 happening is very small, but may be its worth taking care of it. May be we
 can get the exact current LSN and not store it in the pg_class if we don't
 do anything during the cycle.
 Comments ?

Hi Pavan,

I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.

You're trying to avoid Phase 3, the second pass on the heap. Why not
avoid the write in Phase 1 if its clear that we'll need to come back
again in Phase 3? So we either do a write in Phase 1 or in Phase 3,
but never both? That minimises the writes, which are what hurt the
most.

We can reduce the overall cost simply by not doing Phase 2 and Phase 3
if the number of rows to remove is too few, say  1%.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Peter Eisentraut

On ons, 2011-07-13 at 11:58 +0200, Nicolas Barbier wrote:
 2011/6/29, Florian Pflug f...@phlo.org:
 
  Secondly, there is little point in having an type XML if we
  don't actually ensure that values of that type can only contain
  well-formed XML.
 
 +1. The fact that XPATH() must return a type that cannot depend on the
 given expression (even if it is a constant string) may be unfortunate,
 but returning XML-that-is-not-quite-XML sounds way worse to me.

The example given was

XPATH('/*/text()', 'rootlt;/root')

This XPath expression returns a node set, and XML is a serialization
format of a node, so returning xml[] in this particular case seems
entirely reasonable to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

2011-07-14 Thread Peter Eisentraut

On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote:
 Hackers,
 
  B. 6. Current behaviour _is intended_ (there is if  to check node type) 
  and _natural_. In this particular case user ask for text content of some 
  node, and this content is actually .
  
  I don't buy that. The check for the node type is there because
  two different libxml functions are used to convert nodes to
  strings. The if has absolutely *zero* to do with escaping, expect
  for that missing escape_xml() call in the else case.
  
  Secondly, there is little point in having an type XML if we
  don't actually ensure that values of that type can only contain
  well-formed XML.
 
 Can anyone else weigh in on this? Peter?

Looks like a good change to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 12:42, Simon Riggs wrote:

On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masaomasao.fu...@gmail.com  wrote:


Currently walwriter might write out the WAL before a transaction commits.
IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups.
This might be useful for long transaction which generates lots of WAL
records before commit. So we should call SetLatch() in XLogInsert() instead
of RecordTransactionCommit()? Though I'm not sure how much walwriter
improves the performance of synchronous commit case..


Yeh, we did previously have a heuristic to write out the WAL when it
was more than half full. Not sure I want to put exactly that code back
into such a busy code path.

I suggest that we set latch every time the wal buffers wrap.

So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
SetLatch on the WALWriter.

That's a simple test and we only check it if we're switch WAL buffer page.


That was my first though too - but I wonder if that's too aggressive? A 
backend that does for example a large bulk load will cycle through the 
buffers real quick. It seems like a bad idea to wake up walwriter 
between each buffer in that case. Then again, setting a latch that's 
already set is cheap, so maybe it works fine in practice.


Maybe it would be better to do it less frequently, say, every time you 
switch to new WAL segment. Or every 10 buffers or something like that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reduced power consumption in WAL Writer process

2011-07-14 Thread Simon Riggs

On Thu, Jul 14, 2011 at 10:53 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 14.07.2011 12:42, Simon Riggs wrote:

 On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masaomasao.fu...@gmail.com
  wrote:

 Currently walwriter might write out the WAL before a transaction commits.
 IOW, walwriter tries to write out the WAL in wal_buffers in every
 wakeups.
 This might be useful for long transaction which generates lots of WAL
 records before commit. So we should call SetLatch() in XLogInsert()
 instead
 of RecordTransactionCommit()? Though I'm not sure how much walwriter
 improves the performance of synchronous commit case..

 Yeh, we did previously have a heuristic to write out the WAL when it
 was more than half full. Not sure I want to put exactly that code back
 into such a busy code path.

 I suggest that we set latch every time the wal buffers wrap.

 So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then
 SetLatch on the WALWriter.

 That's a simple test and we only check it if we're switch WAL buffer page.

 That was my first though too - but I wonder if that's too aggressive? A
 backend that does for example a large bulk load will cycle through the
 buffers real quick. It seems like a bad idea to wake up walwriter between
 each buffer in that case. Then again, setting a latch that's already set is
 cheap, so maybe it works fine in practice.

 Maybe it would be better to do it less frequently, say, every time you
 switch to new WAL segment. Or every 10 buffers or something like that.

Yes, that roughly what I'm saying. When nextidx == 0 is just after we
wrapped wal_buffers, i.e. we only wake up the WALWriter every
wal_buffers pages.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Single pass vacuum - take 1

2011-07-14 Thread Heikki Linnakangas


On 14.07.2011 18:57, Pavan Deolasee wrote:

On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggssi...@2ndquadrant.com  wrote:

I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.


Thanks Simon for looking at the patch.

I am not sure if the use case is really narrow. Today, we dirty the pages in
both the passes and also emit WAL records. Just the heap scan can take a
very long time for large tables, blocking the autovacuum worker threads from
doing useful work on other tables. If I am not wrong, we use ring buffers
for vacuum which would most-likely force those buffers to be written/read
twice to the disk.


Seems worthwhile to me. What bothers me a bit is the need for the new 
64-bit LSN value on each heap page. Also, note that temporary tables are 
not WAL-logged, so there's no LSNs.


How does this interact with the visibility map? If you set the 
visibility map bit after vacuuming indexes, a subsequent vacuum will not 
visit the page. The second vacuum will update relindxvacxlogid/off, but 
it will not clean up the dead line pointers left behind by the first 
vacuum. Now the LSN on the page differs from the one stored in pg_class, 
so subsequent pruning will not remove the dead line pointers either. I 
think you can sidestep that if you check that the page's vacuum LSN = 
vacuum LSN in pg_class, instead of equality.


Ignoring the issue stated in previous paragraph, I think you wouldn't 
actually need an 64-bit LSN. A smaller counter is enough, as wrap-around 
doesn't matter. In fact, a single bit would be enough. After a 
successful vacuum, the counter on each heap page (with dead line 
pointers) is N, and the value in pg_class is N. There are no other 
values on the heap, because vacuum will have cleaned them up. When you 
begin the next vacuum, it will stamp pages with N+1. So at any stage, 
there is only one of two values on any page, so a single bit is enough. 
(But as I said, that doesn't hold if vacuum skips some pages thanks to 
the visibility map)


Is there something in place to make sure that pruning uses an up-to-date 
relindxvacxlogid/off value? I guess it doesn't matter if it's 
out-of-date, you'll just miss the opportunity to remove some dead tuples.


Seems odd to store relindxvacxlogid/off as two int32 columns. Store it 
in one uint64 column, or invent a new datatype for LSNs, or store it as 
text in %X/%X format.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

81 matches

Mail list logo