Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Asko Oja
On Sat, Sep 6, 2008 at 2:29 AM, Euler Taveira de Oliveira <[EMAIL PROTECTED]
> wrote:

> Martin Pihlak escreveu:
> > I suspected that, but somehow managed to overlook it :( I guess it was
> > too tempting to use it. I'll start looking for alternatives.
> >
> If you can't afford a 500 msec pgstat time, then you need to make it
> tunable.

Additional parameter in config file. Not good.


> Another ideas are (i) turn on/off pgstat per table or database
> and (ii) make the pgstat time tunable per table or database. You can use
> the reloptions column to store these info. These workarounds are much
> simpler than that you proposed and they're almost for free.
>
Does not seem simple to me. Why would dba's want extra management work. We
want all the stats to be there as we don't know when we need to look at it.

>
>
> --
>  Euler Taveira de Oliveira
>  http://www.timbira.com/
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-06 Thread Markus Wanner

Hi,

Fujii Masao wrote:

Pavan re-designed the sync replication based on the prototype
and I posted that design doc on wiki. Please check it if you
are interested in it.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects


I've read that wiki page and allow myself to comment from a Postgres-R 
developer's perspective ;-)


R1: "without ... any negative performance overhead"? For fully 
synchronous replication, that's clearly not possible. I guess that 
applies only for async WAL shipping.


NR3: who is supposed to do failure detection and manage automatic 
failover? How does integration with such an additional tool work?


I got distracted by the SBY and ACT abbreviations. Why abbreviate 
standby or active at all? It's not like we don't already have enough 
three letter acronyms, but those stand for rather more complex terms 
than single words.


Standby Bootstrap: "stopping the archiving at the ACT" doesn't prevent 
overriding WAL files in pg_xlog. It just stops archiving a WAL file 
before it gets overridden - which clearly doesn't solve the problem here.


How is communication done? "Serialization of WAL shipping" should better 
not mean serialization on the network, i.e. the WAL Sender Process 
should be able to await acknowledgment of multiple WAL packets in 
parallel, otherwise the interconnect latency might turn into a 
bottleneck. How is communication done? What happens if the link between 
the active and standby goes down? Or if it's temporarily unavailable for 
some time?


The IPC mechanism reminds me a lot of what I did for Postgres-R, which 
also has a central "replication manager" process, which receives 
changesets from multiple backends. I've implemented an internal 
messaging mechanism based on shared memory and signals, using only 
Postgres methods. It allows arbitrary processes to send messages to each 
other by process id.


Moving the WAL Sender and WAL Receiver processes under the control of 
the postmaster certainly sounds like a good thing. After all, those are 
fiddling wiht Postgres internals.



This design is too huge. In order to enhance the extensibility
of postgres, I'd like to divide the sync replication into
minimum hooks and some plugins and to develop it, respectively.
Plugins for the sync replication plan to be available at the
time of 8.4 release.


Hooks again? I bet you all know by now, that my excitement for hooks has 
always been pretty narrow. ;-)



In my design, WAL sending is achieved as follow by WALSender.
WALSender is a new process which I introduce.

  1) On COMMIT, backend requests WALSender to send WAL.
  2) WALSender reads WAL from walbuffers and send it to slave.
  3) WALSender waits for the response from slave and replies
 backend.

I propose two hooks for WAL sending.

WAL-writing hook

This hook is for backend to communicate with WALSender.
WAL-writing hook intercepts write system call in XLogWrite.
That is, backend requests WAL sending whenever write is called.

WAL-writing hook is available also for other uses e.g.
Software RAID (writes WAL into two files for durability).

Hook for WALSender
--
This hook is for introducing WALSender. There are the following
three ideas of how to introduce WALSender. A required hook
differs by which idea is adopted.

a) Use WALWriter as WALSender

   This idea needs WALWriter hook which intercepts WALWriter
   literally. WALWriter stops the local WAL write and focuses on
   WAL sending. This idea is very simple, but I don't think of
   the use of WALWriter hook other than WAL sending.

b) Use new background process as WALSender

   This idea needs background-process hook which enables users
   to define new background processes. I think the design of this
   hook resembles that of rmgr hook proposed by Simon. I define
   the table like RmgrTable. It's for registering some functions
   (e.g. main function and exit...) for operating a background
   process. Postmaster calls the function from the table suitably,
   and manages a start and end of background process. ISTM that
   there are many uses in this hook, e.g. performance monitoring
   process like statspack.

c) Use one backend as WALSender

   In this idea, slave calls the user-defined function which
   takes charge of WAL sending via SQL e.g. "SELECT pg_walsender()".
   Compared with other ideas, it's easy to implement WALSender
   because postmater handles the establishment and authentication
   of connection. But, this SQL causes a long transaction which
   prevents vacuum. So, this idea needs idle-state hook which
   executes plugin before transaction starts. I don't think of
   the use of this hook other than WAL sending either.


The above cited wiki page sounds like you've already decided for b).

I'm unclear on what you want hooks for. If additional processes get 
integrated into Postgres, those certainly need to get integrated very 
much like we integrated other auxiliary processes. I wouldn't call that 
'h

Re: [HACKERS] Need more reviewers!

2008-09-06 Thread Greg Smith

On Thu, 4 Sep 2008, Simon Riggs wrote:


I think this should be organised with different kinds of reviewer...


Great post.  Rewrote the intro a bit and turned it into a first bit of 
reviewer training material at 
http://wiki.postgresql.org/wiki/Reviewing_a_Patch


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need more reviewers!

2008-09-06 Thread Greg Smith

On Fri, 5 Sep 2008, Marko Kreen wrote:


I think we have better results and more relaxed atmospere if we
use following task description for reviewers:


I assimilated this and some of your later comments into 
http://wiki.postgresql.org/wiki/Reviewing_a_Patch as well.  I disagree 
with your feeling that Simon's text was too much; there's value to both a 
gentle intro and a detailed list of review tasks.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NDirectFileRead and Write

2008-09-06 Thread Hitoshi Harada
2008/8/5 ITAGAKI Takahiro <[EMAIL PROTECTED]>:
> Here is a patch to user NDirectFileRead/Write counters to get I/O counts
> in BufFile module. We can see the counters when log_statement_stats is on.
>
> The information is different from trace_sort; trace_sort shows used blocks
> in external sort, and log_statement_stats shows how many I/Os are submitted
> during sorts.
>
> I wrote:
>> I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
>> for counting reads and writes in BufFile. They are defined, but not used
>> now. BufFile is used for tuple sorting or materializing, so we could use
>> NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.
>
> =# SET client_min_messages = log;
> =# SET trace_sort = on;
> =# SET log_statement_stats = on;
> =# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 100) AS i ORDER BY i;
> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
> LOG:  switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 
> 0.35 sec
> LOG:  performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
> LOG:  finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 
> 2.21 sec
> LOG:  performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
> LOG:  external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 
> 3.06 sec
> LOG:  QUERY STATISTICS
> DETAIL:  ! system usage stats:
> !   3.078000 elapsed 2.234375 user 0.812500 system sec
> !   [3.328125 user 1.281250 sys total]
> ! buffer usage stats:
> !   Shared blocks:  0 read,  0 written, buffer hit rate = 
> 0.00%
> !   Local  blocks:  0 read,  0 written, buffer hit rate = 
> 0.00%
> !   Direct blocks:   5375 read,   5374 written
> QUERY PLAN
> 
>  Sort  (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 
> rows=100 loops=1)
>   Sort Key: i
>   Sort Method:  external sort  Disk: 19552kB
>   ->  Function Scan on generate_series i  (cost=0.00..12.50 rows=1000 
> width=4) (actual time=349.065..892.907 rows=100 loops=1)
>  Total runtime: 3087.305 ms
> (5 rows)
>
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Though I'm not quite good around buffers, I reviewed this patch and
found that it seems strange about the declaring position of the two
variables. In ResetBufferUsage() of bufmgr.c it says:

void
ResetBufferUsage(void)
{
BufferHitCount = 0;
ReadBufferCount = 0;
BufferFlushCount = 0;
LocalBufferHitCount = 0;
ReadLocalBufferCount = 0;
LocalBufferFlushCount = 0;
NDirectFileRead = 0;
NDirectFileWrite = 0;
}

so I guess all of these variables should be defined together but
actually you put the two in buffile.h while the others in
buf_iinnternals.h. Is there clear reason for that?

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need more reviewers!

2008-09-06 Thread Simon Riggs

On Sat, 2008-09-06 at 04:03 -0400, Greg Smith wrote:
> On Thu, 4 Sep 2008, Simon Riggs wrote:
> 
> > I think this should be organised with different kinds of reviewer...
> 
> Great post.  Rewrote the intro a bit and turned it into a first bit of 
> reviewer training material at 
> http://wiki.postgresql.org/wiki/Reviewing_a_Patch

Well written, thanks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window functions patch v04 for the September commit fest

2008-09-06 Thread Hitoshi Harada
2008/9/5 Heikki Linnakangas <[EMAIL PROTECTED]>:
> Heikki Linnakangas wrote:
>>
>> I'll review the parser/planner changes from the current patch.
>
> Looks pretty sane to me. Few issues:
>
> Is it always OK to share a window between two separate window function
> invocations, if they both happen to have identical OVER clause? It seems OK
> for stable functions, but I'm not sure that's correct for expressions
> involving volatile functions. I wonder if the SQL spec has anything to say
> about that.

It may be here:

---quote---
In general, two s are computed independently, each
one performing its own sort of its data, even if they use the same
data and the same . Since sorts may specify
partial orderings,
the computation of s is inevitably non-deterministic
to the extent that the ordering is not total. Nevertheless, the user
may desire that two s be computed using the same
ordering, so that, for example, two moving aggregates move through the
rows of a partition in precisely the same order.

Two s are computed using the same (possibly
non-deterministic) window ordering of the rows if any of the following
are true:

― The s identify the same window structure descriptor.

― The s' window structure descriptors have window
partitioning clauses that enumerate the same number of column
references, and those column references are pairwise equivalent in
their order
of occurrence; and their window structure descriptors have window
ordering clauses with the same number of s, and those s are all column references, and those column references are
pairwise equivalent in their order of occurrence, and the s pairwise specify or imply s that
specify equivalent s, the same  (ASC or DESC), and the same  (NULLS
FIRST or NULLS LAST).

― The window structure descriptor of one  is the
ordering window of the other , or both window
structure descriptors identify the same ordering window.
/---quote---

But it doesn't say anything about volatile functions. Do you have
example that is bad with the current design?

The other issuses are OK. I missed those cases. will fix them.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prototype: In-place upgrade v02

2008-09-06 Thread Greg Smith

On Fri, 5 Sep 2008, Heikki Linnakangas wrote:


All in all, though. I find it a bit hard to see the big picture.


I've been working on trying to see that myself lately, have been dumping 
links to all the interesting material at 
http://wiki.postgresql.org/wiki/In-place_upgrade if there's any of that 
you haven't seen before.


To keep this concrete, let's focus on PG 8.2 -> PG 8.3 (or are you 
focusing on PG 8.3 -> 8.4? That's fine with me as well, but let's pick 
one)


From a complexity perspective, the changes needed to go from 8.2->8.3 seem 
much larger than what's needed for 8.3->8.4.  There's also a huge PR win 
if 8.4 goes out the door saying that in-place upgrades are available from 
the previous version starting at the 8.4 release.  Given the limited time 
left, I would think a focus on nailing the 8.3->8.4 conversion down first 
and then slipping in support for earlier revs later would be one way to 
get this into more managable chunks.  Obviously if you can fit 
infrastructure that makes the 8.2 conversion easier that's worth doing, 
but I'd hate to see this get bogged down worrying too much about things 
that haven't actually changed since 8.3.


The specific areas I am getting up to speed to help out with here are 
catalog updates and working on integration/testing.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review Report: propose to include 3 new functions into intarray and intagg

2008-09-06 Thread Markus Wanner

Hi,

this is my first "official" review. I've tried to follow the "Review a
patch" guidelines from the wiki - thanks Simon, that was pretty helpful.

This review covers only the intagg additions.

Dmitry Koterov wrote:

Here are these functions with detailed documentation:
http://en.dklab.ru/lib/dklab_postgresql_patch/


Submission review: we generally prefer having patches archived on our
mailing lists, so please just send future patches or revisions of this
patch to our lists (I prefer -hackers, but probably -patches is still
the official one). The documentation should go into a README file of the
contrib module or some such. Tests are missing, but that's the case for
the intagg contrib module anyway. The patch applies and is in context
diff format, good. It adds an unrelated newline, which should generally 
be avoided.


Usability review: functional additions look good and usable, certainly
within a contrib module. I don't think it's compliant to any spec, but
that's not required for contrib, IMO.

Functional test: works as advertised on the accompanying website. I've
tested the int_array_append_aggregate somewhat, see [1].

Performance testing: I've tested with 10 mio rows of arrays of size 1
and compared against core's int_array_aggregate function, see [1] again.
In this simple test it took roughly 50% longer, which seems okay. Memory 
consumption looks sane as well.


Coding review: style seems fine for contrib, though lines longer than 80 
cols should be broken up. Comments in the code are sparse , some even 
counter-productive (marking something as "additional things" certainly 
doesn't help).


Code and architecture review: the new int_array_append_aggregate() 
functions itself seems fine to me.


Summary: My general feeling is, that this patch should be applied after 
minor code style corrections. As a longer term goal I think intagg 
should be integrated into  core, since it's very basic functionality. 
TODO entries for things like an array_accum() aggregate already exist. 
Adding this patch to contrib now might be a step into the right direction.


Dmitry, can you please apply these small corrections and re-submit the 
patch?


Regards

Markus Wanner

P.S.: I dislike the intagg's use of PGARRAY, but that's nothing to do 
with this patch. Shouldn't this better use a real composite type as the 
aggregate's state type? I'd propose to clean up the intagg contrib 
module and prepare it for inclusion into core.



[1]: functional and performance testing session:

On a database with (patched) intagg and intarr contrib modules:

markus=# CREATE TABLE test (id INT NOT NULL, arr INT[] NOT NULL);
CREATE TABLE
markus=# INSERT INTO test VALUES (1, ARRAY[1,2,3]), (2, ARRAY[4,5]), (3,
ARRAY[3,2,1]);
INSERT 0 3
markus=# SELECT * FROM test;
 id |   arr
+-
  1 | {1,2,3}
  2 | {4,5}
  3 | {3,2,1}
(3 rows)

markus=# SELECT int_array_append_aggregate(arr) FROM test;
 int_array_append_aggregate

 {1,2,3,4,5,3,2,1}
(1 row)

markus=# SELECT * FROM test;
 id |   arr
+-
  1 | {1,2,3}
  2 | {4,5}
  3 | {3,2,1}
(3 rows)

markus=# SELECT int_array_aggregate(id) AS ids,
int_array_append_aggregate(arr) FROM test GROUP BY (id / 2);
  ids  | int_array_append_aggregate
---+
 {1}   | {1,2,3}
 {2,3} | {4,5,3,2,1}
(2 rows)

markus=# SELECT int_array_aggregate(id) AS ids,
int_array_append_aggregate(arr) FROM test GROUP BY (id % 2);
  ids  | int_array_append_aggregate
---+
 {2}   | {4,5}
 {1,3} | {1,2,3,3,2,1}
(2 rows)

markus=# INSERT INTO test VALUES (4, NULL);
INSERT 0 1

markus=# SELECT int_array_append_aggregate(arr) FROM test;
 int_array_append_aggregate

 {1,2,3,4,5,3,2,1}
(1 row)

markus=# SELECT id, int_array_append_aggregate(arr) FROM test GROUP BY id;
 id | int_array_append_aggregate
+
  4 | {}
  2 | {4,5}
  3 | {3,2,1}
  1 | {1,2,3}
(4 rows)

-- switching to performance testing

markus=# \timing
Timing is on.

markus=# DELETE FROM test;
DELETE 4
Time: 9.037 ms

markus=# INSERT INTO test SELECT generate_series(1, 1000),
array[round(random() * 100)]::int[];
INSERT 0 1000
Time: 53321.186 ms

markus=# SELECT icount(int_array_aggregate(id)) AS count FROM test;
  count
--
 1000
(1 row)

Time: 2493.184 ms

markus=# SELECT icount(int_array_append_aggregate(arr)) AS count FROM test;
  count
--
 1000
(1 row)

Time: 4152.478 ms



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump/pg_restore items

2008-09-06 Thread Andrew Dunstan


What is the state of the following items? I'm a little confused about 
whether or not work is being done on them.


. splitting pg_dump/pg_restore schema dumps into pre-data and post-data 
sections

. parallel pg_restore

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] [Patch Review] TRUNCATE Permission

2008-09-06 Thread Ryan Bradetich
Hello Robert,

On Fri, Sep 5, 2008 at 8:13 PM, Robert Haas <[EMAIL PROTECTED]> wrote:
> Updated patch attached, based on comments from Ryan Bradetich and Tom
> Lane, and sync'd to latest CVS version.

Thanks for the update.   I am out of town until tomorrow evening.
I will re-review this patch when I get back if it has not been
committed by then.

- Ryan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Simon Riggs

On Fri, 2008-09-05 at 15:23 -0400, Tom Lane wrote:

> How necessary is this given the recent fixes to allow the stats file to
> be kept on a ramdisk?

I would prefer this approach and back-out the other change.

On-demand is cheaper and easier to use.

> > Attached is a WIP patch, which basically implements this:
> 
> This patch breaks deadlock checking and statement_timeout, because
> backends already use SIGALRM.  You can't just take over that signal.
> It's possible that you could get things to work by treating this as an
> additional reason for SIGALRM, but that code is unreasonably complex
> already.  I'd suggest finding some other way.

There are other ways already in use in backend, so just use those.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-06 Thread Simon Riggs

On Fri, 2008-09-05 at 23:21 +0900, Fujii Masao wrote:

> Pavan re-designed the sync replication based on the prototype
> and I posted that design doc on wiki. Please check it if you
> are interested in it.
> http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects

It's good to see the detailed design, many thanks.

I will begin looking at technical details next week.

> This design is too huge. In order to enhance the extensibility
> of postgres, I'd like to divide the sync replication into
> minimum hooks and some plugins and to develop it, respectively.
> Plugins for the sync replication plan to be available at the
> time of 8.4 release.

What is Core's commentary on this plan?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Fri, 2008-09-05 at 15:23 -0400, Tom Lane wrote:
>> How necessary is this given the recent fixes to allow the stats file to
>> be kept on a ramdisk?

> I would prefer this approach and back-out the other change.

Even if we get on-demand done, I wouldn't see it as a reason to back out
the statfile relocation work.  In an environment where the stats are
demanded frequently, you could still need that for performance.

(In fact, maybe this patch ought to include some sort of maximum update
rate tunable?  The worst case behavior could actually be WORSE than now.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Alvaro Herrera
Tom Lane escribió:

> (In fact, maybe this patch ought to include some sort of maximum update
> rate tunable?  The worst case behavior could actually be WORSE than now.)

Some sort of "if stats were requested in the last 500 ms, just tell the
requester to read the existing file".

Things that come to mind:

- autovacuum could use a more frequent stats update in certain cases

- Maybe we oughta have separate files, one for each database?  That way
we'd reduce unnecessary I/O traffic for both the reader and the writer.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Some sort of "if stats were requested in the last 500 ms, just tell the
> requester to read the existing file".

Hmm, I was thinking of delaying both the write and the reply signal
until 500ms had elapsed.  But the above behavior would certainly be
easier to implement, and would probably be good enough (TM).

> - Maybe we oughta have separate files, one for each database?  That way
> we'd reduce unnecessary I/O traffic for both the reader and the writer.

The signaling would become way too complex, I think.  Also what do you
do about shared tables?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix an oversight in the 8.2 patch that improved mergejoin

2008-09-06 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Sat, 2008-09-06 at 13:06 +0100, Gregory Stark wrote:
>> Is that right? The materialize is just doing the same writing that the final
>> pass of the sort would have been doing. Did we discount the costs for sort 
>> for
>> that skipping writing that final pass when that was done?

> IIRC the cost of the sort didn't include the final merge, so when we
> avoided the final merge the cost model for the sort became accurate.
> Perhaps we should add something when we don't do that.

> It seems reasonable than an extra node should cost something anyhow, and
> the per tuple cost is the current standard way of indicating that extra
> cost.

Well, the problem with the previous coding was that the costs assigned
to the various nodes simply failed to add up, because the materialize
node injected by create_mergejoin_plan had not been accounted for at all
in the Path costs.  It's fair to question whether cost_sort couldn't be
improved to create a more accurate estimate of what's likely to happen
at runtime given recent changes in the sort code --- but that would be a
distinct issue.  I'm not entirely sure how to give cost_sort enough
context info, anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Tom Lane
I wrote:
> ... define
> \ef with no argument as being the command that presents an empty CREATE
> FUNCTION command template to fill in.

No complaints?  I'll go make that happen.

What about the general issue that neither \e nor \ef leave you with a
presentation of what's in the query buffer?  I haven't studied readline
enough to know if that's fixable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Abhijit Menon-Sen
At 2008-09-06 14:58:25 -0400, [EMAIL PROTECTED] wrote:
>
> I wrote:
> > ... define
> > \ef with no argument as being the command that presents an empty
> > CREATE FUNCTION command template to fill in.
> 
> No complaints?  I'll go make that happen.

No complaints, it sounds fine to me.

> What about the general issue that neither \e nor \ef leave you with a
> presentation of what's in the query buffer?

I don't know how that can be fixed; but I agree with Brendan that it's
behaviour that people are used to, and that it can be left alone for
now.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Asko Oja
Too frequent read protection is already handled in the patch but these
comments might lead it into new directions. Current implementation had this
same limit that file was written no more than once per 500 ms.

On Sat, Sep 6, 2008 at 9:12 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Some sort of "if stats were requested in the last 500 ms, just tell the
> > requester to read the existing file".
>
> > Things that come to mind:
>
> > - autovacuum could use a more frequent stats update in certain cases
>
> BTW, we could implement that by, instead of having a global tunable,
> including a field in the request message saying how stale an existing
> file is acceptable for this requestor.  500ms might be the standard
> value but autovac could use a smaller number.
>
>regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Need more reviewers!

2008-09-06 Thread Bruce Momjian
Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Thu, Sep 04, 2008 at 09:54:02PM +0100, Simon Riggs wrote:
> > * coding review - does it follow standard code guidelines? Are there
> > portability issues? Will it work on Windows/BSD etc? Are there
> > sufficient comments?
> > 
> > * code review - does it do what it says, correctly?
> 
> Just one thing though, I picked a random patch and started reading.
> However, the commitfest page doesn't link to anywhere that actually
> describes *what* the patch is trying to do. Many patches do have the
> design and the patch in one page, but some don't.
> 
> I suppose what happens is the original patch comes with design and
> later a newer version is posted with just changes. The commitfest page
> points to the latter, losing former in the archive somewhere.

Yep, that is a problem; the previous emails about the patch and comments
are very valuable for reviewers.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prototype: In-place upgrade v02

2008-09-06 Thread Bruce Momjian
Heikki Linnakangas wrote:
> > 4) This patch contains more topics for decision. First is general if 
> > this approach is acceptable.
> 
> I don't like the invasiveness of this approach. It's pretty invasive 
> already, and ISTM you'll need similar switch-case handling of all data 
> types that have changed the internal representation as well.
> 
> We've talked about this before, so you'll remember that I favor teh 
> approach is to convert the page format, page at a time, when the pages 
> are read in. I grant you that there's non-trivial issues with that as 
> well, like if the converted data takes more space and don't fit in the 
> page anymore.

I 100% agree with Heikki here;  having the conversion spill out into the
main backend is very expensive and adds lots of complexity.  The only
argument for the Zdenek's conversion spill appoach is that it allows
conversion to happen at a more natural time than when the page is read
in, but frankly I think the conversion needs are going to be pretty
limited and are better done in a localized way at page read-in time.

As far as the page not fitting after conversion, what about some user
command that will convert an entire table to the new format if page
expansion fails.

> I wonder if we could go with some sort of a hybrid approach? Convert the 
>   whole page when it's read in, but if it doesn't fit, fall back to 
> tricks like loosening the alignment requirements on platforms that can 
> handle non-aligned data, or support a special truncated page header, 
> without pd_tli and pd_prune_xid fields. Just a thought, not sure how 
> feasible those particular tricks are, but something along those lines..
> 
> All in all, though. I find it a bit hard to see the big picture. For 
> upgrade-in-place, what are all the pieces that we need? To keep this 
> concrete, let's focus on PG 8.2 -> PG 8.3 (or are you focusing on PG 8.3 
> -> 8.4? That's fine with me as well, but let's pick one) and forget 
> about hypothetical changes that might occur in a future version. I can see:
> 1. Handling page layout changes (pd_prune_xid, pd_flags)
> 2. Handling tuple header changes (infomask2, HOT bits, combocid)
> 3. Handling changes in data type representation (packed varlens)
> 4. Toast chunk size
> 5. Catalogs
> 
> After putting all those together, how large a patch are we talking 
> about, and what's the performance penalty then? How much of all that 
> needs to be in core, and how much can live in a pgfoundry project or an 
> extra binary in src/bin or contrib? I realize that none of us have a 
> crystal ball, and one has to start somewhere, but I feel uneasy 
> committing to an approach until we have a full plan.

Yes, another very good point.

I am ready to focus on these issues for 8.4;  all this needs to be
fleshed out, perhaps on a wiki.  As a starting point, what would be
really nice is to start a wiki that lists all data format changes for
every major release.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About that CommitFest redirect page ...

2008-09-06 Thread Tom Lane
So according to
http://wiki.postgresql.org/index.php?title=CommitFest&action=history
there's been rather a lot of confusion about where the CommitFest
redirect page should point when.

I think the problem is that we need two redirect pages: one for "the
place where you should submit a new patch" and one for "the commitfest in
progress".  The problem is to choose names that will make it reasonably
obvious which is which.

Another possibility is to convert CommitFest to a plain page with
suitably-explained links leading to the two relevant pages.  However,
a redirect would be nicer because you could bookmark it.

Ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Some sort of "if stats were requested in the last 500 ms, just tell the
> requester to read the existing file".

> Things that come to mind:

> - autovacuum could use a more frequent stats update in certain cases

BTW, we could implement that by, instead of having a global tunable,
including a field in the request message saying how stale an existing
file is acceptable for this requestor.  500ms might be the standard
value but autovac could use a smaller number.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Bruce Momjian
Tom Lane wrote:
> "Brendan Jurd" <[EMAIL PROTECTED]> writes:
> > On Sat, Sep 6, 2008 at 10:10 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> >> ... I changed
> >> the exit code to PSQL_CMD_NEWEDIT instead of PSQL_CMD_SEND, which causes
> >> the command to wait in the query buffer.
> 
> > The principle of least astonishment suggests that \ef should behave in
> > the same way as \e.
> 
> Quite.

So, are they consistent now or do we need another patch?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump/pg_restore items

2008-09-06 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> What is the state of the following items? I'm a little confused about 
> whether or not work is being done on them.

> . splitting pg_dump/pg_restore schema dumps into pre-data and post-data 
> sections

A patch for that was proposed and rejected in the last fest; I'm not
sure if anyone is trying to bring it up to speed, but it's in the
archives if anyone wants to.

> . parallel pg_restore

I've not heard that anyone has done any work on this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Alvaro Herrera
Tom Lane escribió:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:

> > - Maybe we oughta have separate files, one for each database?  That way
> > we'd reduce unnecessary I/O traffic for both the reader and the writer.
> 
> The signaling would become way too complex, I think.  Also what do you
> do about shared tables?

They are already stored in a "separate database" (denoted with
InvalidOid dbid), and autovacuum grabs it separately.  I admit I don't
know what do regular backends do about it.

As for signalling, maybe we could implement something like we do for the
postmaster signal stuff: the requestor stores a dbid in shared memory
and sends a SIGUSR2 to pgstat or some such.  We'd have enough shmem
space for a reasonable number of requests, and pgstat consumes them from
there into local memory (similar to what Andrew proposes for
LISTEN/NOTIFY); it stores the dbid and PID of the requestor.  As soon as
the request has been fulfilled, pgstat responds by  to that particular backend.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> "Brendan Jurd" <[EMAIL PROTECTED]> writes:
>>> The principle of least astonishment suggests that \ef should behave in
>>> the same way as \e.
>> 
>> Quite.

> So, are they consistent now or do we need another patch?

They are consistent, I just don't like either of them ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About that CommitFest redirect page ...

2008-09-06 Thread Alvaro Herrera
Tom Lane wrote:
> So according to
> http://wiki.postgresql.org/index.php?title=CommitFest&action=history
> there's been rather a lot of confusion about where the CommitFest
> redirect page should point when.
> 
> I think the problem is that we need two redirect pages: one for "the
> place where you should submit a new patch" and one for "the commitfest in
> progress".  The problem is to choose names that will make it reasonably
> obvious which is which.

I suggest two redirects CommitFestInProgress and CommitFestOpen, and
turning CommitFest into a plain page with suitable text pointing to both
redirects.

We'd also need a page saying "there is no commitfest currently in
progress; maybe you wanted to go to CommitFestOpen instead?" to point
the CommitFestInProgress to in the period between commitfests.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> As for signalling, maybe we could implement something like we do for the
> postmaster signal stuff: the requestor stores a dbid in shared memory
> and sends a SIGUSR2 to pgstat or some such.

No, no, no.  Martin already had a perfectly sane design for that
direction of signalling: send a special stats message to the collector.
That can carry whatever baggage it needs to.  It's the reverse direction
of "the data you requested is available now, sir" that is tricky.
And I fear that having to keep track of multiple stats-collector output
files would make it very significantly trickier --- both for the stats
collector's own bookkeeping and for the signaling mechanism itself.
I don't believe it's gonna be worth that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About that CommitFest redirect page ...

2008-09-06 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I suggest two redirects CommitFestInProgress and CommitFestOpen, and
> turning CommitFest into a plain page with suitable text pointing to both
> redirects.

> We'd also need a page saying "there is no commitfest currently in
> progress; maybe you wanted to go to CommitFestOpen instead?" to point
> the CommitFestInProgress to in the period between commitfests.

Works for me ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Alvaro Herrera
Abhijit Menon-Sen wrote:
> At 2008-09-06 14:58:25 -0400, [EMAIL PROTECTED] wrote:

> > What about the general issue that neither \e nor \ef leave you with a
> > presentation of what's in the query buffer?
> 
> I don't know how that can be fixed; but I agree with Brendan that it's
> behaviour that people are used to, and that it can be left alone for
> now.

As far as it works to not execute the query when the user exits without
saving the buffer, it should be OK.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing statistics write overhead

2008-09-06 Thread Tom Lane
I wrote:
> No, no, no.  Martin already had a perfectly sane design for that
> direction of signalling: send a special stats message to the collector.

Actually ... given that the stats message mechanism is designed to be
lossy under high load, maybe that isn't so sane.  At the very least
there would have to be timeout-and-resend logic on the backend side.

I dislike the alternative of communicating through shared memory,
though.  Right now the stats collector isn't even connected to shared
memory.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] allow has_table_privilege(..., 'usage') on sequences

2008-09-06 Thread Tom Lane
"Jaime Casanova" <[EMAIL PROTECTED]> writes:
> On Thu, Aug 7, 2008 at 3:08 AM, Abhijit Menon-Sen <[EMAIL PROTECTED]> wrote:
>> I just noticed, to my dismay, that has_table_privilege() does not allow
>> me to check for usage privileges on sequences.

> Maybe we want a new function has_sequence_privilege() instead?

Yeah, that seems like the $64 question for this patch.  The presented
patch is certainly simple (it lacks only documentation to be considered
committable).  The question is do we want to fuzz things up to the
extent of pretending that USAGE is a table privilege.  The GRANT code
certainly doesn't think so:

regression=# grant usage on table t1 to joe;
ERROR:  invalid privilege type USAGE for table

and in fact aclchk.c devotes quite a few lines of code to making sure
that sequence and table privileges are kept appropriately distinct.

As of right now, the proposed patch looks like a nice easy solution to a
minor problem.  But I'm concerned that we might be backing ourselves
into a corner by inserting this inconsistency --- some day it might
cause a real problem.  It also seems that it would be throwing away
a lot of hard work that was already put into aclchk.c to maintain the
distinction.

So I'm thinking it would be better to invent a has_sequence_privilege
family of functions.

On the other hand, that would require a couple hundred lines of new code
and docs.  Even though it'd be a pretty easy copy-and-paste task,
perhaps that's overkill for what I have to admit is a mostly
hypothetical worry about future inconsistency.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] "\ef " in psql

2008-09-06 Thread Bruce Momjian
Alvaro Herrera wrote:
> Abhijit Menon-Sen wrote:
> > At 2008-09-06 14:58:25 -0400, [EMAIL PROTECTED] wrote:
> 
> > > What about the general issue that neither \e nor \ef leave you with a
> > > presentation of what's in the query buffer?
> > 
> > I don't know how that can be fixed; but I agree with Brendan that it's
> > behaviour that people are used to, and that it can be left alone for
> > now.
> 
> As far as it works to not execute the query when the user exits without
> saving the buffer, it should be OK.

Well, it works like \e now, which is good.  The only complexity is that
\e works differently depending on whether you use ';' or \g, meaning:

SELECT 1;
\e

will execute the buffer on exit (saved or not), while

SELECT 1
\g
\e

will not execute the buffer on editor exit.  Our current \ef code does
not add a trailing semicolon to the CREATE FUNCTION buffer contents so
it works like the second case, which is probably the best we are going
to do.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review Report: propose to include 3 new functions into intarray and intagg

2008-09-06 Thread Tom Lane
Markus Wanner <[EMAIL PROTECTED]> writes:
> Submission review: we generally prefer having patches archived on our
> mailing lists, so please just send future patches or revisions of this
> patch to our lists (I prefer -hackers, but probably -patches is still
> the official one).

Please.  But -patches is dead, please use -hackers.

> The documentation should go into a README file of the
> contrib module or some such.

No, it should go into the SGML docs, specifically doc/src/sgml/intarray.sgml
and intagg.sgml.  We got rid of flat-text README documentation for
contrib in 8.3, and are not about to permit any backsliding.

> Summary: My general feeling is, that this patch should be applied after 
> minor code style corrections. As a longer term goal I think intagg 
> should be integrated into  core, since it's very basic functionality. 

Well, what should be integrated is generic (non-datatype-specific)
versions of this functionality, ie functions working on
anyarray/anyelement not just int4[]/int4.  There was some discussion
about that a few days ago.  I'm not really sure how painful it would be
to do; probably the generalization to non-fixed-width types would be the
trickiest part.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCHES] [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)

2008-09-06 Thread David Rowley
<<>>

Tom Lane wrote:
> I wrote:
> > I looked this over a bit and was immediately confused by one thing:
> > the introductory comment says that the skip table size ought to be based
> > on the length of the haystack, which makes sense to me, but the code is
> > actually initializing it on the basis of len2, ie, the length of the
> > needle.  Isn't that a bug?  Was the same bug present in the tests you
> > made to determine the best table sizes?
>
> BTW, to the extent that you feel like testing a different idea,
> I would suggest:
>
> * don't bother initializing the skiptable when len1 < len2
>
> * otherwise, choose its size based on len1 - len2, not just len1 or
> len2.  This is (one less than) the maximum number of search loop
> consultations of the skip table that can happen, so it seems like a
> plausible number, and better than either length alone.


I've made the discussed changes. Also updated the benchmark results.
http://www.unixbeast.com/~fat/8.3_test_v1.3.xls


On re-benchmarking to determine the best skip table size, the changes to the
logic didn't seem to affect the results enough to have to change the
thresholds. But I'm sure it will perform a little better in cases like when
both the needle and haystack are very long and close to being the same
length. Having a big skip table in this case is a little bit of a waste as
the search won't get to make much use of it. 


boyer-moore-strpos_v1.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-06 Thread Bruce Momjian
Markus Wanner wrote:
> > Hook for WALSender
> > --
> > This hook is for introducing WALSender. There are the following
> > three ideas of how to introduce WALSender. A required hook
> > differs by which idea is adopted.
> > 
> > a) Use WALWriter as WALSender
> > 
> >This idea needs WALWriter hook which intercepts WALWriter
> >literally. WALWriter stops the local WAL write and focuses on
> >WAL sending. This idea is very simple, but I don't think of
> >the use of WALWriter hook other than WAL sending.

The problem with this approach is that you are not sending WAL to the
disk _while_ you are, in parallel, sending WAL to the slave;  I think
this is useful for performance reasons in synrchonous replication.

> > b) Use new background process as WALSender
> > 
> >This idea needs background-process hook which enables users
> >to define new background processes. I think the design of this
> >hook resembles that of rmgr hook proposed by Simon. I define
> >the table like RmgrTable. It's for registering some functions
> >(e.g. main function and exit...) for operating a background
> >process. Postmaster calls the function from the table suitably,
> >and manages a start and end of background process. ISTM that
> >there are many uses in this hook, e.g. performance monitoring
> >process like statspack.

I think starting/stopping a process for each WAL send is too much
overhead.

> > c) Use one backend as WALSender
> > 
> >In this idea, slave calls the user-defined function which
> >takes charge of WAL sending via SQL e.g. "SELECT pg_walsender()".
> >Compared with other ideas, it's easy to implement WALSender
> >because postmater handles the establishment and authentication
> >of connection. But, this SQL causes a long transaction which
> >prevents vacuum. So, this idea needs idle-state hook which
> >executes plugin before transaction starts. I don't think of
> >the use of this hook other than WAL sending either.
> 
> The above cited wiki page sounds like you've already decided for b).

I assumed that there would be a background process like bgwriter that
would be notified during a commit and send the appropriate WAL files to
the slave.

> I'm unclear on what you want hooks for. If additional processes get 
> integrated into Postgres, those certainly need to get integrated very 
> much like we integrated other auxiliary processes. I wouldn't call that 
> 'hooking', but YMMV.

Yea, I am unclear how this is going to work using simple hooks.

It sounds like Fujii-san is basically saying they can only get the hooks
done for 8.4, not the actual solution.  But, as I said above, I am
unclear how a hook solution would even work long-term;  I am afraid it
would be thrown away once an integrated solution was developed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need more reviewers!

2008-09-06 Thread Abbas Butt
 >  On Thu, 2008-09-04 at 10:45 -0700, Josh Berkus wrote:
> > We currently have 38 patches pending, and only nine people reviewing
> them.
> > At this rate, the September commitfest will take three months.
>

> > If you are a postgresql hacker at all, or even want to be one, we need
> your
> > help reviewing patches!  There are several "easy" patches in the list, so
> > I can assign them to beginners.
> >
> > Please volunteer now!
>
>
Hi Josh,

I volunteer as a reviewer, assign a patch to me.

Regards
Abbas
www.enterprisedb.com


Re: [PATCHES] [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)

2008-09-06 Thread Tom Lane
"David Rowley" <[EMAIL PROTECTED]> writes:
> I've made the discussed changes. Also updated the benchmark results.
> http://www.unixbeast.com/~fat/8.3_test_v1.3.xls

Applied with revisions; mostly cosmetic except for one point.  I
realized after studying the code a bit more that B-M cannot possibly win
for a single-character pattern (needle), since the skip distance must
always be 1 in that case.  The fact that it seemed to keep up at that
length has to be because the original coding included a strncmp call
inside the innermost loop, which likely prevents the compiler from
optimizing that loop really tightly.  But the strncmp wasn't doing
anything anyway for the case of pattern length = 1.  So what I committed
special-cases pattern length 1 to be a naive search with a *very* tight
inner loop.  I think it's worth troubling over this case because a
common usage is split_to_array and suchlike with single-character
delimiters.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Noisy CVS updates

2008-09-06 Thread D'Arcy J.M. Cain
When I build from CVS I wind up with this in my CVS update email in the
morning:

? GNUmakefile
? config.log
? config.status
? src/Makefile.global
? src/backend/postgres
? src/backend/bootstrap/bootstrap_tokens.h
? src/backend/catalog/postgres.bki
? src/backend/catalog/postgres.description
? src/backend/catalog/postgres.shdescription
? src/backend/parser/parse.h
? src/backend/snowball/snowball_create.sql

...etc.  Would it be OK if I went in and added .cvsignore files to keep
the noise level down?  I guess I could have my daily script filter them
out but there may be times when there really is an unexpected file and
I want to follow up on it.

-- 
D'Arcy J.M. Cain <[EMAIL PROTECTED]> |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers