Re: [HACKERS] Maximum reasonable bgwriter_delay

2007-06-20 Thread Greg Smith

On Tue, 19 Jun 2007, Heikki Linnakangas wrote:


Simon Riggs wrote:

Laptop mode? Linux has it...


Granted, though you're still going to wake up every second, so I'm not sure 
how much it helps with battery life.


In this context, Linux's laptop mode is all about keeping the disks from 
spinning up any more than they have to; the fact that the CPU does a 
little something occasionally isn't so important.  I don't think that's an 
argument for keeping the current range for this parameter though.  The 
goal for a proper laptop mode tuning is for the hard drive to go minutes 
at a time between waking, and whether bgwriter_delay is 1s or 10s really 
isn't that big of a difference relative to that scale.  Unless you dirty a 
lot of memory, the laptop mode tuning is going to cache all the writes 
anyway until it hits the interval where it wakes the disk to catch up.


I can't think of any good reason why the bgwriter_delay can't be reduced 
to 1s if that simplifies things.  You'd need a pretty old system for a 
longer delay than that to be appropriate.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Maximum reasonable bgwriter_delay

2007-06-20 Thread Tom Lane
Greg Smith [EMAIL PROTECTED] writes:
 I can't think of any good reason why the bgwriter_delay can't be reduced 
 to 1s if that simplifies things.

The simplification Heikki suggests would save a grand total of 9 lines
of C code, two of which are braces.  Is it really worth it to make such
stringent assumptions about what the parameter is good for?

(Having a GUC parameter at all costs more lines than that, not even
counting its documentation.)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Suggestion for Enum Support Functions

2007-06-20 Thread toronto programmer
Dear Postgres developers, 

I have been working with Oracle for few years now in my work, and I tried some 
free databases for a project that I'm developing for my own use, I have tried 
H2,FireBird and postgres, and found the last to be the most stable and 
feature-rich, so thanks for all the good work.

I have read the 8.3 documentation, and with reference to Enum Support Functions 
found on http://developer.postgresql.org/pgdocs/postgres/functions-enum.html, i 
think it is useful to add 2 functions, enum_after(anyenum) and 
enum_before(anyenum), so having :

CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 
'purple');
enum_after('orange'::rainbow) will return 'yellow'
enum_after('purple'::rainbow) will return an error
enum_before('purple'::rainbow) will return 'blue'

a good to have function would be enum_size(anyenum) which would return 6 in the 
previous example
that will be helpful in dealing with enums

Best regards

Hashim Kubba 
 






  Get a sneak peak at messages with a handy reading pane with All new 
Yahoo! Mail: http://mrd.mail.yahoo.com/try_beta?.intl=ca

Re: [HACKERS] PG-MQ?

2007-06-20 Thread Chris Browne
[EMAIL PROTECTED] (Steve Atkins) writes:
 Is there any existing work out there on this?  Or should I maybe be
 looking at prototyping something?

 The skype tools have some sort of decent-looking publish/subscribe
 thing, PgQ, then they layer their replication on top of. It's multi
 consumer and producer, with delivered at least once semantics.

 Looks nice.

I had not really noticed that - I need to take a look at their
connection pooler too, so I guess that puts more skype items on my
ToDo list ;-).  Thanks for pointing it out...
-- 
let name=cbbrowne and tld=linuxdatabases.info in String.concat @ 
[name;tld];;
http://cbbrowne.com/info/advocacy.html
Signs of a Klingon Programmer #1: Our users will  know fear and cower
before our software. Ship it! Ship it and let  them flee like the dogs
they are!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 04:45, Chris Browne wrote:
 I'm seeing some applications where it appears that there would be
 value in introducing asynchronous messaging, ala message queueing.
 http://en.wikipedia.org/wiki/Message_queue

 The granddaddy of message queuing systems is IBM's MQ-Series, and I
 don't see particular value in replicating its functionality.

I'm quite interested in this.  Maybe I'm thinking of something too
complex, but I do think there are some oh it'll need to do that too
pitfalls that are best considered up front.

The big thing about MQ is that it participates as a resource manager in
two-phase commits (and optionally a transaction manager as well).  That
means that you get atomic processing steps: application takes message off
a queue, processes it, commits its changes to the database, replies to
message.  The queue manager then does a second-phase commit for all of
those steps, and that's when the reply really goes out.  If the
application fails, none of this will have happened so you get ACID over
the complete cycle.  That's something we should have free software for.

Perhaps the time is right for something new.  A lot of the complexity
inside MQ comes from data representation issues like encodings and
fixed-length strings, as I recall, and things have changed since MQ was
designed.  I agree it could be useful (and probably not hard either) to
have a transactional messaging system inside the database.  It saves you
from having to do two-phase commits.

But it does tie everything to postgres to some extent, and you lose the
interesting features—atomicity and assured, single delivery—as soon as
anything in the chain does anything persistent that does not participate
in the postgres transaction.  Perhaps what we really need is more mature
components, with a unified control layer on top.  That's how a lot of
successful free software grows.  See below.


 On the other side, the big names these days are:

 a) The Java Messaging Service, which seems to implement *way* more
options than I'm even vaguely interested in having (notably, lots
that involve data stores or lack thereof that I do not care to use);

Far as I know, JMS is an API, not a product.  You'd still slot some
messaging middleware underneath, such as MQ.  That is why MQSeries was
renamed: it fits into the WebSphere suite as the implementing engine
underneath the JMS API.  From what I understand MQ is one of the
best-of-breed products that JMS was designed around.  (Sun's term, bit
hypey for my taste).

In one way, Java is easy: the last thing you want to get into is yet
another marshaling standard.  There are plenty of standards to choose
from already, each married to one particular communications mechanism:
RPC, EDI, CORBA, D-Bus, XMLRPC, what have you.  Even postgres has its own.
 I'd say the most successful mechanism is TCP itself, because it isolates
itself from content representation so effectively.

It's hard not to get into marshaling: someone has to do it, and it's often
a drag to do it in the application, but the way things stand now *any*
choice limits the usefulness of what you're building.  That's something
I'd like to see change.

Personally I'd love to see marshaling or low-level data representation
isolated into a mature component that speaks multiple programming
languages on the one hand and multiple data representation formats on the
other.  Something the implementers of some of these messaging standards
would want to use to compose their messages, isolating their format
definitions into plugins.  Something that would make application writers
stop composing messages in finicky ad-hoc code that fails with unexpected
locales or has trouble with different line breaks.

If we had a component like that, combining it with existing transactional
variants of TCP and [S]HTTP might even be enough to build a usable
messaging system.  I haven't looked at them enough to know.  Of course
we'd need implementations of those protocols; see
http://ttcplinux.sourceforge.net/ and http://www.csn.ul.ie/~heathclf/fyp/
for example.

Another box of important tools, and I have no idea where we stand with
this one, is transaction management.  We have 2-phase commit in postgres
now.  But do we have interoperability with existing transaction managers? 
Is there a decent free, portable, everything-agnostic transaction manager?
 With those, the sphere of reliability of a database-driven messaging
package could extend much further.

A free XA-capable filesystem would be great too, but I guess I'm daydreaming.


 There tend to be varying semantics out there:

 - Some queues may represent subscriptions where a whole bunch of
   listeners want to get all the messages;

The two simplest models that offer something more than TCP/UDP are 1:n
reliable publish-subscribe without persistence, and 1:1 request-reply with
persistent storage.  D-Bus does them both; IIRC MQ does 1:1 and has
add-ons on top for publish-subscribe.

I could imagine 

Re: [HACKERS] PG-MQ?

2007-06-20 Thread Markus Schiltknecht

Hi Chris,

Chris Browne wrote:

I'm seeing some applications where it appears that there would be
value in introducing asynchronous messaging, ala message queueing.
http://en.wikipedia.org/wiki/Message_queue


ISTM that 'message queue' is a way too general term. There are hundreds 
of different queues at different levels on a standard server. So I'm 
somewhat unsure about what problem you want to solve.



c) There are lesser names, like isectd http://isectd.sf.net and the
(infamous?) Spread Toolkit which both implement memory-based messaging
systems.


If a GCS is about what you're looking for, then you also might want to 
consider these: ensemble, appia or jGroups. There's a Java layer called 
jGCS, which supports even more, similar systems.


Another commonly used term is 'reliable multicast', which guarantees 
that messages are delivered to a group of recipients. These algorithms 
often are the basis for a GCS.



My bias would be to have something that can basically run as a thin
set of stored procedures atop PostgreSQL :-).  It would be trivial to
extend that to support SOAP/XML-RPC, if desired.


Hm.. in Postgres-R I currently have (partial) support for ensemble and 
spread. Exporting that interface via stored procedures could be done, 
but you would probably need a manager process, as you certainly want 
your connections to persist across transactions (or not?).


Together with that process, we already have half of what Postgres-R is: 
an additional process which connects to the GCS. Thus I'm questioning, 
if there's value for exporting the interface. Can you think of other use 
cases than database replication? Why do you want to do that via the 
database, then, and not directly with the GCS?



It would be nice to achieve 'higher availability' by having queues
where you might replicate the contents (probably using the MQ system
itself ;-)) to other servers.


Uhm.. sorry, but I fail to see the big news here. Which replication 
solution does *not* work that way?


Regards

Markus


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Jeroen T. Vermeulen [EMAIL PROTECTED] wrote:

On Wed, June 20, 2007 04:45, Chris Browne wrote:
 - Sometimes you have the semantics where:
   - messages need to be delivered at least once
   - messages need to be delivered no more than once
   - messages need to be delivered exactly once

IMHO, if you're not doing exactly once, or something very close to it,
you might as well stay with ad-hoc code.  You can ensure single delivery
by having the sender re-send when in doubt, and keeping track of
duplications in the recipient.


In case of PGQ, the at least once semantics is related to batch-based
processing it does - in case of failure, full batch is delivered again,
so if consumer had managed to process some of the items already, it gets
them double.

As it is responsponsible only for delivering events from database,
it has no way of guaranteeing exactly once behaviour, that needs
to be built on top of PGQ.

Simplest case would be if the events are processed in same database
that the queue resides.  Then you can just fetch, process, close batch
in one transaction and immidiately you get exactly once behaviour.

To achieve exactly once behaviour with different databases, look
at the pgq_ext module for sample.  Basically it just requires
storing batch_id/event_id on remote db and committing there first.
Later it can be checked if the batch/event is already processed.

It's tricky only if you want to achieve full transactionality for
event processing.  As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...

To Chris: you should like PgQ, its just stored procs in database,
plus it's basically just generalized Slony-I, with some optimizations,
so should be familiar territory ;)

--
marko

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Heikki Linnakangas

Marko Kreen wrote:

As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...


JMS certainly does have the concept of transactions. Both distributed 
ones through XA and two-phase commit, and local involving just one JMS 
provider. I don't know about others, but would be surprised if they didn't.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:

Marko Kreen wrote:
 As I understand, JMS does not have a concept
 of transactions, probably also other solutions mentioned before,
 so to use PgQ as backend for them should be much simpler...

JMS certainly does have the concept of transactions. Both distributed
ones through XA and two-phase commit, and local involving just one JMS
provider. I don't know about others, but would be surprised if they didn't.


Ah, sorry, my mistake then.  Shouldn't trust hearsay :)

--
marko

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote:
 Marko Kreen wrote:
 As I understand, JMS does not have a concept
 of transactions, probably also other solutions mentioned before,
 so to use PgQ as backend for them should be much simpler...

 JMS certainly does have the concept of transactions. Both distributed
 ones through XA and two-phase commit, and local involving just one JMS
 provider. I don't know about others, but would be surprised if they
 didn't.

Wait...  I thought XA did two-phase commit, and then there was XA+ for
*distributed* two-phase commit, which is much harder?


Jeroen



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Heikki Linnakangas

Jeroen T. Vermeulen wrote:

On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote:

Marko Kreen wrote:

As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...

JMS certainly does have the concept of transactions. Both distributed
ones through XA and two-phase commit, and local involving just one JMS
provider. I don't know about others, but would be surprised if they
didn't.


Wait...  I thought XA did two-phase commit, and then there was XA+ for
*distributed* two-phase commit, which is much harder?


Well, I meant distributed as in one transaction manager, multiple 
resource managers, all participating in a single atomic transaction. I 
don't know what XA+ adds on top of that.


To be precise, being a Java-thing, JMS actually supports two-phase 
commit through JTA (Java Transaction API), not XA. It's the same design 
and interface, just defined as Java interfaces instead of at native 
library level.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Rob Butler
Do you guys need something PG specific or built into PG?

ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton 
of features.  Could you simply use that?

http://activemq.apache.org/

Rob



   

Get the free Yahoo! toolbar and rest assured with the added security of spyware 
protection.
http://new.toolbar.yahoo.com/toolbar/features/norton/index.php

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Alvaro Herrera
Bruce Momjian wrote:

 Agreed.  I don't see the point in following a standard few people know
 about.

Few people in the US and UK you mean, right?  Everybody else stopped
measuring in king's feet and thumbs a long time ago.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 19:42, Rob Butler wrote:
 Do you guys need something PG specific or built into PG?

 ActiveMQ is very nice, speaks multiple languages, protocols and supports a
 ton of features.  Could you simply use that?

 http://activemq.apache.org/

Looks very nice indeed!


Jeroen



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Suggestion for Enum Support Functions

2007-06-20 Thread Andrew Dunstan



toronto programmer wrote:

Dear Postgres developers,

I have been working with Oracle for few years now in my work, and I 
tried some free databases for a project that I'm developing for my own 
use, I have tried H2,FireBird and postgres, and found the last to be 
the most stable and feature-rich, so thanks for all the good work.


I have read the 8.3 documentation, and with reference to Enum Support 
Functions found on 
http://developer.postgresql.org/pgdocs/postgres/functions-enum.html, i 
think it is useful to add 2 functions, enum_after(anyenum) and 
enum_before(anyenum), so having :


CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 
'purple');
enum_after('orange'::rainbow) will return 'yellow'
enum_after('purple'::rainbow) will return an error
enum_before('purple'::rainbow) will return 'blue'

a good to have
 function would be enum_size(anyenum) which would return 6 in the previous 
example
that will be helpful in dealing with enums



You could easily create these for yourself, of course. For example:


 create or replace function enum_size(anyenum)
   returns int as
   $$ select array_upper(enum_range($1),1) $$
   language sql;


Successor and predecessor functions would be a bit more work, but not 
hard. I don't think they should error out at the range extremes, though. 
Perhaps returning NULL would  be better.


We could look at adding these as builtins for 8.4, but it's too late now 
to add them for 8.3. Besides, I think we need to see how enums are used 
in the field before deciding if any extensions are needed.




cheers

andrerw

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Rob Butler [EMAIL PROTECTED] wrote:

Do you guys need something PG specific or built into PG?


Yes, we need it usable from inside the DB, thus the PgQ.

That means the events are also transactional with other
things happening in the DB.


ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton 
of features.  Could you simply use that?


I guess that if you need standalone message broker, the
ActiveMQ may be good choice.  At least, any solution that
avoids the database when passing messages should outperform
solutions that pipe stuff thru (general-purpose) database.

OTOH, if you _do_ need to transport the events via database
it should be very hard to outperform PgQ. :)  As it uses the
user-level xid/snapshot trick introduced by rserv/erserver/slony,
which is not possible with other databases other than PostgreSQL.

--
marko

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Chris Browne
[EMAIL PROTECTED] (Marko Kreen) writes:
 To Chris: you should like PgQ, its just stored procs in database,
 plus it's basically just generalized Slony-I, with some optimizations,
 so should be familiar territory ;)

Looks interesting...

Random ideas

- insert_event in C (way to get rid of plpython)

Yeah, I'm with that...  Ever tried building [foo] on AIX, where foo in
('perl', 'python', ...)???  :-(

It seems rather excessive to add in a whole stored procedure language
simply for one function...
-- 
(format nil [EMAIL PROTECTED] cbbrowne linuxdatabases.info)
http://www3.sympatico.ca/cbbrowne/sgml.html
I always try to do things in chronological order. 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Robert Treat
On Tuesday 19 June 2007 10:15, Tom Lane wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
  I'm little bit confused when we introduce new page layout version? I
  expect that new version become with changes with pageheader, tuple
  header or data encoding (varlen/TOAST ...). But in case when there is
  new data type internal implementation, there was not reason to update
  version (see inet/cidr between 8.1 - 8.2). Can me somebody clarify this?

 Well, we've changed it when there was a benefit to an existing tool to
 do so.  So far that's meant page header and tuple header changes.  If
 we ever had a working in-place upgrade solution, I think we'd be willing
 to make the page version account for datatype format changes too.


FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
8.1-8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
constraint their willing to place on development, I see them a good 
chicken/egg away from making it a continually usefull tool. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Andrew Sullivan
On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
 FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
 8.1-8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
 constraint their willing to place on development, I see them a good 
 chicken/egg away from making it a continually usefull tool. 

Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Chris Browne [EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] (Marko Kreen) writes:
 To Chris: you should like PgQ, its just stored procs in database,
 plus it's basically just generalized Slony-I, with some optimizations,
 so should be familiar territory ;)

Looks interesting...


Thanks :)


Random ideas

- insert_event in C (way to get rid of plpython)

Yeah, I'm with that...  Ever tried building [foo] on AIX, where foo in
('perl', 'python', ...)???  :-(

It seems rather excessive to add in a whole stored procedure language
simply for one function...


Well, it's standard in our installations as we use it for
other stuff too.  It's much easier to prototype in PL/Python
than in C...

As it has not been performance problem I have not bothered
to rewrite it.  But now the interface has been stable some
time, it could be done.

--
marko

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas
I've uploaded the latest test results to the results page at 
http://community.enterprisedb.com/ldc/


The test results on the index page are not in a completely logical 
order, sorry about that.


I ran a series of tests with 115 warehouses, and no surprises there. LDC 
smooths the checkpoints nicely.


Another series with 150 warehouses is more interesting. At that # of 
warehouses, the data disks are 100% busy according to iostat. The 90% 
percentile response times are somewhat higher with LDC, though the 
variability in both the baseline and LDC test runs seem to be pretty 
high. Looking at the response time graphs, even with LDC there's clear 
checkpoint spikes there, but they're much less severe than without.


Another series was with 90 warehouses, but without think times, driving 
the system to full load. LDC seems to smooth the checkpoints very nicely 
 in these tests.


Heikki Linnakangas wrote:

Gregory Stark wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
Now that the checkpoints are spread out more, the response times are 
very

smooth.


So obviously the reason the results are so dramatic is that the 
checkpoints
used to push the i/o bandwidth demand up over 100%. By spreading it 
out you
can see in the io charts that even during the checkpoint the i/o busy 
rate

stays just under 100% except for a few data points.

If I understand it right Greg Smith's concern is that in a busier 
system where
even *with* the load distributed checkpoint the i/o bandwidth demand 
during t
he checkpoint was *still* being pushed over 100% then spreading out 
the load

would only exacerbate the problem by extending the outage.

To that end it seems like what would be useful is a pair of tests with 
and
without the patch with about 10% larger warehouse size (~ 115) which 
would

push the i/o bandwidth demand up to about that level.


I still don't see how spreading the writes could make things worse, but 
running more tests is easy. I'll schedule tests with more warehouses 
over the weekend.


It might even make sense to run a test with an outright overloaded to 
see if
the patch doesn't exacerbate the condition. Something with a warehouse 
size of
maybe 150. I would expect it to fail the TPCC constraints either way 
but what
would be interesting to know is whether it fails by a larger margin 
with the

LDC behaviour or a smaller margin.


I'll do that as well, though experiences with tests like that in the 
past have been that it's hard to get repeatable results that way.




--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Heikki Linnakangas wrote:

Another series with 150 warehouses is more interesting. At that # of 
warehouses, the data disks are 100% busy according to iostat. The 90% 
percentile response times are somewhat higher with LDC, though the 
variability in both the baseline and LDC test runs seem to be pretty high.


Great, this the exactly the behavior I had observed and wanted someone 
else to independantly run into.  When you're in 100% disk busy land, LDC 
can shift the distribution of bad transactions around in a way that some 
people may not be happy with, and that might represent a step backward 
from the current code for them.  I hope you can understand now why I've 
been so vocal that it must be possible to pull this new behavior out so 
the current form of checkpointing is still available.


While it shows up in the 90% figure, what happens is most obvious in the 
response time distribution graphs.  Someone who is currently getting a run 
like #295 right now: http://community.enterprisedb.com/ldc/295/rt.html


Might be really unhappy if they turn on LDC expecting to smooth out 
checkpoints and get the shift of #296 instead: 
http://community.enterprisedb.com/ldc/296/rt.html


That is of course cherry-picking the most extreme examples.  But it 
illustrates my concern about the possibility for LDC making things worse 
on a really overloaded system, which is kind of counter-intuitive because 
you might expect that would be the best case for its improvements.


When I summarize the percentile behavior from your results with 150 
warehouses in a table like this:


TestLDC %   90%
295 None3.703
297 None4.432
292 10  3.432
298 20  5.925
296 30  5.992
294 40  4.132

I think it does a better job of showing how LDC can shift the top 
percentile around under heavy load, even though there are runs where it's 
a clear improvement.  Since there is so much variability in results when 
you get into this territory, you really need to run a lot of these tests 
to get a feel for the spread of behavior.  I spent about a week of 
continuously running tests stalking this bugger before I felt I'd mapped 
out the boundaries with my app.  You've got your own priorities, but I'd 
suggest you try to find enough time for a more exhaustive look at this 
area before nailing down the final form for the patch.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Peter Eisentraut
Am Mittwoch, 20. Juni 2007 05:54 schrieb Bruce Momjian:
 Agreed.  I don't see the point in following a standard few people know
 about.

Yes, let's drop SQL as well.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Bruce Momjian
Oleg Bartunov wrote:
 On Sun, 17 Jun 2007, Bruce Momjian wrote:
 
  I have completed my first pass over the tsearch documentation:
 
  http://momjian.us/expire/fulltext/HTML/sql.html
 
  They are from section 14 and following.
 
  I have come up with a number of questions that I placed in SGML comments
  in these files:
 
  http://momjian.us/expire/fulltext/SGML/
 
  Teodor/Oleg, let me know when you want to go over my questions.
 
 Below are my answers (marked as )

OK.
 
 Comments to editorial work of Bruce Momjian.
 
 fulltext-intro.sgml:
 
 it is useful to have a predefined list of lexemes.
 
Bruce, here should be list of types of lexemes !

Agreed.  Are the list of lexemes parser-specific?

 /para/listitem
 
 !--
 SEEMS UNNECESSARY
 It useless to attempt normalize typeemail address/type using
 morphological dictionary of russian language, but looks reasonable to pick
 out typedomain name/type and be able to search for typedomain
 name/type.
 --
 
 I dont' understand where did you get this para :)

Uh, it was in the SGML.  I have removed it.

 fulltext-opfunc.sgml:
 
 All of the following functions that accept a configuration argument can
 use either an integer !-- why an integer -- or a textual configuration
 name to select a configuration.
 
 originally it was integer id, probably better use typeoid/type

Uh, my question is why are you allowing specification as an integer/oid
when the name works just fine.  I don't see the value in allowing
numbers here.

 This returns the query used for searching an index. It can be used to test
 for an empty query. The commandSELECT/ below returns literal'T'/,
 !-- lowercase? -- which corresponds to an empty query since GIN indexes
 do not support negate queries (a full index scan is inefficient):
 
  capital case. This looks cumbersome, probably querytree() should
  just return NULL.

Agreed.

 The integer option controls several behaviors which is done using bit-wise
 fields and literal|/literal (for example, literal2|4/literal):
 !-- why so complex? --
 
  to avoid 2 arguments

But I don't see why you would want to set two of those values --- they
seem mutually exclusive, e.g.

1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself

I assume you do either one, not both.

 its replaceableid/replaceable or replaceablets_name/replaceable; !-- 
 n
 if none is specified that the current configuration is used.
 
  I don't understand this question

Same issue as above --- why allow a number here when the name works just
fine.  We don't allow tables to be specified by number, so why
configurations?

 para
 !-- why?  --
 Note that the cascade dropping of the functionheadline/function function
 cause dropping of the literalparser/literal used in fulltext configuration
 replaceabletsname/replaceable.
 /para
 
  hmm, probably it should be reversed - cascade dropping of the parser cause
  dropping of the headline function.

Agreed.

 
 In example below, literalfulltext_idx/literal is
 a GIN index:!-- why isn't this automatic --
 
  It's explained above. The problem is that current index api doesn't allow
  to say if search was lossy or exact, so to preserve performance of
  GIN index we had to introduce @@@ operator, which is the same as @@, but
  lossy.

Well, then we have to fix the API.  Telling users to use a different
operator based on what index is defined is just bad style.

 nly the tokenlword/token lexeme, then a acronymTZ/acronym
 definition like ' one 1:11' will not work since lexeme type
 tokendigit/token is not assigned to the acronymTZ/acronym.
 !-- what do these numbers mean? --
 /para

OK, I changed it to be clearer.

  nothing special, just numbers for example.
 
 functionts_debug/ displays information about every token of
 replaceable class=PARAMETERdocument/replaceable as produced by the
 parser and processed by the configured dictionaries using the configuration
 specified by replaceable class=PARAMETERcfgname/replaceable or
 replaceable class=PARAMETERoid/replaceable. !-- no need for oid
 
  don't understand this comment. ts_debug accepts cfgname or its oid

Again, no need for oid.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Bruce Momjian
Peter Eisentraut wrote:
 Am Mittwoch, 20. Juni 2007 05:54 schrieb Bruce Momjian:
  Agreed. ?I don't see the point in following a standard few people know
  about.
 
 Yes, let's drop SQL as well.

If SQL was not a popular standard, we would drop it.  You and Alvaro are
saying that 'm' for meter and 'min' for minute is commonly recognized
outside the USA/UK, so that is good enough for me to say that the
existing setup is fine.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Bruce Momjian
Greg Smith wrote:
 I think it does a better job of showing how LDC can shift the top 
 percentile around under heavy load, even though there are runs where it's 
 a clear improvement.  Since there is so much variability in results when 
 you get into this territory, you really need to run a lot of these tests 
 to get a feel for the spread of behavior.  I spent about a week of 
 continuously running tests stalking this bugger before I felt I'd mapped 
 out the boundaries with my app.  You've got your own priorities, but I'd 
 suggest you try to find enough time for a more exhaustive look at this 
 area before nailing down the final form for the patch.

OK, I have hit my limit on people asking for more testing.  I am not
against testing, but I don't want to get into a situation where we just
keep asking for more tests and not move forward.  I am going to rely on
the patch submitters to suggest when enough testing has been done and
move on.

I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.  Right now, we know we have a
serious issue with checkpoint I/O, and this patch is going to improve
that in most cases.  I don't want to see us reject it or greatly delay
beta as we try to make it perfect.

My main point is that should keep trying to make the patch better, but
the patch doesn't have to be perfect to get applied.  I don't want us to
get into a death-by-testing spiral.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Oleg Bartunov

On Wed, 20 Jun 2007, Bruce Momjian wrote:


Oleg Bartunov wrote:

On Sun, 17 Jun 2007, Bruce Momjian wrote:


I have completed my first pass over the tsearch documentation:

http://momjian.us/expire/fulltext/HTML/sql.html

They are from section 14 and following.

I have come up with a number of questions that I placed in SGML comments
in these files:

http://momjian.us/expire/fulltext/SGML/

Teodor/Oleg, let me know when you want to go over my questions.


Below are my answers (marked as )


OK.


Comments to editorial work of Bruce Momjian.

fulltext-intro.sgml:

it is useful to have a predefined list of lexemes.

Bruce, here should be list of types of lexemes !


Agreed.  Are the list of lexemes parser-specific?



yes, it it parser which defines types of lexemes.


fulltext-opfunc.sgml:

All of the following functions that accept a configuration argument can
use either an integer !-- why an integer -- or a textual configuration
name to select a configuration.

originally it was integer id, probably better use typeoid/type


Uh, my question is why are you allowing specification as an integer/oid
when the name works just fine.  I don't see the value in allowing
numbers here.


for compatibility reason. Hmm, indeed, i don't recall where oid's could be 
important.





This returns the query used for searching an index. It can be used to test
for an empty query. The commandSELECT/ below returns literal'T'/,
!-- lowercase? -- which corresponds to an empty query since GIN indexes
do not support negate queries (a full index scan is inefficient):


capital case. This looks cumbersome, probably querytree() should
just return NULL.


Agreed.


The integer option controls several behaviors which is done using bit-wise
fields and literal|/literal (for example, literal2|4/literal):
!-- why so complex? --


to avoid 2 arguments


But I don't see why you would want to set two of those values --- they
seem mutually exclusive, e.g.

1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself

I assume you do either one, not both.


but what's about others variants ?

What I missed is the definition of extent.


From http://www.sai.msu.su/~megera/wiki/NewExtentsBasedRanking

Extent is a shortest and non-nested sequence of words, which satisfy a query.





its replaceableid/replaceable or replaceablets_name/replaceable; !-- n
if none is specified that the current configuration is used.


I don't understand this question


Same issue as above --- why allow a number here when the name works just
fine.  We don't allow tables to be specified by number, so why
configurations?


para
!-- why?  --
Note that the cascade dropping of the functionheadline/function function
cause dropping of the literalparser/literal used in fulltext configuration
replaceabletsname/replaceable.
/para


hmm, probably it should be reversed - cascade dropping of the parser cause
dropping of the headline function.


Agreed.



In example below, literalfulltext_idx/literal is
a GIN index:!-- why isn't this automatic --


It's explained above. The problem is that current index api doesn't allow
to say if search was lossy or exact, so to preserve performance of
GIN index we had to introduce @@@ operator, which is the same as @@, but
lossy.


Well, then we have to fix the API.  Telling users to use a different
operator based on what index is defined is just bad style.


This was raised by Heikki and we discussed it a bit in Ottawa, but it's
unclear if it's doable for 8.3.  @@@ operator is in rare use, so we could
say it will be improved in future versions.




nly the tokenlword/token lexeme, then a acronymTZ/acronym
definition like ' one 1:11' will not work since lexeme type
tokendigit/token is not assigned to the acronymTZ/acronym.
!-- what do these numbers mean? --
/para


OK, I changed it to be clearer.


nothing special, just numbers for example.


functionts_debug/ displays information about every token of
replaceable class=PARAMETERdocument/replaceable as produced by the
parser and processed by the configured dictionaries using the configuration
specified by replaceable class=PARAMETERcfgname/replaceable or
replaceable class=PARAMETERoid/replaceable. !-- no need for oid


don't understand this comment. ts_debug accepts cfgname or its oid


Again, no need for oid.


We need to decide if we need oids as user-visible argument. I don't see
any value, probably Teodor think other way.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas

Greg Smith wrote:
While it shows up in the 90% figure, what happens is most obvious in the 
response time distribution graphs.  Someone who is currently getting a 
run like #295 right now: http://community.enterprisedb.com/ldc/295/rt.html


Might be really unhappy if they turn on LDC expecting to smooth out 
checkpoints and get the shift of #296 instead: 
http://community.enterprisedb.com/ldc/296/rt.html


You mean the shift and flattening of the graph to the right in the 
delivery response time distribution graph? Looking at the other runs, 
that graph looks sufficiently different between the two baseline runs 
and the patched runs that I really wouldn't draw any conclusion from that.


In any case you *can* disable LDC if you want to.

That is of course cherry-picking the most extreme examples.  But it 
illustrates my concern about the possibility for LDC making things worse 
on a really overloaded system, which is kind of counter-intuitive 
because you might expect that would be the best case for its improvements.


Well, it is indeed cherry-picking, so I still don't see how LDC could 
make things worse on a really overloaded system. I grant you there might 
indeed be one, but I'd like to understand the underlaying mechanism, or 
at least see one.


Since there is so much variability in results 
when you get into this territory, you really need to run a lot of these 
tests to get a feel for the spread of behavior.


I think that's the real lesson from this. In any case, at least LDC 
doesn't seem to hurt much in any of the test configurations tested this 
far, and smooths the checkpoints a lot in most configurations.


 I spent about a week of 
continuously running tests stalking this bugger before I felt I'd mapped 
out the boundaries with my app.  You've got your own priorities, but I'd 
suggest you try to find enough time for a more exhaustive look at this 
area before nailing down the final form for the patch.


I don't have any good simple ideas on how to make it better in 8.3 
timeframe, so I don't think there's much to learn from repeating these 
tests.


That said, running tests is easy and doesn't take much effort. If you 
have suggestions for configurations or workloads to test, I'll be happy 
to do that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Joshua D. Drake

Bruce Momjian wrote:

Greg Smith wrote:



I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.  Right now, we know we have a
serious issue with checkpoint I/O, and this patch is going to improve
that in most cases.  I don't want to see us reject it or greatly delay
beta as we try to make it perfect.

My main point is that should keep trying to make the patch better, but
the patch doesn't have to be perfect to get applied.  I don't want us to
get into a death-by-testing spiral.


Death by testing? The only comment I have is that is could be useful to 
be able to turn this feature off via GUC. Other than that, I think it is 
great.


Joshua D. Drake







--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas

Joshua D. Drake wrote:
The only comment I have is that is could be useful to 
be able to turn this feature off via GUC. Other than that, I think it is 
great.


Yeah, you can do that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 If SQL was not a popular standard, we would drop it.  You and Alvaro are
 saying that 'm' for meter and 'min' for minute is commonly recognized
 outside the USA/UK, so that is good enough for me to say that the
 existing setup is fine.

If we're not going to make the units-parsing any more friendly, for
gosh sakes let's at least make it give a HINT about what it will accept.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 If SQL was not a popular standard, we would drop it.  You and Alvaro are
 saying that 'm' for meter and 'min' for minute is commonly recognized
 outside the USA/UK, so that is good enough for me to say that the
 existing setup is fine.
 
 If we're not going to make the units-parsing any more friendly, for
 gosh sakes let's at least make it give a HINT about what it will accept.

yeah a proper HINT seem like a very reasonable compromise ...


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Heikki Linnakangas wrote:

You mean the shift and flattening of the graph to the right in the delivery 
response time distribution graph?


Right, that's what ends up happening during the problematic cases.  To 
pick numbers out of the air, instead of 1% of the transactions getting 
nailed really hard, by spreading things out you might have 5% of them get 
slowed considerably but not awfully.  For some applications, that might be 
considered a step backwards.



I'd like to understand the underlaying mechanism


I had to capture regular snapshots of the buffer cache internals via 
pg_buffercache to figure out where the breakdown was in my case.


I don't have any good simple ideas on how to make it better in 8.3 timeframe, 
so I don't think there's much to learn from repeating these tests.


Right now, it's not clear which of the runs represent normal behavior and 
which might be anomolies.  That's the thing you might learn if you had 10 
at each configuration instead of just 1.  The goal for the 8.3 timeframe 
in my mind would be to perhaps have enough data to give better guidelines 
for defaults and a range of useful settings in the documentation.


The only other configuration I'd be curious to see is pushing the number 
of warehouses even more to see if the 90% numbers spread further from 
current behavior.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Bruce Momjian wrote:


I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.


This is completely fair.  Consider my suggestions something that people 
might want look out for during beta rather than a task Heikki should worry 
about before applying the patch.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 If SQL was not a popular standard, we would drop it.  You and Alvaro are
 saying that 'm' for meter and 'min' for minute is commonly recognized
 outside the USA/UK, so that is good enough for me to say that the
 existing setup is fine.

Could you expand on your logic here? And why you disagree with my argument
that which abbreviations are correct is irrelevant in deciding whether we
should accept other abbreviations.

Afaict nobody has expressed a single downside to accepting other
abbreviations.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Bruce Momjian
Oleg Bartunov wrote:
 On Wed, 20 Jun 2007, Bruce Momjian wrote:
  Comments to editorial work of Bruce Momjian.
 
  fulltext-intro.sgml:
 
  it is useful to have a predefined list of lexemes.
 
  Bruce, here should be list of types of lexemes !
 
  Agreed.  Are the list of lexemes parser-specific?
 
 
 yes, it it parser which defines types of lexemes.

OK, how will users get a list of supported lexemes?  Do we need a list
per supported parser?

  fulltext-opfunc.sgml:
 
  All of the following functions that accept a configuration argument can
  use either an integer !-- why an integer -- or a textual configuration
  name to select a configuration.
 
  originally it was integer id, probably better use typeoid/type
 
  Uh, my question is why are you allowing specification as an integer/oid
  when the name works just fine.  I don't see the value in allowing
  numbers here.
 
 for compatibility reason. Hmm, indeed, i don't recall where oid's could be 
 important.

Well, if neither of ussee no reason for it, let's remove it.  We don't
need to support a feature that has no usefulness.

  This returns the query used for searching an index. It can be used to test
  for an empty query. The commandSELECT/ below returns literal'T'/,
  !-- lowercase? -- which corresponds to an empty query since GIN indexes
  do not support negate queries (a full index scan is inefficient):
 
  capital case. This looks cumbersome, probably querytree() should
  just return NULL.
 
  Agreed.
 
  The integer option controls several behaviors which is done using bit-wise
  fields and literal|/literal (for example, literal2|4/literal):
  !-- why so complex? --
 
  to avoid 2 arguments
 
  But I don't see why you would want to set two of those values --- they
  seem mutually exclusive, e.g.
 
  1 divides the rank by the 1 + logarithm of the document length
  2 divides the rank by the length itself
 
  I assume you do either one, not both.
 
 but what's about others variants ?

OK, here is the full list:

0 (the default) ignores document length
1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself
4 divides the rank by the mean harmonic distance between extents
8 divides the rank by the number of unique words in document
16 divides the rank by 1 + logarithm of the number of unique words in
   document

so which ones would be both enabled?

 
 What I missed is the definition of extent.
 
 From http://www.sai.msu.su/~megera/wiki/NewExtentsBasedRanking
 Extent is a shortest and non-nested sequence of words, which satisfy a query.

I don't understand how that relates to this.

 
  its replaceableid/replaceable or replaceablets_name/replaceable; 
  !-- n
  if none is specified that the current configuration is used.
 
  I don't understand this question
 
  Same issue as above --- why allow a number here when the name works just
  fine.  We don't allow tables to be specified by number, so why
  configurations?
 
  para
  !-- why?  --
  Note that the cascade dropping of the functionheadline/function 
  function
  cause dropping of the literalparser/literal used in fulltext 
  configuration
  replaceabletsname/replaceable.
  /para
 
  hmm, probably it should be reversed - cascade dropping of the parser cause
  dropping of the headline function.
 
  Agreed.
 
 
  In example below, literalfulltext_idx/literal is
  a GIN index:!-- why isn't this automatic --
 
  It's explained above. The problem is that current index api doesn't allow
  to say if search was lossy or exact, so to preserve performance of
  GIN index we had to introduce @@@ operator, which is the same as @@, but
  lossy.
 
  Well, then we have to fix the API.  Telling users to use a different
  operator based on what index is defined is just bad style.
 
 This was raised by Heikki and we discussed it a bit in Ottawa, but it's
 unclear if it's doable for 8.3.  @@@ operator is in rare use, so we could
 say it will be improved in future versions.

Uh, I am wondering if we just have to force heap access in all cases
until it is fixed.

  nly the tokenlword/token lexeme, then a acronymTZ/acronym
  definition like ' one 1:11' will not work since lexeme type
  tokendigit/token is not assigned to the acronymTZ/acronym.
  !-- what do these numbers mean? --
  /para
 
  OK, I changed it to be clearer.
 
  nothing special, just numbers for example.
 
  functionts_debug/ displays information about every token of
  replaceable class=PARAMETERdocument/replaceable as produced by the
  parser and processed by the configured dictionaries using the configuration
  specified by replaceable class=PARAMETERcfgname/replaceable or
  replaceable class=PARAMETERoid/replaceable. !-- no need for oid
 
  don't understand this comment. ts_debug accepts cfgname or its oid
 
  Again, no need for oid.
 
 We need to decide if we need oids as user-visible argument. I don't see
 any value, probably Teodor think 

Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Bruce Momjian
Gregory Stark wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  If SQL was not a popular standard, we would drop it.  You and Alvaro are
  saying that 'm' for meter and 'min' for minute is commonly recognized
  outside the USA/UK, so that is good enough for me to say that the
  existing setup is fine.
 
 Could you expand on your logic here? And why you disagree with my argument
 that which abbreviations are correct is irrelevant in deciding whether we
 should accept other abbreviations.

I suppose the idea is that we don't want to be sloppy about accepting
just anything in postgresql.conf.  I think people are worried that an
'm' in one column might mean something different than an 'm' in another
column, and perhaps that is confusing.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 I suppose the idea is that we don't want to be sloppy about accepting
 just anything in postgresql.conf.  

becuase?

 I think people are worried that an 'm' in one column might mean something
 different than an 'm' in another column, and perhaps that is confusing.

To whom? the person writing it?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Kevin Grittner
 On Wed, Jun 20, 2007 at  5:21 PM, in message
[EMAIL PROTECTED], Bruce Momjian [EMAIL PROTECTED] wrote:

 Gregory Stark wrote:
 
 Could you expand on your logic here? And why you disagree with my argument
 that which abbreviations are correct is irrelevant in deciding whether we
 should accept other abbreviations.
 
 I suppose the idea is that we don't want to be sloppy about accepting
 just anything in postgresql.conf.  I think people are worried that an
 'm' in one column might mean something different than an 'm' in another
 column, and perhaps that is confusing.
 
If we want precision and standards, I would personally find ISO 8601 4.4.3.2 
less confusing than the current implementation.  (You could say 'PT2M30S' or 
'PT2,5M' or 'PT2.5M' to specify a 2 minute and 30 second interval.)  That said, 
I'd be OK with a HINT that listed valid syntax.  I've wasted enough time 
looking up the supported abbreviations to last me a while.
 
-Kevin
 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] DROP TABLE and autovacuum

2007-06-20 Thread ITAGAKI Takahiro

Alvaro Herrera [EMAIL PROTECTED] wrote:

 Something worth considering, though unrelated to the topic at hand: what
 happens with the table stats after CLUSTER?  Should we cause an ANALYZE
 afterwards?  We could end up running with outdated statistics.

We don't invalidate the value statistics in pg_stats by ANALYZE presently.

Also, the runtime statistics are not invalidated -- it cound be a bug.
pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
but we pass it relfilenode.

[storage/smgr/smgr.c]
static void
smgr_internal_unlink(RelFileNode rnode, int which, bool isTemp, bool isRedo)
{
...
/*
 * Tell the stats collector to forget it immediately, too.  Skip this in
 * recovery mode, since the stats collector likely isn't running (and if
 * it is, pgstat.c will get confused because we aren't a real backend
 * process).
 */
if (!InRecovery)
pgstat_drop_relation(rnode.relNode);

...
}

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] autovacuum launcher continues to run after reloading autovacuum=off

2007-06-20 Thread ITAGAKI Takahiro
I found that the autovacuum launcher continues to run and spawn workers
after reloading the configuration file with autovacuum = off in CVS HEAD.

What should we do after autovacuum is disabled runtime? I think the
launcher should not spawn any new workers. It can be fixed easily,
but there are some other issues to be discussed:

- Can the launcher exit immediately?
  or it needs to wait for all worker's exits?
- Should the workers skip the remaining jobs?
  One difficulty is that workers ingore SIGHUP signals currently.
- Should the workers skip the table being vacuumed then?


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org