[HACKERS] autovacuum launcher continues to run after reloading autovacuum=off

2007-06-20 Thread ITAGAKI Takahiro
I found that the autovacuum launcher continues to run and spawn workers
after reloading the configuration file with autovacuum = off in CVS HEAD.

What should we do after autovacuum is disabled runtime? I think the
launcher should not spawn any new workers. It can be fixed easily,
but there are some other issues to be discussed:

- Can the launcher exit immediately?
  or it needs to wait for all worker's exits?
- Should the workers skip the remaining jobs?
  One difficulty is that workers ingore SIGHUP signals currently.
- Should the workers skip the table being vacuumed then?


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] DROP TABLE and autovacuum

2007-06-20 Thread ITAGAKI Takahiro

Alvaro Herrera <[EMAIL PROTECTED]> wrote:

> Something worth considering, though unrelated to the topic at hand: what
> happens with the table stats after CLUSTER?  Should we cause an ANALYZE
> afterwards?  We could end up running with outdated statistics.

We don't invalidate the value statistics in pg_stats by ANALYZE presently.

Also, the runtime statistics are not invalidated -- it cound be a bug.
pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
but we pass it relfilenode.

[storage/smgr/smgr.c]
static void
smgr_internal_unlink(RelFileNode rnode, int which, bool isTemp, bool isRedo)
{
...
/*
 * Tell the stats collector to forget it immediately, too.  Skip this in
 * recovery mode, since the stats collector likely isn't running (and if
 * it is, pgstat.c will get confused because we aren't a real backend
 * process).
 */
if (!InRecovery)
pgstat_drop_relation(rnode.relNode);

...
}

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] month abreviation

2007-06-20 Thread Jaime Casanova

Hi,

I got this answer in my
 version
---
PostgreSQL 8.3devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
(1 row)

note the month abreviation (mons?) is this intentional?

sgerp=# select age(current_date, '1979-08-15'::date);
  age
-
27 years 10 mons 5 days
(1 row)


--
Atentamente,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
  Richard Cook

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Kevin Grittner
>>> On Wed, Jun 20, 2007 at  5:21 PM, in message
<[EMAIL PROTECTED]>, Bruce Momjian <[EMAIL PROTECTED]> wrote:

> Gregory Stark wrote:
>> 
>> Could you expand on your logic here? And why you disagree with my argument
>> that which abbreviations are correct is irrelevant in deciding whether we
>> should accept other abbreviations.
> 
> I suppose the idea is that we don't want to be sloppy about accepting
> just anything in postgresql.conf.  I think people are worried that an
> 'm' in one column might mean something different than an 'm' in another
> column, and perhaps that is confusing.
 
If we want precision and standards, I would personally find ISO 8601 4.4.3.2 
less confusing than the current implementation.  (You could say 'PT2M30S' or 
'PT2,5M' or 'PT2.5M' to specify a 2 minute and 30 second interval.)  That said, 
I'd be OK with a HINT that listed valid syntax.  I've wasted enough time 
looking up the supported abbreviations to last me a while.
 
-Kevin
 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Gregory Stark
"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> I suppose the idea is that we don't want to be sloppy about accepting
> just anything in postgresql.conf.  

becuase?

> I think people are worried that an 'm' in one column might mean something
> different than an 'm' in another column, and perhaps that is confusing.

To whom? the person writing it?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Bruce Momjian
Gregory Stark wrote:
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
> 
> > If SQL was not a popular standard, we would drop it.  You and Alvaro are
> > saying that 'm' for meter and 'min' for minute is commonly recognized
> > outside the USA/UK, so that is good enough for me to say that the
> > existing setup is fine.
> 
> Could you expand on your logic here? And why you disagree with my argument
> that which abbreviations are correct is irrelevant in deciding whether we
> should accept other abbreviations.

I suppose the idea is that we don't want to be sloppy about accepting
just anything in postgresql.conf.  I think people are worried that an
'm' in one column might mean something different than an 'm' in another
column, and perhaps that is confusing.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Bruce Momjian
Oleg Bartunov wrote:
> On Wed, 20 Jun 2007, Bruce Momjian wrote:
> >> Comments to editorial work of Bruce Momjian.
> >>
> >> fulltext-intro.sgml:
> >>
> >> it is useful to have a predefined list of lexemes.
> >>
> >> Bruce, here should be list of types of lexemes !
> >
> > Agreed.  Are the list of lexemes parser-specific?
> >
> 
> yes, it it parser which defines types of lexemes.

OK, how will users get a list of supported lexemes?  Do we need a list
per supported parser?

> >> fulltext-opfunc.sgml:
> >>
> >> All of the following functions that accept a configuration argument can
> >> use either an integer  or a textual configuration
> >> name to select a configuration.
> >>
> >> originally it was integer id, probably better use oid
> >
> > Uh, my question is why are you allowing specification as an integer/oid
> > when the name works just fine.  I don't see the value in allowing
> > numbers here.
> 
> for compatibility reason. Hmm, indeed, i don't recall where oid's could be 
> important.

Well, if neither of ussee no reason for it, let's remove it.  We don't
need to support a feature that has no usefulness.

> >> This returns the query used for searching an index. It can be used to test
> >> for an empty query. The SELECT below returns 'T',
> >>  which corresponds to an empty query since GIN indexes
> >> do not support negate queries (a full index scan is inefficient):
> >>
> >>> capital case. This looks cumbersome, probably querytree() should
> >>> just return NULL.
> >
> > Agreed.
> >
> >> The integer option controls several behaviors which is done using bit-wise
> >> fields and | (for example, 2|4):
> >> 
> >>
> >>> to avoid 2 arguments
> >
> > But I don't see why you would want to set two of those values --- they
> > seem mutually exclusive, e.g.
> >
> > 1 divides the rank by the 1 + logarithm of the document length
> > 2 divides the rank by the length itself
> >
> > I assume you do either one, not both.
> 
> but what's about others variants ?

OK, here is the full list:

0 (the default) ignores document length
1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself
4 divides the rank by the mean harmonic distance between extents
8 divides the rank by the number of unique words in document
16 divides the rank by 1 + logarithm of the number of unique words in
   document

so which ones would be both enabled?

> 
> What I missed is the definition of extent.
> 
> >From http://www.sai.msu.su/~megera/wiki/NewExtentsBasedRanking
> Extent is a shortest and non-nested sequence of words, which satisfy a query.

I don't understand how that relates to this.

> >
> >> its id or ts_name; 
> >> 
> >> Note that the cascade dropping of the headline 
> >> function
> >> cause dropping of the parser used in fulltext 
> >> configuration
> >> tsname.
> >> 
> >>
> >>> hmm, probably it should be reversed - cascade dropping of the parser cause
> >>> dropping of the headline function.
> >
> > Agreed.
> >
> >>
> >> In example below, fulltext_idx is
> >> a GIN index:
> >>
> >>> It's explained above. The problem is that current index api doesn't allow
> >>> to say if search was lossy or exact, so to preserve performance of
> >>> GIN index we had to introduce @@@ operator, which is the same as @@, but
> >>> lossy.
> >
> > Well, then we have to fix the API.  Telling users to use a different
> > operator based on what index is defined is just bad style.
> 
> This was raised by Heikki and we discussed it a bit in Ottawa, but it's
> unclear if it's doable for 8.3.  @@@ operator is in rare use, so we could
> say it will be improved in future versions.

Uh, I am wondering if we just have to force heap access in all cases
until it is fixed.

> >> nly the lword lexeme, then a TZ
> >> definition like ' one 1:11' will not work since lexeme type
> >> digit is not assigned to the TZ.
> >> 
> >> 
> >
> > OK, I changed it to be clearer.
> >
> >>> nothing special, just numbers for example.
> >>
> >> ts_debug displays information about every token of
> >> document as produced by the
> >> parser and processed by the configured dictionaries using the configuration
> >> specified by cfgname or
> >> oid. 

Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Gregory Stark
"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> If SQL was not a popular standard, we would drop it.  You and Alvaro are
> saying that 'm' for meter and 'min' for minute is commonly recognized
> outside the USA/UK, so that is good enough for me to say that the
> existing setup is fine.

Could you expand on your logic here? And why you disagree with my argument
that which abbreviations are correct is irrelevant in deciding whether we
should accept other abbreviations.

Afaict nobody has expressed a single downside to accepting other
abbreviations.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Bruce Momjian wrote:


I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.


This is completely fair.  Consider my suggestions something that people 
might want look out for during beta rather than a task Heikki should worry 
about before applying the patch.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Heikki Linnakangas wrote:

You mean the shift and "flattening" of the graph to the right in the delivery 
response time distribution graph?


Right, that's what ends up happening during the problematic cases.  To 
pick numbers out of the air, instead of 1% of the transactions getting 
nailed really hard, by spreading things out you might have 5% of them get 
slowed considerably but not awfully.  For some applications, that might be 
considered a step backwards.



I'd like to understand the underlaying mechanism


I had to capture regular snapshots of the buffer cache internals via 
pg_buffercache to figure out where the breakdown was in my case.


I don't have any good simple ideas on how to make it better in 8.3 timeframe, 
so I don't think there's much to learn from repeating these tests.


Right now, it's not clear which of the runs represent normal behavior and 
which might be anomolies.  That's the thing you might learn if you had 10 
at each configuration instead of just 1.  The goal for the 8.3 timeframe 
in my mind would be to perhaps have enough data to give better guidelines 
for defaults and a range of useful settings in the documentation.


The only other configuration I'd be curious to see is pushing the number 
of warehouses even more to see if the 90% numbers spread further from 
current behavior.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Stefan Kaltenbrunner
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
>> If SQL was not a popular standard, we would drop it.  You and Alvaro are
>> saying that 'm' for meter and 'min' for minute is commonly recognized
>> outside the USA/UK, so that is good enough for me to say that the
>> existing setup is fine.
> 
> If we're not going to make the units-parsing any more friendly, for
> gosh sakes let's at least make it give a HINT about what it will accept.

yeah a proper HINT seem like a very reasonable compromise ...


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> If SQL was not a popular standard, we would drop it.  You and Alvaro are
> saying that 'm' for meter and 'min' for minute is commonly recognized
> outside the USA/UK, so that is good enough for me to say that the
> existing setup is fine.

If we're not going to make the units-parsing any more friendly, for
gosh sakes let's at least make it give a HINT about what it will accept.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas

Joshua D. Drake wrote:
The only comment I have is that is could be useful to 
be able to turn this feature off via GUC. Other than that, I think it is 
great.


Yeah, you can do that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Joshua D. Drake

Bruce Momjian wrote:

Greg Smith wrote:



I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.  Right now, we know we have a
serious issue with checkpoint I/O, and this patch is going to improve
that in most cases.  I don't want to see us reject it or greatly delay
beta as we try to make it perfect.

My main point is that should keep trying to make the patch better, but
the patch doesn't have to be perfect to get applied.  I don't want us to
get into a death-by-testing spiral.


Death by testing? The only comment I have is that is could be useful to 
be able to turn this feature off via GUC. Other than that, I think it is 
great.


Joshua D. Drake







--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas

Greg Smith wrote:
While it shows up in the 90% figure, what happens is most obvious in the 
response time distribution graphs.  Someone who is currently getting a 
run like #295 right now: http://community.enterprisedb.com/ldc/295/rt.html


Might be really unhappy if they turn on LDC expecting to smooth out 
checkpoints and get the shift of #296 instead: 
http://community.enterprisedb.com/ldc/296/rt.html


You mean the shift and "flattening" of the graph to the right in the 
delivery response time distribution graph? Looking at the other runs, 
that graph looks sufficiently different between the two baseline runs 
and the patched runs that I really wouldn't draw any conclusion from that.


In any case you *can* disable LDC if you want to.

That is of course cherry-picking the most extreme examples.  But it 
illustrates my concern about the possibility for LDC making things worse 
on a really overloaded system, which is kind of counter-intuitive 
because you might expect that would be the best case for its improvements.


Well, it is indeed cherry-picking, so I still don't see how LDC could 
make things worse on a really overloaded system. I grant you there might 
indeed be one, but I'd like to understand the underlaying mechanism, or 
at least see one.


Since there is so much variability in results 
when you get into this territory, you really need to run a lot of these 
tests to get a feel for the spread of behavior.


I think that's the real lesson from this. In any case, at least LDC 
doesn't seem to hurt much in any of the test configurations tested this 
far, and smooths the checkpoints a lot in most configurations.


 I spent about a week of 
continuously running tests stalking this bugger before I felt I'd mapped 
out the boundaries with my app.  You've got your own priorities, but I'd 
suggest you try to find enough time for a more exhaustive look at this 
area before nailing down the final form for the patch.


I don't have any good simple ideas on how to make it better in 8.3 
timeframe, so I don't think there's much to learn from repeating these 
tests.


That said, running tests is easy and doesn't take much effort. If you 
have suggestions for configurations or workloads to test, I'll be happy 
to do that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Oleg Bartunov

On Wed, 20 Jun 2007, Bruce Momjian wrote:


Oleg Bartunov wrote:

On Sun, 17 Jun 2007, Bruce Momjian wrote:


I have completed my first pass over the tsearch documentation:

http://momjian.us/expire/fulltext/HTML/sql.html

They are from section 14 and following.

I have come up with a number of questions that I placed in SGML comments
in these files:

http://momjian.us/expire/fulltext/SGML/

Teodor/Oleg, let me know when you want to go over my questions.


Below are my answers (marked as )


OK.


Comments to editorial work of Bruce Momjian.

fulltext-intro.sgml:

it is useful to have a predefined list of lexemes.

Bruce, here should be list of types of lexemes !


Agreed.  Are the list of lexemes parser-specific?



yes, it it parser which defines types of lexemes.


fulltext-opfunc.sgml:

All of the following functions that accept a configuration argument can
use either an integer  or a textual configuration
name to select a configuration.

originally it was integer id, probably better use oid


Uh, my question is why are you allowing specification as an integer/oid
when the name works just fine.  I don't see the value in allowing
numbers here.


for compatibility reason. Hmm, indeed, i don't recall where oid's could be 
important.





This returns the query used for searching an index. It can be used to test
for an empty query. The SELECT below returns 'T',
 which corresponds to an empty query since GIN indexes
do not support negate queries (a full index scan is inefficient):


capital case. This looks cumbersome, probably querytree() should
just return NULL.


Agreed.


The integer option controls several behaviors which is done using bit-wise
fields and | (for example, 2|4):



to avoid 2 arguments


But I don't see why you would want to set two of those values --- they
seem mutually exclusive, e.g.

1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself

I assume you do either one, not both.


but what's about others variants ?

What I missed is the definition of extent.


From http://www.sai.msu.su/~megera/wiki/NewExtentsBasedRanking

Extent is a shortest and non-nested sequence of words, which satisfy a query.





its id or ts_name; 
Note that the cascade dropping of the headline function
cause dropping of the parser used in fulltext configuration
tsname.



hmm, probably it should be reversed - cascade dropping of the parser cause
dropping of the headline function.


Agreed.



In example below, fulltext_idx is
a GIN index:


It's explained above. The problem is that current index api doesn't allow
to say if search was lossy or exact, so to preserve performance of
GIN index we had to introduce @@@ operator, which is the same as @@, but
lossy.


Well, then we have to fix the API.  Telling users to use a different
operator based on what index is defined is just bad style.


This was raised by Heikki and we discussed it a bit in Ottawa, but it's
unclear if it's doable for 8.3.  @@@ operator is in rare use, so we could
say it will be improved in future versions.




nly the lword lexeme, then a TZ
definition like ' one 1:11' will not work since lexeme type
digit is not assigned to the TZ.




OK, I changed it to be clearer.


nothing special, just numbers for example.


ts_debug displays information about every token of
document as produced by the
parser and processed by the configured dictionaries using the configuration
specified by cfgname or
oid. 

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Bruce Momjian
Greg Smith wrote:
> I think it does a better job of showing how LDC can shift the top 
> percentile around under heavy load, even though there are runs where it's 
> a clear improvement.  Since there is so much variability in results when 
> you get into this territory, you really need to run a lot of these tests 
> to get a feel for the spread of behavior.  I spent about a week of 
> continuously running tests stalking this bugger before I felt I'd mapped 
> out the boundaries with my app.  You've got your own priorities, but I'd 
> suggest you try to find enough time for a more exhaustive look at this 
> area before nailing down the final form for the patch.

OK, I have hit my limit on people asking for more testing.  I am not
against testing, but I don't want to get into a situation where we just
keep asking for more tests and not move forward.  I am going to rely on
the patch submitters to suggest when enough testing has been done and
move on.

I don't expect this patch to be perfect when it is applied.  I do expect
to be a best effort, and it will get continual real-world testing during
beta and we can continue to improve this.  Right now, we know we have a
serious issue with checkpoint I/O, and this patch is going to improve
that in most cases.  I don't want to see us reject it or greatly delay
beta as we try to make it perfect.

My main point is that should keep trying to make the patch better, but
the patch doesn't have to be perfect to get applied.  I don't want us to
get into a death-by-testing spiral.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Bruce Momjian
Peter Eisentraut wrote:
> Am Mittwoch, 20. Juni 2007 05:54 schrieb Bruce Momjian:
> > Agreed. ?I don't see the point in following a standard few people know
> > about.
> 
> Yes, let's drop SQL as well.

If SQL was not a popular standard, we would drop it.  You and Alvaro are
saying that 'm' for meter and 'min' for minute is commonly recognized
outside the USA/UK, so that is good enough for me to say that the
existing setup is fine.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Updated tsearch documentation

2007-06-20 Thread Bruce Momjian
Oleg Bartunov wrote:
> On Sun, 17 Jun 2007, Bruce Momjian wrote:
> 
> > I have completed my first pass over the tsearch documentation:
> >
> > http://momjian.us/expire/fulltext/HTML/sql.html
> >
> > They are from section 14 and following.
> >
> > I have come up with a number of questions that I placed in SGML comments
> > in these files:
> >
> > http://momjian.us/expire/fulltext/SGML/
> >
> > Teodor/Oleg, let me know when you want to go over my questions.
> 
> Below are my answers (marked as )

OK.
> 
> Comments to editorial work of Bruce Momjian.
> 
> fulltext-intro.sgml:
> 
> it is useful to have a predefined list of lexemes.
> 
>Bruce, here should be list of types of lexemes !

Agreed.  Are the list of lexemes parser-specific?

> 
> 
> 
> 
> I dont' understand where did you get this para :)

Uh, it was in the SGML.  I have removed it.

> fulltext-opfunc.sgml:
> 
> All of the following functions that accept a configuration argument can
> use either an integer  or a textual configuration
> name to select a configuration.
> 
> originally it was integer id, probably better use oid

Uh, my question is why are you allowing specification as an integer/oid
when the name works just fine.  I don't see the value in allowing
numbers here.

> This returns the query used for searching an index. It can be used to test
> for an empty query. The SELECT below returns 'T',
>  which corresponds to an empty query since GIN indexes
> do not support negate queries (a full index scan is inefficient):
> 
> > capital case. This looks cumbersome, probably querytree() should
> > just return NULL.

Agreed.

> The integer option controls several behaviors which is done using bit-wise
> fields and | (for example, 2|4):
> 
> 
> > to avoid 2 arguments

But I don't see why you would want to set two of those values --- they
seem mutually exclusive, e.g.

1 divides the rank by the 1 + logarithm of the document length
2 divides the rank by the length itself

I assume you do either one, not both.

> its id or ts_name; 
> Note that the cascade dropping of the headline function
> cause dropping of the parser used in fulltext configuration
> tsname.
> 
> 
> > hmm, probably it should be reversed - cascade dropping of the parser cause
> > dropping of the headline function.

Agreed.

> 
> In example below, fulltext_idx is
> a GIN index:
> 
> > It's explained above. The problem is that current index api doesn't allow
> > to say if search was lossy or exact, so to preserve performance of
> > GIN index we had to introduce @@@ operator, which is the same as @@, but
> > lossy.

Well, then we have to fix the API.  Telling users to use a different
operator based on what index is defined is just bad style.

> nly the lword lexeme, then a TZ
> definition like ' one 1:11' will not work since lexeme type
> digit is not assigned to the TZ.
> 
> 

OK, I changed it to be clearer.

> > nothing special, just numbers for example.
> 
> ts_debug displays information about every token of
> document as produced by the
> parser and processed by the configured dictionaries using the configuration
> specified by cfgname or
> oid. 

Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Peter Eisentraut
Am Mittwoch, 20. Juni 2007 05:54 schrieb Bruce Momjian:
> Agreed.  I don't see the point in following a standard few people know
> about.

Yes, let's drop SQL as well.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith

On Wed, 20 Jun 2007, Heikki Linnakangas wrote:

Another series with 150 warehouses is more interesting. At that # of 
warehouses, the data disks are 100% busy according to iostat. The 90% 
percentile response times are somewhat higher with LDC, though the 
variability in both the baseline and LDC test runs seem to be pretty high.


Great, this the exactly the behavior I had observed and wanted someone 
else to independantly run into.  When you're in 100% disk busy land, LDC 
can shift the distribution of bad transactions around in a way that some 
people may not be happy with, and that might represent a step backward 
from the current code for them.  I hope you can understand now why I've 
been so vocal that it must be possible to pull this new behavior out so 
the current form of checkpointing is still available.


While it shows up in the 90% figure, what happens is most obvious in the 
response time distribution graphs.  Someone who is currently getting a run 
like #295 right now: http://community.enterprisedb.com/ldc/295/rt.html


Might be really unhappy if they turn on LDC expecting to smooth out 
checkpoints and get the shift of #296 instead: 
http://community.enterprisedb.com/ldc/296/rt.html


That is of course cherry-picking the most extreme examples.  But it 
illustrates my concern about the possibility for LDC making things worse 
on a really overloaded system, which is kind of counter-intuitive because 
you might expect that would be the best case for its improvements.


When I summarize the percentile behavior from your results with 150 
warehouses in a table like this:


TestLDC %   90%
295 None3.703
297 None4.432
292 10  3.432
298 20  5.925
296 30  5.992
294 40  4.132

I think it does a better job of showing how LDC can shift the top 
percentile around under heavy load, even though there are runs where it's 
a clear improvement.  Since there is so much variability in results when 
you get into this territory, you really need to run a lot of these tests 
to get a feel for the spread of behavior.  I spent about a week of 
continuously running tests stalking this bugger before I felt I'd mapped 
out the boundaries with my app.  You've got your own priorities, but I'd 
suggest you try to find enough time for a more exhaustive look at this 
area before nailing down the final form for the patch.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas
I've uploaded the latest test results to the results page at 
http://community.enterprisedb.com/ldc/


The test results on the index page are not in a completely logical 
order, sorry about that.


I ran a series of tests with 115 warehouses, and no surprises there. LDC 
smooths the checkpoints nicely.


Another series with 150 warehouses is more interesting. At that # of 
warehouses, the data disks are 100% busy according to iostat. The 90% 
percentile response times are somewhat higher with LDC, though the 
variability in both the baseline and LDC test runs seem to be pretty 
high. Looking at the response time graphs, even with LDC there's clear 
checkpoint spikes there, but they're much less severe than without.


Another series was with 90 warehouses, but without think times, driving 
the system to full load. LDC seems to smooth the checkpoints very nicely 
 in these tests.


Heikki Linnakangas wrote:

Gregory Stark wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
Now that the checkpoints are spread out more, the response times are 
very

smooth.


So obviously the reason the results are so dramatic is that the 
checkpoints
used to push the i/o bandwidth demand up over 100%. By spreading it 
out you
can see in the io charts that even during the checkpoint the i/o busy 
rate

stays just under 100% except for a few data points.

If I understand it right Greg Smith's concern is that in a busier 
system where
even *with* the load distributed checkpoint the i/o bandwidth demand 
during t
he checkpoint was *still* being pushed over 100% then spreading out 
the load

would only exacerbate the problem by extending the outage.

To that end it seems like what would be useful is a pair of tests with 
and
without the patch with about 10% larger warehouse size (~ 115) which 
would

push the i/o bandwidth demand up to about that level.


I still don't see how spreading the writes could make things worse, but 
running more tests is easy. I'll schedule tests with more warehouses 
over the weekend.


It might even make sense to run a test with an outright overloaded to 
see if
the patch doesn't exacerbate the condition. Something with a warehouse 
size of
maybe 150. I would expect it to fail the TPCC constraints either way 
but what
would be interesting to know is whether it fails by a larger margin 
with the

LDC behaviour or a smaller margin.


I'll do that as well, though experiences with tests like that in the 
past have been that it's hard to get repeatable results that way.




--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Chris Browne <[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] ("Marko Kreen") writes:
> To Chris: you should like PgQ, its just stored procs in database,
> plus it's basically just generalized Slony-I, with some optimizations,
> so should be familiar territory ;)

Looks interesting...


Thanks :)


Random ideas

- insert_event in C (way to get rid of plpython)

Yeah, I'm with that...  Ever tried building [foo] on AIX, where foo in
('perl', 'python', ...)???  :-(

It seems rather excessive to add in a whole stored procedure language
simply for one function...


Well, it's standard in our installations as we use it for
other stuff too.  It's much easier to prototype in PL/Python
than in C...

As it has not been performance problem I have not bothered
to rewrite it.  But now the interface has been stable some
time, it could be done.

--
marko

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Andrew Sullivan
On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
> FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
> 8.1->8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
> constraint their willing to place on development, I see them a good 
> chicken/egg away from making it a continually usefull tool. 

Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Robert Treat
On Tuesday 19 June 2007 10:15, Tom Lane wrote:
> Zdenek Kotala <[EMAIL PROTECTED]> writes:
> > I'm little bit confused when we introduce new page layout version? I
> > expect that new version become with changes with pageheader, tuple
> > header or data encoding (varlen/TOAST ...). But in case when there is
> > new data type internal implementation, there was not reason to update
> > version (see inet/cidr between 8.1 -> 8.2). Can me somebody clarify this?
>
> Well, we've changed it when there was a benefit to an existing tool to
> do so.  So far that's meant page header and tuple header changes.  If
> we ever had a working in-place upgrade solution, I think we'd be willing
> to make the page version account for datatype format changes too.
>

FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
8.1->8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
constraint their willing to place on development, I see them a good 
chicken/egg away from making it a continually usefull tool. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Chris Browne
[EMAIL PROTECTED] ("Marko Kreen") writes:
> To Chris: you should like PgQ, its just stored procs in database,
> plus it's basically just generalized Slony-I, with some optimizations,
> so should be familiar territory ;)

Looks interesting...

Random ideas

- insert_event in C (way to get rid of plpython)

Yeah, I'm with that...  Ever tried building [foo] on AIX, where foo in
('perl', 'python', ...)???  :-(

It seems rather excessive to add in a whole stored procedure language
simply for one function...
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "linuxdatabases.info")
http://www3.sympatico.ca/cbbrowne/sgml.html
I always try to do things in chronological order. 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Rob Butler <[EMAIL PROTECTED]> wrote:

Do you guys need something PG specific or built into PG?


Yes, we need it usable from inside the DB, thus the PgQ.

That means the events are also transactional with other
things happening in the DB.


ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton 
of features.  Could you simply use that?


I guess that if you need standalone message broker, the
ActiveMQ may be good choice.  At least, any solution that
avoids the database when passing messages should outperform
solutions that pipe stuff thru (general-purpose) database.

OTOH, if you _do_ need to transport the events via database
it should be very hard to outperform PgQ. :)  As it uses the
user-level xid/snapshot trick introduced by rserv/erserver/slony,
which is not possible with other databases other than PostgreSQL.

--
marko

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Suggestion for Enum Support Functions

2007-06-20 Thread Andrew Dunstan



toronto programmer wrote:

Dear Postgres developers,

I have been working with Oracle for few years now in my work, and I 
tried some free databases for a project that I'm developing for my own 
use, I have tried H2,FireBird and postgres, and found the last to be 
the most stable and feature-rich, so thanks for all the good work.


I have read the 8.3 documentation, and with reference to Enum Support 
Functions found on 
http://developer.postgresql.org/pgdocs/postgres/functions-enum.html, i 
think it is useful to add 2 functions, enum_after(anyenum) and 
enum_before(anyenum), so having :


CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 
'purple');
enum_after('orange'::rainbow) will return 'yellow'
enum_after('purple'::rainbow) will return an error
enum_before('purple'::rainbow) will return 'blue'

a good to have
 function would be enum_size(anyenum) which would return 6 in the previous 
example
that will be helpful in dealing with enums



You could easily create these for yourself, of course. For example:


 create or replace function enum_size(anyenum)
   returns int as
   $$ select array_upper(enum_range($1),1) $$
   language sql;


Successor and predecessor functions would be a bit more work, but not 
hard. I don't think they should error out at the range extremes, though. 
Perhaps returning NULL would  be better.


We could look at adding these as builtins for 8.4, but it's too late now 
to add them for 8.3. Besides, I think we need to see how enums are used 
in the field before deciding if any extensions are needed.




cheers

andrerw

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 19:42, Rob Butler wrote:
> Do you guys need something PG specific or built into PG?
>
> ActiveMQ is very nice, speaks multiple languages, protocols and supports a
> ton of features.  Could you simply use that?
>
> http://activemq.apache.org/

Looks very nice indeed!


Jeroen



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-20 Thread Alvaro Herrera
Bruce Momjian wrote:

> Agreed.  I don't see the point in following a standard few people know
> about.

Few people in the US and UK you mean, right?  Everybody else stopped
measuring in king's feet and thumbs a long time ago.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Rob Butler
Do you guys need something PG specific or built into PG?

ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton 
of features.  Could you simply use that?

http://activemq.apache.org/

Rob



   

Get the free Yahoo! toolbar and rest assured with the added security of spyware 
protection.
http://new.toolbar.yahoo.com/toolbar/features/norton/index.php

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Heikki Linnakangas

Jeroen T. Vermeulen wrote:

On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote:

Marko Kreen wrote:

As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...

JMS certainly does have the concept of transactions. Both distributed
ones through XA and two-phase commit, and local involving just one JMS
provider. I don't know about others, but would be surprised if they
didn't.


Wait...  I thought XA did two-phase commit, and then there was XA+ for
*distributed* two-phase commit, which is much harder?


Well, I meant distributed as in one transaction manager, multiple 
resource managers, all participating in a single atomic transaction. I 
don't know what XA+ adds on top of that.


To be precise, being a Java-thing, JMS actually supports two-phase 
commit through JTA (Java Transaction API), not XA. It's the same design 
and interface, just defined as Java interfaces instead of at native 
library level.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote:
> Marko Kreen wrote:
>> As I understand, JMS does not have a concept
>> of transactions, probably also other solutions mentioned before,
>> so to use PgQ as backend for them should be much simpler...
>
> JMS certainly does have the concept of transactions. Both distributed
> ones through XA and two-phase commit, and local involving just one JMS
> provider. I don't know about others, but would be surprised if they
> didn't.

Wait...  I thought XA did two-phase commit, and then there was XA+ for
*distributed* two-phase commit, which is much harder?


Jeroen



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:

Marko Kreen wrote:
> As I understand, JMS does not have a concept
> of transactions, probably also other solutions mentioned before,
> so to use PgQ as backend for them should be much simpler...

JMS certainly does have the concept of transactions. Both distributed
ones through XA and two-phase commit, and local involving just one JMS
provider. I don't know about others, but would be surprised if they didn't.


Ah, sorry, my mistake then.  Shouldn't trust hearsay :)

--
marko

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Heikki Linnakangas

Marko Kreen wrote:

As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...


JMS certainly does have the concept of transactions. Both distributed 
ones through XA and two-phase commit, and local involving just one JMS 
provider. I don't know about others, but would be surprised if they didn't.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Marko Kreen

On 6/20/07, Jeroen T. Vermeulen <[EMAIL PROTECTED]> wrote:

On Wed, June 20, 2007 04:45, Chris Browne wrote:
> - Sometimes you have the semantics where:
>   - messages need to be delivered at least once
>   - messages need to be delivered no more than once
>   - messages need to be delivered exactly once

IMHO, if you're not doing "exactly once," or something very close to it,
you might as well stay with ad-hoc code.  You can ensure single delivery
by having the sender re-send when in doubt, and keeping track of
duplications in the recipient.


In case of PGQ, the "at least once" semantics is related to batch-based
processing it does - in case of failure, full batch is delivered again,
so if consumer had managed to process some of the items already, it gets
them double.

As it is responsponsible only for delivering events from database,
it has no way of guaranteeing "exactly once" behaviour, that needs
to be built on top of PGQ.

Simplest case would be if the events are processed in same database
that the queue resides.  Then you can just fetch, process, close batch
in one transaction and immidiately you get "exactly once" behaviour.

To achieve "exactly once" behaviour with different databases, look
at the "pgq_ext" module for sample.  Basically it just requires
storing batch_id/event_id on remote db and committing there first.
Later it can be checked if the batch/event is already processed.

It's tricky only if you want to achieve full transactionality for
event processing.  As I understand, JMS does not have a concept
of transactions, probably also other solutions mentioned before,
so to use PgQ as backend for them should be much simpler...

To Chris: you should like PgQ, its just stored procs in database,
plus it's basically just generalized Slony-I, with some optimizations,
so should be familiar territory ;)

--
marko

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Markus Schiltknecht

Hi Chris,

Chris Browne wrote:

I'm seeing some applications where it appears that there would be
value in introducing asynchronous messaging, ala "message queueing."



ISTM that 'message queue' is a way too general term. There are hundreds 
of different queues at different levels on a standard server. So I'm 
somewhat unsure about what problem you want to solve.



c) There are lesser names, like isectd  and the
(infamous?) Spread Toolkit which both implement memory-based messaging
systems.


If a GCS is about what you're looking for, then you also might want to 
consider these: ensemble, appia or jGroups. There's a Java layer called 
jGCS, which supports even more, similar systems.


Another commonly used term is 'reliable multicast', which guarantees 
that messages are delivered to a group of recipients. These algorithms 
often are the basis for a GCS.



My bias would be to have something that can basically run as a thin
set of stored procedures atop PostgreSQL :-).  It would be trivial to
extend that to support SOAP/XML-RPC, if desired.


Hm.. in Postgres-R I currently have (partial) support for ensemble and 
spread. Exporting that interface via stored procedures could be done, 
but you would probably need a manager process, as you certainly want 
your connections to persist across transactions (or not?).


Together with that process, we already have half of what Postgres-R is: 
an additional process which connects to the GCS. Thus I'm questioning, 
if there's value for exporting the interface. Can you think of other use 
cases than database replication? Why do you want to do that via the 
database, then, and not directly with the GCS?



It would be nice to achieve 'higher availability' by having queues
where you might replicate the contents (probably using the MQ system
itself ;-)) to other servers.


Uhm.. sorry, but I fail to see the big news here. Which replication 
solution does *not* work that way?


Regards

Markus


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG-MQ?

2007-06-20 Thread Jeroen T. Vermeulen
On Wed, June 20, 2007 04:45, Chris Browne wrote:
> I'm seeing some applications where it appears that there would be
> value in introducing asynchronous messaging, ala "message queueing."
> 
>
> The "granddaddy" of message queuing systems is IBM's MQ-Series, and I
> don't see particular value in replicating its functionality.

I'm quite interested in this.  Maybe I'm thinking of something too
complex, but I do think there are some "oh it'll need to do that too"
pitfalls that are best considered up front.

The big thing about MQ is that it participates as a resource manager in
two-phase commits (and optionally a transaction manager as well).  That
means that you get atomic processing steps: application takes message off
a queue, processes it, commits its changes to the database, replies to
message.  The queue manager then does a second-phase commit for all of
those steps, and that's when the reply really goes out.  If the
application fails, none of this will have happened so you get ACID over
the complete cycle.  That's something we should have free software for.

Perhaps the time is right for something new.  A lot of the complexity
inside MQ comes from data representation issues like encodings and
fixed-length strings, as I recall, and things have changed since MQ was
designed.  I agree it could be useful (and probably not hard either) to
have a transactional messaging system inside the database.  It saves you
from having to do two-phase commits.

But it does tie everything to postgres to some extent, and you lose the
interesting features—atomicity and assured, single delivery—as soon as
anything in the chain does anything persistent that does not participate
in the postgres transaction.  Perhaps what we really need is more mature
components, with a unified control layer on top.  That's how a lot of
successful free software grows.  See below.


> On the other side, the "big names" these days are:
>
> a) The Java Messaging Service, which seems to implement *way* more
>options than I'm even vaguely interested in having (notably, lots
>that involve data stores or lack thereof that I do not care to use);

Far as I know, JMS is an API, not a product.  You'd still slot some
messaging middleware underneath, such as MQ.  That is why MQSeries was
renamed: it fits into the WebSphere suite as the implementing engine
underneath the JMS API.  From what I understand MQ is one of the
"best-of-breed" products that JMS was designed around.  (Sun's term, bit
hypey for my taste).

In one way, Java is easy: the last thing you want to get into is yet
another marshaling standard.  There are plenty of "standards" to choose
from already, each married to one particular communications mechanism:
RPC, EDI, CORBA, D-Bus, XMLRPC, what have you.  Even postgres has its own.
 I'd say the most successful mechanism is TCP itself, because it isolates
itself from content representation so effectively.

It's hard not to get into marshaling: someone has to do it, and it's often
a drag to do it in the application, but the way things stand now *any*
choice limits the usefulness of what you're building.  That's something
I'd like to see change.

Personally I'd love to see marshaling or low-level data representation
isolated into a mature component that speaks multiple programming
languages on the one hand and multiple data representation formats on the
other.  Something the implementers of some of these messaging standards
would want to use to compose their messages, isolating their format
definitions into plugins.  Something that would make application writers
stop composing messages in finicky ad-hoc code that fails with unexpected
locales or has trouble with different line breaks.

If we had a component like that, combining it with existing transactional
variants of TCP and [S]HTTP might even be enough to build a usable
messaging system.  I haven't looked at them enough to know.  Of course
we'd need implementations of those protocols; see
http://ttcplinux.sourceforge.net/ and http://www.csn.ul.ie/~heathclf/fyp/
for example.

Another box of important tools, and I have no idea where we stand with
this one, is transaction management.  We have 2-phase commit in postgres
now.  But do we have interoperability with existing transaction managers? 
Is there a decent free, portable, everything-agnostic transaction manager?
 With those, the sphere of reliability of a database-driven messaging
package could extend much further.

A free XA-capable filesystem would be great too, but I guess I'm daydreaming.


> There tend to be varying semantics out there:
>
> - Some queues may represent "subscriptions" where a whole bunch of
>   listeners want to get all the messages;

The two simplest models that offer something more than TCP/UDP are 1:n
reliable publish-subscribe without persistence, and 1:1 request-reply with
persistent storage.  D-Bus does them both; IIRC MQ does 1:1 and has
add-ons on top for publish

Re: [HACKERS] PG-MQ?

2007-06-20 Thread Chris Browne
[EMAIL PROTECTED] (Steve Atkins) writes:
>> Is there any existing work out there on this?  Or should I maybe be
>> looking at prototyping something?
>
> The skype tools have some sort of decent-looking publish/subscribe
> thing, PgQ, then they layer their replication on top of. It's multi
> consumer and producer, with "delivered at least once" semantics.
>
> Looks nice.

I had not really noticed that - I need to take a look at their
connection pooler too, so I guess that puts more "skype" items on my
ToDo list ;-).  Thanks for pointing it out...
-- 
let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" 
[name;tld];;
http://cbbrowne.com/info/advocacy.html
Signs of a Klingon Programmer #1: "Our users will  know fear and cower
before our software. Ship it! Ship it and let  them flee like the dogs
they are!"

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Suggestion for Enum Support Functions

2007-06-20 Thread toronto programmer
Dear Postgres developers, 

I have been working with Oracle for few years now in my work, and I tried some 
free databases for a project that I'm developing for my own use, I have tried 
H2,FireBird and postgres, and found the last to be the most stable and 
feature-rich, so thanks for all the good work.

I have read the 8.3 documentation, and with reference to Enum Support Functions 
found on http://developer.postgresql.org/pgdocs/postgres/functions-enum.html, i 
think it is useful to add 2 functions, enum_after(anyenum) and 
enum_before(anyenum), so having :

CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 
'purple');
enum_after('orange'::rainbow) will return 'yellow'
enum_after('purple'::rainbow) will return an error
enum_before('purple'::rainbow) will return 'blue'

a good to have function would be enum_size(anyenum) which would return 6 in the 
previous example
that will be helpful in dealing with enums

Best regards

Hashim Kubba 
 






  Get a sneak peak at messages with a handy reading pane with All new 
Yahoo! Mail: http://mrd.mail.yahoo.com/try_beta?.intl=ca