[HACKERS] CREATE INDEX CONCURRENTLY?

2014-10-31 Thread Mark Woodward
I have not kept up with PostgreSQL changes and have just been using it. A
co-worker recently told me that you need to word CONCURRENTLY in CREATE
INDEX to avoid table locking. I called BS on this because to my knowledge
PostgreSQL does not lock tables. I referenced this page in the
documentation:

http://www.postgresql.org/docs/9.3/static/locking-indexes.html

However, I do see this sentence in the indexing page that was not in the
docs prior to 8.0:

Creating an index can interfere with regular operation of a database.
Normally PostgreSQL locks the table to be indexed against writes and
performs the entire index build with a single scan of the table.

Is this true? When/why the change?

When we use concurrently, it seems to hang. I am looking into it.


[HACKERS] SSL and USER_CERT_FILE

2008-05-15 Thread Mark Woodward
I am using PostgreSQL's SSL support and the conventions for the key and
certifications don't make sense from the client perspective. Especially
under Windows.

I am proposing a few simple changes:

Adding two API
void PQsetSSLUserCertFileName(char *filename)
{
user_crt_filename = strdup(filename);
}
PQsetSSLUserKeyFileName(char *filename)
{
user_key_filename = strdup(filename);
}

Adding two static vars in fe-secure.c

char *user_key_filename=NULL;
char *user_crt_filename=NULL;

In client_cert_cb(...)

Add:
if(user_crt_filename)
strncpy(fnbuf, sizeof(fnbuf), user_crt_filename);
else
snprintf(fnbuf, sizeof(fnbuf), %s/%s, homedir, USER_CERT_FILE);

and:

if(user_key_filename)
strncpy(fnbuf, sizeof(fnbuf), user_key_filename);
else
snprintf(fnbuf, sizeof(fnbuf), %s/%s, homedir, USER_KEY_FILE);


The purpose of these changes is to make it easier to configure SSL in an
application which uses libpq.

Any comments?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Permanent settings

2008-02-21 Thread Mark Woodward
I have been looking at this thread for a bit and want to interject an idea.

A couple years ago, I offered a patch to the GUC system that added a
number of abilities, two left out were:

(1) Specify a configuration file on the command line.
(2) Allow the inclusion of a configuration file from within the
configuration file.


If the include functionality were re-instated, then a default file,
something like, $DATADIR/defaults.conf could be included by default on
start up. This file could be written by the program.

It offers the benefit of being readable and editable, requiring very
little extra work to implement, and working in a consistent way with
existing functionality.

P.S. I'd like the ability to load a config file with a command line option
as well. :-)


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Permanent settings

2008-02-21 Thread Mark Woodward


 Mark Woodward wrote:
 I have been looking at this thread for a bit and want to interject an
 idea.

 A couple years ago, I offered a patch to the GUC system that added a
 number of abilities, two left out were:

 (1) Specify a configuration file on the command line.
 (2) Allow the inclusion of a configuration file from within the
 configuration file.


 eh? We have both of these capabilities.

Really? Maybe I'm just missing it in the --help message and the web
docs, (I didn't see it in the code either) how do you specify a config
file on postmaster startup?

Also I didn't see any way to include a file from within postgresql.conf.






---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward

Shouldn't this work?

select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

ERROR:  column y.ycis_id must appear in the GROUP BY clause or be used
in an aggregate function

If I am asking for a specific column value, should I, technically
speaking, need to group by that column?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Stephen Frost wrote:

 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

 But back to the query the issue comes in that the ycis_id value is
 included with the return values requested (a single row value with
 aggregate values that isn't grouped) - if ycis_id is not unique you will
 get x number of returned tuples with ycis_id=15 and the same min() and
 avg() values for each row.
 Removing the ycis_id after the select will return the aggregate values
 you want without the group by.

I still assert that there will always only be one row to this query. This
is an aggregate query, so all the rows with ycis_id = 15, will be
aggregated. Since ycis_id is the identifying part of the query, it should
not need to be grouped.

My question, is it a syntactic technicality that PostgreSQL asks for a
group by, or a bug in the parser?

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Hi, Mark,

 Mark Woodward wrote:
 Stephen Frost wrote:

 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;
 But back to the query the issue comes in that the ycis_id value is
 included with the return values requested (a single row value with
 aggregate values that isn't grouped) - if ycis_id is not unique you
 will
 get x number of returned tuples with ycis_id=15 and the same min() and
 avg() values for each row.
 Removing the ycis_id after the select will return the aggregate values
 you want without the group by.

 I still assert that there will always only be one row to this query.
 This
 is an aggregate query, so all the rows with ycis_id = 15, will be
 aggregated. Since ycis_id is the identifying part of the query, it
 should
 not need to be grouped.

 My question, is it a syntactic technicality that PostgreSQL asks for a
 group by, or a bug in the parser?

 I think that it's a lack of special-casing the = operator. Imagine
 where ycis_id15 or where ycis_id @| $RECTANGLE or other (probably
 user defined) operators on (probably user defined) datatypes.

 The parser has no real knowledge what the operators do, it simply
 requests one that returns a bool.

 One could make the parser to special case the = operator, and maybe some
 others, however I doubt it's worth the effort.

I understand the SQL, and this isn't a sql question else it would be on
a different list, it is a PostgreSQL internals question and IMHO potential
bug.

The original query:
select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

Should NOT require a group by to get ycis_id in the results.





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Hi, Mark,

 Mark Woodward wrote:
 Shouldn't this work?

 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

 ERROR:  column y.ycis_id must appear in the GROUP BY clause or be used
 in an aggregate function

 If I am asking for a specific column value, should I, technically
 speaking, need to group by that column?

 Try:

 SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15;


This isn't a SQL question!!! This is a question of whether or not
PostgreSQL is correct in requiring a group by in the query. I assert
that since it is unabiguous as to what ycis_id should be, PostgreSQL
should not require a grouping.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Mark Woodward wrote:
 Stephen Frost wrote:


 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

 But back to the query the issue comes in that the ycis_id value is
 included with the return values requested (a single row value with
 aggregate values that isn't grouped) - if ycis_id is not unique you
 will
 get x number of returned tuples with ycis_id=15 and the same min() and
 avg() values for each row.
 Removing the ycis_id after the select will return the aggregate values
 you want without the group by.


 I still assert that there will always only be one row to this query.
 This
 is an aggregate query, so all the rows with ycis_id = 15, will be
 aggregated. Since ycis_id is the identifying part of the query, it
 should
 not need to be grouped.

 My question, is it a syntactic technicality that PostgreSQL asks for a
 group by, or a bug in the parser?



 AFAIK what you want is not per sql spec. What if you had instead written


   select ycis_id, min(tindex), avg(tindex) from y where frobnitz(ycis_id)
 = 15;


 ? I think you are expecting too much reasoning from the engine.

Regardless, I can get the results I need and have already worked around
this. The reason why I posted the question to hackers was that I think it
is a bug.

The output column ycis_id is unabiguously a single value with regards to
the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this
exact type of query before either on PostgreSQL or another system, maybe
Oracle, and it did work.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Mark Woodward wrote:
 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;


 I still assert that there will always only be one row to this query.
 This
 is an aggregate query, so all the rows with ycis_id = 15, will be
 aggregated. Since ycis_id is the identifying part of the query, it
 should
 not need to be grouped.

 My question, is it a syntactic technicality that PostgreSQL asks for a
 group by, or a bug in the parser?

 I think your point is that every non-aggregate column in the results of
 the query also appears in the where clause and is given a single value
 there, so conceivably, an all-knowing, all-powerful postgres could
 recognize this and do the implied GROUP by on these columns.

Not exactly.

 I'm not in a position to give a definitive answer on this, but I suspect
 that adjusting the query parser/planner to allow an implied GROUP BY
 either gets prohibitively complicated, or fits too much of a special
 case to be worth implementing.

 select
   ycis_id,
   some_other_id,
   min(tindex),
   avg(tindex)
   from
   y
   where
   ycis_id = 15
   group by
   some_other_id;

This is not, in fact, like the example I gave and confuses the point I am
trying to make.


The original query:
select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

ycis_id is unambiguous and MUST be only one value, there should be no
requirement of grouping. In fact, a group by implies multiple result
rows in an aggregate query.

As I said in other branches of this thread, this isn't a SQL question, it
is a question of whether or not the PostgreSQL parser is correct or not,
and I do not believe that it is working correctly.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote:

 The output column ycis_id is unabiguously a single value with regards
 to
 the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used
 this
 exact type of query before either on PostgreSQL or another system, maybe
 Oracle, and it did work.

 Doesn't work in Oracle 10g:

 SELECT ycis_id, tindex from x where ycis_id = 15;
 YCIS_ID  TINDEX
 ===  ==
  15  10
  15  20

 SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15;
 ORA-00937: not a single-group group function

 SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP
 BY ycis_id;
 YCIS_ID  MIN(TINDEX)  AVG(TINDEX)
 ===  ===  ===
  15   10   15


That's interesting. I am digging through the SQL99 spec, and am trying to
find a definitive answer.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 Mark Woodward wrote:
 Shouldn't this work?

 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

 ERROR:  column y.ycis_id must appear in the GROUP BY clause or be
 used in an aggregate function

 This would require a great deal of special-casing, in particular
 knowledge of the = operator, and then the restriction to a particular
 form of the WHERE clause.  For overall consistency, I don't think this
 should be allowed.


Well, this started out as a huh, that's funny, that should work, is that
a bug? and is turning into a search through the SQL99 spec for a clear
answer. I've already worked around it, but to me, at least, it seems it
should work.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Mark Woodward
 On Oct 17, 2006, at 15:19, Peter Eisentraut wrote:

 Mark Woodward wrote:
 Shouldn't this work?

 select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

 ERROR:  column y.ycis_id must appear in the GROUP BY clause or be
 used in an aggregate function

 This would require a great deal of special-casing, in particular
 knowledge of the = operator, and then the restriction to a particular
 form of the WHERE clause.  For overall consistency, I don't think this
 should be allowed.

 In this particular case, the client constructing the query *knows*
 the value of ycis_id (since the client is generating the ycis_id =
 15 clause). It's technically just a waste of bandwidth and server
 resources to recalculate it. If you really want to replicate the
 output of the query you proposed, you could rewrite it on the client as:

  select 15 as ycis_id, min(tindex), avg(tindex) from y where
 ycis_id = 15;

 You could argue that the server should do this for you, but it seems
 ugly to do in the general case. And, like Peter points out, would
 need a lot of special-casing. I guess the parser could do it for
 expressions in the SELECT clause that exactly match expressions in
 the WHERE clause.


But, and here's the rub, which is the correct way to handle it? I'm
looking through the SQL99 spec to see if I can find an answer.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Hints WAS: Index Tuning Features

2006-10-12 Thread Mark Woodward
 Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into
 her beard:
 What is the point of writing a proposal if there is a threat of
 will be rejected if one of the people who would do the rejection
 doesn't at least outline what would be acceptable?

 If your proposal is merely let's do something like Oracle, it should
 be obvious why that would be rejected.  There is considerable legal
 danger to slavish emulation.  Further, since PostgreSQL isn't Oracle,
 slavish emulation wouldn't work anyways.

I don't actually like Oracle's hinting system.

 If a proposal is too fuzzy to be considered a source of a
 specification, it should be obvious that that would be rejected.

Well, fuzzy isn't a bad starting place to start gathering information
for an eventual  proposal.


 If you have an idea clear enough to turn into a meaningful proposal,
 put it in for the usual to and fro; that generally leads to enormous
 improvements.

Absolutely.


 I'm not sure what a good hinting system ought to look like; what I
 *do* know is that a fuzzy proposal won't be much good.

That is sort of the stopping block. None of us know what it should look
like, but leaving the topic as if you want it, go do the work and submit
a patch. Isn't going to get it done.

First we should decide if it is, in fact, something that ought to happen,
then if that happens, we should think about what it should be.

Again, what would be the point of writing a proposal if there is *no*
concensus on what would be acceptible?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Index Tuning Features

2006-10-11 Thread Mark Woodward
 On 10/10/06, Mark Woodward [EMAIL PROTECTED] wrote:
 I think the idea of virtual indexes is pretty interesting, but
 ultimately a lesser solution to a more fundimental issue, and that would
 be hands on control over the planner. Estimating the effect of an
 index
 on a query prior to creating the index is a great idea, how that is
 done
 is something different than building concensus that it should be done.

 Another thing that this brings up is hints to a query. Over the years,
 I
 have run into situation where the planner wasn't great.  It would be
 nice
 to try forcing different strategies on the planner and see if
 performance
 caan be improved.


 you can do this by setting enable_access_method type parameters.

Here's your hammer, all your problems are now nails.

The enable_xxx setting are OK for simple queries gone wrong, but if you
have a more complex query, any one of those settins may help or hinder
different parts of a query, then you would be left with choosing which of
them helps more than hurts the over-all query.

being able to alter the query plan would help in areas where there are
data patterns in a database that the ANALYZE command can't pick up because
it is not designed too.

Imagine you have a street map database ordered by zip, street, number. The
primary order is zipcode, the secondary order is street. There is a
relationship of number to street, and zip to street. The analyzer, at
least the last time I checked, does not recognize these relationships. So,
a search by street and number would probably use a sequential scan rather
than the street index.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Index Tuning Features

2006-10-11 Thread Mark Woodward

 Mark Woodward [EMAIL PROTECTED] writes:

 The analyzer, at least the last time I checked, does not recognize these
 relationships.

 The analyzer is imperfect but arguing from any particular imperfection is
 weak
 because someone will just come back and say we should work on that problem
 --
 though I note nobody's actually volunteering to do so whereas they appear
 to
 be for hints.

 I think the stronger argument is to say that there are some statistical
 properties that the analyzer _cannot_ be expected to figure out. Either
 because

 a) they're simply too complex to ever expect to be able to find
 automatically,

 b) too expensive to make it worthwhile in the general case, or

 c) because of some operational issue such as the data changing frequently
enough that the analyzes that would be necessary to keep the statistics
 up
to date would become excessively expensive or even be impossible to
 perform rapidly enough.

Well, from a purely data domain standpoint, it is impossible to charactize
the exact nature of a data set without enough information to recreate it.
Anything less must be designed for a fixed set of assumptions. There is no
way that every specific trend can be covered by a fixed number of
assumptions.

The argument that all we need is better statistics completely misses the
point. There will *always* be a number cases where the planner will not
work optimally. I would say that a simpler planner with better hints
will always be capable of creating a better query plan.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index Tuning Features

2006-10-11 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 I would say that a simpler planner with better hints
 will always be capable of creating a better query plan.

 This is demonstrably false: all you need is an out-of-date hint, and
 you can have a worse plan.

That doesn't make it false, it makes it higher maintenance. Hints are
understood to require maintenance.


 The argument against hints is not about whether someone could knock
 together a crappy hint facility and be able to get some use out of it.
 It is about how much work it would take to design a *good* hint facility
 that makes it easy to maintain hints that are robust in the face of data
 and query changes.  If someone were to sit down and design and build
 such a thing, it'd very likely get accepted into core Postgres --- but
 personally, I think the equivalent amount of effort would be better
 spent on improving the planner and the statistics.

While it is always true that something can be improved, there comes a
point where work outweighs benefits. I can't say that the planner is at
that point, but I think that isn't even an issue.

The notion of hints would probably one of the biggest steps toward
improving the planner. Like I said, it is inarguable that there will
always be queries that the planner can not execute efficiently based on
the statistics gathered by analze. Since that number must be greater than
zero, some methodology to deal with it should be created.



 As Josh already noted, Oracle-like hints are pretty likely to get
 rejected ... not only because of doubts about their true usefulness,
 but out of fear of falling foul of some Oracle patent or other.

Well, if it would get rejected if it looked like Oracle, assuming you
would probably be one of the people rejecting it, what do you envision as
not being rejected?

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Hints WAS: Index Tuning Features

2006-10-11 Thread Mark Woodward
 Mark,

 First off, I'm going to request that you (and other people) stop hijacking
 Simon's thread on hypothetical indexes.   Hijacking threads is an
 effective way to get your ideas rejected out of hand, just because the
 people whose thread you hijacked are angry with you.

 So please observe the thread split, thanks.

 Well, if it would get rejected if it looked like Oracle, assuming you
 would probably be one of the people rejecting it, what do you envision
 as not being rejected?

 Something better than Oracle.

 Since you're the one who wants hints, that's kind of up to you to define.
 Write a specification and make a proposal.


What is the point of writing a proposal if there is a threat of will be
rejected if one of the people who would do the rejection doesn't at least
outline what would be acceptable?



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Hints WAS: Index Tuning Features

2006-10-11 Thread Mark Woodward

 Since you're the one who wants hints, that's kind of up to you to
 define.
 Write a specification and make a proposal.


 What is the point of writing a proposal if there is a threat of will be
 rejected if one of the people who would do the rejection doesn't at
 least
 outline what would be acceptable?

 Oh come on Mark, you have been here long enough to know how this works.

Exactly. IMHO, it is a frustrating environment. PostgreSQL is a great
system, and while I completely respect the individuals involved, I think
the management for lack of a better term, is difficult.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index Tuning Features

2006-10-10 Thread Mark Woodward
 Simon Riggs [EMAIL PROTECTED] writes:

 - RECOMMEND command

 Similar in usage to an EXPLAIN, the RECOMMEND command would return a
 list of indexes that need to be added to get the cheapest plan for a
 particular query (no explain plan result though).

 Both of these seem to assume that EXPLAIN results, without EXPLAIN
 ANALYZE results to back them up, are sufficient for tuning.  I find
 this idea a bit dubious, particularly for cases of marginal indexes.


I think the idea of virtual indexes is pretty interesting, but
ultimately a lesser solution to a more fundimental issue, and that would
be hands on control over the planner. Estimating the effect of an index
on a query prior to creating the index is a great idea, how that is done
is something different than building concensus that it should be done.

Another thing that this brings up is hints to a query. Over the years, I
have run into situation where the planner wasn't great.  It would be nice
to try forcing different strategies on the planner and see if performance
caan be improved.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Upgrading a database dump/restore

2006-10-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 Whenever someone actually writes a pg_upgrade, we'll institute a policy
 to restrict changes it can't handle.

 IMHO, *before* any such tool *can* be written, a set of rules must be
 enacted regulating catalog changes.

 That one is easy: there are no rules.  We already know how to deal with
 catalog restructurings --- you do the equivalent of a pg_dump -s and
 reload.  Any proposed pg_upgrade that can't cope with this will be
 rejected out of hand, because that technology was already proven five
 years ago.

 The issues that are actually interesting have to do with the contents
 of user tables and indexes, not catalogs.

It is becomming virtually impossible to recreate databases. Data storage
sizes are increasing faster than the transimssion speeds of the media on
which they are stored or the systems by which they are connected. The
world is looking at a terabyte as merely a very large database these
days. tens of terabytes are not far from being common.

Dumping out a database is bad enough, but that's only the data, and that
can takes (mostly) only hours. Recreating a large database with complex
indexes can take days or hours for the data, hours per index, it adds up.

No one could expect that this could happen by 8.2, or the release after
that, but as a direction for the project, the directors of the
PostgreSQL project must realize that the dump/restore is becomming like
the old locking vacuum problem. It is a *serious* issue for PostgreSQL
adoption and arguably a real design flaw.

If the barrier to upgrade it too high, people will not upgrade. If people
do not upgrade, then older versions will have to be supported longer or
users will have to be abandoned. If users are abandoned and there are
critical bugs in previous versions of PostgreSQL, then user who eventually
have to migrate their data, they will probably not use PostgreSQL in an
attempt to avoid repeating this situation.

While the economics of open source/ free software are different, there is
still a penalty for losing customers, and word of mouth is a dangerous
thing. Once or twice in the customers product usage history can you expect
to get away with this sort of inconvenience, but if every new major
version requres a HUGE process, then the TCO of PostgreSQL gets very high
indeed.

If it is a data format issue, maybe there should be a forum for a next
gen version of the current data layout that is extensible without
restructuring. This is not something that a couple people can go off and
do and submit a patch, it is something that has to be supported and
promoted from the core team, otherwise it won't happen. We all know that.

The question is whether or not you all think it is worth doing. I've done
consulting work for some very large companies that everyone has heard of.
These sorts of things matter.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Upgrading a database dump/restore

2006-10-09 Thread Mark Woodward
 On Mon, Oct 09, 2006 at 11:50:10AM -0400, Mark Woodward wrote:
  That one is easy: there are no rules.  We already know how to deal
 with
  catalog restructurings --- you do the equivalent of a pg_dump -s and
  reload.  Any proposed pg_upgrade that can't cope with this will be
  rejected out of hand, because that technology was already proven five
  years ago.

 snip

 Dumping out a database is bad enough, but that's only the data, and that
 can takes (mostly) only hours. Recreating a large database with complex
 indexes can take days or hours for the data, hours per index, it adds
 up.

 I think you missed the point of the email you replied to. *catalog*
 changes are quick and (relativly) easy. Even with 10,000 tables, it
 would only take a few moments to rewrite the entire catalog to a new
 version.

 If it is a data format issue, maybe there should be a forum for a next
 gen version of the current data layout that is extensible without
 restructuring. This is not something that a couple people can go off and
 do and submit a patch, it is something that has to be supported and
 promoted from the core team, otherwise it won't happen. We all know
 that.

 Actually, the data format is not the issue either. The tuple structure
 hasn't changed that often. What has changed is the internal format of a
 few types, but postgresql could support both the old and the new types
 simultaneously. There has already been a statement from core-members
 that if someone comes up with a tool to handle the catalog upgrade,
 they'd be willing to keep code from older types around with the
 original oid so they'd be able to read the older version.

That's good to know.


 The question is whether or not you all think it is worth doing. I've
 done
 consulting work for some very large companies that everyone has heard
 of.
 These sorts of things matter.

 People are working it, someone even got so far as dealing with most
 catalog upgrades. The hard part going to be making sure that even if
 the power fails halfway through an upgrade that your data will still be
 readable...

Well, I think that any *real* DBA understands and accepts that issues like
power failure and hardware failure create situations where suboptimal
conditions exist. :-) Stopping the database and copying the pg directory
addresses this problem, upon failure, it is a simple mv bkdir pgdir, gets
you started again.

If you have a system on a good UPS and on reliable hardware, which is
exactly the sort of deployment that would benefit most from an in place
upgrade. There is no universal panacea where there is 0 risk, one can only
mitigate risk.

That being said, it should be the preferred method of upgrade with new
versions not being released untill they can migrate cleanly. A
dump/restore should be a last resort. Don't you think?


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Upgrading a database dump/restore

2006-10-09 Thread Mark Woodward
 Mark,

 No one could expect that this could happen by 8.2, or the release after
 that, but as a direction for the project, the directors of the
 PostgreSQL project must realize that the dump/restore is becomming like
 the old locking vacuum problem. It is a *serious* issue for PostgreSQL
 adoption and arguably a real design flaw.

 directors?  (looks around)  Nobody here but us chickens, boss.

 If you're really interested in pg_upgrade, you're welcome to help out.
 Gavin
 Sherry, Zdenek, and Jonah Harris are working on it (the last separately,
 darn
 it).

This is the most frustrating thing, I *wan't* to do these things, but I
can't find any companies that are willing to pay me to do it, and having
kids, I don't have the spare time to do it.

I *have* a recommendations system already, but I can't even find the time
to do the NetFlix Prize thing.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Upgrading a database dump/restore

2006-10-08 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 Not to cause any arguments, but this is sort a standard discussion that
 gets brought up periodically and I was wondering if there has been any
 softening of the attitudes against an in place upgrade, or movement
 to
 not having to dump and restore for upgrades.

 Whenever someone actually writes a pg_upgrade, we'll institute a policy
 to restrict changes it can't handle.  But until we have a credible
 upgrade tool it's pointless to make any such restriction.  (Credible
 means able to handle system catalog restructurings, IMHO --- without
 that, you'd not have any improvement over the current rules for minor
 releases.)

IMHO, *before* any such tool *can* be written, a set of rules must be
enacted regulating catalog changes. If there are no rules and no process
by which changes get approved, requiring a was is conversion strategy,
then the tools has to change with every major version, which will, of
course, put it at risk of losing support in the long term.

Like I said, I understand the reluctance to do these things, it isn't an
easy thing to do. Designing and planning for the future is, however, the
hallmark of a good engineer.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward
I am using the netflix database:
Table public.ratings
 Column |   Type   | Modifiers
+--+---
 item   | integer  |
 client | integer  |
 day| smallint |
 rating | smallint |


The query was executed as:
psql -p 5435 -U pgsql -t -A -c select client, item, rating, day from
ratings order by client netflix  netflix.txt


My question, it looks like the kernel killed psql, and not postmaster. The
postgresql log file complained about a broken pipe.

Question, is this a bug in psql? It took over 4 hours of run time before
the crash.

dmesg:
Free pages:   13192kB (112kB HighMem)
Active:124664 inactive:124330 dirty:0 writeback:0 unstable:0 free:3298
slab:2188 mapped:248080 pagetables:1939
DMA free:12160kB min:16kB low:32kB high:48kB active:0kB inactive:0kB
present:16384kB pages_scanned:12602 all_unreclaimable? yes
protections[]: 0 0 0
Normal free:920kB min:928kB low:1856kB high:2784kB active:438608kB
inactive:437656kB present:901120kB pages_scanned:978318 all_unreclaimable?
yes
protections[]: 0 0 0
HighMem free:112kB min:128kB low:256kB high:384kB active:60176kB
inactive:59536kB present:131008kB pages_scanned:134673 all_unreclaimable?
yes
protections[]: 0 0 0
DMA: 6*4kB 3*8kB 3*16kB 3*32kB 3*64kB 2*128kB 1*256kB 0*512kB 1*1024kB
1*2048kB 2*4096kB = 12160kB
Normal: 0*4kB 1*8kB 7*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 920kB
HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 112kB
Swap cache: add 548633, delete 548633, find 11883/13748, race 0+0
0 bounce buffer pages
Free swap:0kB
262128 pages of RAM
32752 pages of HIGHMEM
3593 reserved pages
608 pages shared
0 pages swap cached
Out of Memory: Killed process 9143 (psql).


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 psql -p 5435 -U pgsql -t -A -c select client, item, rating, day from
 ratings order by client netflix  netflix.txt

 My question, it looks like the kernel killed psql, and not postmaster.

 Not too surprising.

 Question, is this a bug in psql?

 It's really a libpq design issue: since libpq provides random access to
 a PGresult, and has no mechanism for handling failures after returning
 the PGresult to the client, it has to slurp the whole query result into
 memory first.

 FWIW, there's a feature in CVS HEAD to instruct psql to try to use a
 cursor to break up huge query results like this.  For the moment I'd
 suggest using COPY instead.


That's sort of what I was afraid off. I am trying to get 100 million
records into a text file in a specific order.

Sigh, I have to write a quick program to use a cursor. :-(



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward
 On Thu, Oct 05, 2006 at 11:56:43AM -0400, Mark Woodward wrote:
 The query was executed as:
 psql -p 5435 -U pgsql -t -A -c select client, item, rating, day from
 ratings order by client netflix  netflix.txt


 My question, it looks like the kernel killed psql, and not postmaster.
 The
 postgresql log file complained about a broken pipe.

 Question, is this a bug in psql? It took over 4 hours of run time before
 the crash.

 Well, psql tried to store the entire resultset in memory at once, and
 failed. I'm not sure how many records you were trying to display, but
 try to estimate how much memory that would take to store...

 What were you trying to do?

It's the stupid NetFlix prize thing, I need to dump out the data in a
specific order. This is just *one* such query I want to try. I guess, like
I told Tom, I have to write a small program that uses a cursor. :-(

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward

  FWIW, there's a feature in CVS HEAD to instruct psql to try to use a
  cursor to break up huge query results like this.  For the moment I'd
  suggest using COPY instead.


 That's sort of what I was afraid off. I am trying to get 100 million
 records into a text file in a specific order.

 Sigh, I have to write a quick program to use a cursor. :-(

 Why don't you try the psql client from 8.2beta1 then? This way you don't
 have to write the program yourself and you're helping out with beta
 testing as well :-)
 See FETCH_COUNT in
 http://developer.postgresql.org/pgdocs/postgres/app-psql.html


Well, maybe next time, it only took about 10 minutes to write. It is a
simple program.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward
 Tom Lane wrote:
 Mark Woodward [EMAIL PROTECTED] writes:

 psql -p 5435 -U pgsql -t -A -c select client, item, rating, day from
 ratings order by client netflix  netflix.txt


 FWIW, there's a feature in CVS HEAD to instruct psql to try to use a
 cursor to break up huge query results like this.  For the moment I'd
 suggest using COPY instead.




 but COPY doesn't guarantee any order.

 BTW, I just this morning discovered the hard way that our linux boxes
 didn't have strict memory allocation turned on, and then went and set
 it. I'd advise Mark to do the same, if he hasn't already.


Yea, I've been toying with the idea of that setting lately, I can't for
the life of me understand why it isn't the default behavior.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Query Failed, out of memory

2006-10-05 Thread Mark Woodward
 On Thu, 2006-10-05 at 14:53 -0400, Luke Lonergan wrote:
 Is that in the release notes?

 Yes: Allow COPY to dump a SELECT query (Zoltan Boszormenyi, Karel Zak)

I remember this discussion, it is cool when great features get added.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Upgrading a database dump/restore

2006-10-05 Thread Mark Woodward
Not to cause any arguments, but this is sort a standard discussion that
gets brought up periodically and I was wondering if there has been any
softening of the attitudes against an in place upgrade, or movement to
not having to dump and restore for upgrades.

I am aware that this is a difficult problem and I understand that if there
is a radical restructuring of the database then a dump/restore is
justified, but wouldn't it be a laudable goal to *not* require this with
each new release?

Can't we use some release as a standard who's binary format shall not be
changed. I know the arguments about predicting the future, and all, but
standards and stability are important too. I'm not saying it should never
ever change or never ever require a dump/restore, but make it, as policy,
difficult to get past the group and the norm not to require d/r.

The issue is that as disks get bigger and bigger, databases get bigger and
bigger, and this process becomes more and more onerous. If you haven't
noticed, data transmission speeds are not accelerating at the rate disk
space is growing.

I am currently building a project that will have a huge number of records,
1/2tb of data. I can't see how I would ever be able to upgrade PostgreSQL
on this system.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Upgrading a database dump/restore

2006-10-05 Thread Mark Woodward
 Mark Woodward wrote:
 I am currently building a project that will have a huge number of
 records,
 1/2tb of data. I can't see how I would ever be able to upgrade
 PostgreSQL
 on this system.



 Slony will help you upgrade (and downgrade, for that matter) with no
 downtime at all, pretty much. Of course, you do need double the
 resources 

 You other suggestion of setting the on disk format in high viscosity
 jello, if not in concrete, seems doomed to failure. Cool features that
 you and other people want occasionally rely on format changes.

I disagree with the all or nothing attitude, I'm generally a pragmatist.
It is unreasonable to expect that things will never change, by the same
token, never attempting to standardize or enforce some level of stability
is equally unreasonable.

From an enterprise DB perspective, a d/r of a database is a HUGE process
and one that isn't taken lightly.

I just think that an amount of restraint in this area would pay off well.



 Of course, you don't have to upgrade every release. Many people
 (including me) don't.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Upgrading a database dump/restore

2006-10-05 Thread Mark Woodward

 Indeed. The main issue for me is that the dumping and replication
 setups require at least 2x the space of one db. That's 2x the
 hardware which equals 2x $$$. If there were some tool which modified
 the storage while postgres is down, that would save lots of people
 lots of money.

Its time and money. Stoping a database and staring with new software is a
lot faster than dumping the data out (disallowing updates or inserts) and
restoring the data can take hours or days *and* twice the hardware.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Netflix Prize data

2006-10-04 Thread Mark Woodward
I signed up for the Netflix Prize. (www.netflixprize.com) and downloaded
their data and have imported it into PostgreSQL. Here is how I created the
table:
Table public.ratings
 Column |  Type   | Modifiers
+-+---
 item   | integer |
 client | integer |
 rating | integer |
 rdate  | text|
Indexes:
ratings_client btree (client)
ratings_item btree (item)

[EMAIL PROTECTED]:~/netflix$ time psql netflix -c select count(*) from ratings
   count
---
 100480507
(1 row)


real2m6.270s
user0m0.004s
sys 0m0.005s


The one thing I notice is that it is REAL slow. I know it is, in fact, 100
million records, but I don't think PostgreSQL is usually slow like this.
I'm going to check with some other machines to see if there is a problem
with my test machine or if something is wierd about PostgreSQL and large
numbers of rows.

I tried to cluster the data along a particular index but had to cancel it
after 3 hours.

I'm using 8.1.4. The rdate field looks something like: 2005-09-06 So,
the raw data is 23 bytes, the date string will probably be rounded up to
12 bytes, that's 24 bytes per row of data. What is the overhead per
variable? per row?

Is there any advantage to using varchar(10) over text ?



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Netflix Prize data

2006-10-04 Thread Mark Woodward
 I signed up for the Netflix Prize. (www.netflixprize.com)
 and downloaded their data and have imported it into PostgreSQL.
 Here is how I created the table:

 I signed up as well, but have the table as follows:

 CREATE TABLE rating (
   movie  SMALLINT NOT NULL,
   person INTEGER  NOT NULL,
   rating SMALLINT NOT NULL,
   viewed DATE NOT NULL
 );

 I also recommend not loading the entire file until you get further
 along in the algorithm solution. :)

 Not that I have time to really play with this

As luck would have it, I wrote a recommendations system based on music
ratings a few years ago.

After reading the NYT article, it seems as though one or more of the guys
behind Net Perceptions is either helping them or did their system, I'm
not sure. I wrote my system because Net Perceptions was too slow and did a
lousy job.

I think the notion of communities in general is an interesting study in
statistics, but every thing I've seen in the form of bad recommendations
shows that while [N] people may share certain tastes, but that doesn't
nessisarily mean that what one likes the others do. This is especially
flawed with movie rentals because it is seldom a 1:1 ratio of movies to
people. There are often multiple people in a household. Also, movies are
almost always for multiple people.

Anyway, good luck! (Not better than me, of course :-)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Netflix Prize data

2006-10-04 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 The one thing I notice is that it is REAL slow.

 How fast is your disk?  Counting on my fingers, I estimate you are
 scanning the table at about 47MB/sec, which might or might not be
 disk-limited...

 I'm using 8.1.4. The rdate field looks something like: 2005-09-06

 So why aren't you storing it as type date?


You are assuming I gave it any thought at all. :-)

I converted it to a date type (create table ratings2 as )
[EMAIL PROTECTED]:~/netflix/download$ time psql -c select count(*) from
ratings netflix
   count
---
 100480507
(1 row)


real1m29.852s
user0m0.002s
sys 0m0.005s

That's about the right increase based on the reduction in data size.

OK, I guess I am crying wolf, 47M/sec isn't all that bad for the system.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Netflix Prize data

2006-10-04 Thread Mark Woodward

 Greg Sabino Mullane [EMAIL PROTECTED] writes:

 CREATE TABLE rating (
   movie  SMALLINT NOT NULL,
   person INTEGER  NOT NULL,
   rating SMALLINT NOT NULL,
   viewed DATE NOT NULL
 );

 You would probably be better off putting the two smallints first followed
 by
 the integer and date. Otherwise both the integer and the date field will
 have
 an extra two bytes of padding wasting 4 bytes of space.

 If you reorder the fields that way you'll be down to 28 bytes of tuple
 header
 overhead and 12 bytes of data. There's actually another 4 bytes in the
 form of
 the line pointer so a total of 44 bytes per record. Ie, almost 73% of the
 disk
 i/o you're seeing is actually per-record overhead.


That's good advice, however, It is said that Netflix has greater than 64K
movies, so, while the test info may work with a small int, I doubt the
overall system would work.

The rating, however, is one char 1~9. Would making it a char(1) buy anything?

In wonder

If I started screwing around with movie ID and rating, and moved them into
one int. One byte for rating, three bytes for movie ID. That could reduce
the data size by at least half gig.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Netflix Prize data

2006-10-04 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 The rating, however, is one char 1~9. Would making it a char(1) buy
 anything?

 No, that would actually hurt because of the length word for the char
 field.  Even if you used the char type, which really is only one byte,
 you wouldn't win anything because of alignment issues.  Personally I'd
 just go for three ints and a date, rather than trying to be cute with
 the rating.


Actually, the date is just days, right? I don't actualy need it too much.
So, create a small int for date and do this: smalldate =
date('1970-01-01') - rdate. And use small int for rating.

 Column |   Type   | Modifiers
+--+---
 movie  | integer  |
 client | integer  |
 day| smallint |
 rating | smallint |



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Mapping arbitriary and heirachical XML to tuple

2006-09-08 Thread Mark Woodward
I have a system by which I store complex data in PostgreSQL as an XML
string. I have a simple function that can return a single value.

I would like to return sets and sets of rows from the data. This is not a
huge problem, as I've written a few of these functions. The question I'd
like to put out there, is how would you represent heirarchical data as:

foo
  bar
ndx0/ndx
val1.00/val
meta2.5/meta
froboz3.5/froboz
klude
item5/item
life10/life
/kludge
  /bar
  bar
ndx1/ndx
val1.10/val
meta2.2/meta
froboz3.53/froboz
klude
item3/item
life9/life
/kludge
  /bar
/bar


The biggest problem with XML is storing data is easy, getting it back out
in a sane way is less so. How would you guys think to represent this?
(Obviously, this is a bogus example, real life would be much worse!)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] update/insert,

2006-07-05 Thread Mark Woodward
 On Tue, Jul 04, 2006 at 11:59:27AM +0200, Zdenek Kotala wrote:
 Mark,
 I don't know how it will exactly works in postgres but my expectations
 are:

 Mark Woodward wrote:
 Is there a difference in PostgreSQL performance between these two
 different strategies:
 
 
 if(!exec(update foo set bar='blahblah' where name = 'xx'))
 exec(insert into foo(name, bar) values('xx','blahblah');
 or

 The update code generates new tuple in the datafile and pointer has been
 changed in the indexfile to the new version of tuple. This action does
 not generate B-Tree structure changes. If update falls than insert
 command creates new tuple in the datafile and it adds new item into
 B-Tree. It should be generate B-Tree node split.

 Actually, not true. Both versions will generate a row row and create a
 new index tuple. The only difference may be that in the update case the
 may be a ctid link from the old version to the new one, but that's
 about it...

 Which is faster will probably depends on what is more common in your DB:
 row already exists or not. If you know that 99% of the time the row
 will exist, the update will probably be faster because you'll only
 execute one query 99% of the time.

OK, but the point of the question is that constantly updating a single row
steadily degrades performance, would delete/insery also do the same?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] update/insert, delete/insert efficiency WRT vacuum and MVCC

2006-07-03 Thread Mark Woodward
Is there a difference in PostgreSQL performance between these two
different strategies:


if(!exec(update foo set bar='blahblah' where name = 'xx'))
exec(insert into foo(name, bar) values('xx','blahblah');
or
exec(delete from foo where name = 'xx');
exec(insert into foo(name, bar) values('xx','blahblah');

In my session handler code I can do either, but am curious if it makes any
difference. Yes, name is unique.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Mark Woodward
 Ühel kenal päeval, E, 2006-06-26 kell 09:10, kirjutas Mark Woodward:
  Ühel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce
 Momjian:
  Jonah H. Harris wrote:
   On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote:
What I see in this discussion is a huge amount of the grass must
 be
greener on the other side syndrome, and hardly any recognition
 that
every technique has its downsides and complications.
  
   I'm being totally objective.  I don't think we should abandon
   PostgreSQL's overall design at all, because we do perform INSERTs
 and
   DELETEs much better than most systems.  However, I've looked at
 many
   systems and how they implement UPDATE so that it is a scalable
   operation.  Sure, there are costs and benefits to each
 implementation,
   but I think we have some pretty brilliant people in this community
 and
   can come up with an elegant design for scalable UPDATEs.
 
  I think the UPDATE case is similar to the bitmap index scan or
 perhaps
  bitmap indexes on disk --- there are cases we know can not be handled
  well by our existing code, so we have added (or might add) these
  features to try to address those difficult cases.
 
  Not really. Bitmap index scan and bitmap index are both new additions
  working well with existing framework.
 
  While the problem of slowdown on frequent updates is real, the
 suggested
  fix is just plain wrong, as it is based on someones faulty assumption
 on
  how index lookup works, and very much simplified view of how different
  parts of the system work to implement MVCC.

 Yes, the suggestion was based on MVCC concepts, not a particular
 implementation.

 On the contrary - afaik, it was loosely based on how Oracle does it with
 its rollback segments, only assuming that rollback segments are kept in
 heap and that indexes point only to the oldest row version :p

Well, give me a little more credit than that. Yes, Oracle did play small
part in my thinking, but only in as much as they can't do it, why can't
we? The problem was how to get the most recent tuple to be more efficient
and not have tuples that will never be used impact performance without
excessive locking or moving data around.

It was a just a quick idea. Bruce's solution, you have to admit, is
somewhat similar.


  The original fix he suggests was to that imagined behaviour and thus
  ignored all the real problems of such change.

 The original suggestion, was nothing more than a hypothetical for the
 purpose of discussion.

 The problem was the steady degradation of performance on frequent
 updates.
 That was the point of discussion.  I brought up one possible way to
 start a brain storm. The discussion then morphed into critisizing the
 example and not addressing the problem.

 The problem is heatedly discussed every 3-4 months.

And yet, here we are again.


 Anyway, I think some decent discussion about the problem did happen, and
 that is good.

 Agreed.

 Maybe this _was_ the best way to bring up the discussion again.

I have a way, for better or worse, I guess, of stirring up the pot. :-)

Cry as we may about MySQL, but I have a sneaking suspicion that this is
one of the issues that puts PostgreSQL at a serious disadvantage.

While heavily updated rows are a single type of problem, these days I
think *most* database deployments are as back-ends for web sites. This
problem is *very* critical to that type of application, consequently
probably why PostgreSQL has difficulty in that space.

If PostgreSQL can be made *not* to suffer performance degradation on
heavily updated rows, then that is realy the last issue in the way of it
being a completely creadible medium to large enterprise back end. This
combined with its amazing pragramability, should make it unstoppable.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-27 Thread Mark Woodward
 On Fri, Jun 23, 2006 at 06:37:01AM -0400, Mark Woodward wrote:
 While we all know session data is, at best, ephemeral, people still want
 some sort of persistence, thus, you need a database. For mcache I have a
 couple plugins that have a wide range of opitions, from read/write at
 startup and shut down, to full write through cache to a database.

 In general, my clients don't want this, they want the database to store
 their data. When you try to explain to them that a database may not be
 the
 right place to store this data, they ask why, sadly they have little
 hope
 of understanding the nuances and remain unconvinced.

 Have you done any benchmarking between a site using mcache and one not?
 I'll bet there's a huge difference, which translates into hardware $$.
 That's something managers can understand.


Last benchmark I did was on a pure data level, a couple years ago,
PostgreSQL could handle about 800 session transactions a second, but
degraded over time, MCache was up about 7500 session transactions a second
and held steady. I should dig up that code and make it available on my
site.

I have a couple users that tell me that their sites couldn't work without
it, not even with MySQL.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] SO_SNDBUF size is small on win32?

2006-06-27 Thread Mark Woodward

I would set the SO_SNDBUF to 32768.

 Hi,

 I see a performance issue on win32. This problem is causes by the
 following URL.

 http://support.microsoft.com/kb/823764/EN-US/

 On win32, default SO_SNDBUF value is 8192 bytes. And libpq's buffer is
 8192 too.

 pqcomm.c:117
   #define PQ_BUFFER_SIZE 8192

 send() may take as long as 200ms. So, I think we should increase
 SO_SNDBUF to more than 8192. I attache the patch.

 Regards,
 --
 Yoshiyuki Asaba
 [EMAIL PROTECTED]
 Index: pqcomm.c
 ===
 RCS file: /projects/cvsroot/pgsql/src/backend/libpq/pqcomm.c,v
 retrieving revision 1.184
 diff -c -r1.184 pqcomm.c
 *** pqcomm.c  5 Mar 2006 15:58:27 -   1.184
 --- pqcomm.c  27 Jun 2006 15:17:18 -
 ***
 *** 593,598 
 --- 593,608 
   return STATUS_ERROR;
   }

 + #ifdef WIN32
 + on = PQ_BUFFER_SIZE * 2;
 + if (setsockopt(port-sock, SOL_SOCKET, SO_SNDBUF,
 +(char *) on, sizeof(on))  0)
 + {
 + elog(LOG, setsockopt(SO_SNDBUF) failed: %m);
 + return STATUS_ERROR;
 + }
 + #endif
 +
   /*
* Also apply the current keepalive parameters.  If we fail to 
 set a
* parameter, don't error out, because these aren't universally

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] SO_SNDBUF size is small on win32?

2006-06-27 Thread Mark Woodward
 We have definitly seen weird timing issues sometimes when both client
 and server were on Windows, but have been unable to pin it exactly on
 what. From Yoshiykis other mail it looks like this could possibly be it,
 since he did experience a speedup in the range we've been looking for in
 those cases.


 What I would think might help is a patch on the libpq side (because it
 *does* use a nonblocking socket) to avoid sending more than
 8K per WSASend call.  The effect would just be to break a
 long send into a series of shorter sends, which wouldn't
 really do anything useful on a well-designed TCP stack, but
 then this is Windows we're talking about...

 It could definitly be a good idea to have a patch there *as well*, but I
 think they'd both be affected.

As I said earlier, I would boost the socket buffer to something larger
than merely 2x the packet size. I'd try for 32K (32768), that way we have
some space for additional buffers before we hit the problem. It is
presumed that we should have enough data in the socket buffer to at least
try to match the expected amount of data that would be sent while waiting
for the defered ACK.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Mark Woodward
 Ühel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce Momjian:
 Jonah H. Harris wrote:
  On 6/23/06, Tom Lane [EMAIL PROTECTED] wrote:
   What I see in this discussion is a huge amount of the grass must be
   greener on the other side syndrome, and hardly any recognition that
   every technique has its downsides and complications.
 
  I'm being totally objective.  I don't think we should abandon
  PostgreSQL's overall design at all, because we do perform INSERTs and
  DELETEs much better than most systems.  However, I've looked at many
  systems and how they implement UPDATE so that it is a scalable
  operation.  Sure, there are costs and benefits to each implementation,
  but I think we have some pretty brilliant people in this community and
  can come up with an elegant design for scalable UPDATEs.

 I think the UPDATE case is similar to the bitmap index scan or perhaps
 bitmap indexes on disk --- there are cases we know can not be handled
 well by our existing code, so we have added (or might add) these
 features to try to address those difficult cases.

 Not really. Bitmap index scan and bitmap index are both new additions
 working well with existing framework.

 While the problem of slowdown on frequent updates is real, the suggested
 fix is just plain wrong, as it is based on someones faulty assumption on
 how index lookup works, and very much simplified view of how different
 parts of the system work to implement MVCC.

Yes, the suggestion was based on MVCC concepts, not a particular
implementation.

 The original fix he suggests was to that imagined behaviour and thus
 ignored all the real problems of such change.

The original suggestion, was nothing more than a hypothetical for the
purpose of discussion.

The problem was the steady degradation of performance on frequent updates.
That was the point of discussion.  I brought up one possible way to
start a brain storm. The discussion then morphed into critisizing the
example and not addressing the problem.

Anyway, I think some decent discussion about the problem did happen, and
that is good.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-26 Thread Mark Woodward
 Heikki Linnakangas wrote:
 On Mon, 26 Jun 2006, Jan Wieck wrote:

  On 6/25/2006 10:12 PM, Bruce Momjian wrote:
  When you are using the update chaining, you can't mark that index row
 as
  dead because it actually points to more than one row on the page,
 some
  are non-visible, some are visible.
 
  Back up the truck ... you mean in the current code base we have heap
 tuples
  that are visible in index scans because of heap tuple chaining but
 without
  index tuples pointing directly at them?

 In current code, no. Every heap tuple has corresponding index tuples.

 In Bruce's proposal, yes. You would have heap tuples without index
 tuples
 pointing directly at them. An index scan could only find them by
 following
 t_ctid chains.

 Correct me if I understood you incorrectly, Bruce.

 Correct!  We use the same pointers used by normal UPDATEs, except we set
 a bit on the old tuple indicating it is a single-index tuple, and we
 don't create index entries for the new tuple.  Index scan routines will
 need to be taught about the new chains, but because only one tuple in
 the chain is visible to a single backend, the callers should not need to
 be modified.

 (All tuples in the chain have page item ids.  It is just that when they
 are freed, the pointers are adjusted so the index points to the chain
 head.)

 One problem is that once you find the row you want to update, it is
 difficult to see if it is part of a single-index chain because there are
 only forward pointers, so I think we have to scan the entire page to
 find the chains.  To reduce that overhead, I am thinking we free the
 non-visible tuples only when the page has no more free space.  This
 allows us to free not just our own non-visible tuples, but perhaps
 others as well.

This sort of incorporates the vacuum row I suggested.


 We have never been able to free non-visible tuples before because of
 index cleanup overhead, but with single-index chains, we can, and reduce
 the requirements of vacuum for many workloads.


This is great!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum row?

2006-06-26 Thread Mark Woodward
 On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:
 I originally suggested a methodology for preserving MVCC and everyone is
 confusing it as update in place, this isnot what I intended.

 Actually, you should've presented your idea as performing MVCC the way
 Firebird does... the idea is basically the same.  Doing some research
 never hurts... especially with this crowd.

Is it really nessisary make personal comments like this? Lets discuss
ideas not personalities or people.

The whole issue was how to address updates steadily degrading performance.
I wanted to brainstorm the issue and find a solution. I tossed out a first
guess at an algorithm to start the ball rolling.

Was it perfect? No. Was it intended to be? no. It was intended to spark a
discussion, get people, first to recognize the problem, and then to think
about possible solutions.

I find that this, while chaotic, usually finds the best solutions. There
are a lot of good and smart people here who understand this process and
see it for what it is. Unfortunately, some don't.

It isn't about research, per se, because it is assumed that we all know
the various issues involved to some degree. It is about using the
collective knowledge of the group and coming up with an answer.

Over email, this can sometimes come off badly, and for that I appologize,
but imagine, we were sitting at a table in cambridge brewing company,
and we had laptops and pitchers of beer and were discussing the problem.

I'm at a stark disadvantage as I use PostgreSQL a lot, but don't have the
luxury of being able to work on it in any real depth. I'd really love too.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-25 Thread Mark Woodward
 On 6/24/2006 9:23 AM, Mark Woodward wrote:

 On Sat, 24 Jun 2006, Mark Woodward wrote:

 I'm probably mistaken, but aren't there already forward references in
 tuples to later versions? If so, I'm only sugesting reversing the
 order
 and referencing the latest version.

 I thought I understood your idea, but now you lost me again. I thought
 what you want is that the older heap tuple has a pointer to the
 newer one. Which it already has, it's called t_ctid.

 Perfect!

 Can you try to explain more carefully how the whole thing would work?
 What would an index tuple point to? What pointers would a heap tuple
 have? What would an index scan do to find the row version it's
 interested
 in? What exactly would an update do?


 Since we already allocate space for some notion of linked list, then all
 I'm suggesting is we reverse the order, sort of. Currently it looks like
 this:

 ver001-ver002-ver003-...-verN

 That's what t_ctid does now, right? Well, that's sort of stupid. Why not
 have it do this:

 ver001-verN-...-ver003-ver002-|
  ^-/

 This will speed up almost *all* queries when there are more than two
 version of rows.

 OK, here is the behavior of an update:
 (1) Find the latest version of the row
 (2) Duplicate row and modify as per plan
 (3) Set the t_ctid of the new row to the last latest
 (4) Set the t_ctid of the first row to that of the new row
 (5) Attempt to index the row
 (6) If the first version of the row is in the index already (ver001)
 Don't
 modify the index, otherwise, add the new version (just as before)

 When you vacuum, simply make the latest version (verN) the key row
 (ver001).

 This isn't done simply. Currently, vacuum collects a trivial array of
 ctid's it is removing and every now and then does a bulk remove of the
 index tuples pointing to them. Now lets consider a table with two
 indexed columns with the following row versions resulting from an insert
 and 3 updates to that same row:

v1:  a,b
v2:  a,c
v3:  a,d
v4:  b,d

 In your new scheme, there would be two index tuples for column 1 (one
 pointing to v1, one pointing to v4) and 3 index tuples for column 2 (one
 for each different value pointing to v1, v2 and v3). Did I get that
 right so far?

 If vacuum now can remove v1, it has to update index 1 to point to v2 and
 remove the pointer to v1 from index 2. If it can remove v1 and v2, it
 has to update index 1 to point to v3 and remove v1 and v2 from index 2.
 If it can remove v1, v2 and v3 it must delete the index 1 tuple pointing
 to v1, delete the index 2 entries pointing to v1 and v2 and update the
 index 2 entry for v3 to point to v4. Figuring out which index tuples to
 remove and which ones to update can only be done by comparing each and
 every indexed columns old and new values. To do so, vacuum will have to
 fetch all the row versions, which can be scattered all over the place,
 with all possible locking issues including but not limited to deadlocks.

I'm not sure why vacuum can't run similarly to the way it does now.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
 On 6/23/2006 3:10 PM, Mark Woodward wrote:

 This is NOT an in-place update. The whole MVCC strategy of keeping old
 versions around doesn't change. The only thing that does change is one
 level of indirection. Rather than keep references to all versions of all
 rows in indexes, keep only a reference to the first or key row of each
 row, and have the first version of a row form the head of a linked list
 to
 subsequent versions of each row. The list will be in decending order.

 Where exactly do you intend to keep all those links (for a table with N
 indexes)?


I'm probably mistaken, but aren't there already forward references in
tuples to later versions? If so, I'm only sugesting reversing the order
and referencing the latest version.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
 On Sat, 24 Jun 2006, Mark Woodward wrote:

 I'm probably mistaken, but aren't there already forward references in
 tuples to later versions? If so, I'm only sugesting reversing the order
 and referencing the latest version.

 I thought I understood your idea, but now you lost me again. I thought
 what you want is that the older heap tuple has a pointer to the
 newer one. Which it already has, it's called t_ctid.

Perfect!

 Can you try to explain more carefully how the whole thing would work?
 What would an index tuple point to? What pointers would a heap tuple
 have? What would an index scan do to find the row version it's interested
 in? What exactly would an update do?


Since we already allocate space for some notion of linked list, then all
I'm suggesting is we reverse the order, sort of. Currently it looks like
this:

ver001-ver002-ver003-...-verN

That's what t_ctid does now, right? Well, that's sort of stupid. Why not
have it do this:

ver001-verN-...-ver003-ver002-|
 ^-/

This will speed up almost *all* queries when there are more than two
version of rows.

OK, here is the behavior of an update:
(1) Find the latest version of the row
(2) Duplicate row and modify as per plan
(3) Set the t_ctid of the new row to the last latest
(4) Set the t_ctid of the first row to that of the new row
(5) Attempt to index the row
(6) If the first version of the row is in the index already (ver001) Don't
modify the index, otherwise, add the new version (just as before)

When you vacuum, simply make the latest version (verN) the key row (ver001).

There are, no doubt, issues that need to be resolved (I can think of a
coouple off the top of my head), but overall I think it is workable and
don't think this will affect performance in the simple case and improve
performance in the cases where there are more than one or two version of a
row.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
 On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:
 Currently it looks like this:

 ver001-ver002-ver003-...-verN

 That's what t_ctid does now, right? Well, that's sort of stupid. Why not
 have it do this:

 ver001-verN-...-ver003-ver002-|

 Heh, because that's crazy.  The first time you insert a key into the
 index it will point to v1 of a tuple... say after 5 updates you have
 v2,v3,v4,v5... your c_tid pointer chain looks like v1
 (original)-v2-v3-v4-v5 (newest).  However, your whole idea is based
 on not having to do another index insert for unchanged keys, so the
 index still points to v1... which means you have to follow the c_tid
 chain to get to the newest version just like a sequential scan.  I
 don't see how you think you can reverse pointer it.

In the scenario, as previously outlined:

ver001-verN-...-ver003-ver2-|
  ^-/

The index points to version 1 (ver001) which points to the latest version
(verN).




 This will speed up almost *all* queries when there are more than two
 version of rows.

 Nope.

Of course it will.


 When you vacuum, simply make the latest version (verN) the key row
 (ver001).

 How are you going to do this without a ton of locking... remember, the
 index is pointing to v1 with a tid... so you'll have to physically
 move the newest version v5 to v1's tid from wherever it was... like a
 vacuum full on steroids.  Unless of course, you rebuild the index...
 but that's not a solution either.

I don't understand how you can assume this. In fact, it wil proably reduce
locking and disk IO by not having to modify indexes.
\

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
 On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:
 In the scenario, as previously outlined:

 ver001-verN-...-ver003-ver2-|
   ^-/

 So you want to always keep an old version around?

Prior to vacuum, it will be there anyway, and after vacuum, the new
version will become ver001.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-24 Thread Mark Woodward
 On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:
  On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:
  In the scenario, as previously outlined:
 
  ver001-verN-...-ver003-ver2-|
^-/
 
  So you want to always keep an old version around?

 Prior to vacuum, it will be there anyway, and after vacuum, the new
 version will become ver001.

 So you do intend to move verN into ver001's slot?  What about the
 other conditions you had mentioned where you have to follow
 PostgreSQL's current behavior?  How are you going to have a pointer
 chain in that case?

Who said anything about moving anything. When vacuum comes along, it
cleans out previous versions of rows. Very little will change.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] vacuum row?

2006-06-24 Thread Mark Woodward
I originally suggested a methodology for preserving MVCC and everyone is
confusing it as update in place, this isnot what I intended.

How about a form of vacuum that targets a particular row? Is this
possible? Would if have to be by transaction?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 The example is a very active web site, the flow is this:

 query for session information
 process HTTP request
 update session information

 This happens for EVERY http request. Chances are that you won't have
 concurrent requests for the same row, but you may have well over 100
 HTTP
 server processes/threads answering queries in your web server farm.

 You're crazy :)  Use memcache, not the DB :)


I actually have what I consider a better and more complete session handler
system. MCache formally MSession. (http://www.mohawksoft.org/?q=node/8) I
mean, it implements a LOT of nifty features, loadable function modules,
collision safe counters and operators, ability to save session data to
file or SQL database and at varying levels of caching, but that doesn't
mean it is used.

Technologies like memcached and my mcache are a separate data store. Your
session data is not usable anywhere but in your web system. I have gone as
far as to write a session serializer for PHP that outputs XML, a
PostgreSQL plugin that can extract data from the XML session string, and a
set of functions for interfacing mcache with PostgreSQL and I still have a
hard time convincing clients that this is the right way to go.

While we all know session data is, at best, ephemeral, people still want
some sort of persistence, thus, you need a database. For mcache I have a
couple plugins that have a wide range of opitions, from read/write at
startup and shut down, to full write through cache to a database.

In general, my clients don't want this, they want the database to store
their data. When you try to explain to them that a database may not be the
right place to store this data, they ask why, sadly they have little hope
of understanding the nuances and remain unconvinced.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
  I suppose you have a table memberships (user_id, group_id) or something
 like it ; it should have as few columns as possible ; then try regularly
 clustering on group_id (maybe once a week) so that all the records for a
 particular group are close together. Getting the members of a group to
 send them an email should be faster (less random seeks).

 It is like this, and some more bookkeeping data which must be there...
 we could split the table for smaller records or for updatable/stable
 fields, but at the end of the day it doesn't make much sense, usually
 all the data is needed and I wonder if more big/shallow tables instead
 of one big/wider makes sense...

 Regularly clustering is out of question as it would render the system
 unusable for hours. There's no 0 activity hour we could use for such
 stuff. There's always something happening, only the overall load is
 smaller at night...


Let me ask a question, you have this hundred million row table. OK, how
much of that table is read/write? Would it be posible to divide the
table into two (or more) tables where one is basically static, only
infrequent inserts and deletes, and the other is highly updated?

The big thing in performance is the amount of disk I/O, if you have a
smaller active table with only a single index, then you may be able to cut
your disk I/O time really down. The smaller the row size, the more rows
fit into a block. The fewer blocks the less dissk I/O. The less disk I/O
the bbetter the performance.

Also, and anyone listening correct me if I'm wrong, you NEED to vacuum
frequently because the indexes grow and vacuuming them doesnt remove
everything, sometimes a REINDEX or a drop/recreate is the only way to get
performance back. So if you wait too long between vacuums, your indexes
grow  and spread across more disk blocks than they should and thus use
more disk I/O to search and/or shared memory to cache.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 Let me ask a question, you have this hundred million row table. OK, how
 much of that table is read/write? Would it be posible to divide the
 table into two (or more) tables where one is basically static, only
 infrequent inserts and deletes, and the other is highly updated?

 Well, all of it is read write... some of the data might be updated less
 frequently, but there's no way I would know which part of the data is
 that. Logically is just the same type of data... so unless I find a way
 to continuously move back and forth the data between an archive table
 and the live table, based on how active the groups are, I can't imagine
 any other way of partitioning it. And that would also mean some quite
 big load given the pretty high dynamics of the groups.

 The big thing in performance is the amount of disk I/O, if you have a
 smaller active table with only a single index, then you may be able to
 cut
 your disk I/O time really down. The smaller the row size, the more rows
 fit into a block. The fewer blocks the less dissk I/O. The less disk I/O
 the bbetter the performance.

 I agree, but it is quite hard to achieve that when the data set is both
 big AND the partitioning criteria is highly dynamic. Not to mention that
 deleting from that table is also a PITA performance-wise, so I wonder
 how well the continuous back and forth between the active and inactive
 table would do.

 Also, and anyone listening correct me if I'm wrong, you NEED to vacuum
 frequently because the indexes grow and vacuuming them doesnt remove
 everything, sometimes a REINDEX or a drop/recreate is the only way to
 get
 performance back. So if you wait too long between vacuums, your indexes
 grow  and spread across more disk blocks than they should and thus use
 more disk I/O to search and/or shared memory to cache.

 This is nice in theory, but kills performance. I vacuum the big tables
 only overnight, otherwise the server is sluggish.

Well, the only thing left is to cluster the database. There are a couple
ways to do this, one switch to a platform that supports clustering or
create an API to wrap multiple databases. If your queries are simple and
limited, you could create an HTTP/XML service that wraps a number of
postgresql databases, issues a query across all databases, merges multiple
query sets, and returns one homoginous stream.

Inserts would be handled by hash to machine weighted by number of records
on each machine.

Updates and deletes would have two keys, machine and ID.

It sounds like you have a big problem and you need a big solution.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 Ühel kenal päeval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward:

  Depending on exact details and optimisations done, this can be either
  slower or faster than postgresql's way, but they still need to do
  something to get transactional visibility rules implemented.

 I think they have a different strategy. I think they maintain the notion
 of current version of a row, and hunt for previous versions when
 needed,
 at least that's how I suspect Oracle does it with redo logs.

 Not current but last :)

True

 And one side effect of redo logs is that it is practically impossible to
 do large deletes on production databases. So you design around that,
 like you have to design around limitations of MVCC.

Think that's bad, try doing an update in PostgreSQL on a table with 20
million rows and a few indexes. I had to write a script to chunk up the
block update into segments and vacuum between each.


There has to be a more linear way of handling this scenario.
  
   So vacuum the table often.
 
  It's easy to say VACUUM often... but I'd bet that vacuuming is going
  to lessen the throughput in his tests even more; no matter how it's
  tuned.
 
  Running VACUUM often/continuously will likely keep his update rate
  fluctuatons within a corridor of maybe 5-10%, at the cost of 1-2%
 extra
  load. At least if vacuum is configured right and the server is not
  already running at 100% IO saturation, in which case it will be worse.

 Assuming the table is a reasonable size, the I/O required for vacuum
 doesn't kill everything else!

 I have solved the problem of unneccessary IO by keeping active and
 finished rows in separate tables, with the finish() function moving the
 row between tables.

Sorry, an RDBMS is a relational database management system, if you are
doing the database management, it isn't a very good RDBMS.


 In case of the number of actively modified rows being in only tens or
 low hundreds of thousands of rows, (i.e. the modified set fits in
 memory) the continuous vacuum process shows up as just another backend,
 not really taking order of magnitude more resources. It mainly generates
 WAL traffic, as modified pages are already in memory/cache and are
 mostly synced by background writer and/or checkpoint.

 Of course you have to adjust vacuum_cost_* variables so as to not
 saturate IO.

These sort of solutions, IMHO, don't show how good PostgreSQL is, but show
where it is very lacking.


  The max throughput figure is not something you actually need very
 often
  in production.

 No, but you need to have some degree of certainty and predictability in
 the system you are developing.

 Yup. You have to design it so it has.

I was refereing to the system as a whole and the individual components.
PostgreSQL's performance under some pathalogical condictions is not very
predictable or reliable.


  What is interesting is setting up the server so that you
  can service your loads comfortably. Running the server at 100% lead is
  not anything you want to do on production server. There will be things
  you need to do anyway and you need some headroom for that.

 Of course, you design it so peaks are easily managed, but unless you run
 vacuum continuously, and that has its own set of problems, you run into
 this problem, and it can get really really bad.

 Usually it gets really bad if you *don't* run vacuum continuously, maybe
 hopeing to do it in slower times at night. For high-update db you have
 to run it continuously, maybe having some 5-15 sec pauses between runs.

And how much I/O does this take?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 Mark Woodward wrote:

  In case of the number of actively modified rows being in only tens or
  low hundreds of thousands of rows, (i.e. the modified set fits in
  memory) the continuous vacuum process shows up as just another
 backend,
  not really taking order of magnitude more resources. It mainly
 generates
  WAL traffic, as modified pages are already in memory/cache and are
  mostly synced by background writer and/or checkpoint.
 
  Of course you have to adjust vacuum_cost_* variables so as to not
  saturate IO.

 These sort of solutions, IMHO, don't show how good PostgreSQL is, but
 show
 where it is very lacking.

 We all know Postgres is lacking; some of us try to improve it (some with
 more success than others).  People who know the current limitations but
 like the capabilities, try to find workarounds to the problems. What
 surprises me is that, if you have such a low opinion of Postgres, you
 still use it.

Actually I love PostgreSQL, I've been using it for about 10 years on a lot
of projects. There are some serious issues with it, however, and it is
important to expose them, discuss them, and resolve them. Work arounds are
great, but in the end, they are work arounds.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward

 Bottom line: there's still lots of low-hanging fruit.  Why are people
 feeling that we need to abandon or massively complicate our basic
 architecture to make progress?

   regards, tom lane

I, for one, see a particularly nasty unscalable behavior in the
implementation  of MVCC with regards to updates.

For each update to a row additional work needs to be done to access that
row. Surely a better strategy can be done, especially considering that the
problem being solved is a brief one.

The only reason why you need previous versions of a row is for
transactions that started before or during the transaction that seeks to
modify a row. After which time, the previous versions continue to affect
performance and take up space even though they are of no value.
 (Caveats for rollback, etc. but the point is still valid).

This is a very pessimistic behavior and penalizes the more common and
optimistic operations. Now, if a tool were to be created that could roll
back an entire database to some arbitrary transaction ID between vacuums,
then I can see the usefulnes of the older versions.

I still think an in-place indirection to the current row could fix the
problem and speed up the database, there are some sticky situations that
need to be considered, but it shouldn't break much.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 On 6/23/06, Mark Woodward [EMAIL PROTECTED] wrote:
 I, for one, see a particularly nasty unscalable behavior in the
 implementation  of MVCC with regards to updates.

 I think this is a fairly common acceptance.  The overhead required to
 perform an UPDATE in PostgreSQL is pretty heavy.  Actually, it's not
 really PostgreSQL's implementation, but anything that employs basic
 multi-version timestamp ordering (MVTO) style MVCC.  Basically,
 MVTO-style systems require additional work to be done in an UPDATE so
 that queries can find the most current row more quickly.

 This is a very pessimistic behavior

 Yes, and that's basically the point of MVTO in general.  The nice
 thing about MVTO-style MVCC is that it isn't super complicated.  No
 big UNDO strategy is needed because the old versions are always there
 and just have to satisfy a snapshot.

 I still think an in-place indirection to the current row could fix the
 problem and speed up the database, there are some sticky situations that
 need to be considered, but it shouldn't break much.

 I agree, but should make clear that moving to an in-place update isn't
 a quick-fix; it will require a good amount of design and planning.

This is NOT an in-place update. The whole MVCC strategy of keeping old
versions around doesn't change. The only thing that does change is one
level of indirection. Rather than keep references to all versions of all
rows in indexes, keep only a reference to the first or key row of each
row, and have the first version of a row form the head of a linked list to
subsequent versions of each row. The list will be in decending order.

In the vast majority of cases, the overhead of this action will be
trivial. In an unmodified row, you're there. In a modified row, you have
one extra lookup. In extream cases, you may have to go back a few
versions, but I don't see that as a common behavior.

On a heavily updated row, you are never more than one jump away, the
indexes  shouldn't grow overly much.


 What I find in these discussions is that we always talk about over
 complicating vacuum in order to fix the poor behavior in MVCC.  Fixing
 autovacuum does not eliminate the overhead required to add index
 entries and everything associated with performing an UPDATE... it's
 just cleaning up the mess after the fact.  As I see it, fixing the
 root problem by moving to update-in-place may add a little more
 complication to the core, but will eliminate a lot of the headaches we
 have in overhead, performance, and manageability.

Vacuum is a tool for removing old versions. I think there is an overly
eager tendency to have it fix other problems.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 Tom Lane wrote:
 If you're doing heavy updates of a big table then it's likely to end up
 visiting most of the table anyway, no?  There is talk of keeping a map
 of dirty pages, but I think it'd be a win for infrequently-updated
 tables, not ones that need constant vacuuming.

 I think a lot of our problems in this area could be solved with fairly
 straightforward tuning efforts on the existing autovacuum
 infrastructure.  In particular, someone should be looking into
 recommendable default vacuum-cost-delay settings so that a background
 vacuum doesn't affect performance too much.  Another problem with the
 current autovac infrastructure is that it doesn't respond very well to
 the case where there are individual tables that need constant attention
 as well as many that don't.  If you have N databases then you can visit
 a particular table at most once every N*autovacuum_naptime seconds, and
 *every* table in the entire cluster gets reconsidered at that same rate.
 I'm not sure if we need the ability to have multiple autovac daemons
 running at the same time, but we definitely could use something with a
 more flexible table-visiting pattern.  Perhaps it would be enough to
 look through the per-table stats for each database before selecting the
 database to autovacuum in each cycle, instead of going by least
 recently autovacuumed.

 Bottom line: there's still lots of low-hanging fruit.  Why are people
 feeling that we need to abandon or massively complicate our basic
 architecture to make progress?

 I think at some point we have to admit that _polling_ the tables, which
 is what autovacuum does, just isn't going to work well, no matter how
 much it is tweeked, and another approach should be considered for
 certain workload cases.

Thank you, that is *eactly* the anaology I have been unable to formulate.
It was on my mind but I could not put my finger on it.

Vacuum is findimentally inefficient as it does not know what has changed
and must go through an entirety to find the specific each time. Going
through the whole table each time is messed up and wasteful.


 At some point, the autovacuum approach starts to look like a car with
 fifty bumper stickers.  The first few were fine, but at some point, the
 tweeks (bumper stickers) start to overwhelm the car, and it is time to
 look for a new car.

 I think particularly for the UPDATE with no index key changes, a new
 approach must be considred.


I have been ranting about a first row strategy, one where the first
version of a row is the top of a linked list of versions.

(1) The indexes point to the first key row.
(2) When a row is updated, it is found in the various indexes, if the key
row currenlty exists in the index, no changes to the index are made. If it
is not found, the old version of the row is orphaned and behaves as
PostgreSQL always behaves.
(3) If the row is not orphaned, its last version reference is updated.

For the most part, this should only affect updates where the index entries
don't change. If the index value is always change, PostgreSQL will behave
as it currently does. If the index values do not change, updates will be
faster to do and won't impact queries.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread Mark Woodward
 Just out of curiosity Mark, didn't you write your session daemon so
 that you don't have to put sessions in postgres anymore?

The original project started as a shared key/value system for a beowulf
cluster in the late 90s, but got reworked to be a session handler for PHP
when I worked with Stig, MSession.

 Or are you
 just giving that as an example of a very wide, very heavily updated
 table?  My session tables have been an extreme case of this problem,
 but no other table that I have is so adversely affected by this
 behavior.  My decision was not to pull postgres out entirely, just
 using other session handlers.

I have been working as a consultant since 2001, and prior to that, as CTO
at at a dot.com startup. MSession (the previous name) was used to
circumvent shortcomings in PostgreSQL, specificially the problem we are
talking about.

As a consultant, I have to convince the customer that all is well. My
MCache system does not guarentee that no session data lost, nor does
memcached or other non-ACID system.

The technical arguments we can make, no matter how correct, leave us on
the defensive when asked What if the server crashes, do you lose data?
of course the answer is yes. Then we get drawn into a conversation about
transient and unimportant data vs persistent and valuable data. At which
point you've lost the customer. A solid SQL database is the defacto
standard, perhaps not the best choice, but unavoidable.

The update behavior of PostgreSQL is probably the *last* serious issue.
Debate all you want, vacuum mitigates the problem to varying levels,
fixing the problem will be a huge win. If the update behavior gets fixed,
I can't think of a single issue with postgresql that would be a show
stopper.



 Rick

 On Jun 22, 2006, at 7:59 AM, Mark Woodward wrote:

 After a long battle with technology, [EMAIL PROTECTED] (Mark
 Woodward), an earthling, wrote:
 Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward)
 mumbled into
 her beard:
 [snip]

 1.  The index points to all the versions, until they get
 vacuumed out.

 It can't point to all versions, it points to the last current
 version
 as  updated by vacuum, or the first version of the row.

 No, it points to *all* the versions.

 Suppose I take a table with two rows:

 INFO:  analyzing public.test
 INFO:  test: 1 pages, 2 rows sampled, 2 estimated total rows
 VACUUM

 Then, over and over, I remove and insert one entry with the same PK:

 sample=# delete from test where id = 2;insert into test (id)
 values (2);
 DELETE 1

 [snip]

 Now, I vacuum it.

 sample=# vacuum verbose analyze test;
 INFO:  vacuuming public.test
 INFO:  index test_id_key now contains 2 row versions in 2 pages
 DETAIL:  10 index row versions were removed.
 0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  test: removed 10 row versions in 1 pages
 DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  test: found 10 removable, 2 nonremovable row versions in
 1 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  analyzing public.test
 INFO:  test: 1 pages, 2 rows sampled, 2 estimated total rows
 VACUUM

 Notice that the index contained 10 versions of that one row.

 It pointed to *ALL* the versions.

 Hmm, OK, then the problem is more serious than I suspected.
 This means that every index on a row has to be updated on every
 transaction that modifies that row. Is that correct?

 I am attaching some code that shows the problem with regard to
 applications such as web server session management, when run, each
 second
 the system can handle fewer and fewer connections. Here is a brief
 output:

 [EMAIL PROTECTED]:~/pgfoo$ ./footest
 1307 sessions per second, elapsed: 1
 1292 sessions per second, elapsed: 2
 1287 sessions per second, elapsed: 3
 
 1216 sessions per second, elapsed: 25
 1213 sessions per second, elapsed: 26
 1208 sessions per second, elapsed: 27
 
 1192 sessions per second, elapsed: 36
 1184 sessions per second, elapsed: 37
 1183 sessions per second, elapsed: 38
 
 1164 sessions per second, elapsed: 58
 1170 sessions per second, elapsed: 59
 1168 sessions per second, elapsed: 60

 As you can see, in about a minute at high load, this very simple table
 lost about 10% of its performance, and I've seen worse based on update
 frequency.  Before you say this is an obscure problem, I can tell
 you it
 isn't. I have worked with more than a few projects that had to
 switch away
 from PostgreSQL because of this behavior.

 Obviously this is not a problem with small sites, but this is a real
 problem with an enterprise level web site with millions of visitors
 and
 actions a day. Quite frankly it is a classic example of something that
 does not scale. The more and more updates there are, the higher the
 load
 becomes. You can see it on top as the footest program runs.

 There has to be a more

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
 Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into
 her beard:
 We all know that PostgreSQL suffers performance problems when rows are
 updated frequently prior to a vacuum. The most serious example can be
 seen
 by using PostgreSQL as a session handler for a busy we site. You may
 have
 thousands or millions of active sessions, each being updated per page
 hit.

 Each time the record is updated, a new version is created, thus
 lengthening the correct version search each time row is accessed,
 until,
 of course, the next vacuum comes along and corrects the index to point
 to
 the latest version of the record.

 Is that a fair explanation?

 No, it's not.

 1.  The index points to all the versions, until they get vacuumed out.

It can't point to all versions, it points to the last current version
as  updated by vacuum, or the first version of the row.


 2.  There may simultaneously be multiple correct versions.  The
 notion that there is one version that is The Correct One is wrong, and
 you need to get rid of that thought.

Sorry, this is  misunderstanding. By correct version search it was
implied for this transaction. Later I mention finding the first row with
a transaction lower than the current.


 If my assertion is fundimentally true, then PostgreSQL will always
 suffer
 performance penalties under a heavy modification load. Of course, tables
 with many inserts are not an issue, it is mainly updates. The problem is
 that there are classes of problems where updates are the primary
 operation.

 The trouble with your assertion is that it is true for *all* database
 systems except for those whose only transaction mode is READ
 UNCOMMITTED, where the only row visible is the Latest version.

Not true. Oracle does not seem to exhibit this problem.


 I was thinking, just as a hypothetical, what if we reversed the
 problem, and always referenced the newest version of a row and
 scanned backwards across the versions to the first that has a lower
 transacton number?

 That would require an index on transaction number, which is an
 additional data structure not in place now.  That would presumably
 worsen things.

All things being equal, perhaps not. It would proably be a loser if you
have a static database, but in a database that undergoes modification, it
would be the same or less work if the average row has two versions.
(assuming nothing else changes)

 One possible implementation: PostgreSQL could keep an indirection array
 of
 index to table ref for use by all the indexes on a table. The various
 indexes return offsets into the array, not direct table refs. Because
 the
 table refs are separate from the index, they can be updated each time a
 transaction is commited.

 You mean, this index would be VACUUMed as a part of each transaction
 COMMIT?  I can't see that turning out well...

No, it would not be vacuumed!!!

Right now, the indexes point to the lowest row version. When an index
returns the row ID, it is checked if there are newer versions, if so, the
newer versions are searched until the last one is found or exceeds the
current TID.


 This way, the newest version of a row is always the first row
 found. Also, on a heavily updated site, the most used rows would
 always be at the end of the table, reducing amount of disk reads or
 cache memory required to find the correct row version for each
 query.

 I can't see how it follows that most-used rows would migrate to the
 end of the table.

Sorry, OK, as assumtion it ignores the FSM, but the idea is that there is
only one lookup.

 That would only be true in a database that is never
 VACUUMed; as soon as a VACUUM is done, free space opens up in the
 interior, so that new tuples may be placed in the interior.


Regardless, the point is that you have to search the [N] versions of a row
to find the latest correct version of the row for your transacation. This
is done, AFAICT, from first to last version, meaning that the work
required to find a row increases with every update prior to vacuum.

PostgreSQL fails miserably as a web session handler because of this
behavior and it requires too frequent vacuums and inconsistent
performance.

OK, forget the version array, it was just an off the top idea. How about
this:

Currently a row does this:

row_TID[0] - row_TID[1] -row_TID[2] ./. row_TID[LAST-1] - row_TID[LAST]

Pointing to each subsequent row. What if it did this:

row_TID[0] - row_TID[LAST] - row_TID[LAST-1] ./. - row_TID[2] -
row_TID[1]

The base tuple of a version chain gets updated to point to the latest
commited row. It should be fairly low impact on performance on a static
database, but REALLY speed up PostgreSQL on a heavily modified database
and provide more consistent performance between vacuums and require fewer
vacuums to maintain performance.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
 After a long battle with technology, [EMAIL PROTECTED] (Mark
 Woodward), an earthling, wrote:
 Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled into
 her beard:
[snip]

 1.  The index points to all the versions, until they get vacuumed out.

 It can't point to all versions, it points to the last current
 version
 as  updated by vacuum, or the first version of the row.

 No, it points to *all* the versions.

 Suppose I take a table with two rows:

 INFO:  analyzing public.test
 INFO:  test: 1 pages, 2 rows sampled, 2 estimated total rows
 VACUUM

 Then, over and over, I remove and insert one entry with the same PK:

 sample=# delete from test where id = 2;insert into test (id) values (2);
 DELETE 1

[snip]

 Now, I vacuum it.

 sample=# vacuum verbose analyze test;
 INFO:  vacuuming public.test
 INFO:  index test_id_key now contains 2 row versions in 2 pages
 DETAIL:  10 index row versions were removed.
 0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  test: removed 10 row versions in 1 pages
 DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  test: found 10 removable, 2 nonremovable row versions in 1 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  analyzing public.test
 INFO:  test: 1 pages, 2 rows sampled, 2 estimated total rows
 VACUUM

 Notice that the index contained 10 versions of that one row.

 It pointed to *ALL* the versions.

Hmm, OK, then the problem is more serious than I suspected.
This means that every index on a row has to be updated on every
transaction that modifies that row. Is that correct?

I am attaching some code that shows the problem with regard to
applications such as web server session management, when run, each second
the system can handle fewer and fewer connections. Here is a brief output:

[EMAIL PROTECTED]:~/pgfoo$ ./footest
1307 sessions per second, elapsed: 1
1292 sessions per second, elapsed: 2
1287 sessions per second, elapsed: 3

1216 sessions per second, elapsed: 25
1213 sessions per second, elapsed: 26
1208 sessions per second, elapsed: 27

1192 sessions per second, elapsed: 36
1184 sessions per second, elapsed: 37
1183 sessions per second, elapsed: 38

1164 sessions per second, elapsed: 58
1170 sessions per second, elapsed: 59
1168 sessions per second, elapsed: 60

As you can see, in about a minute at high load, this very simple table
lost about 10% of its performance, and I've seen worse based on update
frequency.  Before you say this is an obscure problem, I can tell you it
isn't. I have worked with more than a few projects that had to switch away
from PostgreSQL because of this behavior.

Obviously this is not a problem with small sites, but this is a real
problem with an enterprise level web site with millions of visitors and
actions a day. Quite frankly it is a classic example of something that
does not scale. The more and more updates there are, the higher the load
becomes. You can see it on top as the footest program runs.

There has to be a more linear way of handling this scenario.
#include stdio.h
#include stdlib.h
#include stdarg.h
#include libpq-fe.h
#include sys/times.h
#include assert.h
#include string.h

// Create these items in a database named sessionfoo
// create table foo_sessions(session_id text, session_values text);
// create index foo_sessions_ndx on foo_sessions(session_id);

#define MAX_SESSIONS	100



char *sessions[MAX_SESSIONS];


void PQexecClear(PGconn *pg, char *sql)
{
	PGresult *pgres = PQexec(pg, sql);
	assert(pgres);
	PQclear(pgres);
}

void createRandSession(PGconn *pg, int ndx)
{
	char session[64];
	char qbuf[256];
	snprintf(session, sizeof(session), %8X-%8X-%8X, time(0), rand(), times(NULL));
	snprintf(qbuf, sizeof(qbuf), 
		insert into foo_sessions(session_id, session_values)values('%s','%08X'),
			session, times(NULL));
	PQexecClear(pg,qbuf);
	sessions[ndx] =  strdup(session);
}

void updateSession(PGconn *pg, int ndx)
{
	PGresult *pgres;
	char qbuf[256];
	char *session = sessions[ndx];

	snprintf(qbuf, sizeof(qbuf),
		select * from foo_sessions where session_id = '%s',
			session);
	PQexecClear(pg,qbuf);

	snprintf(qbuf, sizeof(qbuf),
		update foo_sessions set session_values = '%08X' where session_id = '%s',
			times(NULL), session);
	PQexecClear(pg,qbuf);
}



int main()
{
	int startTime;
	int loopTime;
	
	int count=0;
	int i;
	PGresult * pgres;
	PGconn *pg = PQconnectdb(dbname=sessionfoo);

	assert(pg);
	PQexecClear(pg, delete from foo_sessions);
	PQexecClear(pg, vacuum foo_sessions);
	
	for(i=0; i  MAX_SESSIONS; i++)
		createRandSession(pg,i);


	loopTime = time(0);
	while(loopTime == time(0)) // Wait for a fraction
		;

	startTime = loopTime = time(0);
	while(1)
	{
		int theTime = time(0);
		if(loopTime != theTime)
		{
			loopTime = theTime;
			printf(%d sessions per second, elapsed: %d\n, count, loopTime-startTime

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
 Ühel kenal päeval, N, 2006-06-22 kell 09:59, kirjutas Mark Woodward:
  After a long battle with technology, [EMAIL PROTECTED] (Mark
  Woodward), an earthling, wrote:
  Clinging to sanity, [EMAIL PROTECTED] (Mark Woodward) mumbled
 into
  It pointed to *ALL* the versions.

 Hmm, OK, then the problem is more serious than I suspected.
 This means that every index on a row has to be updated on every
 transaction that modifies that row. Is that correct?

 Yes.

 I am attaching some code that shows the problem with regard to
 applications such as web server session management, when run, each
 second
 the system can handle fewer and fewer connections. Here is a brief
 output:

 [EMAIL PROTECTED]:~/pgfoo$ ./footest
 1307 sessions per second, elapsed: 1
 1292 sessions per second, elapsed: 2
 1287 sessions per second, elapsed: 3
 
 1216 sessions per second, elapsed: 25
 1213 sessions per second, elapsed: 26
 1208 sessions per second, elapsed: 27
 
 1192 sessions per second, elapsed: 36
 1184 sessions per second, elapsed: 37
 1183 sessions per second, elapsed: 38
 
 1164 sessions per second, elapsed: 58
 1170 sessions per second, elapsed: 59
 1168 sessions per second, elapsed: 60

 As you can see, in about a minute at high load, this very simple table
 lost about 10% of its performance, and I've seen worse based on update
 frequency.  Before you say this is an obscure problem, I can tell you it
 isn't. I have worked with more than a few projects that had to switch
 away
 from PostgreSQL because of this behavior.

 You mean systems that are designed so exactly, that they can't take 10%
 performance change ?

No, that's not really the point, performance degrades over time, in one
minute it degraded 10%.

The update to session ratio has a HUGE impact on PostgreSQL. If you have a
thousand active sessions, it may take a minute to degrade 10% assuming
some level of active vs operations per session per action.

If an active user causes a session update once a second, that is not too
bad, but if an active user updates a session more often, then it is worse.

Generally speaking, sessions aren't updated when they change, they are
usually updated per HTTP request. The data in a session may not change,
but the session handling code doesn't know this and simply updates anyway.

In a heavily AJAX site, you may have many smaller HTTP requests returning
items in a page. So, a single page may consist of multiple HTTP requests.
Worse yet, as a user drags an image around, there are lots of background
requests being made. Each request typically means a session lookup and a
session update. This is compounded by the number of active users. Since
the object of a site is to have many active users, this is always a
problem. It is less intrusive now that non-locking vacuum is there, but
that doesn't mean it isn't a problem.



 Or just that they did not vacuum for so long, that performance was less
 than needed in the end?

In an active site or application, vacuuming often enough to prevent this
often is, itself, a load on the system.


 btw, what did they switch to ?

One switched to oracle and one is using a session handler I wrote for PHP.
One company I did work for tried to maintain a table with a single row
that indicated state, this single row would sometimes take more than a
second to query. It was horrible. I'm not sure what they ended up using,
but I wrote a shared memory variable C function got rid of that specific
problem. They were trying to use PostgreSQL as the database to implement a
HUGE redundent networked file system. My personal opinion was that there
biggest problem was that they decided to use Java as the programming
environment, but that's another issue.



 Obviously this is not a problem with small sites, but this is a real
 problem with an enterprise level web site with millions of visitors and
 actions a day.

 On such site you should design so that db load stays below 50% and run
 vacuum often, that may even mean that you run vacuum continuously with
 no wait between runs. If you run vacuum with right settings,

Yea, but that, at least in my opinion, is a poor design.

 Quite frankly it is a classic example of something that
 does not scale. The more and more updates there are, the higher the load
 becomes. You can see it on top as the footest program runs.

 Yes, you understood correctly - the more updates, the higher the load :)

Imagine this:

Each row in a table has a single entry that represents that row. Lets call
it the key entry. Whether or not the key entry maintains data is an
implementation detail.

When indexing a table, the index always points to the key entry for a row.

When a row is updated, in the spirit of MVCC, a new data row is created.
The key entry is then updated to point to the new version of the row. The
new row points to the previous version of the row, and the previous entry
continues to point to its previous entry, etc.

When a row is found by the index, the key entry

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
 Ühel kenal päeval, N, 2006-06-22 kell 10:20, kirjutas Jonah H. Harris:
 On 6/22/06, Alvaro Herrera [EMAIL PROTECTED] wrote:
   Hmm, OK, then the problem is more serious than I suspected.
   This means that every index on a row has to be updated on every
   transaction that modifies that row. Is that correct?
 
  Add an index entry, yes.

 Again, this is a case for update-in-place.  No need to write an extra
 index entry and incur the WAL associated with it.

 I guess that MySQL on its original storage does that, but they allow
 only one concurrent update per table and no transactions.

 Imagine a table
 with 3 indexes on it... I would estimate that we perform at least 3 to
 6 times more overhead than any commercial database on such an update.

 One way to describe what commercial databases do to keep constant
 update rates is saying that they do either vacuuming as part of
 update, or they just use locks anf force some transactions to wait or
 fail/retry.

 Depending on exact details and optimisations done, this can be either
 slower or faster than postgresql's way, but they still need to do
 something to get transactional visibility rules implemented.

I think they have a different strategy. I think they maintain the notion
of current version of a row, and hunt for previous versions when needed,
at least that's how I suspect Oracle does it with redo logs.


   There has to be a more linear way of handling this scenario.
 
  So vacuum the table often.

 It's easy to say VACUUM often... but I'd bet that vacuuming is going
 to lessen the throughput in his tests even more; no matter how it's
 tuned.

 Running VACUUM often/continuously will likely keep his update rate
 fluctuatons within a corridor of maybe 5-10%, at the cost of 1-2% extra
 load. At least if vacuum is configured right and the server is not
 already running at 100% IO saturation, in which case it will be worse.

Assuming the table is a reasonable size, the I/O required for vacuum
doesn't kill everything else!

 The max throughput figure is not something you actually need very often
 in production.

No, but you need to have some degree of certainty and predictability in
the system you are developing.

 What is interesting is setting up the server so that you
 can service your loads comfortably. Running the server at 100% lead is
 not anything you want to do on production server. There will be things
 you need to do anyway and you need some headroom for that.

Of course, you design it so peaks are easily managed, but unless you run
vacuum continuously, and that has its own set of problems, you run into
this problem, and it can get really really bad.


 --
 
 Hannu Krosing
 Database Architect
 Skype Technologies OÜ
 Akadeemia tee 21 F, Tallinn, 12618, Estonia

 Skype me:  callto:hkrosing
 Get Skype for free:  http://www.skype.com



 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
 Christopher Browne [EMAIL PROTECTED] writes:
 After a long battle with technology, [EMAIL PROTECTED] (Mark
 Woodward), an earthling, wrote:
 Not true. Oracle does not seem to exhibit this problem.

 Oracle suffers a problem in this regard that PostgreSQL doesn't; in
 Oracle, rollbacks are quite expensive, as recovery requires doing
 extra work that PostgreSQL doesn't do.

 The Oracle design has got other drawbacks: if you need to access a row
 version other than than the very latest, you need to go searching in the
 rollback segments for it.  This is slow (no index help) and creates
 significant amounts of contention (since lots of processes are competing
 to touch the rollback segments).

But, it is all probability, in most cases, the VAST majority, older
versions aren't much needed outside the concurency of of active
transactions.


 Plus there's the old bugaboo that
 long-running transactions require indefinite amounts of rollback space,
 and Oracle is apparently unable to enlarge that space on-the-fly.
 (This last seems like a surmountable problem, but maybe there is some
 non-obvious reason why it's hard.)

Yea, Oracle has a million way to die. And when you think you know all one
million, you find one million and one.


 Basically there's no free lunch: if you want the benefits of MVCC it's
 going to cost you somewhere.  In the Postgres design you pay by having
 to do VACUUM pretty often for heavily-updated tables.  I don't think
 that decision is fundamentally wrong --- the attractive thing about it
 is that the overhead is pushed out of the foreground query-processing
 code paths.

Under certain circumstances, it is a very poor design. Think of a single
row table that keeps a scoreboard or a session table that keeps a limited
number of rows that are updated very frequently.


 We still have lots of work to do in making autovacuum
 smarter, avoiding vacuuming parts of relations that have not changed,
 and so on.  But I have no desire to go over to an Oracle-style solution
 instead.  We can't beat them by trying to be like them, and we run no
 small risk of falling foul of some of their patents if we do.

I proposed having a key row entry for each logical row. The key row
entry points to the latest version of the row. There, each row entry is a
linked list, in descending order, of previous row versions. The vast
majority of the time, the latest version will be the first version. It is
only when you have a previously started long running or concurrent
transaction will you ever look at previous versions.

I'm not saying it is an easy slam dunk, as I can think of a few
difficulties off the top of my head, but it would solve the steady
degradation of performance between vacuums and, to a possibly lesser
extent, the cost of updating a row in a heavily indexed table.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward
  You mean systems that are designed so exactly, that they can't take
 10%
  performance change ?

 No, that's not really the point, performance degrades over time, in one
 minute it degraded 10%.

 The update to session ratio has a HUGE impact on PostgreSQL. If you have
 a
 thousand active sessions, it may take a minute to degrade 10% assuming
 some level of active vs operations per session per action.

 So don't do an update. Multiple updates to the same row block anyway
 which is generally not something you want anyway.

The example is a very active web site, the flow is this:

query for session information
process HTTP request
update session information

This happens for EVERY http request. Chances are that you won't have
concurrent requests for the same row, but you may have well over 100 HTTP
server processes/threads answering queries in your web server farm.


 If you INSERT into multiple partitions (by time -- say one per minute)
 and TRUNCATE periodically (30 minute old partitions for 30 minute
 expiry) it works much better. Expiring the session is quite fast as well
 since they'll go away with the truncate.

 Index on sessionid and time and grab the row with the most recent time.

I doubt that that approach (1) answers the problem or (2) would be more
efficient.
 --



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward

 As you can see, in about a minute at high load, this very simple table
 lost about 10% of its performance, and I've seen worse based on update
 frequency.  Before you say this is an obscure problem, I can tell you it
 isn't. I have worked with more than a few projects that had to switch
 away
 from PostgreSQL because of this behavior.

 Obviously this is not a problem with small sites, but this is a real
 problem with an enterprise level web site with millions of visitors and
 actions a day. Quite frankly it is a classic example of something that
 does not scale. The more and more updates there are, the higher the load
 becomes. You can see it on top as the footest program runs.


   I believe sessions should not be stored in a SQL database.
Alas, this is a long debate, and while I fundimentally agree with this
position, there is an inconvenient truth that it is often nessisary.

http://www.mohawksoft.org/?q=node/8


   It makes no sense. Updates and Inserts to the database should only be
 done where there is an interesting thing to record, when the user does an
 action like posting to a forum, making a purchase, sending a message, etc.

Again, preaching to the chior.


   I believe sessions should be stored in the memory of the application
 server, as native objects of the whatever language the application is
 written in. This way, sessions incur no serializing overhead and can be
 quite large and complex, which allows storage of interesting things, like
 the result of a complicated search query which is to be later paginated,
 for instance. It really makes sense to use native language objects too, as
 these have a lot more power and versatility than a database row. Think
 about rights management, for instance.

What you seem not to grasp at this point is a large web-farm, about 10 or
more servers running PHP, Java, ASP, or even perl. The database is usually
the most convenient and, aside from the particular issue we are talking
about, best suited.


   When the framework used lacks this power (most do and this is sad), then
 sessions incur serializing overhead ; but they should be serialized to
 filesystem files, or better, to memory using memcached, for instance.

I actually have a good number of years of experience in this topic, and
memcached or file system files are NOT the best solutions for a server
farm.

 It
 makes no sense to pay the performance penalty of a COMMIT (disk seek delay
 etc) and the database overhead for sessions, which are by nature volatile
 data changing all the time.

Very true.


   I don't think postgres should be tweaked to allow better handling of
 this. It would only make a better foot-gun.

I think the debate is over, it may be a bad use of a database, but there
are few alternatives, and SQL databases have become the defacto
methodology for dealing with this type of problem.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Mark Woodward

 What you seem not to grasp at this point is a large web-farm, about 10
 or
 more servers running PHP, Java, ASP, or even perl. The database is
 usually
 the most convenient and, aside from the particular issue we are talking
 about, best suited.

   The answer is sticky sessions : each user is assigned to one and only 
 one
 webserver in the cluster and his session is maintained locally, in RAM. No
 locks, no need to manage distributed session...

 I actually have a good number of years of experience in this topic, and
 memcached or file system files are NOT the best solutions for a server
 farm.

   If sessions are distributed, certainly, but if sessions are sticky to
 their own server ?

And what if a particulr server goes down? or gets too high a percentage of
the load?

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] vacuum, performance, and MVCC

2006-06-21 Thread Mark Woodward
We all know that PostgreSQL suffers performance problems when rows are
updated frequently prior to a vacuum. The most serious example can be seen
by using PostgreSQL as a session handler for a busy we site. You may have
thousands or millions of active sessions, each being updated per page hit.

Each time the record is updated, a new version is created, thus
lengthening the correct version search each time row is accessed, until,
of course, the next vacuum comes along and corrects the index to point to
the latest version of the record.

Is that a fair explanation?

If my assertion is fundimentally true, then PostgreSQL will always suffer
performance penalties under a heavy modification load. Of course, tables
with many inserts are not an issue, it is mainly updates. The problem is
that there are classes of problems where updates are the primary
operation.

I was thinking, just as a hypothetical, what if we reversed the problem,
and always referenced the newest version of a row and scanned backwards
across the versions to the first that has a lower transacton number?

One possible implementation: PostgreSQL could keep an indirection array of
index to table ref for use by all the indexes on a table. The various
indexes return offsets into the array, not direct table refs. Because the
table refs are separate from the index, they can be updated each time a
transaction is commited.

This way, the newest version of a row is always the first row found. Also,
on a heavily updated site, the most used rows would always be at the end
of the table, reducing amount of disk reads or cache memory required to
find the correct row version for each query.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Preventing DELETE and UPDATE without a WHERE clause?

2006-06-17 Thread Mark Woodward
 On 6/16/06, Mark Woodward [EMAIL PROTECTED] wrote:
  Chris Campbell [EMAIL PROTECTED] writes:
  I heard an interesting feature request today: preventing the
  execution of a DELETE or UPDATE query that does not have a WHERE
 clause.
 
  These syntaxes are required by the SQL spec.  Furthermore, it's easy
  to imagine far-more-probable cases in which the system wouldn't detect
  that you'd made a mistake, eg
 
DELETE FROM tab WHERE key  1
 
  where you meant to type
 
DELETE FROM tab WHERE key  1000
 
  I suggest counseling your client to learn how to use BEGIN/ROLLBACK.
  This proposal strikes me as falling squarely within the rule about
  design a system that even a fool can use, and only a fool will want
  to use it.
 
 Just a theory, couldn't a trigger be set up that would case the query to
 tank if it touches too many rows?


 i haven't tried but maybe a FOR STATEMENT trigger AFTER the event can
 ask ROW_COUNT using GET DIAGNOSTICS?

Well, if you *can't do it in a trigger, maybe that's a valid modification
for Hackers to consider.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Preventing DELETE and UPDATE without a WHERE clause?

2006-06-16 Thread Mark Woodward
 Chris Campbell [EMAIL PROTECTED] writes:
 I heard an interesting feature request today: preventing the
 execution of a DELETE or UPDATE query that does not have a WHERE clause.

 These syntaxes are required by the SQL spec.  Furthermore, it's easy
 to imagine far-more-probable cases in which the system wouldn't detect
 that you'd made a mistake, eg

   DELETE FROM tab WHERE key  1

 where you meant to type

   DELETE FROM tab WHERE key  1000

 I suggest counseling your client to learn how to use BEGIN/ROLLBACK.
 This proposal strikes me as falling squarely within the rule about
 design a system that even a fool can use, and only a fool will want
 to use it.

Just a theory, couldn't a trigger be set up that would case the query to
tank if it touches too many rows?



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] How to avoid transaction ID wrap

2006-06-08 Thread Mark Woodward
 On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:
 I guess what I am saying is that PostgreSQL isn't smooth, between
 checkpoints and vacuum, it is near impossible to make a product that
 performs consistently under high load.

 Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
 find a case where I couldn't smooth out the IO load so that it wasn't an
 issue.

In several project that I have been involved with, PostgreSQL had most of
the important features to be used, but in one project, checkpoints caused
us to time out under load. In this current project I am researching, I
know that vacuum may be an issue. The load is brutally constant.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] How to avoid transaction ID wrap

2006-06-07 Thread Mark Woodward
 On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:
 Zdenek Kotala [EMAIL PROTECTED] writes:
  Koichi Suzuki wrote:
  I've once proposed a patch for 64bit transaction ID, but this causes
  some overhead to each tuple (XMIN and XMAX).

  Did you check performance on 32-bit or 64-bit systems and 64-bit
 binary
  version of PGSQL? I think that today is not problem to have 64-bit
  architecture and 64-bit ID should increase scalability of Postgres.

 The percentage increase in I/O demand is the main reason the patch was
 rejected, not so much the arithmetic.

 Before considering 64 bit XIDs, it'd be very helpful to know why Mark
 can't vacuum frequently enough to handle rollover...

The system is under heavy load, and while there are tricks that can be
done, vacuum is a process which is extra load the system when it is
running. It is a sliding scale, as always, you may get the system to the
point where it can vacuum AND perform as needed, but the database is
growing constantly. Eventually you will get to the point where you can't
run vacuum *and* keep up with the data stream.

I guess what I am saying is that PostgreSQL isn't smooth, between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Now don't flame me, I really do love PostgreSQL, it is just that I bump up
against these issues from time to time and it would be nice if there were
some way to work around them.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Mark Woodward
OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.
The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] AGREGATE FUNCTIONS

2006-06-06 Thread Mark Woodward
 Hello, I would like to know where in the source-code of postgres is
 located the code of the aggregate functions min, max, avg.
 I wish to develop more statistical aggregate functions, and I prefer to
 use C than to write then in the PL/R.

There is a library in contrib called intagg. I wrote it a few years
ago, and I have to laugh at the README file because I must have been
stoned or something, because I can't understand it.

Anyways, if you want to make an aggregate function, it covers what you
need. Feel free to ignore the array stuff, because you probably won't need
to deal with it.

Aggregates have basically two functions, a single function called on every
iteration of the query (or GROUP BY) for state. Then there is a
function that is called at the end called final. The PostgreSQL docs are
pretty good as well.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Mark Woodward
 Mark Woodward wrote:
 OK, here's my problem, I have a nature study where we have about 10
 video
 cameras taking 15 frames per second.
 For each frame we make a few transactions on a PostgreSQL database.

 Maybe if you grouped multiple operations on bigger transactions, the I/O
 savings could be enough to buy you the ability to vacuum once in a
 while.  Or consider buffering somehow -- save the data elsewhere, and
 have some sort of daemon to put it into the database.  This would allow
 to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real long term work around.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Mark Woodward
 Jim C. Nasby [EMAIL PROTECTED] writes:
 On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

 Certainly seems like it. It seems like it would be good to allow for
 libpq not to buffer, since there's cases where it's not needed...

 See past discussions.  The problem is that libpq's API says that when it
 hands you back the completed query result, the command is complete and
 guaranteed not to fail later.  A streaming interface could not make that
 guarantee, so it's not a transparent substitution.

 I wouldn't have any strong objection to providing a separate API that
 operates in a streaming fashion, but defining it is something no one's
 bothered to do yet.  In practice, if you have to code to a variant API,
 it's not that much more trouble to use a cursor...


Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
this for free?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Mark Woodward
 Mark Woodward wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:

 On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:

 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that
 there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

 Certainly seems like it. It seems like it would be good to allow for
 libpq not to buffer, since there's cases where it's not needed...

 See past discussions.  The problem is that libpq's API says that when
 it
 hands you back the completed query result, the command is complete and
 guaranteed not to fail later.  A streaming interface could not make
 that
 guarantee, so it's not a transparent substitution.

 I wouldn't have any strong objection to providing a separate API that
 operates in a streaming fashion, but defining it is something no one's
 bothered to do yet.  In practice, if you have to code to a variant API,
 it's not that much more trouble to use a cursor...



 Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
 this for free?




 It won't solve it in the general case for clients that expect a result
 set. ISTM that use a cursor is a perfectly reasonable answer, though.

I'm not sure I agree -- surprise!

psql is often used as a command line tool and using a cursor is not
acceptable.

Granted, with an unaligned output, perhaps psql should not buffer the
WHOLE result at once, but without rewriting that behavior, a COPY from
query may be close enough.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward
Tom had posted a question about file compression with copy. I thought
about it, and I want to through this out and see if anyone things it is a
good idea.

Currently, the COPY command only copies a table, what if it could operate
with a query, as:

COPY (select * from mytable where foo='bar') as BAR TO stdout

I have no idea if it is doable, but I can see uses for replication

psql -h source mydb -c COPY (select * from mytable where ID  x) as
mytable TO STDOUT | psql -h target mydb -c COPY mytable FROM stdin



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward
 Mark Woodward wrote:
 Tom had posted a question about file compression with copy. I thought
 about it, and I want to through this out and see if anyone things it is
 a
 good idea.

 Currently, the COPY command only copies a table, what if it could
 operate
 with a query, as:

 COPY (select * from mytable where foo='bar') as BAR TO stdout

 I have no idea if it is doable, but I can see uses for replication

 I doubt it be really usefull (apart from maybe saving some work
 coding a client app) but did you actually test it with

 create table as select ...; followed by a copy of that table
 if it really is faster then just the usual select  fetch?

Why create table?

The idea is that you would have one or more redundent databases and use
the COPY TO/FROM to keep them up to date.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward
 Mark Woodward wrote:
 Mark Woodward wrote:
 Tom had posted a question about file compression with copy. I thought
 about it, and I want to through this out and see if anyone things it
 is
 a
 good idea.

 Currently, the COPY command only copies a table, what if it could
 operate
 with a query, as:

 COPY (select * from mytable where foo='bar') as BAR TO stdout

 I have no idea if it is doable, but I can see uses for replication
 I doubt it be really usefull (apart from maybe saving some work
 coding a client app) but did you actually test it with

 create table as select ...; followed by a copy of that table
 if it really is faster then just the usual select  fetch?

 Why create table?

 Just to simulate and time the proposal.
 SELECT ... already works over the network and if COPY from a
 select (which would basically work like yet another wire
 protocol) isnt significantly faster, why bother?

Because the format of COPY is a common transmiter/receiver for PostgreSQL,
like this:

pg_dump -t mytable | psql -h target -c COPY mytable FROM STDIN

With a more selective copy, you can use pretty much this mechanism to
limit a copy to a sumset of the records in a table.


 The idea is that you would have one or more redundent databases and use
 the COPY TO/FROM to keep them up to date.

 Well, if you have databases you would have regular tables - and
 can use copy as it is now :-)

But COPY copies all the records, not some of the records.

 Regards
 Tino



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward
 Mark Woodward wrote:
 ...
 create table as select ...; followed by a copy of that table
 if it really is faster then just the usual select  fetch?
 Why create table?
 Just to simulate and time the proposal.
 SELECT ... already works over the network and if COPY from a
 select (which would basically work like yet another wire
 protocol) isnt significantly faster, why bother?

 Because the format of COPY is a common transmiter/receiver for
 PostgreSQL,
 like this:

 pg_dump -t mytable | psql -h target -c COPY mytable FROM STDIN

 With a more selective copy, you can use pretty much this mechanism to
 limit a copy to a sumset of the records in a table.

 Ok, but why not just implement this into pg_dump or psql?
 Why bother the backend with that functionality?

Because COPY runs on the back-end, not the front end, and the front end
may not even be in the same city as the backend. When you issue a COPY
the file it reads or writes local to the backend. True, the examples I
gave may not show how that is important, but consider this:

psql -h remote masterdb -c COPY (select * from mytable where ID 
xxlastxx) as mytable TO '/replicate_backup/mytable-060602.pgc'

This runs completely in the background and can serve as a running backup.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward
 Mark Woodward wrote:
 ...

 pg_dump -t mytable | psql -h target -c COPY mytable FROM STDIN

 With a more selective copy, you can use pretty much this mechanism to
 limit a copy to a sumset of the records in a table.
 Ok, but why not just implement this into pg_dump or psql?
 Why bother the backend with that functionality?

 Because COPY runs on the back-end, not the front end, and the front
 end
 may not even be in the same city as the backend. When you issue a COPY
 the file it reads or writes local to the backend. True, the examples I
 gave may not show how that is important, but consider this:


 We were talking about COPY to stdout :-) Copy to file is another
 issue :-) Copy to (server fs) file has so many limitations I dont see
 wide use for it. (Of course there are usecases)

wide use for is not always the same as useful. Sometimes useful is
something not easily doable in other ways or completes a feature set.


 psql -h remote masterdb -c COPY (select * from mytable where ID 
 xxlastxx) as mytable TO '/replicate_backup/mytable-060602.pgc'

 This runs completely in the background and can serve as a running
 backup.

 And you are sure it would be much faster then a server local running
 psql just dumping the result of a query?

No I can't be sure of that at all, but  The COPY command has a
specific use that is understood and an operation that is separate from the
normal query mechanism.

 (And you could more easy avoid raceconditions in contrast to several
 remote clients trying to trigger your above backup )

Again, the examples may not have been precise in presenting why, the
focus was mostly what so it could be discussed. As a generic feature it
has many potential uses. Trying to debate and defend a specific use limits
the potential scope of the feature.

Why have COPY anyway? Why not just use SELECT * FROM TABLE?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] COPY (query) TO file

2006-06-02 Thread Mark Woodward

  Allow COPY to output from views
  Another idea would be to allow actual SELECT statements in a COPY.

 Personally I strongly favor the second option as being more flexible
 than the first.


 I second that - allowing arbitrary SELECT statements as a COPY source
 seems much more powerful and flexible than just supporting COPY FROM
 VIEW.

 Not to be a sour apple or anything but I don't see how any of this is
 needed in the backend since we can easily use Psql to do it, or pretty
 much any other tool.

There is an important difference between a capability in the backend vs
one synthesized in the frontend.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Possible TODO item: copy to/from pipe

2006-06-01 Thread Mark Woodward
 After re-reading what I just wrote to Andreas about how compression of
 COPY data would be better done outside the backend than inside, it
 struck me that we are missing a feature that's fairly common in Unix
 programs.  Perhaps COPY ought to have the ability to pipe its output
 to a shell command, or read input from a shell command.  Maybe something
 like

   COPY mytable TO '| gzip /home/tgl/mytable.dump.gz';

 (I'm not wedded to the above syntax, it's just an off-the-cuff thought.)

 Of course psql would need the same capability, since the server-side
 copy would still be restricted to superusers.

 You can accomplish COPY piping now through psql, but it's a bit awkward:

   psql -c COPY mytable TO stdout mydb | gzip ...

 Thoughts?  Is this worth doing, or is the psql -c approach good enough?


To be honest, I don't see much benefit in it. You can already accomplish
what you want to accomplish easily enough.

If you want to muck with COPY, I'd like to see it accept a query as:

psql -c COPY select * from mytable where foo='bar' TO stdout mydb | gzip
...




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] LIKE, leading percent, bind parameters and indexes

2006-05-26 Thread Mark Woodward
 On Thu, May 25, 2006 at 08:41:17PM -0300, Rodrigo Hjort wrote:
 
 I think more exactly, the planner can't possibly know how to plan an
 indexscan with a leading '%', because it has nowhere to start.
 

 The fact is that index scan is performed on LIKE expression on a string
 not
 preceded by '%', except when bound parameter is used.

 select * from table where field like 'THE NAME%'; -- index scan
 select * from table where field like '%THE NAME%'; -- seq scan
 select * from table where field like :bind_param; -- seq scan (always)

 Since I'm somewhat doubtful of coming up with a generic means for
 dealing with plan changes based on different bound parameter values any
 time soon...

 How difficult would it be to make LIKE check the value of the bound
 parameter for a starting % and use that information to decide on a query
 plan? IMHO this is worth making into a special case in the planner,
 because it's very easy to detect and makes a tremendous difference in
 the query plan/performance.


My solution is a function in one of my libraries called strrev() which
returns the reverse of a string. I make a function index of a
strrev(field). Then, just search where strrev('%the name') like
strrev(field);



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Performance Issues

2006-05-23 Thread Mark Woodward
 Dhanaraj M wrote:
 I have the following doubts.

 1. Does postgres create an index on every primary key?  Usually, queries
 are performed against a table on the primary key, so, an index on it
 will be very useful.

 Yes, a unique index is used to enforce the primary-key.

Well, here is an interesting question that I have suddenly become very
curious of, if you have a primary key, obviously a unique index, is it, in
fact, use this index regardless of analyzing the table?



 2. If 'm executing a complex query and it takes 10 seconds to return the
 results -- it takes 10 seconds to execute the next time also.  I'm
 wondering if there's any kind of caching that can be enabled -- so, the
 next time it takes 10 seconds to return the results.

 Not of query results. Obviously data itself might be cached. You might
 want to look at memcached for this sort of thing.


I am looking at this string of posts and it occurs to me that he should
run analyze. Maybe I'm jumping at the wrong point.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] String Similarity

2006-05-22 Thread Mark Woodward
 Try contrib/pg_trgm...

Tri-graphs are interesting, and I'll try to reconsider whether they fit or
not, ut I suspect that do not. (You are the second to recommend it)

Anything based on a word parser is probably not appropriate, the example I
first gave is a little misleading in that it is not the whole of the
problem. Consider this:

pinkfloyd darkside of the moon - money

Again, we humans see that this string is almost identical to the others.

I have a working system right now, and it strips all non alnum() out of
the strings, then searches the strings for runs, and compiles a list of
runs from longest to shortest.

The algorithm is strlen1*strlen2*N where N is the number of runs detected.
As you can see it is merely brute force.

Secondly, there is a subtle difference between comparing a known string
for which you are searching and comparing two arbitrary strings. The
known string is assumed to be in some sort of regular form and the
string to be compared must break down into that form based on your
alorithm. When trying to understand the similarity of two arbitrary
strings, you don't always know what similarities are or what the parsing
rules should be.

ThisIsSomethinThatHumansCanReadButSpaceBasedParsersCanNot.
tHISiSaLSOsOMETHING
Yo uca nalmos trea dthi s

can you see the similarity in these two strings?
CanYouSeeTheSimilarityInTheseTwoStrings?

Ideally I would metahone the individual words, strip out all the white
spaces, and find the run lengths that compare in the two strings, and
calculate the similarity based on the number and size of the runs. I do
not currently metaphone (It isn't clear to me that endcases can be handled
correctly) and I'm not sure how to best calculate similarity. I've been
trying best run, total match, and a host of others, but haven't found one
I really like.


 Chris

 Mark Woodward wrote:
 I have a side project that needs to intelligently know if two strings
 are contextually similar. Think about how CDDB information is collected
 and sorted. It isn't perfect, but there should be enough information to
 be
 usable.

 Think about this:

 pink floyd - dark side of the moon - money
 dark side of the moon - pink floyd - money
 money - dark side of the moon - pink floyd
 etc.

 To a human, these strings are almost identical. Similarly:

 dark floyd of money moon pink side the

 Is a puzzle to be solved by 13 year old children before the movie
 starts.

 My post has three questions:

 (1) Does anyone know of an efficient and numerically quantified method
 of
 detecting these sorts of things? I currently have a fairly inefficient
 and
 numerically bogus solution that may be the only non-impossible solution
 for the problem.

 (2) Does any one see a need for this feature in PostgreSQL? If so, what
 kind of interface would be best accepted as a patch? I am currently
 returning a match liklihood between 0 and 100;

 (3) Is there also a desire for a Levenshtein distence function for text
 and varchars? I experimented with it, and was forced to write the
 function
 in item #1.


 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

 --
 Christopher Kings-Lynne

 Technical Manager
 CalorieKing
 Tel: +618.9389.8777
 Fax: +618.9389.8444
 [EMAIL PROTECTED]
 www.calorieking.com


 ---(end of broadcast)---
 TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-advocacy] [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-22 Thread Mark Woodward
 On Sat, May 20, 2006 at 02:29:01PM +0200, Dawid Kuroczko wrote:
 On 5/20/06, Lukas Smith [EMAIL PROTECTED] wrote:
 The improvements to the installer are great, but there simply needs to
 be a packaged solution that adds more of the things people are very
 likely to use. From my understanding Bizgres goes in that direction? I
 just think that whatever highly packaged solution PostgreSQL picks,
 this
 should be the download that is pushed at conferences, in articles and
 books. People with a clue will still know where they can get the clean
 base.

 Hmm, a Comprehensive PostgreSQL Archive Network? ;)

 I mean, something like CPAN, CTAN or CRAN? :)

 I mean, the -contrib is great, but pushing other things there is a bit
 tricky (to say the least) from the maintenance point of view.  (Every
 bugfix, a new release of -contrib, etc, etc...).

 Then again PGfoundry is great to keep development centered, but
 finding and building a new package is not really a one-liner, and
 if you're unlucky you might get alpha-quality code installed. :)

 I don't see any reason why CPgAN would need to change pgFoundry at all.
 In fact, my thought was that any such system should use pgFoundry as
 it's backend/repository.

 I think a CPgAN-like solution would be the best.  A uniform method
 of getting approved Pg extensions.  It would simplify installing the
 extensions, and would encourage distributions to package such
 extensions.  Somebody suggested apt-get install postgresql-contrib.
 Imagine:
 apt-get install postgresql-datatype-fqdn
 apt-get install postgresql-gist-ltree
 ...and so on.

 Except that apt doesn't work on all platforms. Though it would certainly
 make sense to look at lifting the framework for CPgAN from somewhere,
 rather than coding it ourselves.


A CPgAN would be a great idea in theory, but I have reservations.

As a software developer, I'm fine with pgfoundery, but as a DB admin, and
one who deploys data centers from time to time, I'd like to see something
closer to the contrib.

If I could have any influence at all, I'd like to see contrib
essentially go away in the main distribution and replaced or renamed
extensions. Then, some advisory group blesses extensions, and those
extensions get packaged into a PostgreSQL extensions pack. I, again as a
DB admin, would have NO problem with PostgreSQL playing favorites and
picking best of breed for these extensions.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] String Similarity

2006-05-20 Thread Mark Woodward

 What I was hoping someone had was a function that could find the substring
 runs in something less than a strlen1*strlen2 number of operations and a
 numerically sane way of representing the similarity or difference.

Acually, it is more like strlen1*strlen2*N, where N is the number of valid
runs.

Unless someone has a GREAT algorithm, I think it will always be at least
strlen1*strlen2. The amount of processing for N is the question. Is N *
(strlen1*strlen2) less than sorting an array of N elements, scanning
through those elements and eliminating duplicate character matches?

Depending on the max value of N, I could save all the runs, sort by max
length, then exclude based on overlapp, but it isn't clear that this is a
performance win unless the strings are long, even then, I'm not completely
convinced as N still has some strlen ramifications for removing
duplicates.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-20 Thread Mark Woodward

 My question is whether psql using libreadline.so has to be GPL, meaning
 the psql source has to be included in a binary distribution.

If I understand what I have been told by lawyers, here's what using a GPL,
and NOT LGPL, library means:

According to RMS, the definition of a derivitive work is one which shares
the same address space when running. The in-memory process separation also
separates works. One may argue this definition, but it is in supporting
documents to the GPL and likely to be considered as the intention of the
GPL in a court of law.

There is no requirement of shipping source with a binary. One must make
available the source. This can be done by a web site or alternate CD
distribution. It need not be free, as in beer, but must be free of any
restrictions beyond those of the GPL.

There is no requirement that one would need to make the source of the 3rd
party GPL library available, as it is available from the original source
from whence it was obtained in the first place. Any changes, however,
made, by you, to that library must be made available. (If you do not make
modifications to libreadline, you don't even need to worry about it.)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-20 Thread Mark Woodward
 On Fri, May 19, 2006 at 07:04:47PM -0400, Bruce Momjian wrote:
  libreadline is not a problem because you can distribute postgresql
  compiled with readline and comply with all licences involved
  simultaneously. It doesn't work with openssl because the licence
  requires things that are incompatable with the GPL.

 My question is whether psql using libreadline.so has to be GPL, meaning
 the psql source has to be included in a binary distribution.

 IANAL, but yes. Or any other of the methods allowed, like providing a
 written voucher valid for at least three years. People who feel they
 need to keep the source to psql secret should link against libeditline
 instead.

 The way I understand it, the GPL affects programs in two main ways:

 1. A program which is GPL'd must, when distributed, be able to provide
 all source used to build it under terms compatable with the GPL.

This is not technically true. If you incorporate GPL code that is
publically available and unchanged, you needn't provide the 3rd party
packages.


 2. A program which includes a GPL'd header file while building, must,
 when distributed, provide its own source and the library under GPL
 compatable terms, but not necessariliy the source of anything else
 needed to build it. This is why it's OK that psql links against openssl
 and readline.

This is sort of a disputable position, and RMS himself isn't clear. If the
header files are simply definitions and declarations, then no GPL material
is actually included in a binary. However, inline functions and macros may
constitute code.


 These are obviously only relevent when distributing precompiled
 binaries. If you are only distributing source, none of the above
 applies to you.

Of course.

 There's a third method that some people claim, but I don't buy. This
 where a program using an interface of a GPL'd library somehow become a
 derived work of said library. That's just way whacked out.

There is no supporting argument for that, however, RMS supporting writings
indicate that he defines derived as being in the same process space.


 You may ofcourse disagree with any of the above, and hey, if you have a
 lawyer to back you up, who am I to argue?

I have talked to too many lawyers, sigh, aout this stuff.


 As for why you don't solve the problem by distributing a libpq not
 compiled against OpenSSL, well, that's a different question. Back when
 SSL was considered an arms exports by the US, having both SSL and
 non-SSL versions was common (and a big PITA). When that disappeared,
 the main reason for the split went away and people started compiling
 SSL by default. This solved the problem for 99% of programs.

 However, one tiny subset remains problematic:
 - A library implements SSL, but only using OpenSSL
 - The library doesn't use the GPL, or doesn't have an OpenSSL exception
 clause.
 - A GPL'd program uses this library, without an OpenSSL exception
 clause.

 In this subset of a subset of a subset of programs, it's a problem.
 Many libraries that implement SSL provide an alternative to OpenSSL,
 many programs using such libraries have exception clauses so that
 there's just a handful of programs and libraries that are problematic.

 As long as there's a possibility that the situation can change (either
 every GPL program using postgresql gains an exception clause, or
 postgresql might someday support some other library) it will probably
 stay this way.

 If the the postgresql core decides that OpenSSL will be the only SSL
 ever supported, no matter what, well, the split distribution may yet
 happen. In the meantime, we have status quo.

 Have a nice day,
 --
 Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to
 litigate.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-19 Thread Mark Woodward
 Andrew Dunstan [EMAIL PROTECTED] writes:
 Mark Woodward wrote:
 Again, there is so much code for MySQL, a MySQL emulation layer, MEL
 for
 short, could allow plug and play compatibility for open source, and
 closed
 source, applications that otherwise would force a PostgreSQL user to
 hold
 his or her nose and use MySQL.

 If we had infinite resources this might make sense. We don't, so it
 doesn't. There is a real cost to producing a compatibility layer, and
 the cost will be those spiffy new features.

 The real problem is that there's a whole lot of stuff, such as mysql's
 weak error checking, that I don't think a compatibility layer could
 sanely provide.

I kind of agree with this statement, but while I was playing devils's
advocate and just grousing a bit about having to use MySQL, there is a
sort of reality of openomics where mind-share is everything.

The more mind-share you have, the more opportunities you have and the more
resources become available. Not always, of course, look at OpenSSH, but
for the most part.

As MySQL adds features, not matter how poorly implemented, and maintain a
migration path, we will never reach their users.

PostgreSQL is better, true, but it is not ideal in many ways. It can be
best said that the difference between PostgreSQL and MySQL is similar to
the difference between Linux/BSD and Windows.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-19 Thread Mark Woodward

 Actually, I think it's a lot more accurate to compare PostgreSQL and
 MySQL as FreeBSD vs Linux from about 5 years ago. Back then FreeBSD was
 clearly superior from a technology standpoint, and clearly playing
 second-fiddle when it came to users. And now, Linux is actually
 technically superior in most ways thanks to all the mindshare that's
 been poured into it.

 And with that, I am going to sit in a lawn chair and watch the bonfire.


Even I know that is NOT a discussion we want to start.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] String Similarity

2006-05-19 Thread Mark Woodward
 Mark Woodward wrote:
 I have a side project that needs to intelligently know if two strings
 are contextually similar. Think about how CDDB information is collected
 and sorted. It isn't perfect, but there should be enough information to
 be
 usable.

 Think about this:

 pink floyd - dark side of the moon - money
 dark side of the moon - pink floyd - money
 money - dark side of the moon - pink floyd
 etc.

 To a human, these strings are almost identical. Similarly:

 dark floyd of money moon pink side the

 Is a puzzle to be solved by 13 year old children before the movie
 starts.
[snip]

 Hmmm...  I think I like this problem.  Maybe I'll work on it a bit as a
 contrib
 module.

I *have* a working function, but it is not very efficient and it is not
what I would call numerically predictable. And it does find the various
sub-strings between the two strings in question.

Email me offline and we can make something for contrib.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] String Similarity

2006-05-19 Thread Mark Woodward

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 I have a side project that needs to intelligently know if two strings
 are contextually similar.

 The examples you gave seem heavy on word order and whitespace
 consideration,
 before applying any algorithms. Here's a quick perl version that does the
 job:

[SNIP]

This is a case where the example was too simple to explain the problem,
sorry. I have an implementation of Oracle's contains function for
PostgreSQL, and it does basically what you are doing, and, in fact, also
has Mohawk Software Extensions (LOL) that provide metaphone. The problem
is that parsing white space realy isn't reliable. Sometimes it is
pinkfloyd-darksideofthemoon.

Also, I have been thinking of other applications.

I have a piece of code that does this:

apps$ ./stratest pink foyd dark side of the moon money money dark side
of the moon pink floyd
Match:  dark side of the moon
Match: pink f
Match: money
Match: oyd

apps$ ./stratest pinkfoyddarksideofthemoonmoney
moneydarksideofthemoonpinkfloyd
Match: darksideofthemoon
Match: pinkf
Match: money
Match: oyd

I need to come up with a numerically sane way of taking this information
and understanding overall similarity.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


  1   2   >