date:20070222

Re: [HACKERS] SCMS question

2007-02-22 Thread Lukas Kahwe Smith


Hey,

Anyone who followed the thread willing to list the mentioned 
requirements as well as the pro's and con's of the differnent options in 
the developer wiki [1]? I think this would help a lot in making a 
decision and it could be a lot of help for other OSS projects 
considering to switch.


If you do not have an account with write access, just open an account on 
the wiki and email Greg to get your ACL's upgraded so you can write into 
the wiki.


I can also try to work on this a bit over the weekend.

regards,
Lukas

[1] http://developer.postgresql.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Simple Column reordering

2007-02-22 Thread Simon Riggs

On Thu, 2007-02-22 at 23:49 -0300, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
> > Simon Riggs wrote:
> > >
> > > I propose that at CREATE TABLE time, the column ordering is re-ordered
> > > so that the table columns are packed more efficiently. This would be a
> > > physical re-ordering, so that SELECT * and COPY without explicit column
> > > definitions would differ from the original CREATE TABLE statement.
> > >
> > > This would be an optional feature, off by default, controlled by a
> > > USERSET GUC
> > >   optimize_column_order = off (default) | on
> > 
> > Umm, you want a GUC setting to enable standards-breaking behaviour and
> > that will be obsolete when we do column ordering right, which is not
> > likely to be more than one release away, and could even still happen in
> > this coming release?
> 
> Given that we already seem to have a patch implementing a complete
> solution, or part thereof, this would seem a rather shortsighted
> proposal.  Why not develop the whole thing and be done with it?

That would be my preference, but if it doesn't happen, I wanted to have
a clear secondary proposal documented.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Robert Treat

On Thursday 25 January 2007 12:51, Oleg Bartunov wrote:
> On Thu, 25 Jan 2007, Nikolay Samokhvalov wrote:
> > On 1/25/07, Teodor Sigaev <[EMAIL PROTECTED]> wrote:
> >> It's should clear enough for now - dump data from old db and load into
> >> new one.
> >> But dump should be without any contrib/tsearch2 related functions.
> >
> > Upgrading from 8.1.x to 8.2.x was not tivial because of very trivial
> > change in API (actually not really API but the content of "pg_ts_*"
> > tables): russian snowball stemming function was forked to 2 different
> > ones, for koi8 and utf8 encodings. So, as I  dumped my pg_ts_* tables
> > data (to keep my tsearch2 settings), I saw errors during restoration
> > (btw, why didn't you keep old russian stemmer function name as a
> > synonym to koi8 variant?) -- so, I had to change my dump file
> > manually, because I didn't manage to follow "tsearch2 best practices"
>
> sed and grep did the trick.
>
> > (to use some kind of "bootstrap" script that creates tsearch2
> > configuration you need from default one -- using several INSERTs and
> > UPDATEs). And there were no upgrade notes for tsearch2.
>
> This is unfair, you promised to write upgrade notes and we discussed the
> problem with name change before release and I rely on you. It was my fault,
> of course.
>

I got bit by this today and, afaict the best solution for the status quo would 
be to change the install schema to something like tsearch2, which would then 
allow for much easier dump and restore handling. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] SCMS question

2007-02-22 Thread Warren Turkal

On Wednesday 21 February 2007 21:23, Warren Turkal wrote:
> Are there any plans to move to another SCMS in the future? I am curious, I
> guess.

Speaking of which...Are there any plans to upgrade the CVS server to the 
latest version?

wt
-- 
Warren Turkal (w00t)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Markus Schiltknecht


Hi,

Richard Levitte - VMS Whacker wrote:

In message <[EMAIL PROTECTED]> on Thu, 22 Feb 2007 17:38:26 +0100, Markus 
Schiltknecht <[EMAIL PROTECTED]> said:

markus> > So far, I'm getting the sense that there are a lot of
markus> > opinions on what replacement system to use, a bit carelessly
markus> > before having answered the above questions thoroughly.
markus> 
markus> How did you get that impression?


You said it yourself: Most PostgreSQL developers currently want to
stay with CVS.


Uh, yah. But I was refering to the "lots of opinions on what replacement 
system to use". This has not much to do with the want or need (for lack 
of a better alternative) to stay with CVS, IMO.



Unless there's a majority that wants to move on, I doubt there will be
a move.  In the end, it has to be a group effort, or it will simply
not happen.


I absolutely agree. And I'm quite sure most PostgreSQL developers also 
know that, thus I don't see much point in warning them - they are 
resistant enough ;-)


As you might have noticed, I myself don't consider monotone ready for 
use by the PostgreSQL project, yet. And I've never advocated for 
switching *now*. I only made Tom's point...


Regards

Markus


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Simple Column reordering

2007-02-22 Thread Gregory Stark

"Alvaro Herrera" <[EMAIL PROTECTED]> writes:

> Given that we already seem to have a patch implementing a complete
> solution

we do? 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] SCMS question

2007-02-22 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> [much snipped]
>> Why are so few people committers?
>> ...
>> The answer to both questions is because CVS limitations make it hard to do
>> better.
>
> Uh, no.  The reason there are so few committers is that there are so few
> people qualified not to break things.

Not to break the codebase? Or not to break the archive? It's only due to CVS's
awkwardness that the latter is a concern at all.

> If we had a community with hundreds of people with commit access, very
> possibly we'd be feeling the need of a better SCM system.  But right
> now, CVS is not the bottleneck, and I don't see that we'd get a payback
> for the pain of changing to something else.

I'm not suggesting changing the number of people who can commit code to HEAD.
But getting work into the revision control system sooner, before it's ready to
be merged into HEAD. Right now we're giving up most of the advantages revision
control systems exist for. We're using CVS as a glorified archival system.

You're still merging patches and reviewing patches by hand, without any of the
tools to, for example, view incremental changes in the branch, view the logs
of the branch, merge the branch into the code automatically taking into
account the known common ancestor. Instead of receiving a 20k patch without
any tools to work with it you would be given a branch name and be able to view
and merge it into the main branch using the tools.

Moreover work on things like bitmapped indexes that other people want to help
on is hampered by this need to be mailing around patches. If two or three
people submit changes (based possibly on different old versions of the patch)
the main developer has to merge them into his version of the patch by hand and
mail out a new patch. The whole point of a revision control system is to
provide tools to make that easier.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] HOT WIP Patch - version 2

2007-02-22 Thread Jim C. Nasby

There's a fair amount of added work to be done when updating tuples.
Will it be possible to postpone some of that to the bgwriter in a later
version? I realize that sometimes you'll still want to do the work up
front, like if it means we can stay on the same page instead of going
cold...

On Tue, Feb 20, 2007 at 12:08:14PM +0530, Pavan Deolasee wrote:
> Reposting - looks like the message did not get through in the first
> attempt. My apologies if multiple copies are received.
> 
> 
> This is the next version of the HOT WIP patch. Since the last patch that
> I sent out, I have implemented the HOT-update chain pruning mechanism.
> 
> When following a HOT-update chain from the index fetch, if we notice that
> the root tuple is dead and it is HOT-updated, we try to prune the chain to
> the smallest possible length. To do that, the share lock is upgraded to an
> exclusive lock and the tuple chain is followed till we find a
> live/recently-dead
> tuple. At that point, the root t_ctid is made point to that tuple. In order
> to
> preserve the xmax/xmin chain, the xmax of the root tuple is also updated
> to xmin of the found tuple. Since this xmax is also < RecentGlobalXmin
> and is a committed transaction, the visibility of the root tuple still
> remains
> the same.
> 
> The intermediate heap-only tuples are  removed from the HOT-update chain.
> The HOT-updated status of these tuples is cleared and their respective
> t_ctid are made point to themselves. These tuples are not reachable now
> and ready for vacuuming. This entire action is logged in a single
> WAL record.
> 
> During vacuuming, we keep track of number of root tuples vacuumed.
> If this count is zero, then the index cleanup step is skipped. This
> would avoid unnecessary index scans whenever possible.
> 
> This patch should apply cleanly on current CVS head and pass all regression
> tests. I am still looking for review comments from the first WIP patch. If
> anyone
> has already looked through it and is interested in the incremental changes,
> please let me know. I can post that.
> 
> Whats Next ?
> -
> 
> ISTM that  the basic  HOT-updates and ability to prune the HOT-update chain,
> 
> should help us reduce the index bloat, limit the overhead of ctid following
> in
> index fetch and efficiently vacuum heap-only tuples. IMO the next important
> but rather less troublesome thing to tackle is to reuse space within a block
> 
> without complete vacuum of the table. This would help us do much more
> HOT-updates and thus further reduce index/heap bloat.
> 
> I am thinking of reusing the DEAD heap-only tuples which gets removed from
> the HOT-update chain as part of pruning operation. Since these tuples, once
> removed from the chain, are neither reachable nor have any index references,
> could be readily used for storing newer versions of the same or other rows
> in
> the block. How about setting LP_DELETE on these tuples as part of the
> prune operation ? LP_DELETE is unused for heap tuples, if I am not
> mistaken. Other information like length and offset are is maintained as it
> is.
> When we run out space for update-within-the-block, we traverse
> through all the line pointers looking for LP_DELETEd items. If any of these
> items have space large enough to store the new tuple, that item is reused.
> Does anyone see any issue with doing this ? Also, any suggestions
> about doing it in a better way ?
> 
> If the page gets really fragmented, we can try to grab a VACUUM-strength
> lock on the page and de-fragment it. The lock is tried conditionally to
> avoid
> any deadlocks. This is done in the heap_update() code path, so would add
> some overhead, but may still prove better than putting the tuple in a
> different block and having corresponding index insert(s). Also, since we are
> more concerned about the large tables, the chances of being able to upgrade
> the exclusive lock to vacuum-strength lock are high. Comments ?
> 
> If there are no objections, I am planning to work on the first part
> while Nikhil would take up the second task of block level retail-vacuum.
> Your comments on these issues and the patch are really appreciated.
> 
> Thanks,
> Pavan
> 
> -- 
> 
> EnterpriseDB http://www.enterprisedb.com


> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster


-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Jim C. Nasby

On Thu, Feb 22, 2007 at 10:32:44PM -0500, Matthew T. O'Connor wrote:
> Jim C. Nasby wrote:
> >On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote:
> >  
> >>So the heuristic would be:
> >>* Launcher fires off workers into a database at a given interval 
> >>(perhaps configurable?)
> >>* Each worker works on tables in size order. 
> >>* If a worker ever catches up to an older worker, then the younger 
> >>worker exits.
> >>
> >>This sounds simple and workable to me, perhaps we can later modify this 
> >>to include some max_workers variable so that a worker would only exit if 
> >>it catches an older worker and there are max_workers currently active.
> >>
> >
> >That would likely result in a number of workers running in one database,
> >unless you limited how many workers per database. And if you did that,
> >you wouldn't be addressing the frequently update table problem.
> >
> >A second vacuum in a database *must* exit after a fairly short time so
> >that we can go back in and vacuum the important tables again (well or
> >the 2nd vacuum has to periodically re-evaluate what tables need to be
> >vacuumed).
> >  
> 
> I'm not sure this is a great idea, but I don't see how this would result 
> in large numbers of workers working in one database.   If workers work 
> on tables in size order, and exit as soon as they catch up to an older 
> worker, I don't see the problem.  Newer works are going to catch-up to 
> older workers pretty quickly since small tables will vacuum fairly quickly.

The reason that won't necessarily happen is because you can get large
tables popping up as needing vacuuming at any time.

Round 1:
Fire up autovac worker; starts working and soon hits 100G table

Round 2:
Another worker starts. Since round 1, a 98G table now needs vacuuming,
which this worker soon hits.

Round 3:
89G table now needs vacuuming. Worker 3 starts up and soon hits it.

So now we have 3 workers, all hammering away in the same database, and
likely causing a lot of random IO.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] SCMS question

2007-02-22 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> [much snipped]
> Why are so few people committers?
> ...
> The answer to both questions is because CVS limitations make it hard to do
> better.

Uh, no.  The reason there are so few committers is that there are so few
people qualified not to break things.

If we had a community with hundreds of people with commit access, very
possibly we'd be feeling the need of a better SCM system.  But right
now, CVS is not the bottleneck, and I don't see that we'd get a payback
for the pain of changing to something else.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] SCMS question

2007-02-22 Thread Warren Turkal

On Thursday 22 February 2007 20:39, Alvaro Herrera wrote:
> > Git is also pretty cool, too. You can even present a CVS interface on a
> > git repository. That might address the build farm issue.
>
> But it wasn't portable, last time I checked.

Git is in the FreeBSD ports. The cvs gateway server comes with it AFAIK.

wt
-- 
Warren Turkal (w00t)

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] SCMS question

2007-02-22 Thread Andrew Dunstan


Alvaro Herrera wrote:

Warren Turkal wrote:
  

On Thursday 22 February 2007 11:00, Joshua D. Drake wrote:


1. It has an api that can be written to, that may or may not be helpful
to buildfarm.

2. It has mindshare. I know that isn't a big deal to a lot of people
here, but the it is becoming the new cvs. There are others of course
(like monotone) but most projects I know are moving from cvs to svn or
starting with svn.
  
Git is also pretty cool, too. You can even present a CVS interface on a git 
repository. That might address the build farm issue.



But it wasn't portable, last time I checked.

  


Besides that, the whole idea is that buildfarm should mimic as exactly 
as possible how we checkout and build by hand.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] SCMS question

2007-02-22 Thread Alvaro Herrera

Warren Turkal wrote:
> On Thursday 22 February 2007 11:00, Joshua D. Drake wrote:
> > 1. It has an api that can be written to, that may or may not be helpful
> > to buildfarm.
> >
> > 2. It has mindshare. I know that isn't a big deal to a lot of people
> > here, but the it is becoming the new cvs. There are others of course
> > (like monotone) but most projects I know are moving from cvs to svn or
> > starting with svn.
> 
> Git is also pretty cool, too. You can even present a CVS interface on a git 
> repository. That might address the build farm issue.

But it wasn't portable, last time I checked.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Matthew T. O'Connor


Jim C. Nasby wrote:

On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote:
  

So the heuristic would be:
* Launcher fires off workers into a database at a given interval 
(perhaps configurable?)
* Each worker works on tables in size order. 
* If a worker ever catches up to an older worker, then the younger 
worker exits.


This sounds simple and workable to me, perhaps we can later modify this 
to include some max_workers variable so that a worker would only exit if 
it catches an older worker and there are max_workers currently active.



That would likely result in a number of workers running in one database,
unless you limited how many workers per database. And if you did that,
you wouldn't be addressing the frequently update table problem.

A second vacuum in a database *must* exit after a fairly short time so
that we can go back in and vacuum the important tables again (well or
the 2nd vacuum has to periodically re-evaluate what tables need to be
vacuumed).
  


I'm not sure this is a great idea, but I don't see how this would result 
in large numbers of workers working in one database.   If workers work 
on tables in size order, and exit as soon as they catch up to an older 
worker, I don't see the problem.  Newer works are going to catch-up to 
older workers pretty quickly since small tables will vacuum fairly quickly.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] SCMS question

2007-02-22 Thread Warren Turkal

On Thursday 22 February 2007 11:00, Joshua D. Drake wrote:
> 1. It has an api that can be written to, that may or may not be helpful
> to buildfarm.
>
> 2. It has mindshare. I know that isn't a big deal to a lot of people
> here, but the it is becoming the new cvs. There are others of course
> (like monotone) but most projects I know are moving from cvs to svn or
> starting with svn.

Git is also pretty cool, too. You can even present a CVS interface on a git 
repository. That might address the build farm issue.

wt
-- 
Warren Turkal (w00t)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Simple Column reordering

2007-02-22 Thread Alvaro Herrera

Andrew Dunstan wrote:
> Simon Riggs wrote:
> >
> > I propose that at CREATE TABLE time, the column ordering is re-ordered
> > so that the table columns are packed more efficiently. This would be a
> > physical re-ordering, so that SELECT * and COPY without explicit column
> > definitions would differ from the original CREATE TABLE statement.
> >
> > This would be an optional feature, off by default, controlled by a
> > USERSET GUC
> > optimize_column_order = off (default) | on
> 
> Umm, you want a GUC setting to enable standards-breaking behaviour and
> that will be obsolete when we do column ordering right, which is not
> likely to be more than one release away, and could even still happen in
> this coming release?

Given that we already seem to have a patch implementing a complete
solution, or part thereof, this would seem a rather shortsighted
proposal.  Why not develop the whole thing and be done with it?

I don't understand the reluctance to implementing all of it.  The most
serious objection I've seen, from Andreas IIRC, is that it would make
drivers' lives more difficult; but really, drivers have to cope with
dropped columns today which is a pain, and ISTM this proposal (not this
one here, but the three-column proposal) would make that a bit simpler.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Simple Column reordering

2007-02-22 Thread Andrew Dunstan

Simon Riggs wrote:
>
> I propose that at CREATE TABLE time, the column ordering is re-ordered
> so that the table columns are packed more efficiently. This would be a
> physical re-ordering, so that SELECT * and COPY without explicit column
> definitions would differ from the original CREATE TABLE statement.
>
> This would be an optional feature, off by default, controlled by a
> USERSET GUC
>   optimize_column_order = off (default) | on
>

Umm, you want a GUC setting to enable standards-breaking behaviour and
that will be obsolete when we do column ordering right, which is not
likely to be more than one release away, and could even still happen in
this coming release?

I hope I haven't misunderstood.

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Grouped Index Tuples

2007-02-22 Thread Simon Riggs

On Thu, 2007-02-22 at 12:47 +, Heikki Linnakangas wrote:

> One question that I'm sure someone will ask is do we need this if we 
> have bitmap indexes? Both aim at having a smaller index, after all.
> The use cases are quite different; GIT is effective whenever you have
> a table that's reasonably well-clustered. Unlike the bitmap indexam,
> GIT's effectiveness doesn't depend on the number of distinct values,
> in particular it works well with unique indexes. GIT is comparable to 
> clustered indexes in other DBMSs (in fact we might want to call GIT
> that in the end).

A few thoughts:

Whether its proof, or merely strong interest, we need to see where the
cross-over point is between bitmap indexes (BMI) and GIT. I'm assuming
there is one, but others may need convincing that there is a clear win
in common use cases. Graphs, numbers etc..

On your TODO list, it would be good to differentiate between ideas and
must-complete items. My suggested list would be:

Must complete
- cost model for optimizer
- user documentation
- performance tests

Good to complete
- duplicate key flag (should increase range of overlap with BMI)
- range re-check optimization (based upon your results)
- better index AM
- HOT, so that GIT can provide longer term benefits

Others
...

I seem to recall you mentioning there was a way to record that the heap
tuples are stored in monotonic order and so the sort can be avoided when
you return the tuples. Is that still true?

Also
"the patch roughly doubles the amount of WAL generated by WAL"
not sure what that means

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Simple Column reordering

2007-02-22 Thread Simon Riggs

Column storage position is the subject of many long threads in recent
times. Solutions proposed for this have been both fairly complex and
long enough that nothing seems likely to happen for 8.3. If I'm wrong,
then of course this proposal would be superceded.

I propose that at CREATE TABLE time, the column ordering is re-ordered
so that the table columns are packed more efficiently. This would be a
physical re-ordering, so that SELECT * and COPY without explicit column
definitions would differ from the original CREATE TABLE statement.

This would be an optional feature, off by default, controlled by a
USERSET GUC
optimize_column_order = off (default) | on

When the full column ordering proposal is implemented,
optimize_column_ordering would be set to default to on. The feature
would be supported for at least one more release after this to allow bug
analysis.

The proposed ordering would be:
1. All fixed length columns, arranged so that alignment is efficient
2. All variable length columns

All column ordering would stay as close as possible to original order

No changes would be made apart from at CREATE TABLE time.

The ordering would be repeatable, so that the order would not change on
repeated dump/restore of a table with no changes.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread patrick

On Thu, Feb 22, 2007 at 03:13:49PM +0100, Markus Schiltknecht wrote:
> one sparc (osol). So far all gcc compiled, AFAIK.
I think, that buildbot was gcc on solaris9/sparc. I care for support of 
monotone built with sunpro on solaris10
(and opensolaris) on x86 and sparc (but no buildbot for those).

there was once some work on msvc support, but I have no idea what happened to 
that.


patrick georgi

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Howto change db cluster locale on-the-fly

2007-02-22 Thread Jakub Ouhrabka

Thanks for your answer. Is there any other risk than wrong answers when 
running with wrong locale?


So maybe the best bet would be:

1) drop all text/varchar user indexes
2) stop database, change the locale
3) in single user mode reindex shared tables and system tables in all 
databases and templates

4) start the database
5) create all text/varchar user indexes

Sounds this about right? I'd like to minimize downtime...

How to do step 2) - change the locale??

Thanks a lot,

Kuba

Martijn van Oosterhout napsal(a):

On Mon, Feb 19, 2007 at 09:27:06AM +0100, Jakub Ouhrabka wrote:

But I guess something like this would work:

a)
1) drop all indexes on text/varchar columns
2) change cluster locale
3) create all indexes on text/varchar columns


You're going to miss the "name" columns, ie. every string index in
pg_catalog. Also, there are shared tables which all used in every DB.
You need to log into every DB in the cluster (don't forget template[01]
and reindex everything.

So, REINDEX DATABASE; seems to be a safer bet. In general this doesn't
actually work since changing the locale may make two strings equal that
wern't before, thus possibly breaking a unique index, but it may be
possible.

I'd suggest single user mode at least, and make backups!

Have a nice day,


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Richard Levitte - VMS Whacker

If I may, I'll add a few words to this discussion:

Basically, I'm seeing that three things need to be decided upon:

  1. Do you want to stay with CVS or do you want to move to something
 else?
  2. If you want to move, when?  Is now a good time, or is it better
 to look at it another time.  This may be a question of what
 people you have who'd do the job, what kind of time they have for
 the moment and so on.
  3. What would you want a replacement to be able to do?

When those questions are answered and people are behind it, then it's
time to look at the different systems and see what' the best match to
your desires.

So far, I'm getting the sense that there are a lot of opinions on what
replacement system to use, a bit carelessly before having answered the
above questions thoroughly.

HTH.

Cheers,
Richard

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

"When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up."
-- C.S. Lewis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Richard Levitte - VMS Whacker

In message <[EMAIL PROTECTED]> on Thu, 22 Feb 2007 17:38:26 +0100, Markus 
Schiltknecht <[EMAIL PROTECTED]> said:

markus> > So far, I'm getting the sense that there are a lot of
markus> > opinions on what replacement system to use, a bit carelessly
markus> > before having answered the above questions thoroughly.
markus> 
markus> How did you get that impression?

You said it yourself: Most PostgreSQL developers currently want to
stay with CVS.

Unless there's a majority that wants to move on, I doubt there will be
a move.  In the end, it has to be a group effort, or it will simply
not happen.

Cheers,
Richard

-
Please consider sponsoring my work on free software.
See http://www.free.lp.se/sponsoring.html for details.

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

"When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up."
-- C.S. Lewis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Richard Levitte - VMS Whacker

In message <[EMAIL PROTECTED]> on Thu, 22 Feb 2007 09:09:48 -0800, "Joshua D. 
Drake" <[EMAIL PROTECTED]> said:

jd> I believe that is much more accurate. The reality is, switching to
jd> something else will be painful. I would prefer not to be on CVS as
jd> well but it would take a lot of work and cvs does what we need it
jd> to.

I can only tell you my own story, and that's a story of being sick and
damn (oh, I've a much stronger word, but I'm too polite to use it
here) tired of CVS, especially having had to deal with it's absolute
lack of merging capacity.  It's taken trying a number of other SCMs,
which were all a PITA, until I found monotone, before I decided I had
found a worthy replacement.

To each his or her own, I say, and as you say, chaging (or "moving
on") takes some amount of effort, and you will not do it, as a group,
before more or less all (especially any core group) thinks it's worth
more than staying with the current system.

Cheers,
Richard

-- 
Richard Levitte [EMAIL PROTECTED]
http://richard.levitte.org/

"When I became a man I put away childish things, including
 the fear of childishness and the desire to be very grown up."
-- C.S. Lewis

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Monotone-devel] Re: SCMS question

2007-02-22 Thread Chris Browne

[EMAIL PROTECTED] (Andrew Dunstan) writes:
> Tom has pointed out, our biggest pain point is
> the occasional wish to move things across directories.

That's the biggest pain that people are normally aware of.

There are things that people don't even bother to try to do with CVS
because they are so impossible as to be stupid ideas.  

Notably, CVS, being an inherently centralized system, doesn't support
the notion of having a secondary repository tracking projects that
haven't been able to be committed into the central project.
-- 
(reverse (concatenate 'string "ofni.secnanifxunil" "@" "enworbbc"))
http://linuxfinances.info/info/finances.html
Rules of  the Evil Overlord #189. "I  will never tell the  hero "Yes I
was the one who  did it, but you'll never be able  to prove it to that
incompetent  old fool."  Chances  are, that  incompetent  old fool  is
standing behind the curtain."  

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Saving space for common kinds of numeric values

2007-02-22 Thread Gregory Stark


The numeric data type's minimum data size is 8 bytes and it can only even get
that small for "0". Storing even "1" requires 10 bytes. That seems pretty
abysmal.

It occurs to me that we could assign special-case meanings for any datum
smaller than 8 bytes. In just 2 or 4 bytes (including the 1-byte varlena
header) I think we could encode many common small values.

This would be a pretty straightforward change to numeric.c because all the
hard work there is done using an internal representation that never reaches
disk. As a result only set_var_from_num and make_result and a handful of
simple functions that work directly on the packed Numeric representation would
need to be adjusted at all.

I'm thinking of the following two cases:

1 byte (plus 1 byte header): integer between -128..127 with dscale and weight
implicitly defined to display as an integer. Ie, weight is always the number
of digits in the value and dscale is always 0. That doesn't restrict the
values Numeric supports only the values it can use this representation for, so
if you store "-1.0" it would store it normally (or using the following option).

3 bytes (plus 1 byte header): 1 byte to store weight and dscale (one nibble
each) and 2 bytes to store the value. This would let us handle the extremely
common case of currency quantities which have 2 decimal places. It could store
-327.68 .. 327.67 in four bytes including the varlena header.

Alternatively we could do away with weight entirely for the 3 byte
representation as with the 1 byte representation. That would let us store up
-10485.76 .. 10486.75 in 21 bits and use the remaining 3 bytes to store a
dscale of up to 8. I actually favour this option.

There are lots of options and we could go nuts defining a meaning for every
possible length up to 8, but really I think just defining a 1+1 byte encoding
for small integers and a 3+1 byte encoding for many common applications would
be reasonable.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Log levels for checkpoint/bgwriter monitoring

2007-02-22 Thread Jim C. Nasby

On Mon, Feb 19, 2007 at 10:59:38PM -0500, Greg Smith wrote:
> I have a WIP patch that adds the main detail I have found I need to 
> properly tune checkpoint and background writer activity.  I think it's 
> almost ready to submit (you can see the current patch against 8.2 at 
> http://www.westnet.com/~gsmith/content/postgresql/patch-checkpoint.txt ) 
> after making it a bit more human-readable.  But I've realized that along 
> with that, I need some guidance in regards to what log level is 
> appropriate for this information.
It would also be extremely useful to make checkpoint stats visible
somewhere in the database (presumably via the existing stats mechanism).
The log output is great for doing initial tuning, but you'd want to also
be able to monitor things, which would be impractical via logging. I'm
thinking just tracking how many pages had to be flushed during a
checkpoint would be a good start.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > > vacuum should be a process with the least amount of voodoo. 
> > > If we can just have vacuum_delay and vacuum_threshold, where 
> > > threshold allows an arbitrary setting of how much bandwidth we
will 
> > > allot to the process, then that is a beyond wonderful thing.
> > > 
> > > It is easy to determine how much IO you have, and what 
> you can spare.
> > 
> > The tricky part is what metric to use. Imho "IO per second" 
> would be 
> > good.
> > In a typical DB scenario that is the IO bottleneck, not the Mb/s.
> 
> Well, right now they're one in the same... but yeah, IO/sec 
> probably does make more sense.

Hopefully not :-) Else you have no readahead. And that is imho the
problem.
You need to anticipate how many physical IO's your logical IO's cause.
And this is near impossible unless we group IO's in pg itself.

Andreas

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> >> I agree, I haven't thought of drop column :-( Drop column should
have 
> >> relabeled attnum. Since it was not done then, my comments are 
> >> probably moot.
> >
> > We can correct this problem now.
> 
> How?  If attnum is serving as both physical position and 
> logical order, how can you make it be logical position 
> without breaking physical position?

If you ask me, attnum would be the logical position and would be used
in all other system tables. attphypos would only be used in
pg_attribute.
It would be quite some work to rearrange attnum in all system tables for
"drop column" and "add column before", but it would be nice for jdbc.

But it seems others want this: attnum beeing an arbitrary number,
that is used in all system tables and 2 extra columns in pg_attribute,
one for logical position and one for physical position.
If you want a corresponding colname to a pg_index attnum you need a map.

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Jim C. Nasby

On Thu, Feb 22, 2007 at 09:35:45AM +0100, Zeugswetter Andreas ADI SD wrote:
> 
> > vacuum should be a process with the least amount of voodoo. 
> > If we can just have vacuum_delay and vacuum_threshold, where 
> > threshold allows an arbitrary setting of how much bandwidth 
> > we will allot to the process, then that is a beyond wonderful thing.
> > 
> > It is easy to determine how much IO you have, and what you can spare.
> 
> The tricky part is what metric to use. Imho "IO per second" would be
> good.
> In a typical DB scenario that is the IO bottleneck, not the Mb/s.

Well, right now they're one in the same... but yeah, IO/sec probably
does make more sense.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Jim C. Nasby

On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote:
> Jim C. Nasby wrote:
> >On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote:
> >  
> >>My Proposal:  If we require admins to identify hot tables tables, then: 
> >>1) Launcher fires-off a worker1 into database X.
> >>2) worker1 deals with "hot" tables first, then regular tables.
> >>3) Launcher continues to launch workers to DB X every autovac naptime. 
> >>4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as 
> >>worker1 did above.  If worker1 is still working in DB X then worker2 
> >>looks for hot tables that are being starved because worker1 got busy. 
> >>If worker2 finds no hot tables that need work, then worker2 exits.
> >>
> >
> >Rather than required people to manually identify hot tables, what if we
> >just prioritize based on table size? So if a second autovac process hits
> >a specific database, it would find the smallest table in need of
> >vacuuming that it should be able to complete before the next naptime and
> >vacuum that. It could even continue picking tables until it can't find
> >one that it could finish within the naptime. Granted, it would have to
> >make some assumptions about how many pages it would dirty.
> >
> >ISTM that's a lot easier than forcing admins to mark specific tables.
> 
> So the heuristic would be:
> * Launcher fires off workers into a database at a given interval 
> (perhaps configurable?)
> * Each worker works on tables in size order. 
> * If a worker ever catches up to an older worker, then the younger 
> worker exits.
> 
> This sounds simple and workable to me, perhaps we can later modify this 
> to include some max_workers variable so that a worker would only exit if 
> it catches an older worker and there are max_workers currently active.

That would likely result in a number of workers running in one database,
unless you limited how many workers per database. And if you did that,
you wouldn't be addressing the frequently update table problem.

A second vacuum in a database *must* exit after a fairly short time so
that we can go back in and vacuum the important tables again (well or
the 2nd vacuum has to periodically re-evaluate what tables need to be
vacuumed).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > I agree, I haven't thought of drop column :-( Drop column should
have 
> > relabeled attnum.
> > Since it was not done then, my comments are probably moot. 
> 
> We can correct this problem now.

Do you mean fix it with the 3rd column in pg_attribute and use that,
or fix attnum ? :-)

Imho it is a pain to need 2 numbers and a mapping in drivers.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Column storage positions

2007-02-22 Thread Alvaro Herrera

Kris Jurka escribió:
> 
> 
> On Thu, 22 Feb 2007, Alvaro Herrera wrote:
> 
> >Zeugswetter Andreas ADI SD escribió:
> >>
> >>I agree, I haven't thought of drop column :-( Drop column should have 
> >>relabeled attnum. Since it was not done then, my comments are probably 
> >>moot.
> >
> >We can correct this problem now.
> 
> How?  If attnum is serving as both physical position and logical order, 
> how can you make it be logical position without breaking physical 
> position?

By patching the code, of course, so that it doesn't serves as both
things, which is what is being discussed.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Column storage positions

2007-02-22 Thread Kris Jurka




On Thu, 22 Feb 2007, Alvaro Herrera wrote:


Zeugswetter Andreas ADI SD escribió:


I agree, I haven't thought of drop column :-( Drop column should have 
relabeled attnum. Since it was not done then, my comments are probably 
moot.


We can correct this problem now.



How?  If attnum is serving as both physical position and logical order, 
how can you make it be logical position without breaking physical 
position?


Kris Jurka

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Column storage positions

2007-02-22 Thread Alvaro Herrera

Zeugswetter Andreas ADI SD escribió:
> 
> > > And I also see a lot of unhappiness from users of system tables
> > > when column numbers all over the system tables would not be
> > > logical column positions any more.
> > 
> > Right now the fact that attnum presents the logical order but 
> > not the logical position is a problem for the JDBC driver.  
> > In the presence of dropped columns there is no easy way to 
> > get from a pg_attribute entry to logical position.  I would 
> > hope that a new logical position column would reflect the 
> > actual position and solve this problem.
> 
> I agree, I haven't thought of drop column :-( Drop column should have
> relabeled attnum.
> Since it was not done then, my comments are probably moot. 

We can correct this problem now.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] SCMS question

2007-02-22 Thread Alvaro Herrera

Gregory Stark wrote:
> "Andrew Dunstan" <[EMAIL PROTECTED]> writes:
> 
> > 2. Many people (and some buildfarm members) operate against mirrors of the 
> > main
> > repo which are created with rsync or CVSup. I am not aware of any way to do 
> > the
> > equivalent with SVN -  any info would be gratefully received. Of course, 
> > SVN is
> > better at disconnected operation than CVS, so it might be a non-issue for 
> > many.
> > Even so, it might be a pity to have to forego the facility.
> 
> Well SVN basically works by having that mirror all the time. That kind of
> sucks for people with many checkouts since it takes more disk space but it
> provides the same benefits of having a local mirror of the CVS repository
> which is really just working around the problems with CVS.
> 
> The general point about the build farms is a strong argument in favor of SVN
> over the new-fangled revision control systems. It would involve the least
> change in the operational model and the main build farm maintainer is familiar
> with it...

Nonsense.  Getting a checkout is almost as easy in Monotone as it is in
CVS or subversion.  You just get an initial copy of the database and
then get all your checkouts from there.

In SVN there's this strange notion of keeping the latest revision within
the checked out copy: if you have more than one working copy, you get
the same stuff multiple times.  If you "grep" the whole source tree you
get a lot of extraneous files in the grep result, which you then have to
"grep -v" out, which is annoying.

In Monotone the database is separate from the working copy, and from one
database you can get as many working copies as you want.  You don't need
to mirror the repository, because you _have_ the repository.  (I guess
another way to put it is that mirroring the repository is part of
everyday operation).

We use SVN internally in Command Prompt, and I will change to anything
else any day :-)

One thing that Monotone doesn't do to which we are used to, is
$Id$-style keyword expansion.  The only other problem I see currently is
the long time to get an initial pull.

> It's also the easiest to get ahold of. Easier I would say than CVS which you
> have to download some bug fixes from various unofficial sites to get a good
> working version of.

Sure, anything is better than CVS :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > And I also see a lot of unhappiness from users of system tables when

> > column numbers all over the system tables would not be logical
column 
> > positions any more.
> 
> Right now the fact that attnum presents the logical order but 
> not the logical position is a problem for the JDBC driver.  
> In the presence of dropped columns there is no easy way to 
> get from a pg_attribute entry to logical position.  I would 
> hope that a new logical position column would reflect the 
> actual position and solve this problem.

I agree, I haven't thought of drop column :-( Drop column should have
relabeled attnum.
Since it was not done then, my comments are probably moot. 

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I think you should increase pg_control version.

And the WAL page-header version, since this also changes WAL contents.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] SCMS question

2007-02-22 Thread Stefan Kaltenbrunner

Gregory Stark wrote:
> "Andrew Dunstan" <[EMAIL PROTECTED]> writes:
> 

[...]
> It's also the easiest to get ahold of. Easier I would say than CVS which you
> have to download some bug fixes from various unofficial sites to get a good
> working version of.

just to mention it - the openbsd gyus are working on a BSD licenced CVS
rewrite too (and i believe they strike for full compat first and improve
on the main week points later on).
some information about that effort is available here:
http://www.opencvs.org/.

Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Column storage positions

2007-02-22 Thread Kris Jurka




On Thu, 22 Feb 2007, Zeugswetter Andreas ADI SD wrote:

And I also see a lot of unhappiness from users of system tables when 
column numbers all over the system tables would not be logical column 
positions any more.




Right now the fact that attnum presents the logical order but not the 
logical position is a problem for the JDBC driver.  In the presence of 
dropped columns there is no easy way to get from a pg_attribute entry to 
logical position.  I would hope that a new logical position column would 
reflect the actual position and solve this problem.


Kris Jurka

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule



>> CREATE TABLE foo (id serial, names text FULLTEXT);
>>
>> Anything more complicated is a waste of cycles.
>>
>> Joshua D. Drake
>
> I agree. Question: what about multilanguage fulltext.
>
> CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ];
> CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary,
> english_dictionary) ] );
>
> all others can we do via SP.

That works for me with perhaps a default mapping to locales? For example
if our locale is en_us.UTF8 we are pretty assured that we are using 
english.




90% yes. 10% no. In czech typical task: find word without accents, or find 
german, english, czech stemmed word in multilanguage documents (or different 
languages depend on topology). Lot of databases are minimal bilanguagal (in 
czech rep. german and czech).


Pavel

p.s. missing collates is big minus for PostgreSQL in eu (we have some 
workarounds)


_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Zeugswetter Andreas ADI SD


> I think the word "line pointer" is causing some confusion 
> here. Let me explain the idea again: Each page has a set of 
> line pointers OR item-ids as they are referred in the code (I 
> shall use the word item-id here after).
> The item-id stores the offset(15 bits), length (15 bits) and 
> two flags, LP_USED and LP_DELETE (2 bits).
> 
> The root tuple is pointed to by some item-id (say, I1) on the 
> page. When the tuple is hot updated, a heap-only tuple is 
> added which is linked to the root tuple by its t_ctid and is 
> pointed by another item-id I2 on the page.
> 
> Now, when the root tuple subsequently becomes DEAD, I am 
> proposing to store I2 in the offset field of I1. The length 
> field in I1 can be set to zero (or some other special value) 
> to mark that I1 is now just a redirection to I2 and does not 
> point to any real tuple, dead or live.

Oh, thanks for explaining. I was really misunderstanding.
Is that possible ? I thought item-id's (slots) need to be in physical
order.

> If we do reuse dead tuples without vacuum we should probably, 
> as already
> > suggested,
> > disconnect the "what is dead enough for reuse/vacuum" from 
> global xmin 
> > right from the start.
> 
> 
> I did not get that completely. Can you elaborate on that ?
> 
> Once we find a heap-only DEAD tuple, we remove it from the
> tuple-chain (and thus remove its only reference) and set LP_DELETE
> on its item-id. When we run out of free-space in a page, we search for
> an item-id whose LP_DELETE is set and whose length is atleast equal
> to the new tuple length. We reuse that item-id and the associated
> storage for storing the new tuple.

With the recent discussion about flashback (explicitly setting a
snapshot
to 5 min ago) and forensics, I think we should have a way to delay
what is considered "DEAD and available for immediate reuse".
I know this can reduce the efficiency of HOT when delayed too far,
but I think we should have that possibility if we reuse without vacuum.

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Alvaro Herrera

Florian G. Pflug wrote:
> Alvaro Herrera wrote:
> >Florian G. Pflug wrote:
> >>Heikki Linnakangas wrote:
> >>>No you're right, it's related to the WAL undo stuff that was never 
> >>>actually implemented. It's dead code.
> >>>
> >>>Teodor Sigaev wrote:
> Opps, sorry, I missed checkpoint keyword
> 
> Teodor Sigaev wrote:
> >>What am I missing?
> >Seems, it's about that
> >http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php
> >>Thanks - everything makes much more sense now ;-)
> >
> >So if you want to submit a patch to remove it, it'll make sense for
> >future readers as well ;-)
> 
> Here you go. I'm not subscribe to pgsql-patches, so I posted this here.

I think you should increase pg_control version.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake


>> CREATE TABLE foo (id serial, names text FULLTEXT);
>>
>> Anything more complicated is a waste of cycles.
>>
>> Joshua D. Drake
> 
> I agree. Question: what about multilanguage fulltext.
> 
> CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ];
> CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary,
> english_dictionary) ] );
> 
> all others can we do via SP.

That works for me with perhaps a default mapping to locales? For example
if our locale is en_us.UTF8 we are pretty assured that we are using english.

Joshua D. Drake



> 
> Pavel Stehule
> 
> _
> Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com.
> http://www.msn.cz/
> 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>   choose an index scan if your joining column's datatypes do not
>   match
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > Yes, that was the idea (not oid but some number), and I am arguing 
> > against it. Imho people are used to see the logical position in e.g.
> > pg_index
> >
> 
> Which people are you talking about? In my commercial PG work 
> I hardly ever look at a system table at all, and users 
> shouldn't have to IMNSHO. 

You are probably right. I tend to resort to commandline tools, schema
dumps and system tables, probably not many other "people" do that. I
often don't get to use my perferred toolset because it is not installed.


> If you mean tools developers, then accomodating to catalog 
> changes is par for the course, I should think.

The question is, whether the distributed work needed to get all the
tools and interfaces (like jdbc, odbc, pgadmin) to work again isn't more
work, than doing it in the backend would be.

Since we want plan invalidation anyway, I am not sure the rest is so
much.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Peter Eisentraut

Markus Schiltknecht wrote:
> Most PostgreSQL developers currently want to stay with CVS. Only some
> desperate souls including myself are fiddling with other VCSes.

I think if you took a head count, a majority of developers would 
probably want to switch, but I doubt that there would be a consensus on 
what to.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] SCMS question

2007-02-22 Thread Joshua D. Drake

Andrew Dunstan wrote:
> Warren Turkal wrote:
>> On Thursday 22 February 2007 05:26, Andrew Dunstan wrote:
>>  
>>> 2. Many people (and some buildfarm members) operate against mirrors of
>>> the main repo which are created with rsync or CVSup. I am not aware of
>>> any way to do the equivalent with SVN - 

Well you could just use rsync with svn ;)

>> any info would be gratefully
>>> received. Of course, SVN is better at disconnected operation than CVS,
>>> so it might be a non-issue for many. Even so, it might be a pity to have
>>> to forego the facility.
>>> 
>>
>> Check out svnsync for SVN. It's part of a normal SVN installation.
>>

That is cool, I didn't know about that.

Other items I find interesting about SVN:

1. It has an api that can be written to, that may or may not be helpful
to buildfarm.

2. It has mindshare. I know that isn't a big deal to a lot of people
here, but the it is becoming the new cvs. There are others of course
(like monotone) but most projects I know are moving from cvs to svn or
starting with svn.

Joshua D. Drake

>>   
> 
> Nifty.
> 
> Thanks.
> 
> andrew
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>   http://archives.postgresql.org
> 

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule


I am not talking about stored procedures. I am talking about a very
ugly, counter intuitive syntax above.

Initializing full text should be as simple as:

CREATE INDEX foo USING FULLTEXT(bar);

(or something similar)

Or:

CREATE TABLE foo (id serial, names text FULLTEXT);

Anything more complicated is a waste of cycles.

Joshua D. Drake


I agree. Question: what about multilanguage fulltext.

CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ];
CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary, 
english_dictionary) ] );


all others can we do via SP.

Pavel Stehule

_
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. 
http://www.msn.cz/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] SCMS question

2007-02-22 Thread Andrew Dunstan


Warren Turkal wrote:

On Thursday 22 February 2007 05:26, Andrew Dunstan wrote:
  

2. Many people (and some buildfarm members) operate against mirrors of
the main repo which are created with rsync or CVSup. I am not aware of
any way to do the equivalent with SVN -  any info would be gratefully
received. Of course, SVN is better at disconnected operation than CVS,
so it might be a non-issue for many. Even so, it might be a pity to have
to forego the facility.



Check out svnsync for SVN. It's part of a normal SVN installation.

  


Nifty.

Thanks.

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Alvaro Herrera

Andrew Dunstan wrote:

> It's also fair to say that this is a subject about which we usually get 
> much more noise from partisans of other SCM systems than from the 
> relatively small number of people who actually have to maintain the 
> postgresql code. (As Tom has pointed out, our biggest pain point is the 
> occasional wish to move things across directories.)

There are more features we are missing -- we just don't know about them
:-)

For example, currently if I have a patch and somebody reviews it and
opines that I have to change foo to bar; then I resubmit the patch.  How
do they find out whether I actually changed foo to bar?  Currently there
are two alternatives:

1. trust that I did it
2. review the whole patch again

With a distributed SCM, I could just patch the code and commit a new
revision in my branch to just change foo to bar, and then the reviewer
can check that I truly did what he wanted.

Another easy thing to do is to track the current HEAD in a branch of
mine.  Keeping patches up to date in parallel with other developments is
easier.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake

Pavel Stehule wrote:
>> > And users are constantly complaining that PostgreSQL doesn't have
>> > fulltext indexing capabilities (if they don't know about tsearch2) or
>> > about how hard it is to use tsearch2.
>> >
>> >> SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
>> >> ARRAY['...']) is readable.
>> >
>> > Hardly. Because it's not like SQL:
>>
>> I have to agree here.
>>
>> SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
>> ARRAY['...']) is readable.
>>
>> Is a total no op. We might as well just leave it in contrib.
>>
> 
> I am for integration tsearch to core, why not. But I don't see reason
> for special syntax. Stored procedures is exactly good tool for it.

I am not talking about stored procedures. I am talking about a very
ugly, counter intuitive syntax above.

Initializing full text should be as simple as:

CREATE INDEX foo USING FULLTEXT(bar);

(or something similar)

Or:

CREATE TABLE foo (id serial, names text FULLTEXT);

Anything more complicated is a waste of cycles.

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule


> And users are constantly complaining that PostgreSQL doesn't have
> fulltext indexing capabilities (if they don't know about tsearch2) or
> about how hard it is to use tsearch2.
>
>> SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
>> ARRAY['...']) is readable.
>
> Hardly. Because it's not like SQL:

I have to agree here.

SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
ARRAY['...']) is readable.

Is a total no op. We might as well just leave it in contrib.



I am for integration tsearch to core, why not. But I don't see reason for 
special syntax. Stored procedures is exactly good tool for it.


Fulltext is standarised in SQL/MM, SQL Multimedia and Application Packages, 
Part 2: Full-Text


Why implement extensive proprietary solution? If our soulution is 
proprietary, then so it is simple and cheap and doesn't complicate future 
conformance with ANSI SQL.


Regards
Pavel Stehule

_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] SCMS question

2007-02-22 Thread Gregory Stark

"Andrew Dunstan" <[EMAIL PROTECTED]> writes:

> 2. Many people (and some buildfarm members) operate against mirrors of the 
> main
> repo which are created with rsync or CVSup. I am not aware of any way to do 
> the
> equivalent with SVN -  any info would be gratefully received. Of course, SVN 
> is
> better at disconnected operation than CVS, so it might be a non-issue for 
> many.
> Even so, it might be a pity to have to forego the facility.

Well SVN basically works by having that mirror all the time. That kind of
sucks for people with many checkouts since it takes more disk space but it
provides the same benefits of having a local mirror of the CVS repository
which is really just working around the problems with CVS.

The general point about the build farms is a strong argument in favor of SVN
over the new-fangled revision control systems. It would involve the least
change in the operational model and the main build farm maintainer is familiar
with it...

It's also the easiest to get ahold of. Easier I would say than CVS which you
have to download some bug fixes from various unofficial sites to get a good
working version of.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Column storage positions

2007-02-22 Thread Andrew Dunstan


Zeugswetter Andreas ADI SD wrote:


Yes, that was the idea (not oid but some number), and I am arguing
against it. Imho people are used to see the logical position in e.g.
pg_index


  


Which people are you talking about? In my commercial PG work I hardly 
ever look at a system table at all, and users shouldn't have to IMNSHO. 
If you mean tools developers, then accomodating to catalog changes is 
par for the course, I should think.


cheers

andrew


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] SCMS question

2007-02-22 Thread Warren Turkal

On Thursday 22 February 2007 05:26, Andrew Dunstan wrote:
> 2. Many people (and some buildfarm members) operate against mirrors of
> the main repo which are created with rsync or CVSup. I am not aware of
> any way to do the equivalent with SVN -  any info would be gratefully
> received. Of course, SVN is better at disconnected operation than CVS,
> so it might be a non-issue for many. Even so, it might be a pity to have
> to forego the facility.

Check out svnsync for SVN. It's part of a normal SVN installation.

With git, a checkout brings the whole repository anyway, I think.

wt
-- 
Warren Turkal (w00t)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > And I also see a lot of unhappiness from users of system tables when

> > column numbers all over the system tables would not be logical
column 
> > positions any more.
> 
> Are you arguing against the feature? Or against the suggested design?

Against the design.

> I should have thought (without much looking) one possible way 
> to implement it would be to put Oids on pg_attribute for the 
> permanent id, and keep attnum for the (now mutable) logical 
> order, adding a further column for the physical order.

Yes, that was the idea (not oid but some number), and I am arguing
against it. Imho people are used to see the logical position in e.g.
pg_index

I know it is a lot of work to update all those dependencies in the
system tables to reorder logical position, but that is the path I think
should be taken. And the first step in that direction is Phil's patch.

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Pavan Deolasee


On 2/22/07, Zeugswetter Andreas ADI SD <[EMAIL PROTECTED]> wrote:




I think you are still misunderstanding me, sorry if I am not beeing
clear
enough. When the row is hot-updated it is too late. You do not have room

in the root for the line pointer.



I think the word "line pointer" is causing some confusion here. Let me
explain the idea again: Each page has a set of line pointers OR item-ids
as they are referred in the code (I shall use the word item-id here after).
The item-id stores the offset(15 bits), length (15 bits) and two
flags, LP_USED and LP_DELETE (2 bits).

The root tuple is pointed to by some item-id (say, I1) on the page. When
the tuple is hot updated, a heap-only tuple is added which is linked to
the root tuple by its t_ctid and is pointed by another item-id I2 on
the page.

Now, when the root tuple subsequently becomes DEAD, I am proposing
to store I2 in the offset field of I1. The length field in I1 can be set to
zero (or some other special value) to mark that I1 is now just a
redirection to I2 and does not point to any real tuple, dead or live.
The actual storage used by the root tuple is now released (or we can
add another item pointer to keep track of it so that it can be reused
without vacuum of the page. But lets defer that for the time being).

In short, the pointer is NOT stored in the tuple header, but in the
item-id.

If we do reuse dead tuples without vacuum we should probably, as already

suggested,
disconnect the "what is dead enough for reuse/vacuum" from global xmin
right from the
start.



I did not get that completely. Can you elaborate on that ?

Once we find a heap-only DEAD tuple, we remove it from the
tuple-chain (and thus remove its only reference) and set LP_DELETE
on its item-id. When we run out of free-space in a page, we search for
an item-id whose LP_DELETE is set and whose length is atleast equal
to the new tuple length. We reuse that item-id and the associated
storage for storing the new tuple.

I hope things would be more clear once I post next version of the
HOT-patch. But I really appreciate these comments.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Joshua D. Drake

> 
> And users are constantly complaining that PostgreSQL doesn't have
> fulltext indexing capabilities (if they don't know about tsearch2) or
> about how hard it is to use tsearch2.
> 
>> SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
>> ARRAY['...']) is readable.
> 
> Hardly. Because it's not like SQL:

I have to agree here.

SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'],
ARRAY['...']) is readable.

Is a total no op. We might as well just leave it in contrib.

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Column storage positions

2007-02-22 Thread Andrew Dunstan


Zeugswetter Andreas ADI SD wrote:
And I also see a lot of unhappiness from users of system tables 
when column numbers all over the system tables would not be logical
column 
positions any more.


  


Are you arguing against the feature? Or against the suggested design?

I should have thought (without much looking) one possible way to 
implement it would be to put Oids on pg_attribute for the permanent id, 
and keep attnum for the (now mutable) logical order, adding a further 
column for the physical order.



cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Joshua D. Drake

Andrew Dunstan wrote:
> Markus Schiltknecht wrote:
>>
>> Richard Levitte - VMS Whacker wrote:
>>>   1. Do you want to stay with CVS or do you want to move to something
>>>  else?
>>
>> Most PostgreSQL developers currently want to stay with CVS. Only some
>> desperate souls including myself are fiddling with other VCSes.
> 
> 
> I really don't think this is a correct characterisation. What is true, I
> think, is that many remain to be convinced that the benefits of any
> proposed change outweigh the likely pain, and I suspect many are also
> uncertain about what's happening in the SCM space and prefer to wait
> until the dust settles some.

I believe that is much more accurate. The reality is, switching to
something else will be painful. I would prefer not to be on CVS as well
but it would take a lot of work and cvs does what we need it to.

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht


Hi,

Pavel Stehule wrote:

Functions maybe doesn't see efective, but user's cannot learn new syntax.


Are you serious? That argument speaks exactly *for* extending the 
grammar. From other databases, users are used to:


CREATE TABLE ... (SQL)
CREATE INDEX ... (SQL)
CREATE FULLTEXT INDEX ... (Transact-SQL)
CREATE TABLE (... FULLTEXT ...) (MySQL)
CREATE INDEX ... INDEXTYPE IS ctxsys.context PARAMETERS ... (Oracle Text)

And users are constantly complaining that PostgreSQL doesn't have 
fulltext indexing capabilities (if they don't know about tsearch2) or 
about how hard it is to use tsearch2.


SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], 
ARRAY['...']) is readable.


Hardly. Because it's not like SQL:

 - it's counter-intuitive to have to SELECT, when you want to CREATE 
something.

 - it's confusing to have two actions (select create)
 - why do I have to write ARRAYs to list parameters?
 - it's not obvious what you're selecting (return value?)
 - you have to keep track of the brackets, which can easily get messed 
up with two levels of them. Especially if the command gets multiple 
lines long.



I agree so enhancing parser oabout not standard construct isn't good.


Generally? Wow! This would mean PostgreSQL would always lack behind 
other RDBSes, regarding ease of use. Please don't do that!


Regards

Markus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Andrew Dunstan


Markus Schiltknecht wrote:


Richard Levitte - VMS Whacker wrote:

  1. Do you want to stay with CVS or do you want to move to something
 else?


Most PostgreSQL developers currently want to stay with CVS. Only some 
desperate souls including myself are fiddling with other VCSes.



I really don't think this is a correct characterisation. What is true, I 
think, is that many remain to be convinced that the benefits of any 
proposed change outweigh the likely pain, and I suspect many are also 
uncertain about what's happening in the SCM space and prefer to wait 
until the dust settles some.


It's also fair to say that this is a subject about which we usually get 
much more noise from partisans of other SCM systems than from the 
relatively small number of people who actually have to maintain the 
postgresql code. (As Tom has pointed out, our biggest pain point is the 
occasional wish to move things across directories.)



cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Florian G. Pflug


Alvaro Herrera wrote:

Florian G. Pflug wrote:

Heikki Linnakangas wrote:
No you're right, it's related to the WAL undo stuff that was never 
actually implemented. It's dead code.


Teodor Sigaev wrote:

Opps, sorry, I missed checkpoint keyword

Teodor Sigaev wrote:

What am I missing?

Seems, it's about that
http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php

Thanks - everything makes much more sense now ;-)


So if you want to submit a patch to remove it, it'll make sense for
future readers as well ;-)


Here you go. I'm not subscribe to pgsql-patches, so I posted this here.

greetings, Florian Pflug


Index: src/backend/access/transam/xlog.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.264
diff -a -u -r1.264 xlog.c
--- src/backend/access/transam/xlog.c   14 Feb 2007 05:00:40 -  1.264
+++ src/backend/access/transam/xlog.c   22 Feb 2007 15:47:52 -
@@ -4039,7 +4039,6 @@
/* Set up information for the initial checkpoint record */
checkPoint.redo.xlogid = 0;
checkPoint.redo.xrecoff = SizeOfXLogLongPHD;
-   checkPoint.undo = checkPoint.redo;
checkPoint.ThisTimeLineID = ThisTimeLineID;
checkPoint.nextXidEpoch = 0;
checkPoint.nextXid = FirstNormalTransactionId;
@@ -4698,9 +4697,8 @@
wasShutdown = (record->xl_info == XLOG_CHECKPOINT_SHUTDOWN);
 
ereport(LOG,
-(errmsg("redo record is at %X/%X; undo record is at %X/%X; shutdown 
%s",
+(errmsg("redo record is at %X/%X; ; shutdown %s",
 checkPoint.redo.xlogid, checkPoint.redo.xrecoff,
-checkPoint.undo.xlogid, checkPoint.undo.xrecoff,
 wasShutdown ? "TRUE" : "FALSE")));
ereport(LOG,
(errmsg("next transaction ID: %u/%u; next OID: %u",
@@ -4730,20 +4728,17 @@
if (XLByteLT(RecPtr, checkPoint.redo))
ereport(PANIC,
(errmsg("invalid redo in checkpoint record")));
-   if (checkPoint.undo.xrecoff == 0)
-   checkPoint.undo = RecPtr;
 
/*
 * Check whether we need to force recovery from WAL.  If it appears to
 * have been a clean shutdown and we did not have a recovery.conf file,
 * then assume no recovery needed.
 */
-   if (XLByteLT(checkPoint.undo, RecPtr) ||
-   XLByteLT(checkPoint.redo, RecPtr))
+   if (XLByteLT(checkPoint.redo, RecPtr))
{
if (wasShutdown)
ereport(PANIC,
-   (errmsg("invalid redo/undo record in shutdown 
checkpoint")));
+   (errmsg("invalid redo record in shutdown 
checkpoint")));
InRecovery = true;
}
else if (ControlFile->state != DB_SHUTDOWNED)
@@ -5850,10 +5845,9 @@
{
CheckPoint *checkpoint = (CheckPoint *) rec;
 
-   appendStringInfo(buf, "checkpoint: redo %X/%X; undo %X/%X; "
+   appendStringInfo(buf, "checkpoint: redo %X/%X; "
 "tli %u; xid %u/%u; oid %u; 
multi %u; offset %u; %s",
 checkpoint->redo.xlogid, 
checkpoint->redo.xrecoff,
-checkpoint->undo.xlogid, 
checkpoint->undo.xrecoff,
 checkpoint->ThisTimeLineID,
 checkpoint->nextXidEpoch, 
checkpoint->nextXid,
 checkpoint->nextOid,
Index: src/bin/pg_controldata/pg_controldata.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_controldata/pg_controldata.c,v
retrieving revision 1.32
diff -a -u -r1.32 pg_controldata.c
--- src/bin/pg_controldata/pg_controldata.c 8 Dec 2006 19:50:53 -   
1.32
+++ src/bin/pg_controldata/pg_controldata.c 22 Feb 2007 15:47:52 -
@@ -168,9 +168,6 @@
printf(_("Latest checkpoint's REDO location:%X/%X\n"),
   ControlFile.checkPointCopy.redo.xlogid,
   ControlFile.checkPointCopy.redo.xrecoff);
-   printf(_("Latest checkpoint's UNDO location:%X/%X\n"),
-  ControlFile.checkPointCopy.undo.xlogid,
-  ControlFile.checkPointCopy.undo.xrecoff);
printf(_("Latest checkpoint's TimeLineID:   %u\n"),
   ControlFile.checkPointCopy.ThisTimeLineID);
printf(_("Latest checkpoint's NextXID:  %u/%u\n"),
Index: src/include/catalog/pg_control.h
===
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_control.h,v
retrieving revision 1.35
diff -a -u -r1.35 pg_control.h
--- src/include/catalog/pg_control.h

Re: [Monotone-devel] Re: [HACKERS] SCMS question

2007-02-22 Thread Markus Schiltknecht


Hello Richard,

you should probably have read the thread on the PostgreSQL -hackers 
mailing list I've linked to... at least you didn't make Tom's point ;-)


Richard Levitte - VMS Whacker wrote:

  1. Do you want to stay with CVS or do you want to move to something
 else?


Most PostgreSQL developers currently want to stay with CVS. Only some 
desperate souls including myself are fiddling with other VCSes.



  3. What would you want a replacement to be able to do?


That's being debated, with many voices saying: CVS (plus our own 
hackery) provides all we need.  (And be warned again: as soon as you 
point out an advantage of your favourite VCS, you're making Tom's point. 
;-) )



So far, I'm getting the sense that there are a lot of opinions on what
replacement system to use, a bit carelessly before having answered the
above questions thoroughly.


How did you get that impression? I'm currently *using* monotone for 
Postgres-R development, doing cvs_import and propagating to my branch. 
And I know others did the same already, too.


Regards

Markus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Column storage positions

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > In any case I think it's foolish not to tackle both issues at once.
> > We know we'd like to have both features and we know that 
> all the same 
> > bits of code need to be looked at to implement either.
> 
> I guess I disagree with that sentiment.  I don't think it's 
> necessary to bundle these two features together, even if some 
> analysis will be duplicated between them, since they are 
> completely distinct in a functional sense and will touch 
> different places in the code.

I fully agree with Phil here. 

And I also see a lot of unhappiness from users of system tables 
when column numbers all over the system tables would not be logical
column 
positions any more.

Andreas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > > Yes, thats one option. Though given a choice I would waste four 
> > > bytes in the heap-page than inserting a new index entry.
> >
> > No question about that. My point was, that it would mean wasting the
2 
> > (2 must be enough for a slot pointer) bytes on every heap tuple, hot

> > or not. And then the decision is not so obvious anymore.
> > If you don't have the room for the back pointer on every slot, there

> > is no room to add one later.
> >
> >
> Oh yes, I agree. I was referring to the idea of line pointer 
> redirection which would waste four bytes (for the root line 
> pointer)  per hot-update chain. That occurs only when a tuple 
> is hot-updated.
> So there is no overhead for normal tuples.

I think you are still misunderstanding me, sorry if I am not beeing
clear 
enough. When the row is hot-updated it is too late. You do not have room

in the root for the line pointer.

Say a table's tuple size is typically 40 bytes. Your root slot has room
for exactly 40 bytes. When the new tuple also (as expected) needs 40
bytes
there is no room for the line pointer. 

Or are you suggesting, that vacuum resizes all root tuples to make room
for
the extra bytes, and only then you can use it ? Imho that would not be
good.

Imho the "relink root to newest dead tuple during update" path is more
promising, and does not depend on vacuum.

If we do reuse dead tuples without vacuum we should probably, as already
suggested,
disconnect the "what is dead enough for reuse/vacuum" from global xmin
right from the
start.

Andreas


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Column storage positions

2007-02-22 Thread Robert Treat

On Thursday 22 February 2007 09:06, Phil Currier wrote:
> On 2/22/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> > Andrew Dunstan <[EMAIL PROTECTED]> writes:
> > > Alvaro Herrera wrote:
> > >> Right, I'm not advocating not doing that -- I'm just saying that the
> > >> first step to that could be decoupling physical position with attr id
> > >>
> > >> :-) Logical column ordering (the order in which SELECT * expands to)
> > >>
> > >> seems to me to be a different feature.
> > >
> > > Except in the sense that divorcing the id from the storage order makes
> > > it possible to do sanely. :-)
> >
> > They are different features, but they are going to hit all the same
> > code, because the hardest part of this remains making sure that every
> > piece of the code is using the right kind of column number.  The
> > suggestion I posted awhile ago amounts to saying that we might be able
> > to solve that by default, by making sure that only one definition of
> > "column number" is relevant to the majority of the backend and we can
> > figure out exactly where the other definitions need to apply.  But
> > that's handwaving until someone actually does it :-(
>
> I don't really think it's just handwaving at this point because I've
> done a lot of it :).  I'm not saying the work is done, or that a lot
> more testing isn't required, but at the moment I have a working system
> that seems to do what it needs to do to separate storage position from
> permanent ID/display position.  And the changes to accomplish this
> were quite localized - namely the tuple access routines in
> heaptuple.c, and the small handful of places that need to construct
> tuple descriptors.  That's pretty much it - the rest of the codebase
> is remains untouched.
>
> > In any case I think it's foolish not to tackle both issues at once.
> > We know we'd like to have both features and we know that all the same
> > bits of code need to be looked at to implement either.
>
> I guess I disagree with that sentiment.  I don't think it's necessary
> to bundle these two features together, even if some analysis will be
> duplicated between them, since they are completely distinct in a
> functional sense and will touch different places in the code.
> Smaller, more incremental changes make more sense to me.
>

Can you post a patch of what you have now to -patches? 

> But if both-features-at-once is what the community wants, that's fine,
> no worries.  I'll just pull my own personal hat out of the ring until
> someone comes along who's interested in implementing them both at the
> same time.
>

Are you that opposed to working on the display portions as well?  You'll be a 
hero to thousands of mysql users if you do it. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht


Hi,

Andrew Dunstan wrote:
If we are worried about the size of the transition table and keeping it 
in cache (see remarks from Tom upthread) then adding more keywords seems 
a bad idea, as it will surely expand the table.  OTOH, I'd hate to make 
that a design criterion. 


Yeah, me too. Especially because it's an implementation issue against 
ease of use. (Or can somebody convince me that functions would provide a 
simple interface?)


My main worry has been that the grammar would 
be stable.


You mean stability of the grammar for the new additions or for all the 
grammar? Why are you worried about that?


Just to quantify all this, I did a quick check on the grammar using 
bison -v - we appear to have 473 terminal symbols, and 420 non-terminal 
sybols in 1749 rules, generating 3142 states. The biggest tables 
generated are yytable and yycheck, each about 90kb on my machine.


That already sounds somewhat better that Tom's 300 kb. And considering 
that these caches most probably grow faster than our grammar...


Regards

Markus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Pavel Stehule


   CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT;
   SELECT add_fulltext_config('myfts', 'template_cfg', True);



That's simple, but what about
CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH 
dictname1[, ...];

?

SELECT create_fulltext_mapping(cfgname, '{lexemetypename[, ...]}'::text[],
'{dictname1[, ...]}'::text[]);

Seems rather ugly for me...


Functions maybe doesn't see efective, but user's cannot learn new syntax.

SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) 
is readable.


I agree so enhancing parser oabout not standard construct isn't good.

And function interface does not provide autocompletion and online help in 
psql. \df says only types of arguments, not a meaning.


Yes, I miss better support function in psql too. But it's different topic. I 
don't see reason why

\h cannot support better functions.

Nice a day
Pavel Stehule

_
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
http://messenger.msn.cz/



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid

2007-02-22 Thread Tom Lane

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Florian G. Pflug wrote:
>> * Note: xlog record is marked as outside transaction control, since we
>> * want it to be redone whether the invoking transaction commits or not.

> That comment is a bit misleading, I agree. We don't skip xlog entries, 
> they're always replayed.

Yeah, this distinction is another bit of effectively-dead code left over
from Vadim's original plan of using WAL for UNDO.  I haven't worried
about ripping it out because it doesn't cost much and it seems that
distinguishing transactional from nontransactional changes might be
useful for log analysis if nothing else.

> Yep, that's right. The reconstructed page is not always byte-to-byte 
> identical to the original.

We don't worry about recovering cmin/cmax since only the originating
transaction would have cared.  I think physical location of tuples on
a page isn't reliably reproduced either.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid

2007-02-22 Thread Heikki Linnakangas


Heikki Linnakangas wrote:

Florian G. Pflug wrote:

seems to imply that (some?) wal redoe records only actually get redone
if the transaction that caused them eventually comitted. But given the
way postgres MVCC works that doesn't make sense to me, and I also can't
find any code that would actually skip xlog entries.


That comment is a bit misleading, I agree. We don't skip xlog entries, 
they're always replayed.


The xid in the WAL record is used by some WAL resource managers to 
reconstruct the original data. For that purpose, it might as well not be 
in the header, but in the data portion.


It's also used in PITR to recover up to a certain transaction, and it's 
used to advance the next xid counter to the next unused xid after replay.


Also, we skip clog update and writing the commit record if the 
transaction hasn't written any WAL records that are tied to the transaction.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid

2007-02-22 Thread Heikki Linnakangas


Florian G. Pflug wrote:

Hi

After futher reading I fear I have to bother you with another question ;-)
There is a flag XLOG_NO_TRAN passed via the info parameter to XLogInsert.

Now, for example the following comment in clog.c
/*
 * Write a TRUNCATE xlog record
 *
 * We must flush the xlog record to disk before returning --- see notes
 * in TruncateCLOG().
 *
 * Note: xlog record is marked as outside transaction control, since we
 * want it to be redone whether the invoking transaction commits or not.
 */
static void
WriteTruncateXlogRec(int pageno)
...

seems to imply that (some?) wal redoe records only actually get redone
if the transaction that caused them eventually comitted. But given the
way postgres MVCC works that doesn't make sense to me, and I also can't
find any code that would actually skip xlog entries.


That comment is a bit misleading, I agree. We don't skip xlog entries, 
they're always replayed.


The xid in the WAL record is used by some WAL resource managers to 
reconstruct the original data. For that purpose, it might as well not be 
in the header, but in the data portion.


It's also used in PITR to recover up to a certain transaction, and it's 
used to advance the next xid counter to the next unused xid after replay.



On a related note - Looking at e.g. heap_xlog_insert, it seems that
the orginal page (before the crash), and the one reconstructed via
heap_xlog_insert are "only" functionally equivalent, but not the same
byte-wise? At least this is what doing
HeapTupleHeaderSetCmin(htup, FirstCommandId);
seems to imply - surely the original command id could have been higher, no?


Yep, that's right. The reconstructed page is not always byte-to-byte 
identical to the original.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid

2007-02-22 Thread Florian G. Pflug


Hi

After futher reading I fear I have to bother you with another question ;-)
There is a flag XLOG_NO_TRAN passed via the info parameter to XLogInsert.

Now, for example the following comment in clog.c
/*
 * Write a TRUNCATE xlog record
 *
 * We must flush the xlog record to disk before returning --- see notes
 * in TruncateCLOG().
 *
 * Note: xlog record is marked as outside transaction control, since we
 * want it to be redone whether the invoking transaction commits or not.
 */
static void
WriteTruncateXlogRec(int pageno)
...

seems to imply that (some?) wal redoe records only actually get redone
if the transaction that caused them eventually comitted. But given the
way postgres MVCC works that doesn't make sense to me, and I also can't
find any code that would actually skip xlog entries.

On a related note - Looking at e.g. heap_xlog_insert, it seems that
the orginal page (before the crash), and the one reconstructed via
heap_xlog_insert are "only" functionally equivalent, but not the same
byte-wise? At least this is what doing
HeapTupleHeaderSetCmin(htup, FirstCommandId);
seems to imply - surely the original command id could have been higher, no?

greetings, Florian Pflug


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Andrew Dunstan


Teodor Sigaev wrote:
In that proposed syntax, I would drop all "=", ",", "(", and ")".  
They don't seem necessary and they are untypical for SQL commands.  
I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands 
that do similar things.


I was looking at CREATE TYPE mostly. With removing "=", ",", "(", and 
")" in CREATE/ALTER FULLTEXT it's needed to add several items in 
unreserved_keyword list. And increase gram.y by adding new rules 
similar to  OptRoleList instead of

simple opt_deflist:
'(' def_list ')'   { $$ = $2; }
  | /*EMPTY*/  { $$ = NIL; }
  ;

Is it acceptable?
List of new keywords is: LOCALE, LEXIZE, INIT, OPT, GETTOKEN, 
LEXTYPES, HEADLINE


So, syntax will be
CREATE FULLTEXT DICTIONARY dictname
LEXIZE  lexize_function
[ INIT init_function ]
[ OPT opt_text ];

CREATE FULLTEXT DICTIONARY dictname
   [ { LEXIZE  lexize_function | INIT init_function | OPT opt_text } 
[...] ]

LIKE template_dictname;




If we are worried about the size of the transition table and keeping it 
in cache (see remarks from Tom upthread) then adding more keywords seems 
a bad idea, as it will surely expand the table.  OTOH, I'd hate to make 
that a design criterion. My main worry has been that the grammar would 
be stable.


Just to quantify all this, I did a quick check on the grammar using 
bison -v - we appear to have 473 terminal symbols, and 420 non-terminal 
sybols in 1749 rules, generating 3142 states. The biggest tables 
generated are yytable and yycheck, each about 90kb on my machine.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] SCMS question

2007-02-22 Thread Markus Schiltknecht


Hi,

Andrew Dunstan wrote:
CVSup is not required, and is absent from most existing clients. I don't 
use it any more since the Fedora project stopped supporting it.


..which is quite understandable, concerning the PITA compiling modula-3 
gives you (or at least has given me, it still hurts).


The point you are missing is that, while we know existing buildfarm 
members all have CVS installed, we don't know that they have SVN or 
whatever, and requiring them to install it will involve significant 
distributed pain. 


Okay, I certainly agree that CVS is much more wide spread and available 
than most (if not all) other VCSes. Let's change that ;-)


It will also involve some considerable localised pain 
(probably on my part) in rewriting the client. Right now I'm thinking it 
might make some sense to future-proof buildfarm by creating some sort of 
snapshot server. OTOH, if we avoid use of whatever SCM system that the 
project uses, we aren't testing that part of the process.


Did I mention that monotone is a very good snapshot server? *duck*

You probably don't want to reinvent the weel, as 'snapshot serving' is 
exactly what a VCS should do (among other things).



You're making Tom's point again :-)


Yeah, sorry, couldn't resist :-)

IIRC you don't need to be connected to the repo to run "svn diff", 
whereas you do to run "cvs diff".


Yes, in the simplest case of comparing against the immediate successor 
revision. But certainly not for: svn diff -r${FURTHER_IN_THE_PAST}, as 
subversion does not have that data available (nor does CVS, for that 
matter).


We know the warts. If this were a green fields project there is no doubt 
we would not use CVS. But many proponents of other systems ignore the 
downside of changing.


Well, I guess many advocates for other VCSes (like myself) simply don't 
particularly like to talk about the downsides...  But they are probably 
more aware of them than most other people.


One thing I want to know is that whatever we change to will still be 
there, maintained and in widespread use, many years down the track. So 
far I am not sure about that for any possible replacement, with the 
possible exception of SVN.


That's certainly a valid concern, too. Probably *the* one where monotone 
is weaker compared to git and mercurial. :-(  We are working on that 
issue, though.


Regards

Markus

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Alvaro Herrera

Florian G. Pflug wrote:
> Heikki Linnakangas wrote:
> >No you're right, it's related to the WAL undo stuff that was never 
> >actually implemented. It's dead code.
> >
> >Teodor Sigaev wrote:
> >>Opps, sorry, I missed checkpoint keyword
> >>
> >>Teodor Sigaev wrote:
> >>>
> What am I missing?
> >>>Seems, it's about that
> >>>http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php
> 
> Thanks - everything makes much more sense now ;-)

So if you want to submit a patch to remove it, it'll make sense for
future readers as well ;-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Florian G. Pflug


Heikki Linnakangas wrote:
No you're right, it's related to the WAL undo stuff that was never 
actually implemented. It's dead code.


Teodor Sigaev wrote:

Opps, sorry, I missed checkpoint keyword

Teodor Sigaev wrote:



What am I missing?

Seems, it's about that
http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php


Thanks - everything makes much more sense now ;-)

greetings, Florian Pflug

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Heikki Linnakangas

No you're right, it's related to the WAL undo stuff that was never 
actually implemented. It's dead code.


Teodor Sigaev wrote:

Opps, sorry, I missed checkpoint keyword

Teodor Sigaev wrote:



What am I missing?

Seems, it's about that
http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php






--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Matthew T. O'Connor


Jim C. Nasby wrote:

On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote:
  
My Proposal:  If we require admins to identify hot tables tables, then: 
1) Launcher fires-off a worker1 into database X.

2) worker1 deals with "hot" tables first, then regular tables.
3) Launcher continues to launch workers to DB X every autovac naptime. 
4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as 
worker1 did above.  If worker1 is still working in DB X then worker2 
looks for hot tables that are being starved because worker1 got busy. 
If worker2 finds no hot tables that need work, then worker2 exits.



Rather than required people to manually identify hot tables, what if we
just prioritize based on table size? So if a second autovac process hits
a specific database, it would find the smallest table in need of
vacuuming that it should be able to complete before the next naptime and
vacuum that. It could even continue picking tables until it can't find
one that it could finish within the naptime. Granted, it would have to
make some assumptions about how many pages it would dirty.

ISTM that's a lot easier than forcing admins to mark specific tables.


So the heuristic would be:
* Launcher fires off workers into a database at a given interval 
(perhaps configurable?)
* Each worker works on tables in size order. 
* If a worker ever catches up to an older worker, then the younger 
worker exits.


This sounds simple and workable to me, perhaps we can later modify this 
to include some max_workers variable so that a worker would only exit if 
it catches an older worker and there are max_workers currently active.



Thoughts?

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] SCMS question

2007-02-22 Thread Andrew Dunstan


Markus Schiltknecht wrote:

Hi,

Andrew Dunstan wrote:
1. The buildfarm is very heavily dependent on CVS, and any change to 
anything else will be quite painful. There is no guarantee that all 
the members even have SVN installed,


But you can guarantee they have CVS or even cvsup installed? That 
seems dubious to me.


CVSup is not required, and is absent from most existing clients. I don't 
use it any more since the Fedora project stopped supporting it.


Buildfarm was designed to be able to run anywhere that a build from our 
repo could run, without requiring anything extra - I have even tried to 
keep to a minimum the perl modules required.


The point you are missing is that, while we know existing buildfarm 
members all have CVS installed, we don't know that they have SVN or 
whatever, and requiring them to install it will involve significant 
distributed pain. It will also involve some considerable localised pain 
(probably on my part) in rewriting the client. Right now I'm thinking it 
might make some sense to future-proof buildfarm by creating some sort of 
snapshot server. OTOH, if we avoid use of whatever SCM system that the 
project uses, we aren't testing that part of the process.




let alone anything else. And someone would have to code and test 
significant client changes. That said, a lot of the tortuous logic 
could be removed - change detection would almost just resolve to 
comparing two tree numbers with SVN, for example.


..and a *real* VCS (as in monotone :-) ) would provide not only that, 
but give you correctness guarantees, built in certification of 
revisions (i.e. each buildfarm member could issue a cert on successful 
testing) and lightweight branches, so you could much easier test 
experimental patches of different authors. Just to name a few 
additional advantages.



You're making Tom's point again :-)

Of course, SVN is better at disconnected operation than CVS, 


Really? I've dropped subversion exactly because it sucks big time when 
disconnected. But again, I'm probably not comparing against CVS...



IIRC you don't need to be connected to the repo to run "svn diff", 
whereas you do to run "cvs diff".




I have no doubt we'll change someday to something better. I don't 
know what it is and I don't think we need to be in any hurry. This 
space is still very fluid.


Yup. Good to hear you see it that way. As I understand, you have good 
reasons to be still using CVS, but are open to good suggestions. 
That's a very good thing, but easily slips by when reading all the 
critics and pro-CVS statements. ;-)



We know the warts. If this were a green fields project there is no doubt 
we would not use CVS. But many proponents of other systems ignore the 
downside of changing.


One thing I want to know is that whatever we change to will still be 
there, maintained and in widespread use, many years down the track. So 
far I am not sure about that for any possible replacement, with the 
possible exception of SVN.



cheers

andrew


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] autovacuum next steps, take 2

2007-02-22 Thread Jim C. Nasby

On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote:
> My Proposal:  If we require admins to identify hot tables tables, then: 
> 1) Launcher fires-off a worker1 into database X.
> 2) worker1 deals with "hot" tables first, then regular tables.
> 3) Launcher continues to launch workers to DB X every autovac naptime. 
> 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as 
> worker1 did above.  If worker1 is still working in DB X then worker2 
> looks for hot tables that are being starved because worker1 got busy. 
> If worker2 finds no hot tables that need work, then worker2 exits.

Rather than required people to manually identify hot tables, what if we
just prioritize based on table size? So if a second autovac process hits
a specific database, it would find the smallest table in need of
vacuuming that it should be able to complete before the next naptime and
vacuum that. It could even continue picking tables until it can't find
one that it could finish within the naptime. Granted, it would have to
make some assumptions about how many pages it would dirty.

ISTM that's a lot easier than forcing admins to mark specific tables.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Teodor Sigaev


Opps, sorry, I missed checkpoint keyword

Teodor Sigaev wrote:



What am I missing?

Seems, it's about that
http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] SCMS question

2007-02-22 Thread Markus Schiltknecht


Hi,

[ I've CCed the monotone-devel list, as I'm sure those people are 
interested, too. ]


Stefan Kaltenbrunner wrote:
Beside that - are all of the currently supported Platforms officially 
supported by the proposed SCMSes ?


I can only speak for monotone. We have (had) buildbots for x86 (linux, 
netbsd, freebsd, win32), amd64 (linux), ppc (osx) and one sparc (osol). 
So far all gcc compiled, AFAIK.


We are very interested in increasing portability of monotone. If you 
could give me (or other monotone devels) ssh access to some of the more 
obscure boxes, that would help a lot. Please contact me privately.


most of the issues with CVS in that regard have already been worked 
around (and are therefore "solved").


Huh? How do you guarantee the correctness of a local checkout? At best, 
you can check an md5 sum of a tar archive, but CVS itself does almost no 
integrity checking. Does the buildfarm code check somehow? Against what? 
(Note that we've already had quite some disk failures uncovered by 
monotone, which does extensive integrity checking. But I'm sure database 
people know how important that is, don't you?)


Or quickly test experimental patches? Is that solved?

Or merging between branches, to add another major annoyance of CVS (and 
subversion, for that matter).


I currently fetch the whole PostgreSQL repository via cvsup and then 
import it into monotone to be able to do serious work. Of course that's 
possible, and you can work around all the other limitations of CVS 
somehow, but it's annoying.


But I agree that for developers especially those that are doing large 
patches over a long period of time might gain something from another 
SCMS, but it is not really clear what that SCMS should be or if it 
warrants the imho enormous switching costs (and the potential disruption 
 in development until that switch is done which might take days if not 
weeks).


I certainly agree that switching to another VCS is a major undertaking. 
And I'm working on easing migration to monotone. And I'll quite 
certainly try again to convince you again, *at some point in the 
future*. I would not vote for switching the PostgreSQL repository to 
monotone, yet. (As if I had a vote...;-) )


Regards

Markus


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Teodor Sigaev




What am I missing?

Seems, it's about that
http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Column storage positions

2007-02-22 Thread Phil Currier

On 2/22/07, Tom Lane <[EMAIL PROTECTED]> wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> Right, I'm not advocating not doing that -- I'm just saying that the
>> first step to that could be decoupling physical position with attr id
>> :-) Logical column ordering (the order in which SELECT * expands to)
>> seems to me to be a different feature.

> Except in the sense that divorcing the id from the storage order makes
> it possible to do sanely. :-)

They are different features, but they are going to hit all the same
code, because the hardest part of this remains making sure that every
piece of the code is using the right kind of column number.  The
suggestion I posted awhile ago amounts to saying that we might be able
to solve that by default, by making sure that only one definition of
"column number" is relevant to the majority of the backend and we can
figure out exactly where the other definitions need to apply.  But
that's handwaving until someone actually does it :-(

I don't really think it's just handwaving at this point because I've
done a lot of it :).  I'm not saying the work is done, or that a lot
more testing isn't required, but at the moment I have a working system
that seems to do what it needs to do to separate storage position from
permanent ID/display position.  And the changes to accomplish this
were quite localized - namely the tuple access routines in
heaptuple.c, and the small handful of places that need to construct
tuple descriptors.  That's pretty much it - the rest of the codebase
is remains untouched.

In any case I think it's foolish not to tackle both issues at once.
We know we'd like to have both features and we know that all the same
bits of code need to be looked at to implement either.

I guess I disagree with that sentiment.  I don't think it's necessary
to bundle these two features together, even if some analysis will be
duplicated between them, since they are completely distinct in a
functional sense and will touch different places in the code.
Smaller, more incremental changes make more sense to me.

But if both-features-at-once is what the community wants, that's fine,
no worries.  I'll just pull my own personal hat out of the ring until
someone comes along who's interested in implementing them both at the
same time.

phil

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev


CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT;
SELECT add_fulltext_config('myfts', 'template_cfg', True);

That's simple, but what about
CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, 
...];
?

SELECT create_fulltext_mapping(cfgname, '{lexemetypename[, ...]}'::text[],
'{dictname1[, ...]}'::text[]);

Seems rather ugly for me...

And function interface does not provide autocompletion and online help in psql. 
\df says only types of arguments, not a meaning.



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] What is CheckPoint.undo needed for?

2007-02-22 Thread Florian G. Pflug


Hi

I'm trying to gain a better understanding of how the postgres
xlog works - especially about the corner cases of wal replay.

One thing that I do not understand is what CheckPoint.undo is
used for. I grepped through the source, and only see very few
references to it, which either just print it, or set it to
the checkpoint location or the redo location.

The only place where is is checked is
if (XLByteLT(checkPoint.undo, RecPtr) || XLByteLT(checkPoint.redo, RecPtr))
but I don't see the reason for this check, given that undo never
seems to be assigned any interesting value..

Is there any in-depth documentation about the postgres xlog
implementation?
What am I missing?

greetings, Florian Pflug

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Teodor Sigaev

In that proposed syntax, I would drop all "=", ",", "(", and ")".  They 
don't seem necessary and they are untypical for SQL commands.  I'd 
compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that 
do similar things.


I was looking at CREATE TYPE mostly. With removing "=", ",", "(", and ")" in 
CREATE/ALTER FULLTEXT it's needed to add several items in unreserved_keyword 
list. And increase gram.y by adding new rules similar to  OptRoleList instead of

simple opt_deflist:
'(' def_list ')'   { $$ = $2; }
  | /*EMPTY*/  { $$ = NIL; }
  ;

Is it acceptable?
List of new keywords is: LOCALE, LEXIZE, INIT, OPT, GETTOKEN, LEXTYPES, HEADLINE

So, syntax will be
CREATE FULLTEXT DICTIONARY dictname
LEXIZE  lexize_function
[ INIT init_function ]
[ OPT opt_text ];

CREATE FULLTEXT DICTIONARY dictname
   [ { LEXIZE  lexize_function | INIT init_function | OPT opt_text } [...] ]
LIKE template_dictname;

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Pavan Deolasee

On 2/22/07, Zeugswetter Andreas ADI SD <[EMAIL PROTECTED]> wrote:

>
> Yes, thats one option. Though given a choice I would waste
> four bytes in the heap-page than inserting a new index entry.

No question about that. My point was, that it would mean wasting
the 2 (2 must be enough for a slot pointer) bytes on every heap tuple,
hot or not. And then the decision is not so obvious anymore.
If you don't have the room for the back pointer on every slot,
there is no room to add one later.

Oh yes, I agree. I was referring to the idea of line pointer redirection
which would waste four bytes (for the root line pointer)  per
hot-update chain. That occurs only when a tuple is hot-updated.
So there is no overhead for normal tuples. Also, since its outside
the tuple header, we don't have issues of additional space wastage
because of alignment.

We would need to teach the code to ignore all such pointers which
don't point to a real tuple, but only redirects us to another line pointer.
We arrive at this line pointer from the index and then get redirected
to another line pointer on the same page. Vacuum would need to
delay freeing this line pointer until the hot-update chain is dead.

I am waiting for feedback on this since I would like to work on
this next. Anybody sees any issue with this approach ? Comments
please.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] SCMS question

2007-02-22 Thread Stefan Kaltenbrunner


Markus Schiltknecht wrote:

Hi,

Andrew Dunstan wrote:
1. The buildfarm is very heavily dependent on CVS, and any change to 
anything else will be quite painful. There is no guarantee that all 
the members even have SVN installed,


But you can guarantee they have CVS or even cvsup installed? That seems 
dubious to me.


getting CVS on a box is still way easier than SVN (I don't want event 
talk about more esoteric ones) especially on older and/or special 
platforms. As someone who operates a large number of buildfarm members 
switching to something else would put a large burden(both in terms of 
installation and configuration changes/upgrades of the buildfarm client) 
on me for no appearent gain.
Beside that - are all of the currently supported Platforms officially 
supported by the proposed SCMSes ?




let alone anything else. And someone would have to code and test 
significant client changes. That said, a lot of the tortuous logic 
could be removed - change detection would almost just resolve to 
comparing two tree numbers with SVN, for example.


..and a *real* VCS (as in monotone :-) ) would provide not only that, 
but give you correctness guarantees, built in certification of revisions 
(i.e. each buildfarm member could issue a cert on successful testing) 
and lightweight branches, so you could much easier test experimental 
patches of different authors. Just to name a few additional advantages.


most of the issues with CVS in that regard have already been worked 
around (and are therefore "solved").


But I agree that for developers especially those that are doing large 
patches over a long period of time might gain something from another 
SCMS, but it is not really clear what that SCMS should be or if it 
warrants the imho enormous switching costs (and the potential disruption 
 in development until that switch is done which might take days if not 
weeks).



Stefan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] SCMS question

2007-02-22 Thread Markus Schiltknecht


Hi,

Andrew Dunstan wrote:
1. The buildfarm is very heavily dependent on CVS, and any change to 
anything else will be quite painful. There is no guarantee that all the 
members even have SVN installed,


But you can guarantee they have CVS or even cvsup installed? That seems 
dubious to me.


let alone anything else. And someone 
would have to code and test significant client changes. That said, a lot 
of the tortuous logic could be removed - change detection would almost 
just resolve to comparing two tree numbers with SVN, for example.


..and a *real* VCS (as in monotone :-) ) would provide not only that, 
but give you correctness guarantees, built in certification of revisions 
(i.e. each buildfarm member could issue a cert on successful testing) 
and lightweight branches, so you could much easier test experimental 
patches of different authors. Just to name a few additional advantages.


2. Many people (and some buildfarm members) operate against mirrors of 
the main repo which are created with rsync or CVSup. I am not aware of 
any way to do the equivalent with SVN -  any info would be gratefully 
received. 


You might want to have a look at svk. It can do exactly that. And the 
Blog of Thomas explains how, see [1].


Of course, SVN is better at disconnected operation than CVS, 


Really? I've dropped subversion exactly because it sucks big time when 
disconnected. But again, I'm probably not comparing against CVS...


I have no doubt we'll change someday to something better. I don't know 
what it is and I don't think we need to be in any hurry. This space is 
still very fluid.


Yup. Good to hear you see it that way. As I understand, you have good 
reasons to be still using CVS, but are open to good suggestions. That's 
a very good thing, but easily slips by when reading all the critics and 
pro-CVS statements. ;-)


Regards

Markus

[1]: Remote Backup Of A Subversion Repository
http://moelhave.dk/2006/07/remote-mirroring-a-subversion-svn-repository/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > Imho we should follow the swing idea.
> 
> 
> Yes, thats one option. Though given a choice I would waste 
> four bytes in the heap-page than inserting a new index entry.

No question about that. My point was, that it would mean wasting 
the 2 (2 must be enough for a slot pointer) bytes on every heap tuple,
hot or not. And then the decision is not so obvious anymore.
If you don't have the room for the back pointer on every slot,
there is no room to add one later.

Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Grouped Index Tuples

2007-02-22 Thread Heikki Linnakangas

I've brought the GIT patch up-to-date with CVS head. The latest version 
can be found at http://community.enterprisedb.com/git/


I also reran the CPU bound test cases with the latest patch.

I want this in 8.3 in some form, and I have the time to do any required 
changes. If someone wants to see more tests, I can arrange that as well.


The patch is pretty big at the moment. I think the best way to proceed 
with this is to extract some smaller, incremental patches from it that 
just refactor the current b-tree code. After that, the final patch that 
implements GIT should be much smaller and more readable. And there's 
still a bunch of todo items there as well...


But before I start doing that, I need some review and general agreement 
on the design. What I don't want to happen is that three days after the 
feature freeze, someone finally looks at it and finds a major issue or 
just thinks it's an unreadable mess, and we no longer have the time to 
fix it.


One question that I'm sure someone will ask is do we need this if we 
have bitmap indexes? Both aim at having a smaller index, after all. The 
use cases are quite different; GIT is effective whenever you have a 
table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's 
effectiveness doesn't depend on the number of distinct values, in 
particular it works well with unique indexes. GIT is comparable to 
clustered indexes in other DBMSs (in fact we might want to call GIT that 
in the end).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] tsearch in core patch, for inclusion

2007-02-22 Thread Markus Schiltknecht


Hi,

Peter Eisentraut wrote:

Oleg Bartunov wrote:

It's not so big addition to the gram.y, see a list of commands
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html.


As we still to still discuss the syntax: is there a proposal for how a 
function based syntax would look like?


CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT;

just seems so much more SQL-like than:

SELECT add_fulltext_config('myfts', 'template_cfg', True);

I admit, that's a very simple and not thought through example. But as 
long as those who prefer not to extend the grammar don't come up with a 
better alternative syntax, one easily gets the impression that extending 
the grammar in general is evil.


In that proposed syntax, I would drop all "=", ",", "(", and ")".  They 
don't seem necessary and they are untypical for SQL commands.  I'd 
compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that 
do similar things.


Yup, I'd second that.

Regards

Markus


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] SCMS question

2007-02-22 Thread Andrew Dunstan


Tom Lane wrote:

Gregory Stark <[EMAIL PROTECTED]> writes:
  

If we want to minimize the pain of changing and keep the same mode of
operation Subversion is definitely the right choice. Its goal was to provide
the same operational model as CVS and fix the implementation and architectural
problems.



Erm ... but this is not an argument in favor of changing.

AFAIR the only real disadvantage of CVS that we've run up against is
that it's hard to shuffle files around to different directories without
losing their change history (or more accurately, making the history
harder to find).  Now that is a pretty considerable annoyance on some
days, but it's not sufficient reason to change to something else.
I have no doubt that every other SCMS has annoyances of its own.

  


Oh, goody! My favourite non-productive debate! :-)

I work daily with SVN, and it certainly has some of the CVS pain points 
fixed, plus one or two nice gadgets. It's annoyed me a couple of times 
too, although I can't remember exactly how.


Let me throw another couple of data points into the mix.

1. The buildfarm is very heavily dependent on CVS, and any change to 
anything else will be quite painful. There is no guarantee that all the 
members even have SVN installed, let alone anything else. And someone 
would have to code and test significant client changes. That said, a lot 
of the tortuous logic could be removed - change detection would almost 
just resolve to comparing two tree numbers with SVN, for example.


2. Many people (and some buildfarm members) operate against mirrors of 
the main repo which are created with rsync or CVSup. I am not aware of 
any way to do the equivalent with SVN -  any info would be gratefully 
received. Of course, SVN is better at disconnected operation than CVS, 
so it might be a non-issue for many. Even so, it might be a pity to have 
to forego the facility.


I have no doubt we'll change someday to something better. I don't know 
what it is and I don't think we need to be in any hurry. This space is 
still very fluid.


cheers

andrew


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Pavan Deolasee

On 2/22/07, Zeugswetter Andreas ADI SD <[EMAIL PROTECTED]> wrote:

> > I very much like Hannu's idea, but it does present some issues.
> >
> >
> I too liked Hannu's idea initially, but Tom raised a valid
> concern that it does not address the basic issue of root
> tuples. According to the idea, a DEAD root tuple can be used
> for a subsequent update of the same row.

If you are reusing the existing slot of a root tuple how will that
slot likely have room for an extra pointer and a live tuple ?
If the idea does not cover root reuse we don't need pointers.

Hannu talked about using one of xmin/xmax for storing
back-pointers. There were objections to that since it breaks
the xmax/xmin matching robustness that we have today.

Imho we should follow the swing idea.

Yes, thats one option. Though given a choice I would waste
four bytes in the heap-page than inserting a new index entry.
The heap tuples can be vacuumed rather easily than the index
entries which, if I am mistaken, can not be reused even after
marked LP_DELETEd.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Column storage positions

2007-02-22 Thread Simon Riggs

On Wed, 2007-02-21 at 16:57 -0300, Alvaro Herrera wrote:
> Andrew Dunstan escribió:
> > Simon Riggs wrote:
> > >
> > >I agree with comments here about the multiple orderings being a horrible
> > >source of bugs, as well as lots of coding even to make it happen at all
> > >http://archives.postgresql.org/pgsql-hackers/2006-12/msg00859.php
> > 
> > I thought we were going with this later proposal of Tom's (on which he's 
> > convinced me): 
> > http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php - if 
> > not I'm totally confused (situation normal). The current thread started 
> > with this sentence:
> > 
> > >Inspired by this thread [1], and in particular by the idea of storing
> > >three numbers (permanent ID, on-disk storage position, display
> > >position) for each column, I spent a little time messing around with a
> > >prototype implementation of column storage positions to see what kind
> > >of difference it would make.
> > 
> > I haven't understood Alvaro to suggest not keeping 3 numbers.
> 
> Right, I'm not advocating not doing that -- I'm just saying that the
> first step to that could be decoupling physical position with attr id
> :-) Logical column ordering (the order in which SELECT * expands to)
> seems to me to be a different feature.

Not disagreed. :-)

Something very, very simple seems most likely to be an effective
additional feature for 8.3. We can implement the 2/3 position version
for 8.4

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] SCMS question

2007-02-22 Thread Gavin Sherry

On Thu, 22 Feb 2007, Tom Lane wrote:

> Gregory Stark <[EMAIL PROTECTED]> writes:
> > If we want to minimize the pain of changing and keep the same mode of
> > operation Subversion is definitely the right choice. Its goal was to provide
> > the same operational model as CVS and fix the implementation and 
> > architectural
> > problems.
>
> Erm ... but this is not an argument in favor of changing.
>
> AFAIR the only real disadvantage of CVS that we've run up against is
> that it's hard to shuffle files around to different directories without
> losing their change history (or more accurately, making the history
> harder to find).  Now that is a pretty considerable annoyance on some
> days, but it's not sufficient reason to change to something else.
> I have no doubt that every other SCMS has annoyances of its own.

It's not a problem for the project but I personally experience pain with
CVS. I often want to take a bunch of commits and merge them into seperate
trees (like Greenplum DB or my private bitmap index tree). This is a lot
easier with the patch set based SCMs like darcs/monotone/git/etc.

Just my thoughts.

Gavin

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] SCMS question

2007-02-22 Thread Gregory Stark


"Peter Eisentraut" <[EMAIL PROTECTED]> writes:

> Gregory Stark wrote:
>> [on a related note, is something wrong with my cvs rsync tree or is
>> configure in the CVS repository? It's causing my patches to bloat
>> considerably since I added one line to configure.in]
>
> cat CVS/Entries

$ cat CVS/Entries
D/config
D/contrib
D/doc
D/src
/configure.in/1.501/Wed Feb 14 11:43:15 2007//
/COPYRIGHT/1.13/Thu Feb 15 14:31:14 2007//
/GNUmakefile.in/1.46/Thu Feb 15 14:31:14 2007//
/Makefile/1.13/Thu Feb 15 14:31:14 2007//
/README/1.32/Thu Feb 15 14:31:14 2007//
/README.CVS/1.3/Thu Feb 15 14:31:14 2007//
/aclocal.m4/1.18/Thu Feb 15 14:31:14 2007//
/configure/1.534/Wed Feb  7 00:28:54 2007//

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-22 Thread Zeugswetter Andreas ADI SD


> > I very much like Hannu's idea, but it does present some issues.
> >
> >
> I too liked Hannu's idea initially, but Tom raised a valid 
> concern that it does not address the basic issue of root 
> tuples. According to the idea, a DEAD root tuple can be used 
> for a subsequent update of the same row.

If you are reusing the existing slot of a root tuple how will that
slot likely have room for an extra pointer and a live tuple ?
If the idea does not cover root reuse we don't need pointers.

Imho we should follow the swing idea. It should even be possible 
to point the root to the newest dead tuple during update (if it 
was index path), and reuse an older dead slot from the chain.
Then we can limit the chain to number of potentially visible 
tuples + root + 2 without vacuum.

Andreas

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

1 2 >

1 - 100 of 109 matches

Mail list logo