Re: [HACKERS] advisory locks and permissions

2006-09-21 Thread Dimitri Fontaine
Le jeudi 21 septembre 2006 01:52, Tom Lane a écrit :
 Or we could try to do something about limiting the number of such locks
 that can be granted, but that seems nontrivial to tackle at such a late
 stage of the devel cycle.

 Thoughts?

What about reserving some amount of shared_buffers out of those locks?
(For example ext2 preserve some disk space for root in case of emergency)

Don't know anything about how easily (error prone) this can be done, though.

Le jeudi 21 septembre 2006 16:22, Tom Lane a écrit :
 Another reason for restricting access to the advisory-lock functions
 is that an uninformed application might take the wrong locks, and
 bollix up your intended usage accidentally.

This sounds like one more attempt to protect against idiots, which universe 
tend to produce on a pretty quick rate :)

My 2¢,
-- 
Dimitri Fontaine
Directeur Technique
Tel: 06 74 15 56 53


pgpnt2Cy8pGIV.pgp
Description: PGP signature


Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-09-04 Thread Dimitri Fontaine
Le jeudi 04 septembre 2008, Robert Treat a écrit :
 To paraphrase, if you can't write a config file correctly before
 restarting, I do not want you anywhere near any instance of a production
 system

Do you really want to TCO of PostgreSQL to raise that much when the software 
could help lowering it?

If you're a shop where you can't have only experts in any given domain do 
level 1 nightly fix, maybe you'd still like them to be able to follow some 
procedures involving server config change.
On what grounds are you wanting this not to be possible?

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] reducing statistics write overhead

2008-09-07 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 7 sept. 08 à 00:45, Tom Lane a écrit :

I dislike the alternative of communicating through shared memory,
though.  Right now the stats collector isn't even connected to shared
memory.


Maybe Markus Wanner work for Postgres-R internal messaging, now it has  
been reworked to follow your advices, could be of some use here?

  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01114.php
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01420.php

Regards,
- --
dim



- --
Dimitri Fontaine
PostgreSQL DBA, Architecte



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjEF0sACgkQlBXRlnbh1bl/FACeORN+NjEFC9wi22suNaSoWmi5
LBEAnj9Qo2E6GWqVjdtsSCG7JILBPmX6
=5jPo
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-09 Thread Dimitri Fontaine
Hi,

Le mardi 09 septembre 2008, Heikki Linnakangas a écrit :
 The tricky part is, how does A know if it should wait, and for how long?
 commit_delay sure isn't ideal, but AFAICS the log shipping proposal
 doesn't provide any solution to that.

It might just be I'm not understanding what it's all about, but it seems to me 
with WALSender process A will wait, whatever happens, either until the WAL is 
sent to slave or written to disk on the slave.

I naively read Simon's proposition to consider GroupCommit done with this new 
feature. A is already waiting (for some external event to complete), why 
can't we use this for including some other transactions commits into the 
local deal?

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-09 Thread Dimitri Fontaine
Le mardi 09 septembre 2008, Markus Wanner a écrit :
 ..and it will still has to wait until WAL is written to disk on the
 local node, as we do now. These are two different things to wait for.
 One is a network socket operation, the other is an fsync(). As these
 don't work together too well (blocking), you better run that in two
 different processes.

Exactly the point. The process is now already waiting in all cases, so maybe 
we could just force waiting some WALSender signal before sending the fsync() 
order, so we now have Group Commit.
I'm not sure this is a good idea at all, it's just the way I understand how 
adding WALSender process in the mix could give Group Commit feature for free.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-09 Thread Dimitri Fontaine
Le mardi 09 septembre 2008, Simon Riggs a écrit :
 If the WALWriter|Sender is available, it can begin the task immediately.
 There is no need for it to wait if you want synchronous behaviour.

Ok. Now I'm as lost as anyone with respect to how you get Group Commit :)
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Synchronous Log Shipping Replication

2008-09-10 Thread Dimitri Fontaine
Hi,

Le mercredi 10 septembre 2008, Heikki Linnakangas a écrit :
 Sure. That's the fundamental problem with synchronous replication.
 That's why many people choose asynchronous replication instead. Clearly
 at some point you'll want to give up and continue without the slave, or
 kill the master and fail over to the slave. I'm wondering how that's
 different than the lag between master and server in asynchronous
 replication from the client's point of view.

As a future user of this new facilities, the difference from client's POV is 
simple : in normal mode of operation, we want a strong guarantee that any 
COMMIT has made it to both the master and the slave at commit time. No lag 
whatsoever.

You're considering lag as an option in case of failure, but I don't see this 
as acceptable when you need sync commit. In case of network timeout, cluster 
is down. So you want to either continue servicing in degraged mode or get the 
service down while you repair the cluster, but neither of those choice can be 
transparent to the admins, I'd argue.

Of course, main use case is high availability, which tends to say you do not 
have the option to stop service, and seems to dictate continue servicing in 
degraded mode: slave can't keep up (whatever the error domain), master is 
alone, advertise to monitoring solutions and continue servicing.
And provide some way for the slave to rejoin, maybe, too.

 I'm not sure I understand that paragraph. Who's the user? Do we need to
 expose some new information to the client so that it can do something?

Maybe with some GUCs where to set the acceptable timeout for WAL sync 
process, and if reaching timeout is a warning or an error. With a userset GUC 
we could event have replication-error-level transaction concurrent to non 
critical ones...

Now what to do exactly in case of error remains to be decided...

HTH, Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-11 Thread Dimitri Fontaine
Le jeudi 11 septembre 2008, Heikki Linnakangas a écrit :
 Well, yes, but you can fall behind indefinitely that way. Imagine that
 each transaction on the slave lasts, say 10 minutes, with a new
 transaction starting every 5 minutes. On the master, there's a table
 that's being vacuumed (or HOT-updated) frequently, say after each
 transaction for simplicity. What can happen is that every transaction
 that finishes on the slave will only let the WAL replay advance by one
 XID before blocking on the snapshot of the next slave transaction. The
 WAL replay will advance at a rate of 0.2 TPM, while the master is
 generating 1.0 TPM.

What would forbid the slave to choose to replay all currently lagging WALs 
each time it's given the choice to advance a little?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Transaction Snapshots and Hot Standby

2008-09-11 Thread Dimitri Fontaine
Le jeudi 11 septembre 2008, Csaba Nagy a écrit :
 Well now that I think I understand what Heikki meant, I also think the
 problem is that there's no choice at all to advance, because the new
 queries will simply have the same snapshot as currently running ones as
 long as WAL reply is blocked... further blocking the WAL reply. When
 saying this I suppose that the snapshot is in fact based on the last
 recovered XID, and not on any slave-local XID. In that case once WAL
 recovery is blocked, the snapshot is stalled too, further blocking WAL
 recovery, and so on...

Well, it may be possible to instruct the WAL replay daemon to stop being 
polite sometimes: when a given max_lag_delay is reached, it could take locks 
and as soon as it obtains them, replay all remaining WAL.
The max_lag_delay would be a GUC allowing to set the threshold between 
continuing the replay and running queries.

There could some other nice ideas without inventing yet another GUC, but this 
one was eaiser to think about for me ;)
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Where to Host Project

2008-09-19 Thread Dimitri Fontaine
Le jeudi 18 septembre 2008, David E. Wheeler a écrit :
 So I'm wondering, given the various discussions of PostgreSQL module
 hosting in the past, where would be a good place to put a PostgreSQL
 module project? The things I would like to have are:

* SVN or git hosting (I've not used git, but would try it)
* Ability to hand out commit bits to other folks
* A project home page and/or wiki
* Good search results rankings in Google et al.
* Mail lists
* Bug tracking
* Release management

There's a french non-profit team offering those:
  http://tuxfamily.org/en/main

You can even take their open source hosting facility software and offer your 
own services based on it, and/or extend their perl code to add new features.
I tried to talk pgfoundry admins into this solution in the past, but I 
understand maintaining pgfoundry is a PITA.

HTP,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Where to Host Project

2008-09-20 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 20 sept. 08 à 09:42, Dave Page a écrit :
On Sat, Sep 20, 2008 at 8:37 AM, Joshua D. Drake  
[EMAIL PROTECTED] wrote:

Dave Page wrote:


Well that's not strictly true - I persuaded one of the GForge
developers to work on the upgrade. As far as I'm aware, we're still
waiting for the hardware/OS platform to be sorted out after some
initial problems. I suspect JD will tell me something different  
though

- that being the case, perhaps we can work out the issues and get on
with the upgrade.


I suppose the plan is to upgrade to a newer GForge. Is it still time  
to propose something completely different? I have real good feedbacks  
about VHFFS, a perl based clean-room re-implementation of it, if you  
want to see it this way.

  http://www.vhffs.org/wiki/index.php
  http://fr.wikipedia.org/wiki/VHFFS (easy to grasp keywords)

Hope this helps, regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjVUqcACgkQlBXRlnbh1bnuWwCgsWMSrYACh2lOt+xbeqa6DCbO
j7AAnifgloNY7ldaA+54S9HLlLxqBvuC
=LoLv
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL future ideas

2008-09-20 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Who can resist the programming language game?

Le 19 sept. 08 à 22:37, D'Arcy J.M. Cain a écrit :

On Fri, 19 Sep 2008 20:57:36 +0100
Dave Page [EMAIL PROTECTED] wrote:
On Fri, Sep 19, 2008 at 8:54 PM, Gevik Babakhani [EMAIL PROTECTED]  
wrote:
Has there been any idea to port PG to a more modern programming  
language
like C++? Of course there are some minor obstacles like a new OO  
design,


The plan is to start porting it to Java after the next release -
probably at the beginning of April.


I don't think that we should rush into any one language without
checking the alternatives.  Personally I think we should port  
everything

to Intercal.


May I recall Greenspun's Tenth Rule of Programming: any sufficiently  
complicated C or Fortran program contains an ad hoc informally- 
specified bug-ridden slow implementation of half of Common Lisp.


Regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjVVZUACgkQlBXRlnbh1bl/6gCcDhLEAdy+pZnjGnKSly3jmZqC
5pYAoMbseRc3Di49dRnr4XLDIGJOApFz
=Qj2e
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Dimitri Fontaine
Le lundi 22 septembre 2008, Andrew Dunstan a écrit :
  You'd really want the latter anyway for some cases, ie, when you don't
  want the restore trying to hog the machine.  Maybe the right form for
  the extra option is just a limit on how many connections to use.  Set it
  to one to force the exact restore order, and to other values to throttle
  how much of the machine the restore tries to eat.

 My intention is to have single-thread restore remain the default, at
 least for this go round, and have the user be able to choose
 --multi-thread=nn to specify the number of concurrent connections to use.

What about the make famous -j option?

   -j [jobs], --jobs[=jobs]
Specifies the number of jobs (commands) to run simultaneously.  If
there  is  more than one -j option, the last one is effective.  If
the -j option is given without an argument, make  will  not  limit
the number of jobs that can run simultaneously.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] parallel pg_restore

2008-09-22 Thread Dimitri Fontaine
Le lundi 22 septembre 2008, Joshua Drake a écrit :
 I will not argue vehemently here but I will say that jobs doesn't
 seem correct. The term workers seems more appropriate.

Mmmm, it sounds like it depends on the implementation (and how all workers 
will share the same serializable transaction or just be independant jobs), 
but my point here is more about giving the user a name they are used to.
Like in oh, pg_restore -j, I see, thanks.

Now, if your argument is that the make concept of job does not match the 
parallel pg_restore concept of workers, I'll simply bow to your choice: 
baring other limits, English not being my natural language makes it hard 
for me to follow there ;)

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] parallel pg_restore

2008-09-24 Thread Dimitri Fontaine
Hi,

Le mardi 23 septembre 2008, Andrew Dunstan a écrit :
 In any case, my agenda goes something like this:

 * get it working with a basic selection algorithm on Unix (nearly
   done - keep your eyes open for a patch soon)
 * start testing
 * get it working on Windows
 * improve the selection algorithm
 * harden code

I'm not sure whether your work will feature single table restore splitting, 
but if it's the case, you could consider having a look at what I've done in 
pgloader. The parallel loading work there was asked for by Simon Riggs and 
Greg Smith and you could test two different parallel algorithms.
The aim was to have a simple testbed allowing PostgreSQL hackers to choose 
what to implement in pg_restore, so I still hope it'll get usefull someday :)

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] parallel pg_restore

2008-09-24 Thread Dimitri Fontaine
Le mercredi 24 septembre 2008, Andrew Dunstan a écrit :
 No. The proposal will perform exactly the same set of steps as
 single-threaded pg_restore, but in parallel. The individual steps won't
 be broken up.

Ok, good for a solid trustworthy parallelism restore. Which is exactly what we 
want. Just out of curiosity, do you plan to use Postgres-R helper backends 
infrastructure?

 Quite apart from anything else, parallel data loading of individual
 tables will defeat clustering, as well as making it impossible to avoid
 WAL logging of the load (which I have made provision for).

Depends whether the different workers are able to work from the same 
transaction or not, I imagine. Some work has been done to allow multiple 
backends to be working in the exact same transaction (Simon's snapclone and 
Postgres-R helper backend infrastructure), so one of them could TRUNCATE the 
table and give a go signal to workers to fill the table. In the same 
transaction.
Ok, I need to wake up now... :)

 The fact that custom archives are compressed by default would in fact
 make parallel loading of individual tables' data difficult with the
 present format. We'd have to do something like expanding it on the
 client (which might not even have enough disk space) and then split it
 before loading it to the server. That's pretty yucky. Alternatively,
 each loader thread would need to start decompressing the data from the
 start and thow away data until it got to the point it wanted to start
 restoring from. Also pretty yucky.

Another alternative is the round-robin reader implemented in pgloader, where 
all the archive reading is done by a single worker, which then split what it 
read to any number of coworkers, filling next queue(s) while previous one(s) 
are busy COPYing to the server.

 Far better would be to provide for multiple data members in the archive
 and teach pg_dump to split large tables as it writes the archive. Then
 pg_restore would need comparatively little adjustment.

Well, that's another possibility, but I tend to prefer having the parallelism 
mecanics into the restore side of things. It may be only an illusion, but 
this way I have far more trust into my backups.

 Also, of course, you can split tables yourself by partitioning them.
 That would buy you parallel data load with what I am doing now, with no
 extra work.

And that's excellent :)

 In any case, data loading is very far from being the only problem. One
 of my clients has long running restores where the data load takes about
 20% or so of the time - the rest is in index creation and the like. No
 amount of table splitting will make a huge difference to them, but
 parallel processing will. 

Oh yes, I'm running into this too (not on the same level but still).
Parallel seqscan should help creating indexes in parallel without having the 
disks going crazy for read - write - read - write etc sequences, and posix 
advices should help a lot here too.
Does the dependancy tracker in pg_restore allows to consider FK creation are 
dependant on matching PK being already there?

 As against that, if your problem is in loading 
 one huge table, this won't help you much. However, this is not a pattern
 I see much - most of my clients seem to have several large tables plus a
 boatload of indexes. They will benefit a lot.

The use case given by Greg Smith at the time was loading a multi terabyte 
table on a raid array with a lot of spindles. It then become impossible for a 
single CPU to take full profit of the available write bandwith. No idea how 
common this situation is in the field, though.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] parallel pg_restore

2008-09-24 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 24 sept. 08 à 18:56, Andrew Dunstan a écrit :
The is purely a patch to pg_restore. No backend changes at all (and  
if I did it would not use anything that isn't in core anyway).


Ok, good.
I'm eager to see what -core hackers will want to do with Postgres-R  
patches, but that shouldn't be a reason to distract them, sorry...


Also, you ignored the point about clustered data. Maybe that doesn't  
matter to some people, but it does to others. This is designed to  
provide the same result as a single threaded pg_restore. Splitting  
data will break that.


I'm not sure I understand what you mean by clustered data here, in  
fact...


Having pg_dump do the split would mean you get it for free, pretty  
much. Rejecting that for a solution that could well be a bottleneck  
at restore time would require lots more than just a feeling. I don't  
see how it would give you any less reason to trust your backups.


Well, when pg_restore's COPY fail, the table is not loaded and you get  
an ERROR, and if you're running with the -1 option, the restore stops  
here and you get a nice ROLLBACK.
With this later option, even if pg_dump did split your tables, the  
ROLLBACK still happens.


Now, what happens when only one part of the data cannot be restored  
but you didn't pg_restore -1. I guess you're simply left with a  
partially restored table. How will you know which part contains the  
error? How will you replay the restoring of this part only?


It the answer is to play with the restore catalogue, ok, if that's not  
it, I'm feeling the dumps are now less trustworthy with the split  
option than they were before.


Of course all this remains hypothetical as your work is not including  
such a feature, which as we see is yet to be designed.


I still think the multiple data members of the archive approach  
would be best here. One that allowed you to tell pg_dump to split  
every nn rows, or every nn megabytes. Quite apart from any  
parallelism issues, that could help enormously when there is a data  
problem as happens from time to time, and can get quite annoying if  
it's in the middle of a humungous data load.


Agreed, but it depends a lot on the ways to control the part that  
failed, IMVHO. And I think we'd prefer to have a version of COPY FROM  
with the capability to continue loading on failure...


Regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjakLQACgkQlBXRlnbh1bm4jgCg0WenIOsaHwD9GDpI6C2mhVYB
pdwAoJYesvDYByQbSxqMjIEZOR9KiVXu
=AVy3
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-29 Thread Dimitri Fontaine
Le lundi 29 septembre 2008, Tom Lane a écrit :
 * Extend the archive format to provide some indication that restoring
 this object requires exclusive access to these dependencies.

 * Hardwire knowledge into pg_restore that certain types of objects
 require exclusive access to their dependencies.

Well, it seems to me that currently the FK needs in term of existing indexes 
and locks, and some other object lock needs, are all hardwired. Is it even 
safe to consider having the locks needed for certain commands not be 
hardwired?

Provided I'm not all wrong here, I don't see how having something more 
flexible at restore time than at build time is a win. The drawback is that 
whenever you change a lock need in commands, you have to remember teaching 
pg_restore about it too.

So my vote here is in favor of hardwired knowledge of pg_restore, matching 
target server code assumptions and needs.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] FSM rewrite committed, loose ends

2008-09-30 Thread Dimitri Fontaine
Hi,

Le mardi 30 septembre 2008, Heikki Linnakangas a écrit :
 pg_relation_size() doesn't include the size of the FSM. Should it? I'm
 thinking no, but pg_total_relation_size() should.

What's practical about pg_relation_size() and pg_total_relation_size() as of 
8.3 is that the diff is the cumulated indexes storage volume. Your proposal 
makes it harder to get this information, but sounds good otherwise.
Would it be possible to add in some new APIs to?
 a. pg_relation_size()
 b. pg_relation_fsm_size()
 c. pg_relation_indexes_size()
 d. pg_total_relation_size() = a + b + c

This scheme will need new functions for each new kind of forks, but I think 
it's a good idea for the user to be able to know which fork is responsible 
for what on-disk volume.

 VACUUM VERBOSE output no longer prints the number of pages with usable
 free space, because we no longer track such a value during the vacuum.
 You can use contrib/pg_freespacemap to view the contents of the FSM in
 detail, but should VACUUM VERBOSE still print something about the amount
 of free space on the relation? Perhaps the total amount of free space in
 the relation?

What about another function to get just this information?
 e. pg_relation_free_space()

Question for the slow readers: this new FSM scheme being dynamic, it's no 
longer possible to have table bloat, right?
(where table bloat is full of dead-for-any-transaction tuples, and you have to 
CLUSTER or VACUUM FULL to be able to reuse the space it takes)

Regards, keep up the good (team) work :)
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] FSM rewrite committed, loose ends

2008-09-30 Thread Dimitri Fontaine
Le mardi 30 septembre 2008, Heikki Linnakangas a écrit :
 Dimitri Fontaine wrote:
  Question for the slow readers: this new FSM scheme being dynamic, it's no
  longer possible to have table bloat, right?
  (where table bloat is full of dead-for-any-transaction tuples, and you
  have to CLUSTER or VACUUM FULL to be able to reuse the space it takes)

 No, it didn't change that. Regular VACUUMing or autovacuum is still needed.

But IIUC correctly it's no longer possible for PostgreSQL to forget about 
where bloat is, so regular vacuuming will know how clean out any bloat any 
time?

For example, a purge script which does a large DELETE could overrun the 
max_fsm_pages setting, resulting in permanent bloat (until table rewrite). 
This large DELETE will now be tracked completely by the new FSM, so 
autovacuum will be able to have the space reused later?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] FSM rewrite committed, loose ends

2008-09-30 Thread Dimitri Fontaine
Le mardi 30 septembre 2008, Heikki Linnakangas a écrit :
  You forgot the toast size.

 Yeah, pg_total_relation_size() - pg_relation_size() is not equal to the
 total size of indexes because of that.

Oops. Thanks for pointing this to me...

 But you can do SUM(pg_relation_size(index)) across all the indexes for
 that:

For convenience, would it be possible to see about having all this provided by 
PostgreSQL?
 a. pg_relation_size()
 b. pg_relation_toast_size()
 c. pg_relation_fsm_size()
 d. pg_relation_indexes_size()
 e. pg_total_relation_size() = a + b + c + d

Are there some other things to add in the mix?

Maybe I'm the only one with the need for some simple functions covering all 
the base, but I thought I'd ask nevertheless :)

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-30 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 30 sept. 08 à 20:03, Tom Lane a écrit :

set_read_position(tupstore, local_read_position);
tuple = tuplestore_gettuple(tupstore, ...);
get_read_position(tupstore, local_read_position);

rather than just tuplestore_gettuple.  The set/get functions will be
cheap enough that this is no big deal.  (Or maybe we should just
provide a wrapper function that does this sequence?)


It seems to me to share some ideas with the MemoryContext concept:  
what about a TupstoreContext associated with tuplestore, you get a  
common default one if you don't register your own, and use

tuplestore_gettuple(MyTupstoreContext, ...);

Maybe some other API would benefit from the idea?

Regards,
- --
dim


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjigF4ACgkQlBXRlnbh1bkycQCgqs/+JBOd0SiN4xvKwLgEgi9F
BOYAoLm0Se6zs8cEAnoTlH6de7pLLh/l
=kzm1
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FSM rewrite committed, loose ends

2008-10-02 Thread Dimitri Fontaine
Le jeudi 02 octobre 2008, Heikki Linnakangas a écrit :
 pg_relation_size('footable') for size of the main data fork
 pg_relation_size('footable', 'fsm') for FSM size

As good as possible, if you ask me!

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Contrib, schema, and load_module

2008-10-11 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 11 oct. 08 à 01:50, Tom Lane a écrit :

Alvaro Herrera [EMAIL PROTECTED] writes:

It seems that the real way forward is to improve on that patch.


Yeah.  If the schema-per-module answer were really a good answer,
we'd have done it before now.


I tried to spend some time thinking about those issues and came up  
with a packaging management proposal, which I still think has merits.  
One of them is to propose to reuse existing stuff and Tom Dunstan's  
preliminary work.

  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01098.php

Hope this helps,
- --
dim


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjwmCQACgkQlBXRlnbh1bmIggCdGqD8TwJE7CeZDCL+v5njPtib
4+kAoLVj3rDzsFpJvb/Zc20R+Jo/s5Z7
=O6C6
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Contrib, schema, and load_module

2008-10-11 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 11 oct. 08 à 21:10, Josh Berkus a écrit :
Am I correct in assuming, however, that you're not at all likely to  
complete this for 8.4?


Not only that, but as I've yet to discover PostgreSQL internal code,  
it would ask a lot of help and efforts to get something in shape even  
in the 8.5 timeframe. If recent events are showing anything, we can  
trust the #postgresql crowd to provide the help, and I could

organise myself around the time  efforts.

Just don't suppose this could happen realistically for 8.4... that is,  
by the end of this month.


Regards,
- --
dim

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjw/zEACgkQlBXRlnbh1bn+NACgjPIz+p9X36E8xlhcML7t+0Tx
EpkAnA8eYhIPCHwttDg4rs7Eab3XnrfC
=+Qsq
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXPLAIN CREATE TABLE AS

2008-10-23 Thread Dimitri Fontaine
Le jeudi 23 octobre 2008, Tom Lane a écrit :
 Peter Eisentraut [EMAIL PROTECTED] writes:
  It turns out you cannot EXPLAIN on CREATE TABLE AS, but it seems to work
  fine if I extend the grammar as below:

 This seems to me to be something that will look like a wart, not a
 feature, from the user's point of view.  You can't explain CREATE TABLE,
 but you can explain CREATE TABLE AS?  It's just weird.  It feels to me
 like exposing an implementation artifact.

As a user, I'm disagreeing about the weirdness of it. +1 for Peter's 
initiative.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-28 Thread Dimitri Fontaine
Hi,

In the python language, functions that lazily return collections are called 
generators and use the yield keyword instead of return.
http://www.python.org/doc/2.5.2/tut/node11.html#SECTION0011100

Maybe having such a concept in PostgreSQL would allow the user to choose 
between current behavior (materializing) and lazy computing, with a new 
internal API to get done in the executor maybe.

CREATE FUNCTION mygenerator()
  returns setof integer
  language PLPGSQL
AS $f$
BEGIN
  FOR v_foo IN SELECT foo FROM table LOOP
YIELD my_expensive_function(v_foo);
  END LOOP;
  RETURN;
END;
$f$;

At the plain SQL level, we could expose this with a new function parameter, 
GENERATOR maybe?

CREATE FUNCTION my_generator_example(integer, integer)
  returns setof integer 
  generator
  language SQL
$f$
  SELECT generate_series($1, $2);
$f$;

Maybe we should prefer to add the GENERATOR (or LAZY or whatever sounds good 
for a native English speaker) parameter to PL functions to instead of 
providing YIELD, having RETURN doing YIELD in this case.

Le mardi 28 octobre 2008, Tom Lane a écrit :
 I suppose, but short of a fundamental rethink of how PL functions work
 that's not going to happen.  There's also the whole issue of when do
 side-effects happen (such as before/after statement triggers).

Would it be possible to forbid generators when using in those cases?

 Agreed, but I think the fundamental solution there, for simple-select
 functions, is inlining.

Would it be possible to maintain current behavior with ROWS estimator for 
functions, even when inlining, as a way to trick the planner when you can't 
feed it good enough stats?

 I think the PL side of the problem is the hard part --- if we knew how
 to solve these issues for plpgsql then SQL functions would surely be
 easy.

What about this python idea of GENERATORS and the YIELD control for lazy 
evaluation of functions?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-29 Thread Dimitri Fontaine
Le mardi 28 octobre 2008, Pavel Stehule a écrit :
 2008/10/28 Dimitri Fontaine [EMAIL PROTECTED]:
  Hi,
 
  In the python language, functions that lazily return collections are
  called generators and use the yield keyword instead of return.
  http://www.python.org/doc/2.5.2/tut/node11.html#SECTION001110
 0
 
  Maybe having such a concept in PostgreSQL would allow the user to choose
  between current behavior (materializing) and lazy computing, with a new
  internal API to get done in the executor maybe.

 lazy computing is good idea, but I am afraid so it should be really
 wery hard implemented. You should to store somewhere current state,
 stop execution, return back from node, and you should be able restore
 PL state and continue in process. I can't to see it without thread
 support.

I'm not sure to understand what is the current situation then. By reading this 
Tom's commit message
  Extend ExecMakeFunctionResult() to support set-returning functions that
  return via a tuplestore instead of value-per-call
  ...
  For the moment, SQL functions still do things  the old way.
  
http://git.postgresql.org/?p=postgresql.git;a=commit;h=6d5301be5ece6394433d73288e0fafaed6326485


I had the impression we already have a lazy implementation, this 
value-per-call returning code path, which still exists for SQL functions.

  CREATE FUNCTION my_generator_example(integer, integer)
   returns setof integer
   generator
   language SQL
  $f$
   SELECT generate_series($1, $2);
  $f$;

So my idea would be to have the SQL function behavior choose to return values 
either via tuplestore or via value-per-call, depending on the user 
setting generator or lazy.
Done this way, the user could also choose for the function to be lazy or to 
use a tuplestore whatever the language in which it's written.

Current behaviour would then mean the default depends on the language, lazy 
for SQL and tuplestore for PL/pgSQL. Well, it will have to be documented, 
whatever the final choice is.

Is it possible? A good idea?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-29 Thread Dimitri Fontaine
Le mercredi 29 octobre 2008, Tom Lane a écrit :
 So the fact that it's possible for SQL-language functions is an
 idiosyncrasy of that language, not something we should cram into the
 general CREATE FUNCTION syntax in the vain hope that having syntax
 might cause an implementation to appear someday.

Ok, that confirms that lazy evaluation and call-per-value are distinct things, 
for once, and that what you where after was not an easy syntax bit. :)

 Therefore, if we were going to expose a knob to the user to control this
 behavior, I'd be inclined to make it a part of the language-specific
 syntax of SQL function bodies.

How would we support the option for SQL functions?

 Mind you, I'm not exactly *for* this, because I think it will result
 in making functions.c a whole lot more complex and hard to maintain
 than it needs to be, in exchange for a behavior that I don't believe
 is especially useful in most cases, and can easily be worked around
 when it is useful.

From what I understand, the lazy evaluation of functions is not seen as easy 
to be worked around by people asking for it.

 But if people are going to be sticky about the 
 point, something like this might be a workable compromise.

What's against PLpgSQL implementing a YIELD statement? Wouldn't it be simpler 
to integrate for both hackers and users?

This would maybe even allow to have a new API in the executor for this, and 
each PL would be free to add support for it when best suits them. Maybe 
that's exactly what you're calling a whole lot more complex and hard to 
maintain than it needs to be, though.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-29 Thread Dimitri Fontaine
Le mercredi 29 octobre 2008, Tom Lane a écrit :
 Well, call-per-value is *necessary* for lazy evaluation, but it's not
 *sufficient*.  You need a function implementation that can suspend and
 resume execution, and that's difficult in general.

Ok, I think I begin to understand how things are tied together. Thanks again 
for your patience explaining :)

 Well, we'd just tweak how the executor gets called inside functions.c.
 The main problem is that we'd have to have two different sets of
 behavior there, depending on whether we are trying to evaluate commands
 a row at a time or all at once, plus interlocks to disallow cases like
 using LAZY with a RETURNING query.  It's certainly possible but I
 believe it will make functions.c a lot longer and uglier than it would
 be without it.

And I fail to see how the user would control which behavior will get chosen, 
which I think was part of the going further with your ideas sub thread.

 Nobody has refuted the argument that sticking a LIMIT into the function
 would accomplish the same result.

Fair enough.

  What's against PLpgSQL implementing a YIELD statement?

 Feel free to try it, if you want.

Hehe, not this year.
But being able to ask questions and get clarifications from hackers certainly 
is a step in this direction. Feeling ready and organizing one's time around 
it is the next :)

 When you get done you might have some 
 grasp of why it'll be nearly impossible for PLs that we don't control
 the entire implementation of.

Hence the YIELD / new API idea, with the LAZY property which would be optional 
for PLs and only implemented in plpgsql (and maybe plpython, as python 
supports the generator functions concept) first.
Maybe having optional features for PLs has not yet been done?

But again, I was started in this only by misunderstanding your call here:

  I think the PL side of the problem is the hard part --- if we knew how
  to solve these issues for plpgsql then SQL functions would surely be
  easy.

I'm not being sticky on the feature request, just struggling to understand 
correctly the issues at hand, recognizing that easy choice of EAGER or LAZY 
function evaluation would be great as a user, even if unsupported in a number 
of PLs.

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-10-29 Thread Dimitri Fontaine
Le mercredi 29 octobre 2008, Tom Lane a écrit :
 Now of course the bigger problem with either this syntax or yours is
 that attaching such a property to a function is arguably the Wrong Thing
 in the first place.  Which one is the best way is likely to depend on
 the calling query more than it does on the function.

Let the planner figure this out, and add in some starting cost considerations 
too maybe? That sounds even better, yes.

 However, I see no 
 solution to that problem except function inlining; and if the function
 gets inlined then all this discussion is moot anyhow.

How to inline PLs functions?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] WIP patch: convert SQL-language functions to return tuplestores

2008-11-03 Thread Dimitri Fontaine
Le vendredi 31 octobre 2008, Tom Lane a écrit :
 With the attached patch, SQL functions support returning the results of
 INSERT/UPDATE/DELETE RETURNING clauses.

Thanks for your work and for having considered user whining in-passing! :)
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] blatantly a bug in the documentation

2008-11-25 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 25 nov. 08 à 20:29, Ron Mayer a écrit :

psql=# install module sampledb;
Downloading sampledb from pgfoundry...
Installing sampledb
Connecting to sampledb
sampledb=#


This could be part of an installer for PostgreSQL extensions. See  
following email for a proposal on how to deal with extension packaging:

  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01098.php

The proposal purposefully let fetching and building steps out of the  
database manager itself, and makes it so that building is to be cared  
about by distributors. So it would be something like:

  create sampledb
  pg_pkg fetch pgsampledb

and either
  pg_pkg install pgsampledb sampledb
  psql sampledb
or
  psql sampledb
  sampledb=# install package pgsampledb;


it'd remove the need for pre-installing a rarely-needed ad-on,
as well as being useful for other projects.  For exmaple:
 psql=# install module US_Census_Tiger_Maps;
 Installing dependency Postgis...
 Installing module US_Census_Tiger_Maps
to install a GIS system with all the road networks.


The dependancy system is yet to be though about, but definitely in the  
scope of the extension manager.


While at it, calling all those things extensions rather than package  
would probably help not confusing people with Oracle compatibility  
etc. Last time I checked I didn't find mention of package into the  
standard, but still.


So PosgtreSQL could have an extension manager at SQL level (create or  
replace extension, install extension, or maybe load extension (which  
would LOAD the modules associated), etc) with a system command line  
tool to fetch  build, etc (pg_ext ?) that source level users and  
packagers (distributors) would use?


What do you dear readers think about the extension vocabulary?

Regards,
- --
dim



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkksZooACgkQlBXRlnbh1bmyvgCaAobd8kWhtkO+DxmDjbnqAWCz
5pQAoMauBWbyuvYxg6bDndYpb9CYiYZc
=Reeq
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] what is necessary for filling SysCache?

2008-11-26 Thread Dimitri Fontaine
Hi,

Le mercredi 26 novembre 2008, Tom Lane a écrit :
 You might want to pull the diffs for some past pg_proc addition from
 CVS and go over the changes.  This one is a good minimal example:
 http://archives.postgresql.org/pgsql-committers/2005-03/msg00433.php

The following link should help the lazy clic  browse amongst us:
http://git.postgresql.org/?p=postgresql.git;a=commitdiff;h=578ce39692571e39fd0e677c079b05fad52d

Hope this helps ;)
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Multiplexing SUGUSR1

2008-12-09 Thread Dimitri Fontaine
Hi,

I hope I'm not disturbing hackers at work by talking about completely 
unrelated things but...

Le mardi 09 décembre 2008, Tom Lane a écrit :
 I think we need something closer to the postmaster signal multiplexing
 mechanism, wherein there is a dedicated shared memory area of static
 layout that holds the signaling flags.  And it needs to be driven off
 of knowing the target's PID, not anything else.

...this makes me recall IMessage Queues from Postgres-R, reworked by Markus to 
follow your advices about postmaster and shared memory.
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01420.php

Could it be the implementation we need for multiplexing signals from one 
backend some others?

Regards,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] actualised funcs typmod patch

2009-11-17 Thread Dimitri Fontaine
Hi,

Pavel Stehule pavel.steh...@gmail.com writes:
 I am sending actualised patch.

I've volunteered to review this patch, and before getting to read code
and documentation, then to test it, I have some more general question.

The idea to add support for typmods in function signatures came from the
need to have more than one anyelement at a time in a function, and Pavel
proposed that we spell that anyelement(0) and anyelement(1) e.g.

Is that how we want to solve it?

Now, whatever the answer to that is, I guess the patch still has
interest in itself for being able to have a function f(x numeric(10,4))
for example. Right?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] actualised funcs typmod patch

2009-11-17 Thread Dimitri Fontaine
Le 17 nov. 2009 à 20:33, Tom Lane a écrit :
 We could to talk about it now. We are not hurry. But I would to see
 some progress in this area in next two months. This patch is simple
 and doesn't create any new rules or doesn't change behave.
 
 What do you mean it doesn't change the behavior?  It establishes a
 specific set of behaviors for functions with non-default typmods in
 their arguments.  If we just apply whatever was the easiest thing to
 implement, without any discussion, we are very likely to regret it
 later.
 
 It might be that what you've done is all fine, but I'd like some
 discussion and consensus on the issues.  Submitting an entirely
 documentation-free patch is not the way to establish consensus.

I'll try to help there, it's not really a review any more, but still it seems 
needed. Here's what I gather the specs of Pavel's work are by quick-reading 
through his patch:

/*
+* Don't allow change of input typmodes. Any change should break
+* stored casts in prepared plans.
+*/

The return type now can have a non -1 typmod given.

[implementation details of parameterTypmodes and allParameterTypmodes left out, 
not well understood yet, does seem to be details rather than spec level things]

+   if (rettypmod != resttypmod  rettypmod != -1)
+   ereport(ERROR,
+   
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+errmsg(return type mismatch in function declared to 
return %s,
+   format_type_with_typemod(rettype, 
rettypmod)),
+errdetail(Actual return type is %s.,
+  
format_type_with_typemod(restype, resttypmod;

So you need to return a decorated value I guess, or assign it to a retval which 
is of the right type, including typmod. Declaring a retval text to handle a 
RETURNS varchar(15) won't do it.


+   /* when typmodes are different, then foerce coercion too */
+   force_coerce = declared_typmod != -1  declared_typmod != 
actual_typmod;

So if you declare typmods they are NOT part of the polymorphism (also per 
comment upthread) but you cannot change them and there's automatic coercion 
when only the typmod mismatches. I think that's what Tom wanted to avoid doing 
(because it breaks existing code assumptions and typmod coercion is not well 
defined).

Here are some tests showing either the coercion of the argument (and failures 
to do it) or the return type typmod invalidity:
+ ERROR:  cannot change parameter typmod of existing function

+ select typmodtest('a','');   -- outside plpgsql
+ ERROR:  value too long for type character varying(3)

+ select typmodtest('','bbb'); -- return value
+ ERROR:  value too long for type character varying(6)
+ CONTEXT:  PL/pgSQL function typmodtest while casting return value to 
function's return type


Then a great deal of changes that makes me cry in favor of having something 
human friendly around internal catalogs representation, all this BKI stuff IIUC.

So the bulk of it is supporting return typemod declaration. This expands to OUT 
types, which can be cool:

+ create or replace function typmodtest(out a numeric(5,2),out b numeric(5,2), 
out c numeric(5,2))


Hope this helps,
-- 
dim

PS: about the more than one anyelement type support in functions, I'd rather 
have a nice SQLish syntax around it. My proposal was sth like this:

CREATE FUNCTION foo(a anyelement x, b anyelement x, c anyelement y)
 RETURNS anyelement y[]
AS $$
... 
$$;
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints

2009-11-18 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Forgive me if this is discussed before, but why does this store the
 strategy numbers of the relevant operators instead of the operators
 themselves?  It seems like this could lead to surprising behavior if
 the user modifies the definition of the operator class.

Wild guess:

  http://www.postgresql.org/docs/8.4/static/xindex.html

  34.14.2. Index Method Strategies

  The operators associated with an operator class are identified by
  strategy numbers, which serve to identify the semantics of each
  operator within the context of its operator class. For example,
  B-trees impose a strict ordering on keys, lesser to greater, and so
  operators like less than and greater than or equal to are
  interesting with respect to a B-tree. Because PostgreSQL allows the
  user to define operators, PostgreSQL cannot look at the name of an
  operator (e.g.,  or =) and tell what kind of comparison it
  is. Instead, the index method defines a set of strategies, which can
  be thought of as generalized operators. Each operator class specifies
  which actual operator corresponds to each strategy for a particular
  data type and interpretation of the index semantics.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Application name patch - v4

2009-11-29 Thread Dimitri Fontaine
Hi,

Le 29 nov. 2009 à 18:22, Tom Lane a écrit :
 I think we should use GUC_NO_RESET_ALL.
 
 I agree with you, but it seems we have at least as many votes to not do
 that.  Any other votes out there?

Driven by the pooler use case (pgbouncer, even), I'd say RESET ALL should reset 
also the application name. And the connection value is not tied any more to 
something sensible as soon as you have pooling in there...

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Application name patch - v4

2009-11-30 Thread Dimitri Fontaine
Le 30 nov. 2009 à 00:25, Tom Lane a écrit :
 The thing is that the libpq API treats application_name as a *property
 of the connection*.

Oh. Yeah.

 We could add a third keyword, say SET DEFAULT, that would have the
 behavior of setting the value in a fashion that would persist across
 resets.  I'm not sure that DEFAULT is exactly le mot juste here, but
 agreeing on a keyword would probably be the hardest part of making it
 happen.

I vaguely remember you explaining how hard it would be to be able to predict 
the value we RESET to as soon as we add this or that possibility. That's very 
vague, sorry, but only leaves a bad impression on the keyword choice 
(bikeshedding, I should open a club).

So what about SET CONNECTION application_name TO 'whatever'?

Regards,
-- 
dim





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Application name patch - v4

2009-11-30 Thread Dimitri Fontaine
Le 30 nov. 2009 à 22:38, Robert Haas a écrit :
 I still don't really understand why we wouldn't want RESET ALL to
 reset the application name.  In what circumstances would you want the
 application name to stay the same across a RESET ALL?

I can't see any use case, but SET/RESET is tied to SESSION whereas 
application_name is a CONNECTION property. So it's a hard sell that reseting 
the session will change connection properties.

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page-level version upgrade

2009-12-02 Thread Dimitri Fontaine
Hi,

As we're talking about crazy ideas...

Bruce Momjian br...@momjian.us writes:
 Well, yea, the idea would be that the 8.5 server would either convert
 the page to the new format on read (assuming there is enough free space,
 perhaps requiring a pre-upgrade script), or have the server write the
 page in the old 8.4 format and not do CRC checks on the page.  My guess
 is the former.

We already have had demand for read only tables (some on-disk format
optimisation would then be possible). What about having page level
read-only restriction, thus allowing the newer server version to operate
in read-only mode on the older server version pages, and convert on
write by allocating whole new page(s)?

Then we go even crazier, with a special recovery mode on the new version
able to read older version WAL format, producing older version
pages. That sounds like code maintenance hell, but would allow for a
$new WAL standby to restore from a $old wal steam, and be read
only. Then you sitchover to the slave and it goes out of recovery and
creates new pages on writes.

How about going this crazy?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page-level version upgrade

2009-12-02 Thread Dimitri Fontaine
Greg Stark gsst...@mit.edu writes:
 On Wed, Dec 2, 2009 at 11:26 AM, Dimitri Fontaine
 dfonta...@hi-media.com wrote:
 We already have had demand for read only tables (some on-disk format
 optimisation would then be possible). What about having page level
 read-only restriction, thus allowing the newer server version to operate
 in read-only mode on the older server version pages, and convert on
 write by allocating whole new page(s)?

 I'm a bit confused. Read-only tables are tables that the user has said
 they don't intend to modify.  We can throw an error if they try. What
 you're proposing are pages that the system treats as read-only but
 what do you propose to do if the user actually does try to update or
 delete (or lock) a record in those pages? 

Well it's still a pretty rough idea, so I'll need help from this forum
to get to something concrete enough for someone to be able to implement
it... and there you go:

 If we want to avoid
 converting them to new pages we need to be able to at least store an
 xmax and set the ctid on those tuples. And probably we would need to
 do other things like set hint bits or set fields in the page header.

My idea was more that any non read-only access to the page forces a
rewrite in the new format, and a deprecation of the ancient page. Maybe
like what vacuum would be doing on it as soon as it realises the page
contains no visible tuples anymore, but done by the backend at the time
of the modification.

That makes the first modifications of the page quite costly but allow to
somewhat choose when that happens. And still have read only access, so
you could test parts of your application on a hot standby running next
version.

Maybe there's just too much craziness in there now.
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First feature patch for plperl - draft [PATCH]

2009-12-04 Thread Dimitri Fontaine
Le 4 déc. 2009 à 20:40, Tim Bunce a écrit :
 Robert's comparison with mod_perl is very apt. Preloading code gives
 dramatic performance gains in production situations where there's a
 significant codebase and connections are frequent.

How far do you go with using a connection pooler such as pgbouncer?

-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, recent changes

2009-12-06 Thread Dimitri Fontaine
Le 6 déc. 2009 à 23:26, Robert Haas a écrit :
 Consider this scenario:
 
 0. You have a master and a standby configured properly, and up and running.
 1. You shut down master for some reason.
 2. You restart standby. For some reason. Maybe by accident, or you want
 to upgrade minor version or whatever.
 3. Standby won't accept connections until the master is started too.
 Admin says WTF?
 
 I would rather document it as a known caveat and be done.

+1

 For what it's worth, this doesn't seem particularly unlikely or
 unusual to me.

I'm sorry to have to disagree here. Shutting down the master in my book means 
you're upgrading it, minor or major or the box under it. It's not a frequent 
event. More frequent than a crash but still.

Now the master is offline, you have a standby, and you're restarting it too, 
but you don't mean that as a switchover. I'm with Simon here, the WTF is are 
you in search of trouble? more than anything else. I don't think shutting down 
the standby while the master is offline is considered as a good practice if 
your goal is HA...

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] YAML Was: CommitFest status/management

2009-12-07 Thread Dimitri Fontaine
Hi,

Greg Smith g...@2ndquadrant.com writes:
 Robert Haas wrote:
 The main point here for me is that the JSON format is already
 parseable by YAML parsers, and can probably be turned into YAML using
 a very short Perl script - possibly even using a sed script.  I think
 that it's overkill to support two formats that are that similar.
   
 It's not the case that JSON can be turned into YAML or that it just happens
 that it can be parsed by YAML parsers.  While there was some possible
 divergence in earlier versions, a JSON 1.2 document *is* in YAML format
 already.  JSON is actually a subset of YAML that uses one of the many
 possible YAML styles--basically, YAML accepts anything in JSON format, along
 with others.  This means that by providing JSON output, we've *already*
 provided YAML output, too.  Just not the nice looking output people tend to
 associate with YAML.

Well we have JSON and agreed it was a good idea to have it. Now JSON is
a subset of YAML and some would prefer another YAML style (me included).

If the problem is supporting 2 formats in core rather than 3, what about
replacing the current JSON support with the YAML one?

At a later point we could even have JSON support back by having the YAML
printer able to output different YAML styles, but I guess that's not
where we are now.

Vote: +1 for YAML even if that means dropping JSON.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default

2009-12-07 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Right, just like every other thing that's pre-installed.  If a
 particular installation wishes to let individual DB owners control this,
 the superuser can drop plpgsql from template1.  It's not apparent to me
 why we need to allow non-superusers to override the project's decisions
 about what should be installed by default.

I guess it boils down to being nice to hosting platforms, where they
will want to give as much power as can be given to database owners
without having those hosted people be superusers.

So should the decision to remove plpgsql be on the hosting platform
hands or the hosted database owner?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default

2009-12-07 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Why not?  If they really want to prohibit use of a feature the upstream
 project has decided should be standard, that's their privilege.

Well, I guess they could also automate their database creation to fix
the privileges and assign the ownership of the language to the owner of
the database. Then whether or not to have plpgsql there is up to the
owner.

For non-hosted environments, you always want to tweak some things, like
installing plpgsql in the first place. So...

 The argument against seems to be basically this should work exactly
 like it did before, but if that's the standard then we can never
 have plpgsql installed by default at all.

Don't get me wrong, I'm all for having plpgsql installed by default. 

I though we were talking about how to provide that and trying to decide
if having to be superuser to drop plpgsql after having created the
database is blocking the way forward, knowing than installing the
language only requires being database owner.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default

2009-12-10 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 It's not impossible that we'll have to tweak pg_dump a bit; it's
 never had to deal with languages that shouldn't be dumped ...

Ah, the best would be to have extensions maybe. Then you could do this
in initdb, filling in template0:
  CREATE EXTENSION plpgsql ...;

Then at createdb time, what would become automatic is:
  INSTALL EXTENSION plpgsql;

And that's it. pg_dump would now about extensions and only issues this
latter statement in its dump.

Bruce, there are some mails in the archive with quite advanced design
proposal that has been discussed and not objected to, and I even
provided a rough sketch of how I wanted to attack the problem. 

  http://archives.postgresql.org/pgsql-hackers/2009-06/msg01281.php
  http://archives.postgresql.org/pgsql-hackers/2009-07/msg01425.php
  http://archives.postgresql.org/pgsql-hackers/2009-07/msg01468.php

  The major version dependant SQL code is now much less of a problem
  than before because we have inline DO statements. So you don't need to
  create a function for this anymore.

Real life kept me away from having the time to prepare the code patch,
and I don't think that will change a bit in the 8.5 release cycle,
whatever my hopes were earlier this year. 

But having everyone talk about the feature and come to an agreement as
to what it should provide and how was the hard part of it, I think, and
is now about done.

Would you be up for writing the extension facility?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default

2009-12-12 Thread Dimitri Fontaine
Hi,

Le 11 déc. 2009 à 01:43, Bruce Momjian a écrit :
 Would you be up for writing the extension facility?
 
 Uh, well, I need to help with the patch commit process at this point ---
 if I find I have extra time, I could do it.   I will keep this in mind.

If you ever find the time to do it, that would be excellent!
The extension facility is on the top list of Josh Berkus missing things we'll 
have to provide soon (or something) and a very annoying missing feature, the 
last stone of the high praised extensibility of PostgreSQL.
  
http://it.toolbox.com/blogs/database-soup/postgresql-development-priorities-31886

It could also means that pg_migrator would have an easier time handling user 
data types etc, if extension authors are able to implement the hooks to call to 
migrate their data from previous to next major version ondisk format.

Anyway, thanks for considering it!
-- 
dim

PS: of course I've developed some ideas of how I'd like this to work and some 
use cases too, so bug me on IRC or whatever for details etc :)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiling with Visual Studio

2009-12-13 Thread Dimitri Fontaine
Le 13 déc. 2009 à 19:48, Magnus Hagander a écrit :
 The buildscript that's there can generate buildfiles for contrib. I
 haven't used it to build external contrib modules, but it's probably
 a pretty good start. If not, just steal the project file from another
 contrib module and modify the files it references (yeah, I know,
 ugly..)

Or, well, you have loads of time in your hand so you can port PGXS support to 
windows building, and this way when extensions are another PostgreSQL facility 
this problem is already solved. So easy to say it I couldn't resist :)

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-14 Thread Dimitri Fontaine
Scott Bailey arta...@comcast.net writes:

 So basically I have an anyrange pseudo type with the functions prev, next,
 last, etc defined. So instead of hard coding range types, we would allow the
 user to define their own range types. Basically if we are able to determine
 the previous and next values of the base types we'd be able to define a
 range type. I'm envisioning in a manner much like defining an enum
 type.

It's not clear how to define those functions for the prefix_range
datatype, where '123' represents any text begining with those chars and
'123[4-6]' any text begining with '123' then either 4, 5 or 6.

What's supposed to return SELECT next('123'::prefix_range); ?

Regards,
-- 
dim

PS: as I'm used to use that in the telephony context, the example
contain figures, but it's a text based type and given questions and
reports in pgsql-general, people do use it with text ranges too.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-16 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
   foreach p2_member in unnest(p2) loop
 p1 := array(select period_except(p1_member, p2_member)
 from unnest(p1) p1_member);
   end loop;

 But maybe it can be done in a single SQL command.

Yeah, as soon as you have LATERAL, I think. Without it there's no way to
compose SRF in SQL, AFAIK.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-17 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 foreach p2_member in unnest(p2) loop
 p1 := array(select period_except(p1_member, p2_member)
 from unnest(p1) p1_member);
 end loop;
 
 But maybe it can be done in a single SQL command.

 Yeah, as soon as you have LATERAL, I think. Without it there's no way to
 compose SRF in SQL, AFAIK.

 Hm, how would you do it with LATERAL?  The problem is not so much
 composition as the need for a variable number of rounds of
 composition.

Let's have a try at it:

select p2_member, array_accum(p1)
  from unnest(p2) as p2_member
   lateral (select period_except(p1_member, p2_member)
  from unnest(p1) p1_member) as x(p1);

I'm not sure I understand how the explicit looping over unnest(p2) is different
from using lateral, or even if that's what you're talking about when
mentioning variable number of rounds.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Dimitri Fontaine
Hi,

Le 17 déc. 2009 à 19:39, Josh Berkus a écrit :
 Mind you, returning (arbitrary expression) would be even better, but if
 we can get returning TEXT[] for 8.5, I think it's worth doing on its own.

Well, you already have it as soon as you have text[]:

 INSERT INTO destination
 SELECT row[0], row[1], myfunction(row[0], row[1]), row[2]::int + 1
   FROM (COPY RETURNING text[] FROM '/path/to/file.cvs' CVS HEADER) as 
file(row);

Of course as Andrew said already what it needs that the syntax here does not 
cover is ragged file processing, that is accepting file content when all the 
rows will not have the same number of columns.

But if you have ragged input reading and COPY as a relation in a query, then 
you're able to apply any expression you want to in the query itself. Such as 
transforming the input slightly in order to conform to PostgreSQL datatype 
input syntaxes, e.g.

Regards,
-- 
dim

Let's deprecate pgloader.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backup history file should be replicated in Streaming Replication?

2009-12-18 Thread Dimitri Fontaine
Hi,

Le 18 déc. 2009 à 19:21, Heikki Linnakangas a écrit :
 On Fri, Dec 18, 2009 at 12:22 PM, Florian Pflug fgp.phlo@gmail.com 
 wrote:
 It'd prefer if the slave could automatically fetch a new base backup if it
 falls behind too far to catch up with the available logs. That way, old logs
 don't start piling up on the server if a slave goes offline for a long time.

Well I did propose to consider a state machine with clear transition for such 
problems, a while ago, and I think my remarks still do apply:
  http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg131511.html

Sorry for non archives.postgresql.org link, couldn't find the mail there.

 Yeah, for small databases, it's probably a better tradeoff. The problem
 with keeping WAL around in the master indefinitely is that you will
 eventually run out of disk space if the standby disappears for too long.

I'd vote for having a setting on the master for how long you keep WALs. If 
slave loose sync then comes back, either you still have the required WALs and 
you're back to catchup or you don't and you're back either to base/init dance.

Maybe you want to add a control on the slave to require explicit DBA action 
before getting back to taking a base backup from the master, though, as that 
could be provided from a nightly PITR backup rather than the live server.

 but it's almost certainly much harder
 to implement.  In particular, there's no hard and fast rule for
 figuring out when you've dropped so far behind that resnapping the
 whole thing is faster than replaying the WAL bit by bit.
 
 I'd imagine that you take a new base backup only if you have to, ie. the
 old WAL files the slave needs have already been deleted from the master.

Well consider a slave can be in one of those states: base, init, setup, 
catchup, sync. Now what you just said is reduced to saying what transitions you 
can do without resorting to base backup, and I don't see that many as soon as 
the last sync point is no more available on the master.

 I think (as I did/do with Hot Standby) that the most important thing
 here is to get to a point where we have a reasonably good feature that
 is of some use, and commit it. It will probably have some annoying
 limitations; we can remove those later.  I have a feel that what we
 have right now is going to be non-robust in the face of network
 breaks, but that is a problem that can be fixed by a future patch.
 
 Agreed. About a year ago, I was vocal about not relying on the file
 based shipping, but I don't have a problem with relying on it as an
 intermediate step, until we add the other options. It's robust as it is,
 if you set up WAL archiving.

I think I'd like to have the feature that a slave never pretends it's in-sync 
or soon-to-be when clearly it's not. For the asynchronous case, we can live 
with it. As soon as we're talking synchronous, you really want the master to 
skip any not-in-sync slave at COMMIT. To be even more clear, a slave that is 
not in sync is NOT a slave as far as synchronous replication is concerned.

Regards,
-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Allow read only connections during recovery, known as Hot

2009-12-19 Thread Dimitri Fontaine
Le 19 déc. 2009 à 03:01, Robert Haas a écrit :
 On Fri, Dec 18, 2009 at 8:32 PM, Simon Riggs sri...@postgresql.org wrote:
 Log Message:
 ---
 Allow read only connections during recovery, known as Hot Standby.
 
 Congratulations!  And, may I be the first to say - woo hoo!

+1!

-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support

2009-12-20 Thread Dimitri Fontaine
Hi from a real user :)

Le 20 déc. 2009 à 22:08, Tom Lane a écrit :
 Pavel Stehule pavel.steh...@gmail.com writes:
 b) general support for preordered aggregates.
 
 I think that we've already expanded the capabilities of aggregates
 a great deal for 8.5, and we should let it sit as-is for a release
 or two and see what the real user demand is for additional features.

All we can have in PostgreSQL without needing to resort to either PLs or 
application code is worth it from here, and I can already picture the smiling 
on our developers face when I say them median() is there by default.

 I'm particularly concerned by the fact that the feature set is already
 far out in front of what the planner can optimize effectively (e.g.,
 there's no ability to combine the work when multiple aggregates need the
 same sorted data).  The more features we add on speculation, the harder
 it's going to be to close that gap.
 
 Another risk is that features added now might preclude adding others
 later.

Now, I have no idea if augmenting the aggregate properties with an optional 
sorting step is the right approach, but it sounds right on spot (general enough 
without being over engineering). I guess it would give the planner the same 
information as if the user did type the extra order by himself, so I'm not sure 
how much your remarks would apply?

I mean we already have explicit user ordering in aggregates at call site, 
adding the exact same information in the aggregate definition itself surely 
isn't going to be such a change there?

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About the CREATE TABLE LIKE indexes vs constraints issue

2009-12-24 Thread Dimitri Fontaine

Hi, sorry for posting style,

--  
dim


Le 23 déc. 2009 à 23:58, Jeff Davis pg...@j-davis.com a écrit :
Honestly, I've never used LIKE in a table definition aside from one- 
off

design experiments. For that kind of thing, what I want is to just get
everything (except perhaps FKs if the above situation applies), and I
adjust it from there. Are there people out there who use LIKE in their
production schema files?


I do use LIKE in scripts for adding providers of federated data. In  
some cases you want to INHERIT, in some other you want to move  
incoming data to another set of tables.


Regards,
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Hi,

Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit :
 We often see posts from people who have more active connections than
 is efficient.

How would your proposal better solve the problem than using pgbouncer?

mad proposal time
I'd be in favor of considering how to get pgbouncer into -core, and now that we 
have Hot Standby maybe implement a mode in which as soon as a real XID is 
needed, or maybe upon receiving start transaction read write command, the 
connection is handled transparently to the master.
/

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 22:46, Andres Freund a écrit :
 mad proposal time
 I'd be in favor of considering how to get pgbouncer into -core, and now
 that we have Hot Standby maybe implement a mode in which as soon as a
 real XID is needed, or maybe upon receiving start transaction read write
 command, the connection is handled transparently to the master. /

 Thats not as easy as it sounds - the master may not have all data needed by 
 the snapshot on the slave anymore.

I suppose that if it was easy some patch would already be around for next 
commit fest? :)

Seriously, your point is why I'd be tempted to only consider getting to the 
master at transaction starting time. That is before any snapshot is taken.

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 22:59, Kevin Grittner a écrit :
 With my current knowledge of pgbouncer I can't answer that
 definitively; but *if* pgbouncer, when configured for transaction
 pooling, can queue new transaction requests until a connection is
 free, then the differences would be:

It does that, yes. You setup a pool, which is per database/user, and when 
there's no more server side connection in the pool, the clients are held in 
cl_waiting state.

 (1)  According to pgbouncer documentation, transaction pooling is a
 hack as it breaks application expectations of backend connection.
 You can use it only when application cooperates with such usage by
 not using features that can break.  This would not be an issue with
 an ACP.

That's why there's both transaction and session pooling. The benefit of session 
pooling is to avoid forking backends, reusing them instead, and you still get 
the pooling control.

 (2)  For the active connection aspect of the policy, you could let
 through superuser requests while other requests were queuing.

superuser is another user and gets its own pool, I'm not sure if you can size 
it differently though (yet). It's possible to trick a little by defining 
another (virtual) database where you force the user in the connection string to 
the server, then tell your application to use this special database.

 (3)  With the ACP, the statements would be parsed and optimized
 before queuing, so they would be ready to execute as soon as a
 connection was freed.

There's a pgfoundry project called preprepare, which can be used along with 
pgbouncer to get this effect. If you use 8.4, you can even get the effect 
without pgbouncer.

  http://preprepare.projects.postgresql.org/README.html

 (4)  Other factors than active connection count could be applied,
 like expected memory consumption, or more esoteric metrics.

All you can put in connection strings or per-role setting can be used to trick 
a virtual database and have it pre-set, but that means different pools (they 
accumulate, now) and different connection strings for the application. The only 
advantage is that it works with released and proven code! (except for 
preprepare... well I've been told it's running in production somewhere)

 In favor of pgbouncer (or other connection poolers) they don't
 require the overhead of a process and connection for each idle
 connection, so I would recommend a connection pooler even with an
 ACP.  They cover overlapping ground, but I see them as more
 complementary than competing.

Yeah, just trying to understand what you're proposing in terms of what I 
already know :)
-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 23:35, Kevin Grittner a écrit :
 So the application would need to open and close a pgbouncer
 connection for each database transaction in order to share the
 backend properly?

No, in session pooling you get the same backend connection for the entire 
pgbouncer connection, it's a 1-1 mapping.

 Well, I don't know that you can very accurately predict a plan or
 what its memory usage would be.  Trying to work out all permutations
 in advance and send each query to the right pool doesn't seem
 workable on a large scale.

True. I was just trying to see what components we already have, while you're 
explaining what's missing: teamwork? :)

 If we had a pooler bundled into the backend and defaulted to a
 halfway reasonable configuration, it's possible that implementing an
 active connection limit the second tier ACP would be covering close
 enough to the same ground as to be redundant.  I'm not quite
 convinced, however, that your proposed use of pgbouncer for this,
 given the multiple pools which would need to be configured and the
 possible application awareness and cooperation with policy would be
 better than a fairly simple ACP.  It seems a bit like driving nails
 with a wrench.  I like wrenches, I use them to turn things, but I
 don't like using them to drive nails when I can help it.  :-)

Hehe, pushing what we already have to their limits is often a nice way to 
describe what we want but still don't have... I think...
-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 23:56, Kevin Grittner a écrit :
  http://preprepare.projects.postgresql.org/README.html
 
 I just reviewed the documentation for preprepare -- I can see a use
 case for that, but I really don't think it has a huge overlap with
 my point.  The parsing and planning mentioned in my point 3 would
 apply to any query -- ad hoc, generated by an ORM, etc.  The
 preprepare project seems to be a way to create persistent prepared
 statements which are automatically materialized upon connection.

Just that, right.
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on statistics for continuously advancing columns

2009-12-31 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Actually, in the problematic cases, it's interesting to consider the
 following strategy: when scalarineqsel notices that it's being asked for
 a range estimate that's outside the current histogram bounds, first try
 to obtain the actual current max() or min() of the column value --- this
 is something we can get fairly cheaply if there's a btree index on the
 column.  If we can get it, plug it into the histogram, replacing the
 high or low bin boundary.  Then estimate as we currently do.  This would
 work reasonably well as long as re-analyzes happen at a time scale such
 that the histogram doesn't move much overall, ie, the number of
 insertions between analyzes isn't a lot compared to the number of rows
 per bin.  We'd have some linear-in-the-bin-size estimation error because
 the modified last or first bin actually contains more rows than other
 bins, but it would certainly work a lot better than it does now.

I know very little about statistics in general, but your proposal seems
straigth enough for me to understand it, and looks good: +1.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Add hstore_to_json()

2010-01-01 Thread Dimitri Fontaine
Andrew Dunstan and...@dunslane.net writes:
 Does anyone have any real-world experience with any of the JSON C libraries?

 I do not, but I see that YAJL http://lloyd.github.com/yajl/ is now in
 Fedora, and has a BSDish license

It's there in debian too, unstable and testing, and should be there on
the next stable (squeeze):

  http://packages.debian.org/source/sid/yajl

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with gist index amdb utility

2010-01-02 Thread Dimitri Fontaine
Sergej Galkin sergej.gal...@gmail.com writes:
 I realized my own gist index, and now I want to debug it :) 

I used Gevel for that:
  http://www.sai.msu.su/~megera/wiki/Gevel

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of plperl inter-sp calling

2010-01-06 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I think what we should do is either (1) implement a poor man's caching
 that doesn't try to cope with any of these issues, and document that
 you get what you pay for or (2) reject this idea in its entirety.
 Trying to reimplement all of our normal function call semantics in a
 caching layer does not seem very sane.

What about (3) implementing the caching layer in the core code so that
any caller benefit from it? I guess the size of the project is not the
same though.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable Isolation without blocking

2010-01-06 Thread Dimitri Fontaine
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I've read through some git tutorials, but
 there's a lot to digest and I'm not entirely sure this is a good way
 to proceed.

I found that the following video is really helpful at grasping the
concepts of git, that it exposes pretty directly even though it's meant
to promote a particular GUI for it. If you happen to use Emacs, consider
using magit, it's really good at what it does.

  http://alexvollmer.com/index.php/2009/01/18/meet-magit/

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unresolved bugs

2010-01-06 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 If we're going to use a bug-tracker, Bugzilla wouldn't be my first
 choice, I don't think.  Honestly what I'd like better than a
 full-fledged trackers is just a webapp that lists all the unreplied-to
 emails in the pgsql-bugs archives.

For something like a flexible archiving solution of mails, what about
having a look at ArchiveOpteryx, which offers IMAP service atop a
PostgreSQL database containing emails, along with full text search and
virtual search folders. And a web interface too.

  http://www.archiveopteryx.org/
  http://archives.aox.org/archives/pgsql-announce

I don't know the details because I'm yet to operate that, but it seems
like it could help us.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: PostgreSQL Add-On Network

2010-01-07 Thread Dimitri Fontaine
Dave Page dp...@pgadmin.org writes:
 We have discussed this sort of facility at previous developer
 meetings, and as I recall came to the conclusion that we need to have
 the ability to distribute pre-built binaries, not just source code as
 virtually no Windows users are ever going to have a build environment
 setup. Similarly, neither are Mac users, if they didn't install XCode.

And I think that's exactly why David and you are both right. The way I
see it, we need *both* a source level distribution infrastructure, which
David is proposing to implement under the PGAN name, and a binary one
too, which is implemented by distributions like debian, Fedora and the
like, and by Devrim when it comes to PGDG RPMs repos and… by you when we
talk about MacOSX and Windows binary distribution.

I think the aim of the PGAN should be to make it damn easy for binary
distributions to include whatever PGXS extension they whish, with about
no effort.

 We also discussed extension management at the DBMS level, which I
 believe Dimitri was working on in his spare time. You should look at
 what he's been doing.

We've been talking about that together on IRC, and my only reserve here
ain't technical, as we already have a somewhat advanced notion of an
extension. What we miss is support for dump and restore, so my reserve
is that pushing the current system to point-and-clic users when it's not
ready to allow you to just pg_restore a dump when you installed any PGXS
extension (contribs included) is going to have us *lose* users, not gain
any.

Unfortunately I was overly optimistic last May when I though I could
reach a patch submission in the 8.5 cycle. I'd put it on too much work
on the day and no more free time with the required energy when my son is
so young (yes, that's a good problem to have). But still.

What I have been able to do is to determine what features to include in
a first extensions implementation, and proposing a syntax that no one
refused (after less successful attempts). Also the work remaining to be
done would represent easily a month of my evening and week-end time, and
maybe a Tom's morning. But he wouldn't have the fun of learning anything
new, and to paraphrase him, source contributors don't grow on trees :)

Anyway I've tried to summary it all (with some code perspective) in the
following wiki page:

  http://wiki.postgresql.org/wiki/ExtensionPackaging

Essentially, it's all about coding a new backend function to execute
commands from a file on the server, in the spirit of \i for psql, say
e.g. pg_execute_commands_from_file('path/ to/file.sql'). It would not
send any result back to the client because you can only describe the
portal once for any given single query.

Then you need to add a catalog for holding the extensions metadata, like
the path of the install file, uninstall file, current version, upgrade
function, custom class if any, path of the setup file (GUCs).

And dependancies between extensions, and between a given extension and
what core provides too (plpgsql e.g., or major postgresql version too).

Now you can hack a CREATE EXTENSION command to fill-in the catalog, and
the commands INSTALL EXTENSION and DROP EXTENSION to execute the given
files. You have to invent a trick to be able to assign any SQL object
created while executing the install file a dependency entry targeting
the extension's catalog oid. As this is all done in a single given
backend, my guess is that a static variable holding the oid would be
enough, but you still have to touch all object creation code path.

You do *not* have to care about schema in any way *BUT* to prepend the
search_path with pg_extension just before executing the install
file. It's a new schema to provide in template databases, so that
extensions default to creating their objects there. This new schema
should be installed just before pg_catalog in the search_path in places
that search for objects from their non qualified name. Full stop on this
topic.

Oh and special bonus if CREATE EXTENSION, given a newer version of an
already installed extension, will register things so as to run the
upgrade function from the newer version at INSTALL EXTENSION time. Or
invent yet another syntax for upgrading.

And. Wait. That's it.

Oh, you could also try to turn plpgsql, plperl, plpython and their
unstrusted variants into built-in extensions too, and contribs too. It
would be nice if contribs where all CREATEd into template0 and template1
and you only had to INSTALL EXTENSION hstore; for enjoying it. That'd be
a Tom's easy evening I guess :)

Anyone having the hours and skill required to do it? It's great fun,
rewarding, everybody and his neighbour *want* it to be done, and I'll be
as available as possible to answer mails, be there on IRC, and do the
necessary political work if there's still some more to be done.

Consider it as a gift I'm giving you: I'm refusing easily earned
reputation by allowing you to code that for me ;)

Regards,
-- 
dim

PS: that ought to do 

Re: [HACKERS] Streaming replication and postmaster signaling

2010-01-07 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Hmm.  There's something to what you say, but what about the people who
 were expecting their patches to be reviewed and perhaps committed in
 the forthcoming CommitFest.  I proposed a schedule for this release
 that involved only three CommitFests and it was rejected, so it seems
 a bit unfair to pull the rug out from under people at the eleventh
 hour.  Will we lose developers if we do this?

Well the RRR people will not be able to help much with SR, will we?

So I'm not sure about what you say, but running the commitfest as usual
seems entirely feasible while continuing the efforts on SR. Now, only
the last action item of the commitfest is to be spoken of, namely the
one we always struggle with: finding commiter time to finish up the
work. I guess the 4 new commiters will help, even if I guess Simon will
be exclusively focused on HS+SR issues and review.

 Unfortunately, there are some patches that I probably will not feel
 confident to commit without your input - in particular, writeable
 CTEs, listen/notify, more frame options in window functions - and I
 venture to say there may not be too many other takers either.  So
 we're going to have to confront the question of whether it's fair to
 make those people wait a year.  Maybe that is the right decision and
 maybe it's not, but I want to make sure we are thinking about our
 developer community as well as our user community, because without
 them we are dead.

What about asking for input from authors themselves? Like would you be
really upset if we had SR in 8.5, surely meaning lots of new users (and
development contract opportunities), at the cost of not being able to
properly review your work and postponing it to 8.6?

That's a hard attitude, but it's not clear for me how to avoid it, and
if as a project there's a better way to face the issue. I've been bitten
with cultural issues before, so if you find this utterly harsh to the
point of being shoked, please accept my excuses, I'm not able to propose
something better on the practical front.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and postmaster signaling

2010-01-07 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 No, I don't think so.  HS without SR means you still have to fool with
 setting up WAL-file-based replication, which despite the existence of
 pg_standby is a PITA.  And you have to make a tradeoff of how often to
 flush WAL files to the standby.  To be a real candidate for it just
 works replication, we've *got* to have SR.

There are also walmgr.py from Skytools and pitrtools from CMD, both of
them are simpler to install and get working. In my view the big ticket
with SR would be the synchronous part for full HA setup, without that I
guess walmgr+HS is plenty good enough. But as a project that still means
having to get an external piece of software to operate replication, so
that's still not PostgreSQL 8.5 comes with replication support.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: PostgreSQL Add-On Network

2010-01-07 Thread Dimitri Fontaine
David E. Wheeler da...@kineticode.com writes:

 On Jan 7, 2010, at 2:11 PM, Peter Eisentraut wrote:

 You might want to clarify in your prose what an extension is.  I
 suspect I know what you mean, but perhaps not everyone does.

 Good suggestion, thanks. How about this in the FAQ?

 * WTF is an extension?

 An extension is a piece of software that adds functionality to
 PostgreSQL itself. Examples are data types (CITEXT, PERIOD), utilities
 (newsysviews, pgTAP), and procedural languages (PL/Ruby, PL/R), among
 others. An extension is *not* a piece of software designed to run on
 top of PostgreSQL (Bricolage, Drupal).

Maybe with a link to:
  http://www.postgresql.org/docs/8.4/static/extend.html

  In the sections that follow, we will discuss how you can extend the
  PostgreSQL SQL query language by adding:

   - functions (starting in Section 34.3)
   - aggregates (starting in Section 34.10)
   - data types (starting in Section 34.11)
   - operators (starting in Section 34.12)
   - operator classes for indexes (starting in Section 34.14)

  The PostgreSQL server can moreover incorporate user-written code into
  itself through dynamic loading. That is, the user can specify an
  object code file (e.g., a shared library) that implements a new type
  or function, and PostgreSQL will load it as required. Code written in
  SQL is even more trivial to add to the server. This ability to modify
  its operation on the fly makes PostgreSQL uniquely suited for rapid
  prototyping of new applications and storage structures.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: PostgreSQL Add-On Network

2010-01-07 Thread Dimitri Fontaine
Dimitri Fontaine dfonta...@hi-media.com writes:
 e.g. pg_execute_commands_from_file('path/ to/file.sql'). It would not
[...]
 Then you need to add a catalog for holding the extensions metadata, like
[...]
 Now you can hack a CREATE EXTENSION command to fill-in the catalog, and
 the commands INSTALL EXTENSION and DROP EXTENSION to execute the given
 files. You have to invent a trick to be able to assign any SQL object
 created while executing the install file a dependency entry targeting
 the extension's catalog oid. As this is all done in a single given
 backend, my guess is that a static variable holding the oid would be
 enough, but you still have to touch all object creation code path.

Now that you have dependency information tracked, don't forget to add
pg_dump support by skipping any objects installed as part of an
extension and only issuing INSTALL EXTENSION foo; instead.

Then let pg_restore run this and complain loudly when the needed
extension entry does not exists yet, you have to CREATE and INSTALL the
extension on the new database before getting to restore your dump.

 You do *not* have to care about schema in any way *BUT* to prepend the
 search_path with pg_extension just before executing the install
 file. It's a new schema to provide in template databases, so that
 extensions default to creating their objects there. This new schema
 should be installed just before pg_catalog in the search_path in places
 that search for objects from their non qualified name. Full stop on this
 topic.

 Oh and special bonus if CREATE EXTENSION, given a newer version of an
 already installed extension, will register things so as to run the
 upgrade function from the newer version at INSTALL EXTENSION time. Or
 invent yet another syntax for upgrading.

 And. Wait. That's it.

 Oh, you could also try to turn plpgsql, plperl, plpython and their
 unstrusted variants into built-in extensions too, and contribs too. It
 would be nice if contribs where all CREATEd into template0 and template1
 and you only had to INSTALL EXTENSION hstore; for enjoying it. That'd be
 a Tom's easy evening I guess :)

 Anyone having the hours and skill required to do it? It's great fun,
 rewarding, everybody and his neighbour *want* it to be done, and I'll be
 as available as possible to answer mails, be there on IRC, and do the
 necessary political work if there's still some more to be done.

 Consider it as a gift I'm giving you: I'm refusing easily earned
 reputation by allowing you to code that for me ;)

 Regards,
 -- 
 dim

 PS: that ought to do it.

Hey, there's about a week left. :)
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] damage control mode

2010-01-08 Thread Dimitri Fontaine
David Fetter da...@fetter.org writes:
 If we *must* have SR and it's not in by the 15th, let's do another
 Commitfest rather than jack the people who played by the rules.

If we do add another Commitfest what we do is exactly jacking people who
played by the rules. Because all those patches that are already part of
alpha3 have been worked on by people expecting a 4 CF development cycle,
and adjusted their agenda, and want a mid-year release.

Now, I'll second Greg Smith and Tom here, in that I think we need to run
the last commitfest as usual, knowing that the outcome of the commitfest
for any given patch is not it made it but we reviewed it. It's still
right for the project to bump a patch on resources ground rather than on
technical merit, at the end of the commitfest.

Why we can do it this way is because we're not starving on
reviewers. We're starving on commiters time. And seeing this:

  https://commitfest.postgresql.org/action/commitfest_view?id=5

  Status Summary. Needs Review: 19, Waiting on Author: 5, Ready for
  Committer: 2, Committed: 9, Returned with Feedback: 4. Total: 39.

I don't see any reason not to consider all the 24 patches requiring our
attention.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] damage control mode

2010-01-08 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:
 Why we can do it this way is because we're not starving on
 reviewers. We're starving on commiters time. And seeing this:

 Well, we're actually somewhat starving on senior reviewers as well.
 That can take on things like the index patches, Writable CTE or SR.
 We're not starving on reviewers for small-to-medium patches.

We've been talking about having specialized reviewers, or multi
layered reviewing. There are several things we do in reviewing, and for
big enough patches there's no need to have the same reviewer do all of
them.

[...searching the archives for a proposal I did already send...]

  http://archives.postgresql.org/pgsql-hackers/2009-08/msg00764.php

So this mail proposes we see those separate items to be handled in
review:

 - patch (applies, merge, compiles, pass regression)
 - code reading (looks like it was already there, no WTF?) [1]
 - documentation (covers code, targets users, is sufficient)
 - testing (code behavior is what is documented, works well)
 - creative testing (tried hard to crash it)
 - perf testing (profiling, no regression in non optimized cases...)
 - you name it

Now the senior reviewers you're talking about are required the most for
code reading. We certainly still can have an army of junior reviewers,
or not-wannabe-hackers reviewers checking the other points. That'd push
the bottleneck some.

Regards,
-- 
dim

[1] http://www.osnews.com/images/comics/wtfm.jpg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Allow substring/replace() to get/set bit values

2010-01-08 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@commandprompt.com writes:
 Leonardo F wrote:
 How can I keep up with who's doing what?

 Read this list and pgsql-committers.

Or subscribe to the RSS feed from:

  http://git.postgresql.org/gitweb?p=postgresql.git;a=summary

-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] damage control mode

2010-01-09 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
  I have always
 felt that the purpose of a CommitFest was to give everyone a fair
 shake at getting their patch reviewed, provided that they followed
 certain ground rules.  

Yes, like for example submitting the patch before the commit fest
begins.

 And I thought we had agreement that one of
 those ground rules was don't submit new, large patches to the final
 CommitFest in a particular release cycle.  No?

I don't remember this having been agreed upon. What I think have been
said before is that doing so would not help stabilizing the tree before
release.

You seem to be wanting to put a lot of energy into being successful at
following the current release schedule, which others seem to be seeing
as an hint or a wish more than anything else (it's the expected one, not
the one we're committed to, I'd venture).

Is it more important to follow the calendar or to be unable to know with
a month precision when we're going to release the best possible 8.5?
Again, it's a compromise to find. You're pushing towards the calendar,
we're advocating staying fair (opened?) with contributors even when it
means we're taking risks on the schedule.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] damage control mode

2010-01-09 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Basically, here's my feeling.  Either we have a rule that we can
 bounce large, previously-unseen patches from the final CommitFest of
 the release cycle, or we don't.  If we do, then we should go ahead and
 do it, and we should do it early when it will have more effect rather
 than putting a lot of time into those patches and doing it only once
 the release is already late.  On the other hand, if we don't, then we
 should have to core team publish a clear statement that all
 CommitFests are equal and we're just going to slip the schedule if
 there are too many patches for the last one.

Yeah, problem being we're trying to solve at least two different
problems here: make contributor happier to contribute to PostgreSQL by
giving them early feedback and releasing their code sooner rather than
later, and having a time-based release.

The two points contradicts exactly at the end of the cycle, when we have
to decide we give the priority to the schedule or the feature
set. Knowing that being late now means being even more late on next
cycle, so contributions missing the boat are even more impacted.

All of this is stating the obvious one more time, but I think that's the
reason why the rule is not written on any wall, and why nobody tried to
enforce it in any way yet.

 What I think have been
 said before is that doing so would not help stabilizing the tree before
 release.

 Sorry, I'm not following this sentence.

Trying to state some more obvious, so as to be sure we're talking about
the same things.

 I am definitely pushing for the schedule.  It's a maxim of software
 development that you can have time-based releases or feature-based
 releases, but not both.  In this community, we have time-based
 releases early in the cycle and then they change to feature-based
 releases late in the cycle.  As we found out with 8.4, trying to be
 both on time and feature-complete can result in failing at both.  I
 feel that we've been eminently fair to contributors in the 8.5 cycle -
 it's something that I have personally worked very hard on - and I
 actually also feel that what I am proposing now is also fair.  It may
 not be very popular, though.

This has already said before, but let's try it again.

One way to solve the problem could be to have a dedicated release
management team, taking responsibilities after last alpha of the cycle
until release: open items, getting to beta, then fixing any bug testers
find, some advocacy people for having more testers, etc, then release
candidates management and then .0 release.

While this team would work on this, we could maybe have the next cycle
open for development, with its first CommitFest happening while 8.5.0 is
not yet out the door.

Of course it has been said more than once that some resources will have
to be there on both the teams, and that we want people to dedicate to
beta testing rather than new features (which is always more fun).

My guess is that current state of affairs is not working that well,
forcing people to concentrate on stabilizing current beta will push them
to procrastinate if that's not what they want to do. If instead they are
working some more on their next patch, what do we lose?

Now running the release management parallel to the first one or two
commit fest of the next cycle would mean less resources to review and
commit that ones, but maybe a better overall average.

The advantages of doing so, if not clear, are that developments never
stops and it's even easier to return a patch in the last commitfest,
we know when next one begins.

Frankly, forcing people into release management and quality assurance
when they do not want to do it does not sound the best way to offer the
best stable code possible at .0 time. And opening a commitfest were it's
possible nothing will get committed (but only reviewed) because resources
are not available sounds only fair. If you really want your stuff
committed, go help stabilizing the beta.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Congrats Alvaro!

2010-01-10 Thread Dimitri Fontaine

Congratulations, good things to all the family!

Devrim GÜNDÜZ dev...@gunduz.org writes:

 Alvaro, one of our hackers and committers and my colleague more than 4
 years, had a new baby today. 

 Congrats Alvaro for his second daughter !
   
 -committers, please commit your patches for our new baby elephant!

-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-10 Thread Dimitri Fontaine
Andrew Dunstan and...@dunslane.net writes:
 That is assuming that the MUA gives you the option of specifying the
 attachment MIME type. Many (including mine) do not. It would mean an extra
 step - I'd have to gzip each patch or something like that. That would be
 unfortunate,as well as imposing extra effort, because it would make the
 patch not display inline in many MUAs (again, like mine).

Bad MUA, change MUA, or what they say…

More seriously though, it's not the first time we're having some
difficulties with the MHonArc setup, and I think it's also related to
the poor thread following on the archives website at month boundaries.

MHonArc (http://hydra.nac.uci.edu/indiv/ehood/mhonarc.html) seems to be
about converting the mails into some HTML pages, and offering the web
interface to get to use them, with some indexing and searches
facilities.

Are our indexing and searches provided by MHonArc or maintained by the
community? How helpful considering alternatives, such as AOX (which runs
atop PostgreSQL and would offer anonymous IMAP facility over the
archives) would be?

Of course it'll boil down to who's maintaining the current solution and
how much time is allocated to this, the solution research and migration
would have to fit in there I suppose. Same as pgfoundry. But still,
should we talk about it?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add .gitignore files to CVS?

2010-01-10 Thread Dimitri Fontaine
Hi,

Another occasion to show ignorance, I couldn't resist!

Tom Lane t...@sss.pgh.pa.us writes:
  What you're
 talking about would require a great deal more maintenance effort, and
 I don't see the point compared to using a VPATH build.

I've discovered VPATH builds pretty recently, in the context of
packaging extensions. The concept is simple but it could be helpful if
spelled in simple terms: VPATH is about finding the sources.

So you build wherever you want without messing your checkout. But you
build from the target directory, telling in VPATH where the sources are.

HTH, regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-11 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@commandprompt.com writes:
 Absolutely.  The month boundary problem boils down to the fact that
 Mhonarc does not scale very well, so we can't have mboxes that are too
 large.  This is why most people split their archives per month, and then
 each month is published as an independent Mhonarc output archive.  It's
 a horrid solution.

 Are our indexing and searches provided by MHonArc or maintained by the
 community?

 Searches are completely external to mhonarc.

Changing the MHonArc solution would probably mean adapting them, I
guess, or proposing a new solution with compatible output for the
searching to still work…

 How helpful considering alternatives, such as AOX (which runs
 atop PostgreSQL and would offer anonymous IMAP facility over the
 archives) would be?

 Of course it'll boil down to who's maintaining the current solution and
 how much time is allocated to this, the solution research and migration
 would have to fit in there I suppose. Same as pgfoundry. But still,
 should we talk about it?

 There's some talk about writing our own archiving system,
 database-backed.  There have been a few false starts but no concrete
 result so far.  We need a lot more manpower invested in this problem.
 If there's interest, let's talk about it.

AOX is already a database backed email solution, offering an archive
page with searching. I believe the searching is baked by tsearch
indexing. That's why I think it'd be suitable.

They already archive and offer search over one of our mailing lists, and
from there it seems like we'd only miss the user interface bits:

  http://archives.aox.org/archives/pgsql-announce

I hope the UI bits are not the most time demanding one.

Is there someone with enough time to install aox somewhere and have it
subscribed to our lists?

 My daugher was born yesterday and I'm having a bit of a calm before the
 storm because she's not coming home until Tuesday or so (at this time of
 the day, that is, because I have to take care of the other daughter).
 I'll be probably away for (at least) a week when she does; and I'll
 probably have somewhat of a shortage of spare time after that.

Ahaha :)
IME that's not the shortage of spare time which ruins you the most as
the lack of energy when you do have this little precious resource
again, very few piece of it atime.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-11 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:
 As for AOX, my understanding is that it is no longer maintained, so
 I'd be worried about choosing such a solution for a complex problem.
 But it's open for discussion.

Ouch.
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-11 Thread Dimitri Fontaine
Dave Page dp...@pgadmin.org writes:
 I recall having tried AOX a long time ago but I can't remember the reason
 why I was not satisfied. I guess I can give another try by setting up a test
 ML archive.

 I tried it too, before I started writing the new prototype archiver
 from scratch. I too forget why I gave up on it, but it was a strong
 enough reason for me to start coding from scratch.

 BTW, we only need to replace the archiver/display code. The search
 works well already.

What the current archiver looks like? A PG database containing the raw
mails and attachements? It that's the case the missing piece would be to
plug a browsing UI atop of that, right?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-11 Thread Dimitri Fontaine
Dimitri Fontaine dfonta...@hi-media.com writes:
 Magnus Hagander mag...@hagander.net writes:
 As for AOX, my understanding is that it is no longer maintained, so
 I'd be worried about choosing such a solution for a complex problem.
 But it's open for discussion.

 Ouch.

It seems that the company baking the development is dead, but the
developpers are still working on the product on their spare time. New
release ahead.

They're not working on the archive UI part.
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Compression Library and Usages

2010-01-11 Thread Dimitri Fontaine
Hi,

The topic came on IRC and it might be that the later attempts at using
another library missed one of the offering, namely FastLZ. It's made for
being quick rather than minimize size, it's MIT licenced, 551 lines of
portable ansi-C code, already tested on a host of systems and compilers.

  http://www.fastlz.org/
  http://www.fastlz.org/compilers.htm
  http://www.fastlz.org/benchmarks.htm
  http://code.google.com/p/fastlz/source/browse/trunk/fastlz.c

Example call sites:
  http://code.google.com/p/fastlz/source/browse/trunk/6pack.c
  http://code.google.com/p/fastlz/source/browse/trunk/6unpack.c

Maybe the only drawback is that it does not seem to be well maintained
any more, so it could mean the code would have to be considered for
inclusion into the PostgreSQL tree. Still I though I'd mention the lib.

As far as the usage part is concerned, not only TOAST could benefit, but
some IRC users are talking about protocol level compression. Should this
one be fast enough and an option on the other grounds, how open to libpq
compression are we?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compression Library and Usages

2010-01-11 Thread Dimitri Fontaine
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Are the concerns from previous discussions off-base?
  
 http://archives.postgresql.org/pgsql-performance/2009-08/msg00053.php

I knew I was forgetting about something, thanks for the reminder. Damn
it, patents.

-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-12 Thread Dimitri Fontaine
Dave Page dp...@pgadmin.org writes:
 2010/1/12 Matteo Beccati p...@beccati.com:
 So, I've decided to spend a bit more time on this and here is a proof of 
 concept web app that displays mailing list archives reading from the AOX 
 database:

 http://archives.beccati.org/

 Seems to work.

Hehe, nice a beginning!

 So just to put this into perspective and give anyone paying attention
 an idea of the pain that lies ahead should they decide to work on
 this:

 - We need to import the old archives (of which there are hundreds of
 thousands of messages, the first few years of which have, umm, minimal
 headers.

Anyone having a local copy of this in his mailboxes? At some point there
were some NNTP gateway, so maybe there's a copy this way.

 - We need to generate thread indexes

We have CTEs :)

 - We need to re-generate the original URLs for backwards compatibility

I guess the message-id one ain't the tricky one... and it should be
possible to fill a relation table like
  monharc_compat(message_id, list, year, month, message_number);

Then we'd need some help from the webserver (rewrite rules I guess) so
that the current URL is transformed to call a catch-all script:
   http://archives.postgresql.org/pgsql-xxx/-MM/msg01234.php
- http://archives.postgresql.org/compat.php?l=xxxy=m=MMn=01234

In that compat.php script you then issue the following query or the like
to get the message_id, then use the newer infrastructure to get to
display it:

  SELECT message_id 
FROM monharc_compat 
   WHERE list = ? and year = ? and month = ? and message_number = ?;

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] damage control mode

2010-01-12 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I agree.  My main concern in terms of dealing with these outstanding
 is that it will distract us, particularly Tom, from stabilizing the
 tree, especially HS, VF, and SR.  If the tree were in a releasable
 state today I wouldn't be worrying about it.

You sound like you want to drop the last Commit Fest and prepare beta
instead.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-12 Thread Dimitri Fontaine
Aidan Van Dyk ai...@highrise.ca writes:

 I'll note that the whole idea of a email archive interface might be a
 very good advocacy project as well.  AOX might not be a perfect fit,
 but it could be a good learning experience... Really, all the PG mail
 archives need is:

 1) A nice normalized DB schema representing mail messages and their
relations to other message and recipients (or folders)

We're now hoping that this one will fit:

  http://www.archiveopteryx.org/schema

 2) A injector that can parse an email message, and de-compose it into
the various parts/tables of the DB schema, and insert it

aox has that either as a bulk importer or as a MDA.

 3) A nice set of SQL queries to return message, parts, threads,
folders based on $criteria (search, id, folder, etc)

I guess Matteo's working on that…

 4) A web interface to view the messages/thread/parts #3 returns

And that too.

 The largest part of this is #1, but a good schema would be a very good
 candidate to show of some of PG's more powerful features in a way that
 others could see (like the movie store sample somewhere) , such as:
   1) full text search
   2) text vs bytea handling (thinking of all the mime parts, and encoding,
  etc)
   3) CTEs, ltree, recursion, etc, for threading/searching
   4) Triggers for materialized views (for quick threading/folder queries)
   5) expression indexes

And Tsearch, too, maybe. Oh and pg_trgm might be quite good at providing
suggestion as you type or Did you mean? stuff.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-12 Thread Dimitri Fontaine
Aidan Van Dyk ai...@highrise.ca writes:
 aox has that either as a bulk importer or as a MDA.

 Yup, LMTP is ideally suited for that too.

Yes.

  3) A nice set of SQL queries to return message, parts, threads,
 folders based on $criteria (search, id, folder, etc)
 
 I guess Matteo's working on that…

 Right, but this is where I want to see the AOX schema imporove... In
 ways like adding persistant tables for threading, which are updated by
 triggers as new messages are delivered, etc.  Documented queries that
 show how to use CTEs, ltree, etc to get threaded views, good FTS support
 (with indexes and triggers managing them), etc.

+1.

I just didn't understand how much your proposal fit into current work :)
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] per-user pg_service.conf

2010-01-13 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 I was surprised/annoyed to find out that there is no way to have
 per-user pg_service.conf, something like ~/.pg_service.conf (well,
 except by export PGSYSCONFDIR).  That would be easy to add.  Comments?

+1.

I'll use it the day it exists.
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] segmentation fault in function

2010-01-14 Thread Dimitri Fontaine
Sergej Galkin sergej.gal...@gmail.com writes:

 I am realizing gist index and get a bug, that crashes DB. I' debugged
 my program as Robert(thanks !) advised me and I know which procedure
 crashed.

Using gdb you should have the line number in the source code and should
be able to look up the variable values. For that you need to use a
custom PostgreSQL build using --with-cassert --enable-debug.

Then report some more details if you still need help.

Also have a look at Gevel to be able to inspect your index :
  http://www.sai.msu.su/~megera/wiki/Gevel

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-14 Thread Dimitri Fontaine
Matteo Beccati p...@beccati.com writes:
 I've extended AOX with a trigger that takes care of filling a separate table
 that's used to display the index pages. The new table also stores threading
 information (standard headers + Exchange headers support) and whether or not
 the email has attachments.

 Please check the updated PoC: http://archives.beccati.org/

Looks pretty good, even if some thread are still separated (this one for
example), and the ordering looks strange.

Seems to be right on tracks, that said :)

Thanks for your work,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch to implement ECPG side tracing / tracking ...

2010-01-14 Thread Dimitri Fontaine
Michael Meskes mes...@postgresql.org writes:
 On Wed, Jan 13, 2010 at 10:30:32PM +0100, Hans-Juergen Schoenig wrote:
 performance tune your precompiler application. in PostgreSQL it is
 currently a little hard to get from the log what is executed how
 often by which application in which speed and so on. so, we came up

 Hard or impossible? I agree with the other replies that this looks more like a
 functionality you'd want in the server rather than the client.

PgFouine partly answers that, and with application_name in 8.5 it should
further improve:

  http://pgfouine.projects.postgresql.org/

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mailing list archiver chewing patches

2010-01-14 Thread Dimitri Fontaine
Matteo Beccati p...@beccati.com writes:
 WITH RECURSIVE t (mailbox, uid, date, subject, sender, has_attachments,
 parent_uid, idx, depth) AS (
   SELECT mailbox, uid, date, subject, sender, has_attachments, parent_uid,
 uid::text, 1
   FROM arc_messages
   WHERE parent_uid IS NULL AND mailbox = 15
   UNION ALL
   SELECT a.mailbox, a.uid, a.date, a.subject, a.sender, a.has_attachments,
 a.parent_uid, t.idx || '.' || a.uid::text, t.depth + 1
   FROM t JOIN arc_messages a USING (mailbox)
   WHERE t.uid = a.parent_uid
 ) SELECT * FROM t ORDER BY idx

 Any improvements to sorting are welcome :)

What I'd like would be to have it sorted by activity, showing first the
thread which received the later messages. I'm yet to play with CTE and
window function myself so without a database example to play with I
won't come up with a nice query, but I guess a more educated reader will
solve this without a sweat, as it looks easier than sudoku-solving,
which has been done already :)

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication, retrying from archive

2010-01-14 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 If we don't fix that within the server, we will need to document that
 caveat and every installation will need to work around that one way or
 another. Maybe with some monitoring software and an automatic restart. Ugh.

 I wasn't really asking if it's possible to fix, I meant Let's think
 about *how* to fix that.

Did I mention my viewpoint on that already?
  http://archives.postgresql.org/pgsql-hackers/2009-07/msg00943.php

It could well be I'm talking about things that have no relation at all
to what is in the patch currently, and that make no sense for where we
want the patch to go. But I'd like to know about that so that I'm not
banging my head on the nearest wall each time the topic surfaces.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXPLAIN, utility statement parameters, and recent plpgsql changes

2010-01-14 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 This works well enough for regular DML statements, but it falls down for
 EXPLAIN which is a utility statement, because *parse analysis of utility
 statements doesn't do anything*.  EXPLAIN actually does the parse
 analysis of its contained statement at the beginning of execution.
 And that is too late, in the scenario Pavel exhibited.  Why is it too
 late?  Because SPI_cursor_open_internal() intentionally freezes the
 ParamListInfo struct after doing initial parsing: what it copies into
 the cursor portal is just a static list of data values without the
 parser hooks (see copyParamList).

Would it make any sense for this function to get to call the hook in the
case a utility statement is being processed?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   5   6   7   8   9   10   >