from:"A.M."

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread A.M.


On Jun 27, 2012, at 7:34 AM, Robert Haas wrote:

 On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 So, here's a patch.  Instead of using POSIX shmem, I just took the
 expedient of using mmap() to map a block of MAP_SHARED|MAP_ANONYMOUS
 memory.  The sysv shm is still allocated, but it's just a copy of
 PGShmemHeader; the real shared memory is the anonymous block.  This
 won't work if EXEC_BACKEND is defined so it just falls back on
 straight sysv shm in that case.
 
 Um.  I hadn't thought about the EXEC_BACKEND interaction, but that seems
 like a bit of a showstopper.  I would not like to give up the ability
 to debug EXEC_BACKEND mode on Unixen.
 
 Would Posix shmem help with that at all?  Why did you choose not to
 use the Posix API, anyway?
 
 It seemed more complicated.  If we use the POSIX API, we've got to
 have code to find a non-colliding name for the shm, and we've got to
 arrange to clean it up at process exit.  Anonymous shm doesn't require
 a name and goes away automatically when it's no longer in use.
 
 With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to
 make it continue to use a full-sized sysv shm.
 

I solved this by unlinking the posix shared memory segment immediately after 
creation. The file descriptor to the shared memory is inherited, so, by 
definition, only the postmaster children can access the memory. This ensures 
that shared memory cleanup is immediate after the postmaster and all children 
close, as well. The fcntl locking is not required to protect the posix shared 
memory- it can protect itself.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-26 Thread A.M.


On Jun 26, 2012, at 5:44 PM, Josh Berkus wrote:

 
 On that, I used to be of the opinion that this is a good compromise (a
 small amount of interlock space, plus mostly posix shmem), but I've
 heard since then (I think via AgentM indirectly, but I'm not sure)
 that there are cases where even the small SysV segment can cause
 problems -- notably when other software tweaks shared memory settings
 on behalf of a user, but only leaves just-enough for the software
 being installed.  This is most likely on platforms that don't have a
 high SysV shmem limit by default, so installers all feel the
 prerogative to increase the limit, but there's no great answer for how
 to compose a series of such installations.  It only takes one
 installer that says whatever, I'm just catenating stuff to
 sysctl.conf that works for me to sabotage Postgres' ability to start.
 
 Personally, I see this as rather an extreme case, and aside from AgentM
 himself, have never run into it before.  Certainly it would be useful to
 not need SysV RAM at all, but it's more important to get a working patch
 for 9.3.


This can be trivially reproduced if one runs an old (SysV shared memory-based) 
postgresql alongside a potentially newer postgresql with a smaller SysV 
segment. This can occur with applications that bundle postgresql as part of the 
app.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-26 Thread A.M.


On Jun 26, 2012, at 6:12 PM, Daniel Farina wrote:
 
 (Emphasis mine).
 
 I don't think that -hackers at the time gave the zero-shmem rationale
 much weight (I also was not that happy about the safety mechanism of
 that patch), but upon more reflection (and taking into account *other*
 software that may mangle shmem settings) I think it's something at
 least worth thinking about again one more time.  What killed the patch
 was an attachment to the deemed-less-safe stategy for avoiding bogus
 shmem attachments already in it, but I don't seem to recall anyone
 putting a whole lot of thought at the time into the zero-shmem case
 from what I could read on the list, because a small interlock with
 nattach seemed good-enough.
 
 I'm simply suggesting that for additional benefits it may be worth
 thinking about getting around nattach and thus SysV shmem, especially
 with regard to safety, in an open-ended way.  Maybe there's a solution
 (like Robert's FIFO suggestion?) that is not too onerous and can
 satisfy everyone.


I solved this via fcntl locking. I also set up gdb to break in critical regions 
to test the interlock and I found no flaw in the design. More eyes would be 
welcome, of course.
https://github.com/agentm/postgres/tree/posix_shmem

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-26 Thread A.M.


On 06/26/2012 07:30 PM, Tom Lane wrote:

A.M. age...@themactionfaction.com writes:

On Jun 26, 2012, at 6:12 PM, Daniel Farina wrote:

I'm simply suggesting that for additional benefits it may be worth
thinking about getting around nattach and thus SysV shmem, especially
with regard to safety, in an open-ended way.



I solved this via fcntl locking.


No, you didn't, because fcntl locks aren't inherited by child processes.
Too bad, because they'd be a great solution otherwise.



You claimed this last time and I replied:
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00656.php

I address this race condition by ensuring that a lock-holding violator 
is the postmaster or a postmaster child. If such as condition is 
detected, the child exits immediately without touching the shared 
memory. POSIX shmem is inherited via file descriptors.


This is possible because the locking API allows one to request which PID 
violates the lock. The child expects the lock to be held and checks that 
the PID is the parent. If the lock is not held, that means that the 
postmaster is dead, so the child exits immediately.


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-26 Thread A.M.


On 06/26/2012 07:15 PM, Alvaro Herrera wrote:


Excerpts from Tom Lane's message of mar jun 26 18:58:45 -0400 2012:


Even if you actively try to configure the shmem settings to exactly
fill shmmax (which I concede some installation scripts might do),
it's going to be hard to do because of the 8K granularity of the main
knob, shared_buffers.


Actually it's very easy -- just try to start postmaster on a system with
not enough shmmax and it will tell you how much shmem it wants.  Then
copy that number verbatim in the config file.  This might fail on picky
systems such as MacOSX that require some exact multiple or power of some
other parameter, but it works fine on Linux.



Except that we have to account for other installers. A user can install 
an application in the future which clobbers the value and then the 
original application will fail to run. The options to get the first app 
working is:


a) to re-install the first app (potentially preventing the second app 
from running)
b) to have the first app detect the failure and readjust the value 
(guessing what it should be) and potentially forcing a reboot
c) to have the the user manually adjust the value and potentially force 
a reboot


The failure usually gets blamed on the first application.

That's why we had to nuke SysV shmem.

Cheers,
M



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.


On Mar 7, 2012, at 11:39 PM, Bruce Momjian wrote:

 On Thu, Mar 01, 2012 at 09:06:10PM -0500, Bruce Momjian wrote:
 OK, combining your and Robert's ideas, how about I have pg_upgrade write
 the server log to a file, and the pg_dump output to a file (with its
 stderr), and if pg_upgrade fails, I report the failure and mention those
 files.  If pg_upgrade succeeds, I remove the files?  pg_upgrade already
 creates temporary files that it removes on completion.
 
 OK, I have completed a rework of pg_upgrade logging.  pg_upgrade had 4
 logging options, -g, -G, -l, and -v, and still it wasn't possible to get
 useful logging.  :-(
 
 What I have done with this patch is to remove -g, -G, and -l, and
 unconditionally write to 4 log files in the current directory (in
 addition to the 3 SQL files I already create).
 
 If pg_upgrade succeeds, the files are removed, but if it fails (or if
 the new -r/retain option is used), the files remain.  Here is a sample
 failure when I create a plpgsql function in the old server, but truncate
 plpgsql.so in the new server:

It looks like the patch will overwrite the logs in the current working 
directory, for example, if pg_upgrade is run twice in the same place. Is that 
intentional? I had imagined that the logs would have been dumped in the /tmp 
directory so that one can compare results if the first pg_upgrade run had been 
errant.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.


On Mar 8, 2012, at 10:00 AM, Bruce Momjian wrote:
 
 Yes.  I was afraid that continually appending to a log file on every run
 would be too confusing.  I could do only appends, or number the log
 files, that those seemed confusing.
 
 the /tmp directory so that one can compare results if the first pg_upgrade
 run had been errant.
 
 You would have to copy the file to a new name before re-running
 pg_upgrade.
 
 The only reason I truncate them on start is that I am appending to them
 in many places in the code, and it was easier to just truncate them on
 start rather than to remember where I first write to them.
 


mktemps?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.


On Mar 8, 2012, at 4:37 PM, Peter Eisentraut wrote:

 On tor, 2012-03-08 at 10:06 -0500, A.M. wrote:
 The only reason I truncate them on start is that I am appending to
 them
 in many places in the code, and it was easier to just truncate them
 on
 start rather than to remember where I first write to them.
 
 
 mktemps? 
 
 I don't want to see some tool unconditionally writing files (log or
 otherwise) with unpredictable names.  That would make it impossible to
 clean up in a wrapper script.


The point of writing temp files to the /tmp/ directory is that they don't need 
to be cleaned up.

You really prefer having log files written to your current working directory? I 
don't know of any utility that pollutes the cwd like that- it seems like an 
easy way to forget where one left the log files.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-02-29 Thread A.M.


On Feb 29, 2012, at 6:02 PM, Bruce Momjian wrote:

 On Wed, Feb 29, 2012 at 04:34:24PM -0500, Robert Haas wrote:
 On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian br...@momjian.us wrote:
 OK, I have implemented both Roberts and Àlvaro's ideas in my patch.
 I only add the .old suffix to pg_controldata when link mode is used, and
 I now do it after the schema has been created (the most common failure
 case for pg_upgrade), and just before we actually link files --- both
 very good ideas.
 
 Thanks for working on this.  I think this will be a significant
 usability improvement.
 
 Glad I got such good feedback and ideas.
 
 Any ideas about improving the error reporting more generally, so that
 when reloading the dump fails, the user can easily see what went
 belly-up, even if they didn't use -l?
 
 The only idea I have is to write the psql log to a temporary file and
 report the last X lines from the file in case of failure.  Does that
 help?
 


Perhaps pg_upgrade can print the path to the temp file containing the log and 
instruct the user to look there for more detail.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] leakproof

2012-02-26 Thread A.M.


On Feb 26, 2012, at 10:39 AM, Peter Eisentraut wrote:

 On ons, 2012-02-22 at 10:56 -0500, Andrew Dunstan wrote:
 The trouble with leakproof is that it 
 doesn't point to what it is that's not leaking, which is information 
 rather than memory, as many might imagine (and I did) without further 
 hints. I'm not sure any single English word would be as descriptive as
 I'd like. 
 
 Well, we have RETURNS NULL ON NULL INPUT, so maybe DOES NOT LEAK
 INFORMATION. ;-)

If you are willing to go full length, then the computer science term is 
referential transparency, no? 

http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)

So a function could be described as REFERENTIALLY TRANSPARENT.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements

2012-01-23 Thread A.M.


On Jan 23, 2012, at 2:49 PM, Tom Lane wrote:

 Marko Kreen mark...@gmail.com writes:
 [ bytea_output doesn't need to be GUC_REPORT because format is 
 autodetectable ]
 
 Fair enough.  Anyway we're really about two years too late to revisit that.
 
 Btw, it does not seems that per-request metainfo change requires
 major version.  It just client can send extra metainfo packet
 before bind+execute, if it knows server version is good enough.
 
 That is nonsense.  You're changing the protocol, and then saying
 that clients should consult the server version instead of the
 protocol version to know what to do.
 
 2. Can we postpone minor data format changes on the wire until there
   is proper way for clients to request on-the-wire formats?
 
 I think that people are coming around to that position, ie, we need
 a well-engineered solution to the versioning problem *first*, and
 should not accept incompatible minor improvements until we have that.

One simple way clients could detect the binary encoding at startup would be to 
pass known test parameters and match against the returned values. If the client 
cannot match the response, then it should choose the text representation.

Alternatively, the 16-bit int in the Bind and RowDescription messages could be 
incremented to indicate a new format and then clients can specify the highest 
version of the binary format which they support.

Cheers,
M 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements

2012-01-23 Thread A.M.


On Jan 23, 2012, at 4:45 PM, Merlin Moncure wrote:

 On Mon, Jan 23, 2012 at 2:00 PM, A.M. age...@themactionfaction.com wrote:
 One simple way clients could detect the binary encoding at startup would be 
 to pass known test parameters and match against the returned values. If the 
 client cannot match the response, then it should choose the text 
 representation.
 
 Alternatively, the 16-bit int in the Bind and RowDescription messages could 
 be incremented to indicate a new format and then clients can specify the 
 highest version of the binary format which they support.
 
 Prefer the version.  But why send this over and over with each bind?
 Wouldn't you negotiate that when connecting? Most likely, optionally,
 doing as much as you can from the server version?  Personally I'm not
 really enthusiastic about a solution that adds a non-avoidable penalty
 to all queries.
 
 Also, a small nit: this problem is not specific to binary formats.
 Text formats can and do change, albeit rarely, with predictable
 headaches for the client.  I see no reason to deal with text/binary
 differently.  The only difference between text/binary wire formats in
 my eyes are that the text formats are documented.
 
 merlin


In terms of backwards compatibility (to support the widest range of clients), 
wouldn't it make sense to freeze each format option? That way, an updated text 
version could also assume a new int16 format identifier. The client would 
simply pass its preferred format. This could also allow for multiple in-flight 
formats; for example, if a client anticipates a large in-bound bytea column, it 
could specify format X which indicates the server should use gzip the result 
before sending. That same format may not be preferable on a different request.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory patch status

2011-06-16 Thread A.M.

On Jun 16, 2011, at 11:51 AM, Heikki Linnakangas wrote:

What's the current state of the POSIX shared memory patch? I grabbed the
patch from
http://archives.postgresql.org/message-id/d9edacf7-53f1-4355-84f8-2e74cd19d...@themactionfaction.com
and it doesn't seem to apply cleanly any more. Are you planning to continue
working on it?

If I understood the conclusion of the discussions correctly, the current plan
is to continue using a small SysV shared memory segment for the interlock,
and POSIX shared memory for the rest. That lets us stay below SHMMAX even if
it's small, which is convenient for admins. Was there a conclusion on whether
we should use fnctl() to provide some extra safety in the current interlock
mechanism? I'm feeling that that should probably be split off to a separate
patch, it would be easier to review separately.

Hello,

Please try a merge from my github branch:

https://github.com/agentm/postgres/tree/posix_shmem

I don't believe any conclusions were reached because the debate concerned
whether or not fcntl locking was sufficient. I thought so while others pointed
out that the proposed interlock would not work with mutli-client NFSv3 despite
the fact that the current interlock doesn't either.

I originally made the patch because SysV memory sometimes requires reboots
which is especially annoying when expanding an existing production db server.
Even if the next version of postgresql incorporates a hybrid SysV/POSIX shmem
setup, reboots may still be required if one runs any other processes requiring
SysV shmem (such as older versions of postgresql).

In any case, I lost interest in maintaining the patch and would not object to
having the patch removed from the CommitFest.

Cheers,
M
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Process wakeups when idle and power consumption

2011-05-05 Thread A.M.


On May 5, 2011, at 4:08 PM, Alvaro Herrera wrote:

 Excerpts from Peter Geoghegan's message of jue may 05 16:49:25 -0300 2011:
 
 I'll need to take a look at statistics, autovacuum and Logger
 processes too, to see if they present more subtle opportunities for
 reduced idle power consumption.
 
 More subtle?  Autovacuum wakes up once per second and it could sleep a
 lot longer if it weren't for the loop that checks for signals.  I think
 that could be improved a lot.

Could kqueue be of use here? Non-kqueue-supporting platforms could always fall 
back to the existing select().

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.


On Apr 13, 2011, at 9:30 PM, Robert Haas wrote:

 On Wed, Apr 13, 2011 at 6:11 PM, A.M. age...@themactionfaction.com wrote:
 I don't see why we need to get rid of SysV shared memory; needing less
 of it seems just as good.
 
 1. As long one keeps SysV shared memory around, the postgresql project has 
 to maintain the annoying platform-specific document on how to configure the 
 poorly named kernel parameters. If the SysV region is very small, that means 
 I can run more postgresql instances within the same kernel limits, but one 
 can still hit the limits. My patch allows the postgresql project to delete 
 that page and the hassles with it.
 
 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? 
 (Pun intended.)
 
 With all due respect, I think this is an unproductive conversation.
 Your patch proves that SysV is wholly unnecessary only if we also
 agree that fcntl() locking is just as reliable as the nattch
 interlock, and Tom and I are trying to explain why we don't believe
 that's the case.  Saying that we're just wrong without responding to
 our points substantively doesn't move the conversation forward.

Sorry- it wasn't meant to be an attack- just a dumb pun. I am trying to argue 
that, even if the fcntl is unreliable, the startup procedure is just as 
reliable as it is now. The reasons being:

1) the SysV nattch method's primary purpose is to protect the shmem region. 
This is no longer necessary in my patch because the shared memory in unlinked 
immediately after creation, so only the initial postmaster and its children 
have access.

2) the standard postgresql lock file remains the same

Furthermore, there is indeed a case where the SysV nattch cannot work while the 
fcntl locking can indeed catch: if two separate machines have a postgresql data 
directory mounted over NFS, postgresql will currently allow both machines to 
start a postmaster in that directory because the SysV nattch check fails and 
then the pid in the lock file is the pid on the first machine, so postgresql 
will say starting anyway. With fcntl locking, this can be fixed. SysV only 
has presence on one kernel.


 
 In case it's not clear, here again is what we're concerned about: A
 System V shm *cannot* be removed until nobody is attached to it.  A
 lock file can be removed, or the lock can be accidentally released by
 the apparently innocuous operation of closing a file descriptor.
 
 Both you and Tom have somehow assumed that the patch alters current 
 postgresql behavior. In fact, the opposite is true. I haven't changed any of 
 the existing behavior. The robust behavior remains. I merely added fcntl 
 interlocking on top of the lock file to replace the SysV shmem check.
 
 This seems contradictory.  If you replaced the SysV shmem check, then
 it's not there, which means you altered the behavior.

From what I understood, the primary purpose of the SysV check was to protect 
the shared memory from multiple stompers. The interlock was a neat 
side-effect. 

The lock file contents are currently important to get the pid of a potential, 
conflicting postmaster. With the fcntl API, we can return a live conflicting 
PID (whether a postmaster or a stuck child), so that's an improvement. This 
could be used, for example, for STONITH, to reliably kill a dying replication 
clone- just loop on the pids returned from the lock.

Even if the fcntl check passes, the pid in the lock file is checked, so the 
lock file behavior remains the same.

If you were to implement a daemon with a shared data directory but no shared 
memory, how would implement the interlock? Would you still insist on SysV 
shmem? Unix daemons generally rely on lock files alone. Perhaps there is a 
different API on which we can agree.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.


On Apr 14, 2011, at 8:22 AM, Florian Weimer wrote:

 * Tom Lane:
 
 Well, the fundamental point is that ignoring NFS is not the real
 world.  We can't tell people not to put data directories on NFS,
 and even if we did tell them not to, they'd still do it.  And NFS
 locking is not trustworthy, because the remote lock daemon can crash
 and restart (forgetting everything it ever knew) while your own machine
 and the postmaster remain blissfully awake.
 
 Is this still the case with NFSv4?  Does the local daemon still keep
 the lock state?

The lock handling has been fixed in NFSv4.

http://nfs.sourceforge.net/
NFS Version 4 introduces support for byte-range locking and share reservation. 
Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must 
maintain contact with an NFS Version 4 server to continue extending its open 
and lock leases.

http://linux.die.net/man/2/flock
flock(2) does not lock files over NFS. Use fcntl(2) instead: that does work 
over NFS, given a sufficiently recent version of Linux and a server which 
supports locking.

I would need some more time to dig up what recent version of Linux specifies, 
but NFSv4 is likely required.

 
 None of this is to say that an fcntl lock might not be a useful addition
 to what we do already.  It is to say that fcntl can't just replace what
 we do already, because there are real-world failure cases that the
 current solution handles and fcntl alone wouldn't.
 
 If it requires NFS misbehavior (possibly in an older version), and you
 have to start postmasters on separate nodes (which you normally
 wouldn't do), doesn't this make it increasingly unlikely that it's
 going to be triggered in the wild?

With the patch I offer, it would be possible to use shared storage and failover 
postgresql nodes on different machines over NFS. (The second postmaster blocks 
and waits for the lock to be released.) Obviously, such as a setup isn't as 
strong as using replication, but given a sufficiently fail-safe shared storage 
setup, it could be made reliable.

Cheers,
M



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-14 Thread A.M.


On Apr 13, 2011, at 11:37 PM, Tom Lane wrote:

 A.M. age...@themactionfaction.com writes:
 1. As long one keeps SysV shared memory around, the postgresql project
 has to maintain the annoying platform-specific document on how to
 configure the poorly named kernel parameters.
 
 No, if it's just a small area, I don't see that that's an issue.
 You're going to max out on other things (like I/O bandwidth) long before
 you run into the limit on how many postmasters you can have from this.
 The reason that those parameters are problematic now is that people tend
 to want *large* shmem segments and the typical defaults aren't friendly
 to that.

That's assuming that no other processes on the system are using up the 
available segments (such as older postgresql instances).


 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? 
 (Pun intended.)
 
 You were losing this argument already, but ad hominem attacks are pretty
 much guaranteed to get people to tune you out.  

I apologized to Robert Haas in another post- no offense was intended.

 There are real,
 substantive, unfixable reasons to not trust fcntl locking completely.

...on NFS which the postgresql community doesn't recommend anyway. But even in 
that case, the existing lock file (even without the fcntl lock), can catch that 
case via the PID in the file contents. That is what I meant when I claimed that 
the behavior remains the same.

 
 I would encourage you to take a look at the patch.
 
 Just to be perfectly clear: I have not read your patch, and am not
 likely to before the next commitfest starts, because I have
 approximately forty times too many things to do already.  I'm just going
 off your own abbreviated description of the patch.  But from what I know
 about fcntl locking, it's not a sufficient substitute for the SysV shmem
 interlock, because it has failure modes that the SysV interlock doesn't,
 and those failure modes occur in real-world cases.  Yeah, it'd be nice
 to also be able to lock against other postmasters on other NFS clients,
 but we hardly ever hear of somebody getting burnt by the lack of that
 (and fcntl wouldn't be a bulletproof defense anyway).  On the other
 hand, accidentally trying to start a duplicate postmaster on the same
 machine is an everyday occurrence.

I really do appreciate the time you have put into feedback. I posed this 
question also to Robert Haas: is there a different API which you would find 
acceptable? I chose fcntl because it seemed well-suited for this task, but the 
feedback has been regarding NFS v4 concerns.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-13 Thread A.M.


On Apr 13, 2011, at 8:36 PM, Robert Haas wrote:
 
 I don't see why we need to get rid of SysV shared memory; needing less
 of it seems just as good.

1. As long one keeps SysV shared memory around, the postgresql project has to 
maintain the annoying platform-specific document on how to configure the poorly 
named kernel parameters. If the SysV region is very small, that means I can run 
more postgresql instances within the same kernel limits, but one can still hit 
the limits. My patch allows the postgresql project to delete that page and the 
hassles with it.

2. My patch proves that SysV is wholly unnecessary. Are you attached to it? 
(Pun intended.)

 
 In answer to your off-list question, one of the principle ways I've
 seen fcntl() locking fall over and die is when someone removes the
 lock file.  You might think that this could be avoided by picking
 something important like pg_control as the log file, but it turns out
 that doesn't really work:
 
 http://0pointer.de/blog/projects/locking.html
 
 Tom's point is valid too.  Many storage appliances present themselves
 as an NFS server, so it's very plausible for the data directory to be
 on an NFS server, and there's no guarantee that flock() won't be
 broken there.  If our current interlock were known to be unreliable
 also maybe we wouldn't care very much, but AFAICT it's been extremely
 robust.

Both you and Tom have somehow assumed that the patch alters current postgresql 
behavior. In fact, the opposite is true. I haven't changed any of the existing 
behavior. The robust behavior remains. I merely added fcntl interlocking on 
top of the lock file to replace the SysV shmem check. If someone deletes the 
postgresql lock file over NFS, the data directory is equally screwed, but with 
my patch there is chance that two machines sharing a properly-configured NFS 
mount can properly interlock- postgresql cannot offer that today, so this is a 
feature upgrade with no loss. The worst case scenario is today's behavior.

My original goal remains to implement POSIX shared memory, but Tom Lane was 
right to point out that the current interlocking check relies on SysV, so, even 
though the startup locking is really orthogonal to shared memory, I implemented 
what could be considered a separate patch for that and rolled it into one.

I would encourage you to take a look at the patch.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-11 Thread A.M.


On Apr 11, 2011, at 6:06 PM, Robert Haas wrote:

 On Sun, Apr 10, 2011 at 5:03 PM, A.M. age...@themactionfaction.com wrote:
 To ensure that no two postmasters can startup in the same data directory, I 
 use fcntl range locking on the data directory lock file, which also works 
 properly on (properly configured) NFS volumes. Whenever a postmaster or 
 postmaster child starts, it acquires a read (non-exclusive) lock on the data 
 directory's lock file. When a new postmaster starts, it queries if anything 
 would block a write (exclusive) lock on the lock file which returns a 
 lock-holding PID in the case when other postgresql processes are running.
 
 This seems a lot leakier than what we do now (imagine, for example,
 shared storage) and I'm not sure what the advantage is.  I was
 imagining keeping some portion of the data in sysv shm, and moving the
 big stuff to a POSIX shm that would operate alongside it.

What do you mean by leakier? The goal here is to extinguish SysV shared 
memory for portability and convenience benefits. The mini-SysV proposal was 
implemented and shot down by Tom Lane.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-11 Thread A.M.


On Apr 11, 2011, at 7:25 PM, Tom Lane wrote:

 Robert Haas robertmh...@gmail.com writes:
 On Sun, Apr 10, 2011 at 5:03 PM, A.M. age...@themactionfaction.com wrote:
 To ensure that no two postmasters can startup in the same data directory, I 
 use fcntl range locking on the data directory lock file, which also works 
 properly on (properly configured) NFS volumes. Whenever a postmaster or 
 postmaster child starts, it acquires a read (non-exclusive) lock on the 
 data directory's lock file. When a new postmaster starts, it queries if 
 anything would block a write (exclusive) lock on the lock file which 
 returns a lock-holding PID in the case when other postgresql processes are 
 running.
 
 This seems a lot leakier than what we do now (imagine, for example,
 shared storage) and I'm not sure what the advantage is.
 
 BTW, the above-described solution flat out doesn't work anyway, because
 it has a race condition.  Postmaster children have to reacquire the lock
 after forking, because fcntl locks aren't inherited during fork().  And
 that means you can't tell whether there's a just-started backend that
 hasn't yet acquired the lock.  It's really critical for our purposes
 that SysV shmem segments are inherited at fork() and so there's no
 window where a just-forked backend isn't visible to somebody checking
 the state of the shmem segment.

Then you haven't looked at my patch because I address this race condition by 
ensuring that a lock-holding violator is the postmaster or a postmaster child. 
If such as condition is detected, the child exits immediately without touching 
the shared memory. POSIX shmem is inherited via file descriptors.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POSIX shared memory redux

2011-04-11 Thread A.M.


On Apr 11, 2011, at 7:13 PM, Tom Lane wrote:

 Robert Haas robertmh...@gmail.com writes:
 On Mon, Apr 11, 2011 at 3:11 PM, A.M. age...@themactionfaction.com wrote:
 What do you mean by leakier? The goal here is to extinguish SysV shared 
 memory for portability and convenience benefits. The mini-SysV proposal was 
 implemented and shot down by Tom Lane.
 
 I mean I'm not convinced that fcntl() locking will be as reliable.
 
 I'm not either.  Particularly not on NFS.  (Although on NFS you have
 other issues to worry about too, like postmasters on different machines
 being able to reach the same data directory.  I wonder if we should do
 both SysV and fcntl locking ...)

Is there an example of a recent system where fcntl is broken (ignoring NFS)? I 
believe my patch addresses all potential race conditions and uses the APIs 
properly to guarantee single-postmaster data directory usage and I tested on 
Darwin and a two-year-old Linux kernel. In the end, fcntl locking relies on the 
same kernel which provides the SysV user count, so I'm not sure what makes it 
less reliable, but I have heard that twice now, so I am open to hearing about 
your experiences.

 I know Tom shot that down before, but I still think it's probably the
 best way forward.
 
 Did I?  I think I pointed out that there's zero gain in portability as
 long as we still depend on SysV shmem to work.  However, if you're doing
 it for other reasons than portability, it might make sense anyway.  The
 question is whether there are adequate other reasons.

I provided an example of postmaster-failover relying on F_SETLKW in the email 
with the patch. Also, as you point out above, fcntl locking at least has a 
chance of working over NFS.

 
 The advantage I see is that we would be able to
 more easily allocate larger chunks of shared memory with changing
 kernel parameters,
 
 Yes, getting out from under the SHMMAX bugaboo would be awfully nice.

Yes, please! That is my primary motivation for this patch.

 
 and perhaps even to dynamically resize shared memory chunks.
 
 This I don't really believe will ever work reliably, especially not in
 32-bit machines.  Whatever your kernel API is, you still have the
 problem of finding address space contiguous to what you were already
 using.

Even if expanding shmem involves copying large regions of memory, it could be 
at least useful to adjust buffer sizes live without a restart.

Cheers,
M


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lowering privs in SECURITY DEFINER function

2011-04-08 Thread A.M.


On Apr 8, 2011, at 7:20 PM, Alvaro Herrera wrote:

 Excerpts from A.M.'s message of mié abr 06 19:08:35 -0300 2011:
 
 That's really strange considering that the new role may not normally
 have permission to switch to the original role. How would you handle
 the case where the security definer role is not the super user?
 
 As I said to Jeff, it's up to the creator of the wrapper function to
 ensure that things are safe.  Perhaps this new operation should only be
 superuser-callable, for example.
 
 How would you prevent general SQL attacks when manually popping the
 authentication stack is allowed?
 
 The popping and pushing operations would be restricted.  You can only
 pop a single frame, and pushing it back before returning is mandatory.

It might be worth thinking about extending this functionality to cover the case 
for connection pooling. If some SQL can re-tool an existing connection to 
have the properties of a new connection by a different role, then that would 
reduce the use-case of connection pooling. If that authorization chain can be 
pushed and popped by a password or some security token, for example, then that 
would cover the use cases I mention in this thread:

http://archives.postgresql.org/pgsql-general/2006-04/msg00917.php

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lowering privs in SECURITY DEFINER function

2011-04-06 Thread A.M.


On Apr 6, 2011, at 5:33 PM, Alvaro Herrera wrote:

 Hi,
 
 A customer of ours has for a long time the desire to be able to return
 to the previous privilege level (i.e. the caller privs) inside a
 SECURITY DEFINER function.  I find that this notion is not at all
 covered in the SQL standard, yet the use case is certainly valid from a
 security-concious point of view.
 
 (Consider, for example, that you may want to enable a user to run some
 operation to which he is authorized, but you want to carry out some
 privileged operation before/after doing so: for example, disable
 triggers, run an update, re-enable triggers.)
 
 An easy way to somewhat solve this problem is to provide another
 security definer function that calls the intermediate operation, owned
 by a role with lower privileges.  But this doesn't really solve the
 problem, because you are then providing a way to return to an arbitrary
 role, not to the specific role that's calling the function.
 
 I think part of the solution here would be to be able to tell what's the
 previous role, i.e. the one just below the topmost stack item in the
 authorization stack.  Then, at least you know what to call SET SESSION
 AUTHORIZATION to.
 
 Thoughts?  This area seems fraught with security problems, yet it is a
 necessary piece on the security puzzle.

That's really strange considering that the new role may not normally have 
permission to switch to the original role. How would you handle the case where 
the security definer role is not the super user?

How would you prevent general SQL attacks when manually popping the 
authentication stack is allowed?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

2011-02-09 Thread A.M.


On Feb 9, 2011, at 12:25 PM, Robert Haas wrote:

 On Wed, Feb 9, 2011 at 10:38 AM, Markus Wanner mar...@bluegap.ch wrote:
 On 02/09/2011 04:16 PM, David Fetter wrote:
 On Tue, Feb 08, 2011 at 09:09:48PM -0500, Robert Haas wrote:
 Frankly, I think this is an example of how our current shared memory
 model is a piece of garbage.
 
 What other model(s) might work better?
 
 Thread based, dynamically allocatable and resizeable shared memory, as
 most other projects and developers use, for example.
 
 Or less invasively, a small sysv shm to prevent the double-postmaster
 problem, and allocate the rest using POSIX shm.

Such a patch was proposed and rejected:
http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/94791
Cheers,
M

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] test_fsync label adjustments

2011-01-18 Thread A.M.


On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote:

 I have modified test_fsync to use test labels that match wal_sync_method
 values, and and added more tests for open_sync with different sizes. 
 This should make the program easier for novices to understand.  Here is
 a test run for Ubuntu 11.04:
 
   $ ./test_fsync
   2000 operations per test
   
   Compare file sync methods using one 8k write:
   (in wal_sync_method preference order, except fdatasync
   is Linux's default)
   open_datasync (non-direct I/O)*85.127 ops/sec
   open_datasync (direct I/O) 87.119 ops/sec
   fdatasync  81.006 ops/sec
   fsync  82.621 ops/sec
   fsync_writethroughn/a
   open_sync (non-direct I/O)*84.412 ops/sec
   open_sync (direct I/O) 91.006 ops/sec
   * This non-direct I/O mode is not used by Postgres.

I am curious how this is targeted at novices. A naive user might enable the 
fastest option which could be exactly wrong. For this to be useful to 
novices, I suspect the tool will need to generate platform-specific 
suggestions, no?

Cheers,
M


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] test_fsync label adjustments

2011-01-18 Thread A.M.


On Jan 18, 2011, at 5:16 PM, Bruce Momjian wrote:

 A.M. wrote:
 
 On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote:
 
 I have modified test_fsync to use test labels that match wal_sync_method
 values, and and added more tests for open_sync with different sizes. 
 This should make the program easier for novices to understand.  Here is
 a test run for Ubuntu 11.04:
 
 $ ./test_fsync
 2000 operations per test
 
 Compare file sync methods using one 8k write:
 (in wal_sync_method preference order, except fdatasync
 is Linux's default)
 open_datasync (non-direct I/O)*85.127 ops/sec
 open_datasync (direct I/O) 87.119 ops/sec
 fdatasync  81.006 ops/sec
 fsync  82.621 ops/sec
 fsync_writethroughn/a
 open_sync (non-direct I/O)*84.412 ops/sec
 open_sync (direct I/O) 91.006 ops/sec
 * This non-direct I/O mode is not used by Postgres.
 
 I am curious how this is targeted at novices. A naive user might enable
 the fastest option which could be exactly wrong. For this to be useful
 to novices, I suspect the tool will need to generate platform-specific
 suggestions, no?
 
 Uh, why isn't the fastest option right for them?  It is hardware/kernel
 specific when you run it --- how could it be better?

Because the fastest option may not be syncing to disk. For example, the only 
option that makes sense on OS X is fsync_writethrough- it would be helpful if 
the tool pointed that out (on OS X only, obviously).

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] test_fsync label adjustments

2011-01-18 Thread A.M.


On Jan 18, 2011, at 5:21 PM, Bruce Momjian wrote:

 A.M. wrote:
 
 On Jan 18, 2011, at 5:16 PM, Bruce Momjian wrote:
 
 A.M. wrote:
 
 On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote:
 
 I have modified test_fsync to use test labels that match wal_sync_method
 values, and and added more tests for open_sync with different sizes. 
 This should make the program easier for novices to understand.  Here is
 a test run for Ubuntu 11.04:
 
   $ ./test_fsync
   2000 operations per test
   
   Compare file sync methods using one 8k write:
   (in wal_sync_method preference order, except fdatasync
   is Linux's default)
   open_datasync (non-direct I/O)*85.127 ops/sec
   open_datasync (direct I/O) 87.119 ops/sec
   fdatasync  81.006 ops/sec
   fsync  82.621 ops/sec
   fsync_writethroughn/a
   open_sync (non-direct I/O)*84.412 ops/sec
   open_sync (direct I/O) 91.006 ops/sec
   * This non-direct I/O mode is not used by Postgres.
 
 I am curious how this is targeted at novices. A naive user might enable
 the fastest option which could be exactly wrong. For this to be useful
 to novices, I suspect the tool will need to generate platform-specific
 suggestions, no?
 
 Uh, why isn't the fastest option right for them?  It is hardware/kernel
 specific when you run it --- how could it be better?
 
 Because the fastest option may not be syncing to disk. For example,
 the only option that makes sense on OS X is fsync_writethrough- it
 would be helpful if the tool pointed that out (on OS X only, obviously).
 
 Yes, that would be a serious problem.  :-(
 
 I am not sure how we would address this --- your point is a good one.

One general idea I had would be to offer some heuristics such as this sync 
rate is comparable to that of one SATA drive or comparable to RAID 10 with X 
drives or this rate is likely too fast to be actually be syncing. But then 
you are stuck with making sure that the heuristics are kept up-to-date, which 
would be annoying.

Otherwise, the only option I see is to detect the kernel and compare against a 
list of known problematic methods. Perhaps it would be easier to compare 
against a whitelist. Also, the tool would likely need to parse mount output 
to account for problems with specific filesystems.

I am just throwing around some ideas...

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] test_fsync label adjustments

2011-01-18 Thread A.M.


On Jan 18, 2011, at 5:41 PM, Bruce Momjian wrote:

 A.M. wrote:
 Because the fastest option may not be syncing to disk. For example,
 the only option that makes sense on OS X is fsync_writethrough- it
 would be helpful if the tool pointed that out (on OS X only, obviously).
 
 Yes, that would be a serious problem.  :-(
 
 I am not sure how we would address this --- your point is a good one.
 
 One general idea I had would be to offer some heuristics such as this
 sync rate is comparable to that of one SATA drive or comparable to
 RAID 10 with X drives or this rate is likely too fast to be actually
 be syncing. But then you are stuck with making sure that the heuristics
 are kept up-to-date, which would be annoying.
 
 That fails for RAID BBUs.

Well, it's nothing more than a heuristic- it is still nice to know whether or 
not the fancy hardware RAID I just setup is similar to Josh Berkus' RAID setup 
or a single SATA drive (which would hint at a misconfiguration). As you said, 
perhaps a wiki is better for this. But a wiki won't integrate with this tool, 
which I why I would hesitate to point novices to this tool... should the tool 
point to the wiki?

 
 Otherwise, the only option I see is to detect the kernel and compare
 against a list of known problematic methods. Perhaps it would be easier
 to compare against a whitelist. Also, the tool would likely need to
 parse mount output to account for problems with specific filesystems.
 
 I am just throwing around some ideas...
 
 That sounds pretty complicated.  One idea would be the creation of a
 wiki where people could post their results, or ideally a tool that could
 read the output and load it into a database for analysis with other
 results.

The OS X example is pretty cut-and-dry- it would be nice if there were some 
kind of hints in the tool pointing in the right direction, or at least a few 
words of warning: the fastest option may not be the safest- read the docs.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] making an unlogged table logged

2011-01-05 Thread A.M.


On Jan 5, 2011, at 2:37 PM, Robert Haas wrote:

 On Wed, Jan 5, 2011 at 2:36 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The lock strength selected on the master doesn't need to be the same as
 the lock strength on the standby. You could quite easily generate AEL
 lock records to send to standby, without actually taking that lock level
 on the master.
 
 True.
 
 Question: what does an unlogged table look like on the standby? Are they
 visible at all, does it throw an error, or do they just look empty. We
 probably need some docs in the HS section to explain that. I hope the
 answer isn't look empty since that is effectively data loss for people
 spreading queries across multiple nodes.
 
 Error.

Hm- if the unlogged tables are being used as HTTP transient state storage, it 
would be handy to have that (admittedly non-essential) data on the standby when 
it becomes master, even if there are no guarantees surrounding the data beyond 
it looked like this at some point. Since the tables are not writing WAL, 
would it be possible to allow for writing to unlogged tables on the standby to 
allow for out-of-band syncing? Otherwise, it seems the only alternative is to 
push changes to a separate database on the standby machine and then suck the 
data in when it becomes master.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread A.M.


On Nov 17, 2010, at 4:00 PM, Kevin Grittner wrote:

 Robert Haas robertmh...@gmail.com wrote:
 
 OK, so we're proposing a hierarchy like this.
 
 1. PERMANENT (already exists).
 
 2. UNLOGGED (what this patch currently implements).
 
 3. UNSYNCED (future work).
 
 4. GLOBAL TEMPORARY (future work).
 
 5. LOCAL TEMPORARY (our current temp tables).
 
 All of the above would have real uses in our shop.
 
 It's possible to imagine a few more stops on this hierarchy.
 
 Some of these might be slightly preferred over the above in certain
 circumstances, but that's getting down to fine tuning.  I think the
 five listed above are more important than the speculative ones
 mentioned.
 
 I don't particularly care for the name UNSYNCED
 
 EVANESCENT?
 
 I'm starting not to like UNLOGGED much either
 
 EPHEMERAL?
 
 Actually, the UNSYNCED and UNLOGGED seem fairly clear

Unless one thinks that the types could be combined- perhaps a table declaration 
could use both UNLOGGED and UNSYNCED?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] create c function with void argument bug?

2010-10-23 Thread A.M.

Hello,

It seems that this:
CREATE OR REPLACE FUNCTION test_fsync_speed() RETURNS float AS 
'$libdir/test_fsync_speed','\
test_fsync_speed' LANGUAGE C IMMUTABLE STRICT;

is not equivalent to this (note void argument):

CREATE OR REPLACE FUNCTION test_fsync_speed(void) RETURNS float AS 
'$libdir/test_fsync_speed','\
test_fsync_speed' LANGUAGE C IMMUTABLE STRICT;

Two functions are created.
test=# \df
  List of functions
 Schema |   Name   | Result data type | Argument data types |  Type  
+--+--+-+
 public | test_fsync_speed | double precision | | normal
 public | test_fsync_speed | double precision | void| normal

Furthermore, I can't figure out how to call the void argument variant. Why is 
void accepted as an argument? 

create function testvoid(void) returns void as '' LANGUAGE PLPGSQL;
ERROR:  PL/pgSQL functions cannot accept type void

I punched the void argument in just by chance- am I just stupid?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing an installation's default value of unix_socket_directory

2010-10-21 Thread A.M.


On Oct 21, 2010, at 4:19 PM, Robert Haas wrote:

 2010/10/21 Tom Lane t...@sss.pgh.pa.us:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Cédric Villemain's message of jue oct 21 16:01:30 -0300 2010:
 I agree this is interesting information to get, but wonder how
 pg_config can know that and it looks to me that this information as
 nothing to do in pg_config
 
 pg_config is all about installation, socket_dir is a postgresql.conf 
 setting.
 
 Yeah -- how is pg_config to know?  All it can tell you is what was the
 compiled-in default.
 
 That's what I wanted, actually.  If you've set a non-default value in
 postgresql.conf, SHOW will tell you about that, but it fails to expose
 the default value.
 
 Maybe you should go the SHOW route.  The user could connect via TCP and
 find out the socket directory that way.
 
 Yeah, the SHOW case is not useless by any means.
 
 I think adding this to pg_config is sensible.  Sure, the user could
 have moved the socket directory.  But it's a place to start looking.
 So why not?


Because pg_config is supposed to return the current state of a cluster?
Because it might indicate a connection to the wrong server?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-10-19 Thread A.M.


On Oct 19, 2010, at 11:22 AM, Bruce Momjian wrote:

 Greg Smith wrote:
 A.M. wrote:
 Perhaps a simpler tool could run a basic fsyncs-per-second test and prompt 
 the DBA to check that the numbers are within the realm of possibility.
 
 
 This is what the test_fsync utility that already ships with the database 
 should be useful for.  The way Bruce changed it to report numbers in 
 commits/second for 9.0 makes it a lot easier to use for this purpose 
 than it used to be.  I think there's still some additional improvements 
 that could be made there, but it's a tricky test to run accurately.  The 
 
 test_fsync was designed to test various things like whether several
 open-sync writes are better than two write and an fsync, and whether you
 can fsync data written on a different file descriptor.  It is really a
 catch-all test right now, not one specific for choosing sync methods.

I am working on simplifying the test_fsync tool and making it a contrib 
function which can be run by the superuser based on the configured fsync 
method. That way, the list can ask a user to run it to report fsyncs-per-second 
for suspiciousness. The goal is to make it more accessible. I was also thinking 
about adding some notes along the lines of Your drive fsync speed rates 
between a 5400 RPM SATA drive and a 7200 RPM SATA drive. or Your drive fsync 
speed rates as high as RAM- your fsync method may be wrong.

Currently, the test tool is not even compiled by default.

Thoughts?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-10-07 Thread A.M.


On Oct 7, 2010, at 12:26 PM, Robert Haas wrote:

 On Thu, Oct 7, 2010 at 11:45 AM, Greg Smith g...@2ndquadrant.com wrote:
 Robert Haas wrote:
 Proposed doc patch attached.
 
 Looks accurate to me.  I like the additional linking to the Reliability page
 you put in there too.  Heavily referencing that important page from related
 areas is a good thing, particularly now that it's got a lot more details
 than it used to.
 
 Cool, thanks for the fast review.  I suspect there are more details
 that could stand to be added to the WAL reliability page as well, but
 I don't know what they are so I can't add them.
 
 I still have the feeling that we have not put quite a large enough
 red, blinking light around this issue, but I don't have a concrete
 suggestion.

I think the general problem is that there is no simple way to verify that a 
PostgreSQL commit is pushing the bits to persistent storage. It would be 
helpful if there were a platform-specific, volume-specific tool to deduce this. 
Currently, there is no warning light that goes on when commits are not 
persistent.

On Linux, a tool could check filesystem parameters, hdparm (if relevant), and 
hard drive and controller specs (possibly against a blacklist of known liars).

Perhaps a simpler tool could run a basic fsyncs-per-second test and prompt the 
DBA to check that the numbers are within the realm of possibility.

How else can a DBA today ensure that a commit is a commit?

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread A.M.


On Sep 30, 2010, at 5:02 PM, Tom Lane wrote:

 Robert Haas robertmh...@gmail.com writes:
 Oh, I missed that.  Actually, I wasn't really so concerned with
 whether his benchmark is correct.  I *am* concerned about being broken
 out of the box on MacOS X.
 
 Actually, the problem with OSX is that OSX is broken out of the box,
 at least by that standard.  The system's normal configuration is that
 fsync() does nothing, 

That is not correct. fsync and friends on Darwin synchronizes I/O and flushes 
dirty kernel caches to the disk which meets the specification and is distinctly 
different from doing nothing.

The fsync() function can be used by an application to indicate that all data 
for the open file description named by fildes is to be transferred to the 
storage device associated with the file described by fildes in an 
implementation-dependent manner.
http://opengroup.org/onlinepubs/007908799/xsh/fsync.html

On MacOS X, fsync() always has and always will flush all file data
from host memory to the drive on which the file resides.
http://lists.apple.com/archives/Darwin-dev/2005/Feb/msg00072.html

 I'm not sure whether we should select fsync_writethrough as the default
 on OSX.  We don't make an equivalent attempt to prevent OS or storage
 malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit
 ahead of the game in that you *can* force writethrough without resorting
 to arcane hacks with hdparm or some such.
 
 We could definitely stand to be a bit more verbose about documenting
 the platform-specific issues in this area.

Not only is this issue platform-specific, it is also bus-, controller- and 
disk-specific. Luckily, hardware that ships from Apple responds properly to 
F_FULLFSYNC. It's too bad there is no cross-platform way to ask what level of 
hardware-syncing is available.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-branch committing in git, revisited

2010-09-21 Thread A.M.


On Sep 21, 2010, at 11:19 PM, Tom Lane wrote:
 
 Offhand I think I like Andrew's recommendation of a shortlived branch
 better.  In essence your idea is using the tip of master itself as a
 shortlived branch, which is maybe a bit too cute.  If you get distracted
 and need to do something else for awhile, the tip of master is not
 where you want your not-yet-pushable work to be.

For uncommitted work, see also git stash.
http://www.kernel.org/pub/software/scm/git/docs/git-stash.html

Cheers,
M

Re: [HACKERS] string function - format function proposal

2010-08-31 Thread A.M.


On Aug 31, 2010, at 5:07 PM, Pavel Stehule wrote:

 Hello
 
 attached WIP patch.
 
 I implement only basic format's tags related to SQL: string, value,
 literal, sql identifier. These tags are basic, but there are not any
 break to implement any other formats or enhance a syntax. The mix with
 to_char function is more complex then I expected - so I don't thinking
 about it for now (there are more then one to_char function).
 

snip

It would be pretty handy if plpgsql EXECUTE could operate like this with USING 
to support identifiers.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] new notify payload as string

2010-08-25 Thread A.M.

With 9.0b4, I am testing the new NOTIFY payload feature. One thing I noticed is 
that it seems impossible to differentiate at the receiving end from:

NOTIFY test;

and

NOTIFY test,'';

So, it is impossible to differentiate between a notification with an empty 
string payload and a notification without a payload due to the backend protocol 
defining the payload as a string. Perhaps a boolean could be added to the 
backend protocol to account for this and then extra could be set to NULL in 
libpq.

This could be handy when converting codebases that use LISTEN/NOTIFY when one 
wishes to be warned of old NOTIFYs being used without a payload.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Committers info for the git migration - URGENT!

2010-08-25 Thread A.M.


On Aug 25, 2010, at 6:49 PM, Tom Lane wrote:

 Magnus Hagander mag...@hagander.net writes:
 The current mapping used is the same one as on git.postgresql.org (see
 attached file).
 
 BTW, I noticed that this list omits several old committers:
 
 Julian Assange pr...@suburbia.net

That is _the_ Julian Assange who is in the news now. Very cool!
http://en.wikipedia.org/wiki/Julian_Assange#Career_as_computer_programmer

-M

Re: [HACKERS] Progress indication prototype

2010-08-18 Thread A.M.


On Aug 18, 2010, at 9:02 AM, Robert Haas wrote:

 On Wed, Aug 18, 2010 at 8:45 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Aug 17, 2010 at 11:29 PM, Dave Page dp...@pgadmin.org wrote:
 Which is ideal for monitoring your own connection - having the info in
 the pg_stat_activity is also valuable for monitoring and system
 administration. Both would be ideal :-)
 
 Hm, I think I've come around to the idea that having the info in
 pg_stat_activity would be very nice. I can just picture sitting in
 pgadmin while a bunch of reports are running and seeing progress bars
 for all of them...
 
 But progress bars alone aren't really the big prize. I would really
 love to see the explain plans for running queries. This would improve
 the DBAs view of what's going on in the system immensely. Currently
 you have to grab the query and try to set up a similar environment for
 it to run explain on it. If analyze has run since or if the tables
 have grown or shrank or if the query was run with some constants as
 parameters it can be awkward. If some of the tables in the query were
 temporary tables it can be impossible. You can never really be sure
 you're looking at precisely the same plan than the other user's
 session is running.
 
 But stuffing the whole json or xml explain plan into pg_stat_activity
 seems like it doesn't really fit the same model that the existing
 infrastructure is designed around. It could be quite large and if we
 want to support progress feedback it could change quite frequently.
 
 We do stuff the whole query there (up to a limited size) so maybe I'm
 all wet and stuffing the explain plan in there would be fine?
 
 It seems to me that progress reporting could add quite a bit of
 overhead.  For example, in the whole-database vacuum case, the most
 logical way to report progress would be to compute pages visited
 divided by pages to be visited.  But the total number of pages to be
 visited is something that doesn't need to be computed in advance
 unless someone cares about progress.  I don't think we want to incur
 that overhead in all cases just on the off chance someone might ask.
 We need to think about ways to structure this so that it only costs
 when someone's using it.

I wish that I could get explain analyze output step-by-step while running a 
long query instead of seeing it jump out at the end of execution. Some queries 
never end and it would be nice to see which step is spinning (explain can be 
a red herring). To me the progress bar is nice, but I don't see how it would 
be reliable enough to draw any inferences (such as execution time). If I could 
get the explain analyze results *and* the actual query results, that would be a 
huge win, too.

Cheers,
M


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Testing with concurrent sessions

2010-01-07 Thread A.M.


On Jan 7, 2010, at 12:39 PM, Greg Sabino Mullane wrote:
 I'm still *very* interested in making a libpq-less pure perl driver,
 if anyone feels like funding it, let me know! :)

You mean this one:

http://search.cpan.org/~arc/DBD-PgPP-0.07/lib/DBD/PgPP.pm

?

Cheers,
M

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Has anyone used CLANG yet?

2009-12-09 Thread A.M.


On Dec 9, 2009, at 4:23 PM, Chris Browne wrote:

 This is a C front end for the LLVM compiler...  I noticed that it
 entered Debian/Unstable today:
 
  http://packages.debian.org/sid/main/clang
 
 I thought it would be interesting to see if PostgreSQL compiles with
 this, as an alternative compiler that should presumably become more and
 more available on Linux et al.  (And I suppose that the randomly
 selected .sig is supremely apropos!)
 
 configure blows up here at the following:
 
 conftest.c:75:28: error: invalid token after top level declarator
 extern unsigned int PASCAL accept (unsigned int, void *, void *);
 
 I suspect there's something about PASCAL that's a problem, as clang is
 nominally supposed to be a C compiler ;-).  
 
 I haven't looked deeper, so haven't the remotest idea how deep the issue
 lies.
 
 At any rate, I should poke at this further soon, but if it seems
 interesting to others, well, CLANG is now an easy install on some number
 of systems!

Clang works for me on MacOS 10.6.2:
 /Developer/usr/bin/clang --version
clang version 1.0.1 (http://llvm.org/svn/llvm-project/cfe/tags/Apple/clang-24 
exported)
Target: x86_64-apple-darwin10

CC=/Developer/usr/bin/clang ./configure --prefix=/Users/agentm/pgsql841/

make -j 8

/Users/agentm/pgsql841/initdb -E UTF8 ../data

./pg_ctl -D ../data/ start
server starting
RD07:bin agentm$ LOG:  database system was shut down at 2009-12-09 17:01:51 EST
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

/Users/agentm/pgsql841/psql postgres
psql (8.4.1)
Type help for help.

postgres=# select 1;
 ?column? 
--
1
(1 row)


I do see lots of warnings regarding unsupported compiler flags:

clang: warning: argument unused during compilation: '-no-cpp-precomp'
clang: warning: argument unused during compilation: '-O2'
clang: warning: argument unused during compilation: '-Wall'
clang: warning: argument unused during compilation: '-Wmissing-prototypes'
clang: warning: argument unused during compilation: '-Wpointer-arith'
clang: warning: argument unused during compilation: 
'-Wdeclaration-after-statement'
clang: warning: argument unused during compilation: '-Wendif-labels'
clang: warning: argument unused during compilation: '-fno-strict-aliasing'
clang: warning: argument unused during compilation: '-fwrapv'

and some code-based warnings:
print.c:1105:24: warning: field width should have type 'int', but argument has 
type 'unsigned int' [-Wformat]
fprintf(fout, %-s%*s, 
hlineptr[line_count].ptr,
   ^

pl_exec.c:3529:6: warning: expression result unused [-Wunused-value]
ItemPointerSetInvalid((tmptup.t_self));
^~~
../../../../src/include/storage/itemptr.h:134:2: note: instantiated from:
BlockIdSet(((pointer)-ip_blkid), InvalidBlockNumber), \
^
../../../../src/include/storage/block.h:86:2: note: instantiated from:
AssertMacro(PointerIsValid(blockId)), \
^
../../../../src/include/postgres.h:675:39: note: instantiated from:
#define AssertMacro(condition)  ((void)true)
   ^
../../../../src/include/c.h:185:15: note: instantiated from:
#define true((bool) 1)

You are probably running configure with gcc, no?

FYI:
with clang: time make (not -j 8)
real1m46.511s
user1m26.295s
sys 0m14.639s

with gcc: time make
real2m41.934s
user2m20.778s
sys 0m17.441s

du -h pgsql841gcc/bin/*
 52Kpgsql841gcc/bin/clusterdb
 52Kpgsql841gcc/bin/createdb
 60Kpgsql841gcc/bin/createlang
 52Kpgsql841gcc/bin/createuser
 52Kpgsql841gcc/bin/dropdb
 60Kpgsql841gcc/bin/droplang
 52Kpgsql841gcc/bin/dropuser
616Kpgsql841gcc/bin/ecpg
 72Kpgsql841gcc/bin/initdb
 32Kpgsql841gcc/bin/pg_config
 28Kpgsql841gcc/bin/pg_controldata
 36Kpgsql841gcc/bin/pg_ctl
280Kpgsql841gcc/bin/pg_dump
 68Kpgsql841gcc/bin/pg_dumpall
 36Kpgsql841gcc/bin/pg_resetxlog
128Kpgsql841gcc/bin/pg_restore
4.6Mpgsql841gcc/bin/postgres
4.0Kpgsql841gcc/bin/postmaster
340Kpgsql841gcc/bin/psql
 52Kpgsql841gcc/bin/reindexdb
 32Kpgsql841gcc/bin/vacuumdb

du -h pgsql841/bin/* (clang build)
 52Kpgsql841/bin/clusterdb
 52Kpgsql841/bin/createdb
 60Kpgsql841/bin/createlang
 52Kpgsql841/bin/createuser
 48Kpgsql841/bin/dropdb
 60Kpgsql841/bin/droplang
 48Kpgsql841/bin/dropuser
612Kpgsql841/bin/ecpg
 72Kpgsql841/bin/initdb
 28Kpgsql841/bin/pg_config
 28Kpgsql841/bin/pg_controldata
 36Kpgsql841/bin/pg_ctl
272Kpgsql841/bin/pg_dump
 68Kpgsql841/bin/pg_dumpall
 36Kpgsql841/bin/pg_resetxlog
124Kpgsql841/bin/pg_restore
4.5Mpgsql841/bin/postgres
4.0Kpgsql841/bin/postmaster
344Kpgsql841/bin/psql
 52Kpgsql841/bin/reindexdb
 32Kpgsql841/bin/vacuumdb

Cheers,
M

Re: [HACKERS] Listen / Notify rewrite

2009-11-11 Thread A.M.

On Nov 11, 2009, at 4:25 PM, Joachim Wieland wrote:

Hi,

Attached is a patch for a new listen/notify implementation.

In a few words, the patch reimplements listen/notify as an slru-
based queue
which works similar to the sinval structure. Essentially it is a
ring buffer on

disk with pages mapped into shared memory for read/write access.

Additionally the patch does the following (see below for details):

1. It removes the pg_listener relation and
2. adds the possibility to specify a payload parameter, i.e.
executing in SQL
NOTIFY foo 'payload'; and 'payload' will be delivered to any
listening

backend.
3. Every distinct notification is delivered.
4. Order is preserved, i.e. if txn 1 first does NOTIFY foo, then
NOTIFY bar, a
backend (listening to both foo and bar) will always first
receive the

notification foo and then the notification bar.
5. It's now listen to a channel, not listen to a relation
anymore...

Hi Joachim,

Thank you for implementing this- LISTEN/NOTIFY without a payload has
been a major problem to work around for me.

I understand that coalescing multiple notifications of the same name
happens now, but I never understood why. I had hoped that someone
revisiting this feature would remove it. My use case is autonomous
transactions.

From a developer's standpoint, NOTIFY specifies a form of remote
trigger for further action- outside the transaction- to occur. From
that point of view, I don't see why NOTIFY foo; NOTIFY foo; is
equivalent to NOTIFY foo;. I understand the use case where a per-row
trigger could generate lots of spurious notifications, but it seems
that if anything is generating spurious notifications, the
notification is in the wrong place.

The documentation makes the strong implication that notification names
are usually table names, but with the new payload, this becomes even
less the case.

I changed this table, take a look at it to see what's new. But no
such association is enforced by the NOTIFY and LISTEN commands. http://www.postgresql.org/docs/8.4/static/sql-notify.html

With the coalescing, I feel like I am being punished for using NOTIFY
with something other than a table name use case.

At least with this new payload, I can set the payload to the
transaction ID and be certain that all the notifications I sent are
processed (and in order even!) but could you explain why the
coalescing is still necessary?

While coalescing could be achieved on the receiving-end NOTIFY
callback ( if(payload ID was already processed) continue; ), non-
coalescing behavior cannot be achieved when the backend does the
coalescing.

For backwards compatibility, payload-less NOTIFY could coalesce while
NOTIFYs with a payload would not coalesce, but, on the other hand,
that might be confusing.

In any case, thank you for improving this long-neglected subsystem!

Best regards,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Listen / Notify rewrite

2009-11-11 Thread A.M.



On Nov 11, 2009, at 10:43 PM, Tom Lane wrote:


Andrew Chernow a...@esilo.com writes:
I thought of a compromise: add the number of times a notification  
was
generated (coalesced count+1) to the callback data. That would  
satisfy

any backwards compatibility concerns and my use case too!


If you are suggesting that the server poke data into the  
notifier's opaque
payload, I vote no.  Maybe the NOTIFY command can include a switch  
to enable

this behavior.  No syntax suggestions at this point.


I agree, we should not have the system modifying the payload string  
for

this.  And don't bother suggesting a third column in the result ---
we'd have to change the FE/BE protocol for that, and it's not going
to happen.

The existing precedent is that the system collapses identical
notifications without payloads.  So we could possibly get away with
saying that identical payload-less notifies are collapsed but those
with a payload are not.  That doesn't really seem to satisfy the
POLA though.  I think Joachim's definition is fine, and anyone who
needs delivery of distinct notifications can easily make his payload
strings unique to ensure it.


The notification count could be a secondary payload which does not  
affect the first, but I guess I'm the only one complaining about the  
coalescing...


-M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PostgreSQL Specifications

2009-07-25 Thread A.M.



On Jul 25, 2009, at 11:06 AM, Tom Lane wrote:


Sam Mason s...@samason.me.uk writes:

Yes, that seems reasonable.  The fact that you're still talking about
confined users is slightly worrying and would seem to imply that
there is still a superuser/normal user divide--it's probably just a
terminology thing though.


There had better still be superusers.  Or do you want the correctness
of your backups to depend on whether your SELinux policy is correct?
The first time somebody loses critical data because SELinux suppressed
it from their pg_dump output, they're going to be on the warpath.


This behavior is no different than when taking/using an SE-enabled  
filesystem backup. And woe to the admin who doesn't test his  
backups- caveat emptor.


Still, it would be nice if pg_dump warned or stopped if the backup it  
created was completely useless (missing data dependencies), no?


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] obtaining the function call stack

2009-07-13 Thread A.M.



On Jul 13, 2009, at 4:51 PM, decibel wrote:


On Jul 13, 2009, at 2:33 PM, Tom Lane wrote:

Alvaro Herrera alvhe...@commandprompt.com writes:

Tom Lane wrote:

The performance and error recovery implications are unfavorable.
Just how badly do you need this, and for what?



Mainly for debugging.  The situation is such that there is a lot of
functions and very high load.  The functions have embedded debug  
elogs
and the intention is to call them only if the function was called  
in a

particular context.


I can't really see that as sufficiently widely useful to justify
inserting such a mechanism.

I suspect also that you are defining the problem the wrong way ---  
this
user doesn't want a generic fmgr call stack, he wants a plpgsql  
stack.

Which is something the plpgsql debugger could be taught to do, if it
doesn't already, thus avoiding the overhead the 99.9% of the time  
that

you don't need it.


Actually, this could conceivably be called from other languages,  
such as plPerl.


But it sounds like this can be done via an add-on, so no need to add  
it directly to the backend.


How would I go about generating a meaningful backtrace for a plpgsql  
function that calls a plperl function? One would also expect a C  
function which calls a plpgsql function to appear, too, no? Shouldn't  
there be a unified backtrace subsystem?


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idea: global temp tables

2009-04-27 Thread A.M.


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

- -BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Apr 27, 2009, at 4:44 PM, Pavel Stehule wrote:


Hello

I am thinking about global temp tables. One possible solution is
creating global temporary table like normal table and in planner stage
check using this table. When some global temporary table is detected,
then real temporary table is created and used in execution plan. It's
like:

CREATE GLOBAL TEMP TABLE foo(a varchar); -- create global empty  
table foo

SELECT * FROM foo;
a) is relevant temp table for foo, use it
a) when not, then CREATE TEMP TABLE pg_temp_1.foo(LIKE foo INCLUDING
DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
b) transform origin query to SELECT * FROM pg_temp_1.foo;

Ideas? Notes? Objections?


When will postgresql offer global temporary tables with data which  
are shared among sessions? Such tables are great for transient data  
such as web session data where writing to the WAL is a waste. (On DB  
startup, the tables would simply be empty.) We're currently stuck with  
the memcached plugin which makes it impossible to use database  
constructs such as foreign keys against the temporary data.


Cheers,
M
- -BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2JHQACgkQqVAj6JpR7t4YRgCdGj8JPJY61PPaK79jnPFXu8c7
vjIAn2F1lA0Nr/2EHVPcYQohWqGjWElK
=3zYu
- -END PGP SIGNATURE-
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2JIIACgkQqVAj6JpR7t6IOgCdE0le+MAlcwCYNqEt+w9jt/Y3
Z/sAni8Jm3ndYZSI1pIQLBVtKnBnJ8Ee
=VXWF
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idea: global temp tables

2009-04-27 Thread A.M.


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:


Hi,

Le 27 avr. 09 à 23:32, A.M. a écrit :
When will postgresql offer global temporary tables with data  
which are shared among sessions? Such tables are great for  
transient data such as web session data where writing to the WAL is  
a waste. (On DB startup, the tables would simply be empty.) We're  
currently stuck with the memcached plugin which makes it impossible  
to use database constructs such as foreign keys against the  
temporary data.



If using 8.3 you can SET LOCAL synchronous_commit TO off; for web  
session management transactions, it'll skip the WAL fsync'ing, which  
is already a good start.


That's pretty close, but it's not table specific and wouldn't let us  
to reliably mix transient data changes with real data changes.


Cheers,
M
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2KNcACgkQqVAj6JpR7t4OrQCgpU9K3FzG2LWWyM245vUaop1G
ZMIAn379RDewxKUmCsZsWLo8KdWAYGIs
=kHl5
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idea: global temp tables

2009-04-27 Thread A.M.



On Apr 27, 2009, at 6:01 PM, Kevin Grittner wrote:


A.M. age...@themactionfaction.com wrote:

On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:

Le 27 avr. 09 à 23:32, A.M. a écrit :

When will postgresql offer global temporary tables with data
which are shared among sessions? Such tables are great for
transient data such as web session data where writing to the WAL is



a waste. (On DB startup, the tables would simply be empty.) We're



currently stuck with the memcached plugin which makes it impossible



to use database constructs such as foreign keys against the
temporary data.



If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
session management transactions, it'll skip the WAL fsync'ing, which



is already a good start.


That's pretty close, but it's not table specific and wouldn't let us



to reliably mix transient data changes with real data changes.


Yeah, we have a dozen or so tables we use with the pattern you
describe; so the feature you describe would also have some value for
us.  To avoid confusion, we don't refer to these as temporary
tables, but rather as permanent work tables.  Again, I can't
comment on practical issues regarding implementation; but it would be
a nice feature to add some day.  The tricky bit would be to figure
out how to ensure that it got cleaned up properly, especially if the
PostgreSQL went down or client processes wend down before tidying up.


Actually, for our usage, that's the easiest part- truncate all the  
permanent work tables whenever the db starts. That's really the  
only sane thing to do anyway. That's what I mean by transient data-  
if it's there, that's great, if not, I can re-generate it (cache) or  
I don't care because, if the database goes down, then the data is  
useless on restart anyway.


Cheers,
M
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gcc: why optimize for size flag is not the default

2009-03-11 Thread A.M.



On Mar 11, 2009, at 3:18 PM, Grzegorz Jaskiewicz wrote:



On 11 Mar 2009, at 13:51, Marko Kreen wrote:

Linux kernel is moving to use -Os everywhere.  AFAIK their argument  
is

that kernel code should not be doing anything CPU-intensive, thus
minimal cache usage is more important than unrolled loops.

This also seems to hint that -Os is not really appropriate to  
Postgres.

Although it would be good fit for eg. PgBouncer.


while it might be right in case of linux kernel (which I won't agree  
totally with personally), I don't see any reason to compare it with  
postgresql.
Kernel is extensively use by everything in system, hence their  
reasoning. Postgresql is an application.


MacOS X defaults to and recommends -Os with the rationales that  
smaller code causes less paging and less CPU instruction cache  
thrashing.


http://developer.apple.com/ReleaseNotes/DeveloperTools/RN-GCC3/index.html
For deployment builds, the recommended setting is -Os, which produces  
the smallest possible binary size. Generally, a binary that's smaller  
is also faster. That's because a large application spends much of its  
time paging its binary code in and out of memory. The smaller the  
binary, the less the application needs to page. For example, say a  
binary uses aggressive function inlining. That binary saves time with  
fewer function calls, but it could easily spend far more time paging  
the binary code containing those inlined functions in and out of memory.


-Os Optimize for size. -Os enables all -O2 optimizations that do not  
typically increase code size. It also performs further optimizations  
designed to reduce code size.


-Os is still optimizing but using a slightly different heuristic as  
to what optimization means.


That said, if postgresql is paging out, the DBA probably has  
postgresql or the server misconfigured.


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prepping to break every past release...

2009-03-04 Thread A.M.



On Mar 4, 2009, at 6:07 PM, Josh Berkus wrote:


Andrew,

Back on that track, I'd like to see a facility whereby we could  
provide an alias (or synonym, to use a nearby subject) columns and  
other objects. That would help to overcome naming glitches without  
breaking things quite so much.


Believe it or not, a large PostgreSQL user in LA just buttonholed me  
about that particular feature idea at SCALE.  So it might be  
generally useful as well -- not just for the system catalogs, bug to  
allow businesses with long-use databases to manage change over time.


Schema change is a strong motivator for applications to access the  
database through views and functions only. A column with multiple  
names would likely make it *more* painful to migrate schemata.


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

2009-01-05 Thread A.M.



On Jan 5, 2009, at 1:16 PM, Stephen R. van den Berg wrote:


Upon reading the DFSG, it seems you have a point...
However...
QuickLZ is dual licensed:
a. Royalty-free-perpetuous-use as part of the PostgreSQL backend or
  any derived works of PostgreSQL which link in *at least* 50% of the
  original PostgreSQL codebase.


How does one even define 50% of the original PostgreSQL codebase?  
What nonsense.


-M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL future ideas

2008-09-26 Thread A.M.



On Sep 25, 2008, at 5:50 PM, Chris Browne wrote:


[EMAIL PROTECTED] (Gevik Babakhani) writes:

Advantage of C++ is that it reduce lot of OO code written in
C in PostgreSQL, but it is so big effort to do that without
small gain. It will increase number of bugs. Do not forget
also that C++ compiler is not so common (so good) on
different platforms. If somebody interesting in that yes but
like a fork ( PostgreSQL++ :-).


Reducing OO code that is written in C is one of my major
interests. After some investigating myself it appears that having
the codebase fully (rewritten in C++ will have an impact on the
performance. So I guess such an effort will result the code being
more C++ish and fully OO, being a mixture in C with some OO taste.


I'm not convinced that it would a good idea at all to make the system
fully OO, nor that C++ would be a meaningful tool to use to that
end.

After all, C++ can certainly be used in decidedly non-OO ways.  For
instance, STL is NOT an OO framework, and the author of STL, obviously
something of a fan of C++, characterizes OO as almost as much of a
hoax as Artificial Intelligence.
http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism

I tend to agree with that characterization.

Further, C++ suffers from the same not OO at its base problem of
Java, which contributes complexity as well as hurting the OO-ness of
it.


Better idea is to start to use C99 in PostgreSQL ;-).


I have not investigated this yet. But I am very interested to know
what the advantages would be to upgrade the code to C99 standards.


It would give us heartburn on any platforms where the preferred
compiler doesn't grok C99, for sure.

As much as I'm ok with using GCC, it would seem unfortunate to force
people into using GCC everywhere, and preclude using other compilers.
(And actually, I'm more ambivalent about GCC than that; I'm not
totally happy with how GCC has gone, but that's another tale for
another day...)


Speaking of language choice, no one said that _all_ the source code  
would need to be rewritten. It would be nice, for example, if  
PostgreSQL rewrote the current GUC system with a glue language like  
Lua (which is also very C-like).


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)

2008-09-24 Thread A.M.



On Sep 24, 2008, at 2:38 PM, Josh Berkus wrote:


Peter,

Yeah, but do we even have the slightest bit of information about  
what exactly would be required to achieve the required levels?  And  
whether this patch does it?  And whether there would be  
alternative, more desirable ways to achieve a similar level?


Actually, I have some direct communication that SEPostgres will lead  
to PostgreSQL being widely adopted in at least one US government  
agency. Can't say more, of course.  ;-)


And the consideration is predicated on the PostgreSQL community  
accepting the patch? This sounds like an opportunity for one of the  
numerous enterprise spin-offs.


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)

2008-09-19 Thread A.M.



On Sep 19, 2008, at 1:42 PM, Robert Haas wrote:


It's too early to vote. :-)

The second and third option have prerequisite.
The purpose of them is to match granularity of access controls
provided by SE-PostgreSQL and native PostgreSQL. However, I have
not seen a clear reason why these different security mechanisms
have to have same granuality in access controls.


Have you seen a clear reason why they should NOT have the same  
granularity?


I realize that SELinux has become quite popular and that a lot of
people use it - but certainly not everyone.  There might be some parts
of the functionality that are not really severable, and if that is the
case, fine.  But I think there should be some consideration of which
parts can be usefully exposed via SQL and which can't.  If the parts
that can be are independently useful, then I think they should be
available, but ultimately that's a judgment call and people may come
to different conclusions.


If the SE-PostgreSQL effort simply implemented SQL interfaces to  
increase security granularity, it wouldn't be SE-PostgreSQL at all. SE- 
PostgreSQL integrates with a set of optional system-wide access  
controls and is only marginally related to SQL-level database  
features. Since it relies on an optional component, it doesn't really  
make sense that anyone is surprised that the patch doesn't improve  
security granularity of vanilla PostgreSQL.


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Protocol 3, Execute, maxrows to return, impact?

2008-07-28 Thread A.M.



On Jul 28, 2008, at 1:54 PM, Dave Cramer wrote:



On 28-Jul-08, at 12:45 PM, Stephen R. van den Berg wrote:


Dave Cramer wrote:

On 27-Jul-08, at 3:00 PM, Stephen R. van den Berg wrote:

Stephen R. van den Berg wrote:
The driver beats libpq in speed by about 62%.



Anyone interested in taking a peek at the (GPL copyright) driver, I
temporarily put up a small package which contains the working  
driver

in Pike at:



http://admin.cuci.nl/psgsql.pike.tar.gz



This is very exciting news, I'd love to look at it, is there any way
it could be re-licensed so that it can be incorporated into say the
jdbc driver ?


Since I wrote it, I can relicense it any which way I want.
What kind of license would you like to have?


As Joshua mentioned BSD is the preferred postgresql license. As I  
understand it I can't even look at your code and subsequently use  
anything in the JDBC driver


The GPL does not cover implementation ideas, not to mention that the  
author just described the implementation. Furthermore, one could not  
take anything from the Pike driver for the JDBC driver because it is  
a completely different language. It seems like you confused the GPL  
with an NDA.


Anyway, what does The driver beats libpq in speed by about 62% mean?

Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] digest

2008-07-10 Thread A.M.



On Jul 10, 2008, at 1:20 PM, Fabrízio de Royes Mello wrote:


set pgsql-hackers digest


Postgresql hackers have been successfully digested. *burp*

-M
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-22 Thread A.M.



On Apr 22, 2008, at 1:47 PM, Simon Riggs wrote:


On Mon, 2008-04-21 at 22:27 -0400, Gregory Stark wrote:

Simon Riggs [EMAIL PROTECTED] writes:

Unrelated to rule processing, you did read the bit about MERGE and  
race
conditions? ISTM that MERGE as it stands isn't very useful for  
anything
other than large data loads since its going to cause problems if  
used

concurrently.


If there are race conditions what advantage does it offer over  
writing plpgsql

or client code to do it?


That's an excellent question. I'm not trying to sell you anything  
here.

MERGE is a SQL Standard command, supported by Oracle, DB2, SQLServer
etc, so there is reason enough to implement it.

There may be also reasons to implement other syntaxes, other  
behaviours,
which would be non-standard. If people want the latter first/second/ 
not

at all then please speak, its not an either-or situation.

I would expect MERGE to be slightly faster than a well coded PL/pgSQL
function, but there won't be too much in it. It will allow the  
statement

to be more easily parallelised in the form it currently takes, I would
note.

I thought the whole advantage of having a built-in command is that  
it could do
the kind of locking our unique constraints do to avoid race  
conditions.


As I've said elsewhere, we could have it lock each row, its just more
overhead if we do and not necessary at all for bulk data merging.


I was hoping to use MERGE alongside the other standard DML. Its  
purpose is to set the truth regardless of past states.


Why should it be relegated to the bulk-loading basement alongside COPY?

Cheers,
M


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE Specification

2008-04-21 Thread A.M.



On Apr 21, 2008, at 4:08 PM, Simon Riggs wrote:

The following two files specify the behaviour of the MERGE statement  
and
how it will work in the world of PostgreSQL. In places, this  
supercedes

my recent commentary on MERGE, particularly with regard to triggers.

Neither of these files is intended for CVS.

The HTML file was generated from SGML source, though the latter is not
included here for clarity.

The test file shows how I expect a successful test run to look when a
regression test is executed with a working version of final MERGE  
patch

applied. It has behavioural comments in it also, to make it slightly
more readable.

If anybody has any questions, ask them now please, before I begin
detailed implementation.



MERGE will not invoke Rules. Does this imply that MERGE cannot be  
used on views or that the resulting INSERTs or UPDATEs do not work on  
views?


Cheers,
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread A.M.



On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote:


Dimitri Fontaine wrote:
Of course, the backends still have to parse the input given by  
pgloader, which only pre-processes data. I'm not sure having the  
client prepare the data some more (binary format or whatever) is a  
wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm  
all wrong, so I'm all ears!


As far as I understand, pgloader starts N threads or processes that  
open up N individual connections to the server. In that case, moving  
then text-binary conversion from the backend into the loader won't  
give any

additional performace I'd say.

The reason that I'd love some within-one-backend solution is that  
I'd allow you to utilize more than one CPU for a restore within a  
*single* transaction. This is something that a client-side solution  
won't be able to deliver, unless major changes to the architecture  
of postgres happen first...


It seems like multiple backends should be able to take advantage of  
2PC for transaction safety.


Cheers,
M

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] FW: bitemporal functionality for PostgreSQL

2008-02-01 Thread A.M.



On Feb 1, 2008, at 10:42 AM, Luke Porter wrote:


All

Is there an interest in developing bitemporal functionality in  
PostgreSQL


Regards

Luke


I can only speak for myself, but- definitely! Based on the googling I  
did on bitemporal database, I kind of do this already with  
PostgreSQL. Some of my tables are insert-only and each row includes a  
committed time timestamp. That way, I don't need a separate audit log  
table, and fixing someone's mistake is as simple as copying old  
rows. The downside to this is that I need a view to represent the  
current truth and calculating the truth is more expensive than a  
simple table would be.


Can you explain in more detail or provide references to how  
PostgreSQL could potentially handle temporal data better?


One idea I had would be to blow the transaction ID up to 128 bits (no  
more wrapping!) and have it represent the nanoseconds since the epoch.


Cheers,
M

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Naming of the prefab snowball stemmer dictionaries

2007-08-22 Thread A.M.



On Aug 22, 2007, at 11:10 , Tom Lane wrote:


I notice that the existing tsearch documentation that we've imported
fairly consistently refers to Snowball dictionaries with names like
en_stem, ru_stem, etc.  However, CVS HEAD is set up to create them
with names english, russian, etc.  As I've been absorbing more of
the docs I'm starting to wonder whether this is a good idea.  ISTM
that these names encourage a novice to think that the one dictionary
is all you could need for a given language; and there are enough
examples of more-complex setups in the docs to make it clear that
in fact Snowball is not the be-all and end-all of dictionaries.

I'm thinking that going back to the old naming convention (or  
something
like it --- maybe english_stem, russian_stem, etc) would be  
better.

It'd help to give the right impression, namely that these dictionaries
are a component of a solution but not necessarily all you need.


Please use ISO 639 codes plus any qualifiers to reduce confusion.
http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes

-M

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] no cascade triggers?

2007-06-26 Thread A.M.



On Jun 26, 2007, at 10:04 , Chris Mair wrote:


Hello,


On the italian list we're discussing a case were a user reportedly
worked around this (i.e. got rid of unwanted cascading calls) by
writing an on insert trigger procedure something on the lines of:

ALTER TABLE tab DISABLE TRIGGER USER;
 -- do more inserts into the same table
ALTER TABLE tab ENABLE TRIGGER USER;

While this reporetedly worked well in 8.2.1 it does not in 8.2.4
resulting in an error:

  ERROR: relation distinta_base1 is being used by active queries
  in this session Stato SQL: 55006

Now -- while we agree that disabling a trigger from inside itself
is a somewhat strange thing to do, we cannot see a good and easy
solution to the problem (of avoiding cascading trigger calls).

General question: would a no cascade clause for triggers be a
todo item?


Instead of putting a trigger on the table, put a rule+trigger on a  
wrapper view.


Cheers,
M

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] My honours project - databases using dynamically attached entity-properties

2007-03-15 Thread A.M.



On Mar 15, 2007, at 11:31 , Ron Mayer wrote:


Josh Berkus wrote:

And then what? dynamically construct all your SQL queries?
Sure, sounds like a simple solution to me...


Not to mention DB security issues.  How do you secure your  
database when

your web client has DDL access?

So, Edward, the really *interesting* idea would be to come up with a
secure, normalized way to do UDFs *without* EAV tables.  People  
would be

very impressed.



I have a system with many essentially user-defined fields, and was
thinking of creating something similar to an Array type and writing
some GIST indexes for it.

My current workaround is to store them as a YAML document and use
tsearch to index it (with application logic to further refine the
results) - but a EAV datatype that could be put in tables and
effectively indexed would be of quite a bit of interest here.
And yes, a better say to do UDFs would be even cooler.


Out of all the databases that I have used, postgresql offers the most  
flexible DDL- mostly for one reason: they can operate within  
transactions.


To handle arbitrary strings as column identifiers, the column names  
could actually be stripped down to lower-case letters and the real  
title could be stored in a separate table or as column comments.


Mr. Berkus' concern regarding the security implications is already  
handled by privilege separation or security-definer functions.


The OP's concern about the difficulty about querying a schema  
structure is alleviated via any number of APIs in Perl, JDBC, etc.


It seems to me that postgresql is especially well-suited to run DDL  
at runtime, so what's the issue?



-M

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] COMMIT NOWAIT Performance Option

2007-03-05 Thread A.M.



On Mar 3, 2007, at 23:19 , Robert Treat wrote:



A similar idea we've been kicking around would be having a set storage
parameter = nologging option for alter table which would, as it's name
implies, cause the system to ignore writing wal logs for the table,  
much like
it does for temp tables now.  One cavaet would be you would  
probably need to
forbid such a table from being the parent side of a FK  
relationship, but
otherwise this should be fairly safe even for replay since alter  
table needs
an exclusive lock, so you have finite points where data for the  
table would

be written or ignored.


I can't think of any reason why a global temp table couldn't be  
referenced by another global temp table, but maybe I'm missing  
something. Whenever postgres starts, it would simply truncate the  
tables to ensure temporary compliance (the no-foot-gun approach). I  
could then place such tables in a ramdisk tablespace and make  
postgresql a transaction-safe memcached replacement. This would also  
be great for materialized views.


Cheers,
M

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] COMMIT NOWAIT Performance Option

2007-02-26 Thread A.M.



On Feb 26, 2007, at 18:58 , Simon Riggs wrote:


On Mon, 2007-02-26 at 23:25 +, Richard Huxton wrote:

Simon Riggs wrote:
Proposal: Implement a new option for COMMIT, for enhancing  
performance,
providing a MySQL-like trade-off between performance and  
robustness for

*only* those that want it.

COMMIT NOWAIT


Isn't the domain of transient data relegated to certain tables  
instead of specific transactions? Wouldn't it be easier to create un- 
wal-logged global temp tables?


-M

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Monotone-devel] Re: SCMS question

2007-02-23 Thread A.M.



On Feb 23, 2007, at 11:24 , Andreas Pflug wrote:




It probably _can_ never happen, because that would have to be a
one-for-all solution, embracing both centric and distributed
repositories, combining contradictionary goals. So the first  
question to

answer is: Will PostgreSQL continue with a single repository (the
project was managed very successfully this way for a long time), or  
try
a distributed approach. IMHO facts would quote for a central  
repository,

which would drastically reduce SCM candidates.


Any distributed SCM can be inherently be used as a central repo. The  
project leaders would merely designate one place from whence builds  
are generated and developers would simply sync with the central repo.


In fact, distributed SCMs fit the open source development model  
better because any failure of the central repo (crash, buy-out,  
sabotage, needing a project fork) cannot cause chaos: a new central  
repo is designated- essentially an instant failover.


Cheers,
M

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread A.M.

On Thu, July 13, 2006 11:03 am, Jonah H. Harris wrote:
 This is my point exactly.  As with many things, we keep skirting the
 real issue by going with an improve the smaller component approach such
 as promote pgfoundry more.  I have never seen this approach work, but
 maybe someone has an example of another OSS project that has successfully
 excluded major components like this?

Perl? CPAN? Many modules are included but how they are chosen is somewhat
arbitrary. However, those modules can be updated from CPAN (without
redownloading Perl).




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread A.M.

On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote:
 On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:

 Instead, I would like to humbly request the inclusion of a UUID
 datatype (or an opaque 128 bit datatype) in the core package. It's
 increasingly common and some databases (MS SQLServer) already have built
 in support for it.

 We have it. We're just not putting in the effort required to have it
 included in core, as it's too much effort to convince people that the type
 has value, that is is generic, and would be widely used without being
 abused. All the geometric types that I'll never use in core, with few or
 no uses, including functions to operate on these types, and no UUID
 type... Hehe... To me, that's irony... :-)

Is it on pgfoundry? From past discussions, the new criteria for getting
something into core is to first determine if it is successful on
pgfoundry.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] vacuum row?

2006-06-26 Thread A.M.

On Mon, June 26, 2006 9:37 am, Mark Woodward wrote:
 On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote:

 I originally suggested a methodology for preserving MVCC and everyone
 is confusing it as update in place, this isnot what I intended.

 Actually, you should've presented your idea as performing MVCC the way
 Firebird does... the idea is basically the same.  Doing some research
 never hurts... especially with this crowd.

 Is it really nessisary make personal comments like this? Lets discuss
 ideas not personalities or people.


 The whole issue was how to address updates steadily degrading
 performance. I wanted to brainstorm the issue and find a solution. I
 tossed out a first guess at an algorithm to start the ball rolling.

 Was it perfect? No. Was it intended to be? no. It was intended to spark a
  discussion, get people, first to recognize the problem, and then to
 think about possible solutions.

 I find that this, while chaotic, usually finds the best solutions. There
 are a lot of good and smart people here who understand this process and see
 it for what it is. Unfortunately, some don't.

 It isn't about research, per se, because it is assumed that we all know
  the various issues involved to some degree. It is about using the
 collective knowledge of the group and coming up with an answer.

Actually, it is. There are plenty of databases that don't need an
expensive separate process to clean out dead-space, so it would be wise to
see how those alternatives handle MVCC to see what this project can glean
from other's work- that is the point of open source.

Firebird, for example, has a half-dozen articles on how it handles MVCC
and dead tuples. In particular, it puts the vacuum burden on the
sessions and fires off a separate cleanup (sweep) thread. After all,
what will know better than the transaction itself on what needs to be
cleaned up?

Linking goodness:
http://www.firebirdsql.org/doc/whitepapers/fb_vs_ibm_vs_oracle.htm
http://www.ibphoenix.com/main.nfs?a=ibphoenixpage=ibp_expert4

-M



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-23 Thread A.M.

On Fri, June 23, 2006 9:56 am, Martijn van Oosterhout wrote:
 On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote:

 It sounds like you have a big problem and you need a big
 solution.

 Well, Postgres does a decent job as it is. The problem is under peek
 load, sometimes it gets bogged down and the usual things like vacuum will
 not help immediately. I think a few more features like the dead space
 map for quick vacuum and even something like the original post's
 proposition would make postgres fly under heavy load too...

 I know there have a been a number of suggestions in the past to deal
 with this thing. Some I don't remember being mentioned in this thread are:


 - Once a tuple has been determined to be invisible to everyone,
 truncate it to just the header. This would probably work wonders for
 frequently updated wide tables. However, this required keeping track of
 the oldest active xact, I'm not sure how that works at the moment.

 - Have the bgwriter do cleanup work before writing out a block. It
 could probably do the truncation bit above, but totally removing old tuples
 requires cleaning out the indexes too, which AIUI is the hard part of
 vacuuming.

 One totally whacked out idea I just thought of: Instead of just
 truncating tuples when they're invisible, mark them dying and make the
 data section store an array of CTIDs pointing to the index tuples pointing
 to it. Lookups that find the tuple via an index could store the CTID of
 the index tuple before continuing. If the bgwriter sees it has a full set,
 it can efficiently remove the tuple straight away.

 There are ofcourse drawbacks to this approach, you'd probably need
 something like the half-dirty pages to avoid a large increase in write
 load. If it's even beneficial at all given concurrency issues.

A lot of these recommendations sound like garbage collection ideas found
in modern programming languages. Perhaps it would be worth considering
allowing sessions to keep track of which pages they alter and spawn a
separate process per connection to sweep up slowly along the way.

Also, it's nice that vacuum now has slow-down settings, but why isn't
there a option to autovacuum during periods of idleness and pause when
busy?

-M



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] let's meet

2006-06-22 Thread A.M.

On Thu, June 22, 2006 1:42 pm, Josh Berkus wrote:
 Marc,


 Sorry folks, my fault ... hit the 'accept' button too fast


 So, was Wilbur really as attractive as she claimed?

In all seriousness, it's actually a pretty clever spam ploy.

1) Spam mailing lists of nerdy, desperate guys with thinly-veiled fake
solicitations for sex.
2) Thousands of naive mailing list subscribers respond to the email.
3) Email addresses of said desperate men are harvested and sold.
4) Aforementioned men are spammed with ads for B16 p3n15 pills.
5) ???
6) Profit!

Best regards,
Wilbur


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work

2006-06-09 Thread A.M.

It would be nice to keep the gettimeofday()s wherever they are most useful
on hardware/software where they are cheap. Perhaps a compile-time option?

On Fri, June 9, 2006 11:18 am, Martijn van Oosterhout wrote:
 On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote:

 To tell you the truth, this information makes me even less pleased with
  the sampling-gettimeofday patch than I was before.  If gettimeofday()
 in itself increases the runtime of a node by a factor of 10, then just
 trying to subtract off that time is no solution.  There's too much
 impact on surrounding nodes, and too much roundoff error anyhow. I had
 thought we were applying an order-of-ten-percent correction by
 subtracting SampleOverhead, not an order-of-10x correction :-(

 Eh? The whole point is to call gettimeofday() much less often. If you
 call it 1000th as often, then the correction is only on the order of one
 hundredth of the normal query time...

 Subtracting SampleOverhead is only a correction on the order of a few
 percent, it's the reduced calling of gettimeofday() that provides the
 benefit.

 Have a nice day,
 --
 Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/

 From each according to his ability. To each according to his ability to
 litigate.




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: Ending EXPLAIN ANALYZE early (was Re: [HACKERS] That EXPLAIN

2006-06-08 Thread A.M.

I think what he meant was a separate EXPLAIN-CANCEL message on a
cancel-type connection, which would be completely backwards compatible.
Old clients simply wouldn't be able to use the special EXPLAIN cancel,
just like it is now.

On Thu, June 8, 2006 3:01 pm, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:

 Tom Lane [EMAIL PROTECTED] writes:
 Would it be possible to make a whole new protocol message for EXPLAIN
 results?

 I'm really unwilling to get into that.  For one thing, that would
 absolutely positively break *all* use of EXPLAIN from un-fixed clients,
 whether you were trying to stop early or not.

 The reason I'm suggesting that is because it might make it easier to
 implement the SIGINFO handler that sends incremental EXPLAIN results on
 demand that I was describing earlier.

 Doesn't matter, because that's not happening ;-)  SIGINFO isn't
 portable, and even if it were, a signal handler couldn't possibly generate
 EXPLAIN output (remember those catalog accesses).


 regards, tom lane

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?


 http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] Implementing RESET CONNECTION ...

2006-04-25 Thread A.M.

To complete the connection pooling for multiple users, it would be great
to have a protocol level option to change roles semi-permanently (to
reduce permissions). RESET SESSION AUTHORIZATION would then bounce back to
that (new, set) role until another protocol-level role rollback. This
would allow completely reusable connections per database while maintaining
a real sandbox for each connection.

On Tue, April 25, 2006 10:19 am, Bruce Momjian wrote:


 OK, what would people like done with this patch?  Our TODO list has:


 * -Add RESET CONNECTION command to reset all session state


 This would include resetting of all variables (RESET ALL), dropping of
 temporary tables, removing any NOTIFYs, cursors, open transactions,
 prepared queries, currval()s, etc.  This could be used  for connection
 pooling.  We could also change RESET ALL to have this functionality. The
 difficult of this features is allowing RESET ALL to not affect changes
 made by the interface driver for its internal use.  One idea is for this
 to be a protocol-only feature.  Another approach is to notify the protocol
 when a RESET CONNECTION command is used.

 This patch does everything except reset currval(), but the big missing
 item is that it doesn't handle the protocol issues outlined in the TODO
 item.  However, there also has been very little discussion on exactly how
 the protocol stuff would work.

 Should we add it for 8.2 and see if we get any problem reports?



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] control pg_hba.conf via SQL

2006-03-30 Thread A.M.

Could postgres offer at least a read-only view of the data in the interim?
Ordering could be controlled by line number.

On Thu, March 30, 2006 10:14 am, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:

 Tom Lane wrote:

 If your pg_hba.conf looks like
 hostall all 0.0.0.0/32  md5 there's not much call to 
 update it
 dynamically ...

 There'll be a call to update it once - to 0.0.0.0/0 ;-)


 Doh ;-).  Should make more effort to check my throwaway examples ...


 But it's not clear to me why a CONNECT right shouldn't encompass all
 the things that hba does, i.e. connect method, source address and auth
 method.

 Because that stuff doesn't fit into either the syntax of GRANT or the
 system tables that store grant information.  It's talking about concepts
 that don't even exist in the SQL world (while users and databases
 certainly do).

 Also, we know from experience that there's value in applying an ordered
 set of tests in pg_hba.conf --- in particular, rules about local vs
 local net vs anywhere connections are most easily expressed that
 way.  We would need some substitute rule or concept in order to do the same
 work in GRANT, and I don't see what that would be.

 Recently in another thread someone was remarking about how ugly MySQL's
 authentication methods are.  I think that's in part because they have
 chosen to wedge the client hostname into their concept of user.  It
 doesn't fit nicely.

 regards, tom lane

 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?


 http://archives.postgresql.org





---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [NOVICE] Question on TRUNCATE privleges

2005-02-22 Thread A.M.

The author doesn't mention why he got a 600x increase- perhaps he 
bypassed the delete triggers which was OK for his situation. I don't 
like the notion that an optimization requires additional 
privileges...why not detect an unqualified delete and call truncate 
instead IFF there are no delete triggers on the table?

I'm not entirely sure that requiring ownership of the table is the
appropriate restriction for TRUNCATE.  It made some sense back when
TRUNCATE wasn't transaction-safe, but now that it is, you could almost
argue that ordinary DELETE privilege should allow TRUNCATE.
Almost.  The hole in the argument is that TRUNCATE doesn't run ON 
DELETE
triggers and so it could possibly be used to bypass things the table
owner wants to have happen.  You could equate TRUNCATE to DROP 
TRIGGER(s),
DELETE, CREATE TRIGGER(s) ... but DROP TRIGGER requires ownership.

CREATE TRIGGER only requires TRIGGER privilege which is grantable.
So one answer is to change DROP TRIGGER to require TRIGGER privilege
(which would mean user A could remove a trigger installed by user B,
if both have TRIGGER privileges on the table) and then say you can
TRUNCATE if you have both DELETE and TRIGGER privileges.
It looks to me like the asymmetry between CREATE TRIGGER and DROP
TRIGGER is actually required by SQL99, though, so changing it would
be a hard sell (unless SQL2003 fixes it?).
Comments anyone?
regards, tom lane
---(end of 
broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if 
your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Documentation: Fast Backward/Forward

2004-01-18 Thread A.M.

A simple documentation enhancement request: please provide Fast 
Backward/Forward links at the bottom of the page as well.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[HACKERS] Errors uniquely identified

2003-04-06 Thread A.M.

There doesn't seem to be any way to get error reporting with unique 
identifiers for each type of error (or warning or notice). Is this 
something planned for the future? Would this be a daunting task because 
the various messages are strewn throughout the sources? Would 
centralized error strings (and ids) be appreciated for easier 
automation parsing? Just some thoughts...I might offer to tackle this 
unless there is some reason not to do so.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

79 matches

Mail list logo