Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Russell Smith
On 28/05/10 04:00, Josh Berkus wrote:
>>  Consider a table that is
>> regularly written but append-only.  Every time autovacuum kicks in,
>> we'll go and remove any dead tuples and then mark the pages
>> PD_ALL_VISIBLE and set the visibility map bits, which will cause
>> subsequent vacuums to ignore the all-visible portions of the table...
>> until anti-wraparound kicks in, at which point we'll vacuum the entire
>> table and freeze everything.
>>
>> If, however, we decree that you can't write a new tuple into a
>> PD_ALL_VISIBLE page without freezing the existing tuples, then you'll
>> still have the small, incremental vacuums but those are pretty cheap,
>> 
> That only works if those pages were going to be autovacuumed anyway.  In
> the case outlined above (which I've seen at 3 different production sites
> this year), they wouldn't be; a table with less than 2% updates and
> deletes does not get vacuumed until max_freeze_age for any reason.  For
> that matter, pages which are getting autovacuumed are not a problem,
> period; they're being read and written and freezing them is not an issue.
>
> I'm not seeing a way of fixing this common issue short of overhauling
> CLOG, or of creating a freeze_map.  Darn.
>   
Don't you not get a positive enough effect by adjusting the table's
autovacuum_min_freeze_age and autovacuum_max_freeze_age.  If you set
those numbers small, it appears to me that you would get very quickly to
a state where the vacuum would example only the most recent part of the
table rather than the whole thing.  Does that give you enough of a win
that it stops the scanning and writing of the whole table which reduces
the performance problem being experienced.  It's not a complete
solution, but does it go someway?

Regards

Russell


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.

2008-11-19 Thread Russell Smith
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
>   
>> Bruce Momjian wrote:
>> 
>>> This is not something we would typically backpatch because of the danger
>>> of introducing some unexpected change in libpq.  We can provide a patch
>>> to anyone who needs it, or if the community wants it backpatched I can
>>> certainly do that.
>>>   

If we start deciding we are not backpatching fixes that we know cause
crashes, where is the limit?

>> It isn't?  It does seem like a bug, which we do typically backpatch ...
>> 
>
> Well, it's a risk-reward tradeoff.  In this case it seems like there's
> a nontrivial risk of creating new bugs against fixing a problem that
> evidently affects very few people.  I concur with Bruce's feeling that
> we shouldn't backpatch ... at least not now.  Once the patch has been
> through beta testing we could reconsider.
>
>   regards, tom lane
>   
I would like to see this backpatched.  Even though the PostgreSQL
community hasn't seen a lot of complaints, there have been a number of
reports where the bug has caused crashes.  Ubuntu launchpad has 6
duplicates for this bug.  php has a bug report for it.  So it's not like
people don't know about it.  They just didn't know how to fix it.  All
that said, I agree it's safer to wait until the 8.4 beta cycle has given
this code change a good run before proceeding.  In the mean time
distributions can either backpatch it themselves or wait for PostgreSQL
community to apply the patch.

For the environment where I have this problem, I think it's still going
to be a up hill battle to get RedHat to incorporate the fix into RHEL5. 
That's whichever route the community takes with backpatching.

Russell.



Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.

2008-11-13 Thread Russell Smith
Bruce Momjian wrote:
> Yes, my defines were very messed up;  updated version attached.
>   
Hi,

I've not done a review of this patch, however I did backport it to 8.3
(as attached in unified diff). The patch wasn't made for PG purposes, so
it's not in context diff. I tested the backported patch and the issues I
was experiencing with the initial bug report have stopped.  So the fix
works for the initial reported problem.

Will this be back patched when it's committed?


Regards

Russell
diff -uNr postgresql-8.3.3/src/interfaces/libpq/fe-secure.c postgresql-8.3.3.new/src/interfaces/libpq/fe-secure.c
--- postgresql-8.3.3/src/interfaces/libpq/fe-secure.c	2008-01-29 13:03:39.0 +1100
+++ postgresql-8.3.3.new/src/interfaces/libpq/fe-secure.c	2008-11-13 20:57:40.0 +1100
@@ -142,12 +142,10 @@
 #define ERR_pop_to_mark()	((void) 0)
 #endif
 
-#ifdef NOT_USED
-static int	verify_peer(PGconn *);
-#endif
 static int	verify_cb(int ok, X509_STORE_CTX *ctx);
 static int	client_cert_cb(SSL *, X509 **, EVP_PKEY **);
 static int	init_ssl_system(PGconn *conn);
+static void destroy_ssl_system(void);
 static int	initialize_SSL(PGconn *);
 static void destroy_SSL(void);
 static PostgresPollingStatusType open_client_SSL(PGconn *);
@@ -156,11 +154,19 @@
 static void SSLerrfree(char *buf);
 #endif
 
-#ifdef USE_SSL
 static bool pq_initssllib = true;
 
 static SSL_CTX *SSL_context = NULL;
+#ifdef ENABLE_THREAD_SAFETY
+static int ssl_open_connections = 0;
+ 
+#ifndef WIN32
+static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
+#else
+static pthread_mutex_t ssl_config_mutex = NULL;
+static long win32_ssl_create_mutex = 0;
 #endif
+#endif/* ENABLE_THREAD_SAFETY */
 
 /*
  * Macros to handle disabling and then restoring the state of SIGPIPE handling.
@@ -839,40 +845,53 @@
 init_ssl_system(PGconn *conn)
 {
 #ifdef ENABLE_THREAD_SAFETY
-#ifndef WIN32
-	static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER;
-#else
-	static pthread_mutex_t init_mutex = NULL;
-	static long mutex_initlock = 0;
-
-	if (init_mutex == NULL)
-	{
-		while (InterlockedExchange(&mutex_initlock, 1) == 1)
-			 /* loop, another thread own the lock */ ;
-		if (init_mutex == NULL)
-			pthread_mutex_init(&init_mutex, NULL);
-		InterlockedExchange(&mutex_initlock, 0);
-	}
-#endif
-	pthread_mutex_lock(&init_mutex);
-
-	if (pq_initssllib && pq_lockarray == NULL)
-	{
-		int			i;
-
-		CRYPTO_set_id_callback(pq_threadidcallback);
-
-		pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks());
-		if (!pq_lockarray)
-		{
-			pthread_mutex_unlock(&init_mutex);
-			return -1;
-		}
-		for (i = 0; i < CRYPTO_num_locks(); i++)
-			pthread_mutex_init(&pq_lockarray[i], NULL);
-
-		CRYPTO_set_locking_callback(pq_lockingcallback);
-	}
+#ifdef WIN32
+  if (ssl_config_mutex == NULL)
+  {
+  while (InterlockedExchange(&win32_ssl_create_mutex, 1) == 1)
+   /* loop, another thread own the lock */ ;
+  if (ssl_config_mutex == NULL)
+  {
+  if (pthread_mutex_init(&ssl_config_mutex, NULL))
+  return -1;
+  }
+  InterlockedExchange(&win32_ssl_create_mutex, 0);
+  }
+#endif
+   if (pthread_mutex_lock(&ssl_config_mutex))
+   return -1;
+ 
+   if (pq_initssllib)
+   {
+   if (pq_lockarray == NULL)
+   {
+   int i;
+   
+   pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks());
+   if (!pq_lockarray)
+   {
+   pthread_mutex_unlock(&ssl_config_mutex);
+   return -1;
+   }
+   for (i = 0; i < CRYPTO_num_locks(); i++)
+   {
+   if (pthread_mutex_init(&pq_lockarray[i], NULL))
+   {
+   free(pq_lockarray);
+   pq_lockarray = NULL;
+   pthread_mutex_unlock(&ssl_config_mutex);
+   return -1;
+   }
+   }
+   }
+   
+   if (ssl_open_connections++ == 0)
+   {
+   /* These are only required for threaded SSL applications */
+   CRYPTO_set_id_callback(pq_threadidcallback);
+   CRYPTO_set_locking_callback(pq_lockingcallback);
+   }
+}
 #endif
 	if (!SSL_context)
 	{
@@ -894,18 +913,61 @@
 			  err);
 			SSLerrfree(err);
 #ifdef ENABLE_THREAD_SAFETY
-			pthread_mutex_unlock(&init_mutex);
+			pthread_mutex_unlock(&ssl_config_mutex);
 #endif
 			return -1;
 		}
 	}
 #ifdef ENABLE_THREAD_SAFETY
-	pthread_mutex_unlock(&init_mutex);
+	pthread_mutex_unlock(&ssl_config_mutex);
 #endif
 	return 0;
 }
 
 /*
+ *This function is needed because if the libpq library is unloaded
+ *from the application, the callback functions will no longer exist when
+ *SSL used by other parts of the system.  For this reason,
+ *we unregister the SSL callback functions when the last libpq
+ *connection is closed.
+ */
+static void
+destroy_ssl_system(void)
+{
+#ifdef ENABLE_THREAD_SAFETY
+  /* Assume mutex is already created */
+  if (pthread_mutex_lock(&ssl

Re: [HACKERS] SSL README

2008-10-23 Thread Russell Smith
Alvaro Herrera wrote:
> While you're cleaning up SSL, how about the thread with this email on
> it:
>
> 19212172.post%40talk.nabble.com
>
>   
Yeah, I mentioned this to Magnus this morning (my time) and he said
Bruce was compiling a patch in time for the next commit fest.

I'm not sure where it's all at, but an "official" update would be helpful.

Russell.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Russell Smith
Andrew Dunstan wrote:
>> Do we know why we experience "tuple concurrently updated" errors if we
>> spawn thread too fast?
>>   
>
> No. That's an open item.

Okay, I'll see if I can have a little more of a look into it.  No
promises as the restore the restore isn't playing nicely.
>
>>
>> the memory context is shared across all threads.  Which means that it's
>> possible the memory contexts are stomping on each other.  My GDB skills
>> are now up to being able to reproduce this in a gdb session as there are
>> forks going on all over the place.  And if you process them in a serial
>> fashion, there aren't any errors.  I'm not sure of the fix for this.
>> But in a parallel environment it doesn't seem possible to store the
>> memory context in the AH.
>>   
>
>
> There are no threads, hence nothing is shared. fork() create s new
> process, not a new thread, and all they share are file descriptors.
>
> However, there does seem to be something odd happening with the
> compression lib, which I will investigate. Thanks for the report.

I'm sorry, I meant processes there.  I'm aware there are no threads. 
But my feeling was that when you forked with open files you got all of
the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor.  My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue. 
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
informative for me.  I again could be wrong but worth a read.  If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor.  I haven't had a change to implementation test is
as it's late here.  But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders

Russell.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_restore - WIP patch

2008-09-26 Thread Russell Smith
Hi,

As I'm interested in this topic, I thought I'd take a look at the
patch.  I have no capability to test it on high end hardware but did
some basic testing on my workstation and basic review of the patch.

I somehow had the impression that instead of creating a new connection
for each restore item we would create the processes at the start and
then send them the dumpId's they should be restoring.  That would allow
the controller to batch dumpId's together and expect the worker to
process them in a transaction.  But this is probably just an idea I
created in my head.

Do we know why we experience "tuple concurrently updated" errors if we
spawn thread too fast?

I completed some test restores using the pg_restore from head with the
patch applied.  The dump was a custom dump created with pg 8.2 and
restored to an 8.2 database.  To confirm this would work, I completed a
restore using the standard single threaded mode.   The schema restore
successfully.  The only errors reported involved non-existent roles.

When I attempt to restore using parallel restore I get out of memory
errors reported from _PrintData.   The code returning the error is;

_PrintData(...
while (blkLen != 0)
{
if (blkLen + 1 > ctx->inSize)
{
free(ctx->zlibIn);
ctx->zlibIn = NULL;
ctx->zlibIn = (char *) malloc(blkLen + 1);
if (!ctx->zlibIn)
die_horribly(AH, modulename, " out of memory\n");

ctx->inSize = blkLen + 1;
in = ctx->zlibIn;
}


It appears from my debugging and looking at the code that in _PrintData;
lclContext *ctx = (lclContext *) AH->formatData;

the memory context is shared across all threads.  Which means that it's
possible the memory contexts are stomping on each other.  My GDB skills
are now up to being able to reproduce this in a gdb session as there are
forks going on all over the place.  And if you process them in a serial
fashion, there aren't any errors.  I'm not sure of the fix for this. 
But in a parallel environment it doesn't seem possible to store the
memory context in the AH.

I also receive messages saying "pg_restore: [custom archiver] could not
read from input file: end of file".  I have not investigated these
further as my current guess is they are linked to the out of memory error.

Given I ran into this error at my first testing attempt  I haven't
evaluated much else at this point in time.  Now all this could be
because I'm using the 8.2 archive, but it works fine in single restore
mode.  The dump file is about 400M compressed and an entire archive
schema was removed from the restore path with a custom restore list.

Command line used;  PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4
--truncate-before-load -v -d tt2 -L tt.list
/home/mr-russ/pg-index-test/timetable.pgdump 2> log.txt

I sent the log and this email originally to the list, but I think the 
attachment was too large, so I've resent without any attachements.  Since my 
initial testing, Stefan has confirmed the problem I am having.

If you have any questions, would like me to run other tests or anything,
feel free to contact me.

Regards

Russell


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where to Host Project

2008-09-19 Thread Russell Smith
Josh Berkus wrote:
>>> That's kind of what I'm doing now. But I'm wondering if I should
>>> bother with pgFoundry at all. It seems pretty dead (see Josh Berkus's
>>> reply).
>>>   
>
> Actually, pgFoundry remains extremely popular.  Currently, we're getting an 
> average of 2-3 new projects a week.
>
> The issue with pgFoundry is that it's based on a hacked version of the 
> GForge code, which had legacy problems to begin with and is now no longer 
> upgradable.  And while lots of people want to complain about it, nobody 
> wants to put in the 15-25 hours of work required to fix it up so that it 
> supports SVN and code snippets (including me).  
>   
I'd be willing to investigate and have a go at this.  Until now I
haven't heard that we were in such a situation with pgFoundry.  Does it
even have a roadmap for future work?  It is a problem that GForge
development in general appears to have slowed/stalled for a couple of
years.  There have been some recent developments but I'm not convinced
it will get back to the same development rate as Tim Prudue it working
on the AS version now.

I personally had such high hopes for pgfoundry as the GBorg site was not
that great.  But it seems that we haven't been able to make the
pgfoundry dream a reality.
> However, I agree with Robert that maintaining a collab site is, today, a 
> bad use of our limited administration resources, which could be better 
> turned to developing directory and build services (for "kitchen sink" 
> packages).  Realistically, though, shutting down pgFoundry might take as 
> much work as fixing it.
>   
Currently it's managed by core developers.  I'm not convinced it's the
best use of their time either.  But others in the community may be at
their best supporting something like pgFoundry.  But whether it's safe
to hand out that level of clearance to other community members is the
decision that has to be made.

So if somebody is interesting in contacting me about moving pgfoundry
forward, please do so.

Regards

Russell Smith


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] no XLOG during COPY?

2008-09-16 Thread Russell Smith
Andrew Dunstan wrote:
> [snip]
>>  
>
> Er, who doesn't know what for certain, exactly? pg_restore will
> certainly know that it has created the table in another session and
> can thus safely truncate the table in the same transaction as the data
> load.
>
> cheers
>
> andrew
>
I'm confused about why table definition and data can't be loaded in the
same backend and transaction.  Can somebody explain that?

All items in the tree like  A -> B -> C -> D  should all be loaded in
the same transaction as they are serially dependent.  I can't think of a
way that the table data requires more than just the table to load. 
Foreign keys may produce this situation but if all tables are loaded
with the data I can't see how it can happen.  As Foreign key tables must
be loaded before the referencing table.  But then I think these
constraints are loaded at the end anyway.

The first cut of this may not have the dependency resolution smarts to
work out how best to group restore items together to send to a backend
together.  My research into how the directed graph dependency
information is stored should allow for dishing out the data to backends
in the best possible way.  But currently there is no graph as such, just
a serial list of items that are safe to load.  Producing the graph will
give a better idea of maximum concurrency based on what's dependent on
each other.  But the graph has to be built from the dependency
information that's stored.

Is it also feasible to have the -1 (single transaction) option to
complete the largest possible work unit inside a single transaction. 
This means there would be 1transaction per backend work unit, eg (A, B,
C, D in the above).  I don' t know if indexes can skip WAL if they are
in the table creation transaction but that would seem like another win
if they were added at the same time as the table.  That does play
against the ideas of running all of the index creation statements in
parallel to get the benefit of synchronized scan.  I don't know what
going to be the biggest win on big hardware as I don't have any.  Just
something to think about.

Thanks

Russell Smith

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP DATABASE always seeing database in use

2008-08-05 Thread Russell Smith
Gregory Stark wrote:
> It seems there's something wrong with CheckOtherDBBackends() but I haven't
> exactly figured out what. There are no other sessions but drop database keeps
> saying "regression" is being accessed by other users. I do see Autovacuum
> touching tables in regression but CheckOtherDBBackends() is supposed to send
> it a sigkill if it finds it and it doesn't seem to be doing so.
>
> I've been hacking on unrelated stuff in this database and have caused multiple
> core dumps and autovacuum is finding orphaned temp tables. It's possible some
> state is corrupted in some way here but I don't see what.
>
>   
Autovacuum does this as well.  I know on 8.1, I've been bitten by it a
number of times.  I don't know for CVS or newer version than 8.1.  But
it's an option worth considering as autovac doesn't show up in
pg_stat_activity.

Regards

Russell.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identifier case folding notes

2008-07-08 Thread Russell Smith
Andrew Dunstan wrote:
> I'm not sure if you've read all the archive history on this. Here are
> the pointers from the TODO list:
>
> http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php
> http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php
>
>
> The fact is that we have substantial groups of users who want
> different things:
> . current users who want no change so there is no breakage in existing
> apps
> . users on other DBs who want Spec compliance
> . users on yet other DBs who want case preservation
>
> The last group should not be lightly dismissed - it is quite common
> behaviour on MSSQL as well as on MySQL, so we have some incentive to
> make this possible to encourage migration.
>
> I'm strongly of the opinion therefore that this should be behaviour
> determined at initdb time (can't make it later because of shared
> catalogs). I suspect that we won't be able to do all this by simple
> transformations in the lexer, unlike what we do now. But I do think
> it's worth doing.
>
> cheers
>
> andrew
>
Hi,

as part of the
http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php
thread, I did a reasonable amount of discovery work on making the 3
options andrew presents a reality.  As I'm not skilled enough I never
got far enough to make them all work at once.  I did however get lower
case and case preservation working. 

To make those tow work the catalogs need no changes.  Some of the
regressions tests expect case folding, so they did need changing to
operate correctly.  I was unable to make the input files to initdb
correctly fold the case of system catalogs for the upper case version. 
I'm sure somebody with more experience would not find it as difficult as
I did.  Function names tended to be where all the gotchas were.  Count()
vs count() vs COUNT() for example.

Once the db was up and running, the issue becomes all the supporting
tools.  psql was made to autocomplete with case preservation, I was
going to make pg_dump just quote everything.  I then got to the point of
adding a fixed GUC like LC_LOCALE that allows psql to read the case
folding situation and act according.  That is where my progress ended.

Attached is what i had worked in.  It's a patch against 8.3.1.  I know
it's not CVS head, but it is what I was using at the time to experiment.

Regards

Russell
=== modified file 'src/backend/access/transam/xlog.c'
--- src/backend/access/transam/xlog.c	2008-03-27 12:10:18 +
+++ src/backend/access/transam/xlog.c	2008-03-27 14:15:13 +
@@ -4040,6 +4040,9 @@
 	PGC_INTERNAL, PGC_S_OVERRIDE);
 	SetConfigOption("lc_ctype", ControlFile->lc_ctype,
 	PGC_INTERNAL, PGC_S_OVERRIDE);
+	/* Make the fixed case folding visible as GUC variables, too */
+	SetConfigOption("identifier_case_folding", ControlFile->identifierCaseFolding,
+	PGC_INTERNAL, PGC_S_OVERRIDE);
 }
 
 void
@@ -4290,6 +4293,10 @@
 	ControlFile->time = checkPoint.time;
 	ControlFile->checkPoint = checkPoint.redo;
 	ControlFile->checkPointCopy = checkPoint;
+
+	/* Set the case folding option */	
+	strncpy(ControlFile->identifierCaseFolding, "preserved", 9);
+
 	/* some additional ControlFile fields are set in WriteControlFile() */
 
 	WriteControlFile();

=== modified file 'src/backend/catalog/information_schema.sql'
--- src/backend/catalog/information_schema.sql	2008-03-27 12:10:18 +
+++ src/backend/catalog/information_schema.sql	2008-03-27 12:12:15 +
@@ -23,7 +23,7 @@
  */
 
 CREATE SCHEMA information_schema;
-GRANT USAGE ON SCHEMA information_schema TO PUBLIC;
+GRANT usage ON SCHEMA information_schema TO public;
 SET search_path TO information_schema, public;
 
 
@@ -33,7 +33,7 @@
 
 /* Expand any 1-D array into a set with integers 1..N */
 CREATE FUNCTION _pg_expandarray(IN anyarray, OUT x anyelement, OUT n int)
-RETURNS SETOF RECORD
+RETURNS SETOF record
 LANGUAGE sql STRICT IMMUTABLE
 AS 'select $1[s], s - pg_catalog.array_lower($1,1) + 1
 from pg_catalog.generate_series(pg_catalog.array_lower($1,1),
@@ -214,7 +214,7 @@
 CREATE VIEW information_schema_catalog_name AS
 SELECT CAST(current_database() AS sql_identifier) AS catalog_name;
 
-GRANT SELECT ON information_schema_catalog_name TO PUBLIC;
+GRANT SELECT ON information_schema_catalog_name TO public;
 
 
 /*
@@ -241,9 +241,9 @@
 FROM pg_auth_members m
  JOIN pg_authid a ON (m.member = a.oid)
  JOIN pg_authid b ON (m.roleid = b.oid)
-WHERE pg_has_role(a.oid, 'USAGE');
+WHERE pg_has_role(a.oid, 'usage');
 
-GRANT SELECT ON applicable_roles TO PUBLIC;
+GRANT SELECT ON applicable_roles TO public;
 
 
 /*
@@ -256,7 +256,7 @@
 FROM applicable_roles
 WHERE is_grantable = 'YES';
 
-GRANT SELECT ON administrable_role_authorizations TO PUBLIC;
+GRANT SELECT ON administrable_role_authorizations TO public;
 
 
 /*
@@ -353,7 +353,7 @@
   AND a.attnum > 0 AND NOT a.attisdropped
   AND c.relkind 

Re: [HACKERS] [PATCHES] Removal of the patches email list

2008-07-03 Thread Russell Smith
Bruce Momjian wrote:
> We have come to agreement that there is no longer a need for a separate
> 'patches' email list --- the size of patches isn't a significant issue
> anymore, and tracking threads between the patches and hackers lists is
> confusing.
>
> I propose we close the patches list and tell everyone to start using
> only the hackers list.  This will require email server changes and web
> site updates, and some people who are only subscribed to patches have to
> figure out if they want to subscribe to hackers.
>
> I have CC'ed hackers, patches, and www because this does affect all
> those lists.
>
>   
I think this is a good idea, and was expecting this to have happened
already.  Is there any time line or consensus that this is going to happen?

Regards

Russell Smith

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers

2008-03-27 Thread Russell Smith

Hi,

It looks like most of the hard yards will be in getting some form of 
consensus about what should be done for this TODO.  I can't see a reason 
not to get started on the design now.  If a decision is not able to be 
made after 4 years since the original discussion, is it worth removing 
the TODO or letting it sit for another 4?  But to the actual issue at hand.


Andrew Dunstan attempted to summarize the original 2004 thread 
http://archives.postgresql.org/pgsql-hackers/2006-10/msg01545.php;


--
There was some discussion a couple of years ago on the -hackers list 
about it, so you might like to review the archives. The consensus seemed 
to be that behaviour would need to be set no later than createdb time.


The options I thought of were:

1. current postgres behaviour (we need to do this for legacy reasons, of 
course, as well as to keep happy the legions who hate using upper case 
for anything)


2. strictly spec compliant (same as current behaviour, but folding to 
upper case for unquoted identifiers rather than lower)


3. fully case sensitive even for unquoted identifiers (not spec 
compliant at all, but nevertheless possibly attractive especially for 
people migrating from MS SQLServer, where it is an option, IIRC).

--


Supporting all 3 of these behaviours at initdb time is not too invasive 
or complicated from my initial investigation.  The steps appear to be;


1. parser has to parse incoming identifiers with the correct casing 
changes. (currently downcase_truncate_identifier)
2. The output quoting needs to quote identifiers using the same rules as 
the parser. (currently quote_identifier)
3. the client needs to know what quote rules are in place. (libpq: 
PQfname, PQfnumber)
4. psql needs to \ commands to be taught about the fact that case can 
mean different things to different servers.
5. bootstrap needs to correctly case the tables and insert values when 
bootstrapping at initdb time. This is only really an issue for upper 
case folding.


Many people appear advocate a 4th option to only want the column names 
to be case preserved or upper cased.  They expect other identifiers will 
behave as they do now.  This doesn't really bring us any closer to the 
spec, it takes us away from it as Tom has suggested in the past.  It 
also appears to increase the complexity and invasiveness of a patch.  
Being able to support case preservation/sensitivity for all identifiers 
at initdb time appears to be no extra work than supporting the upper and 
lower folding versions.


The discussions around having a name as supplied and a quoted version 
allow lots of flexibility, probably even down to the session level.  
However I personally am struggling to get my head around the corner 
cases for that approach.


If this needs to be at createdb time, I think we add at least the 
following complexities;


1. all relations cases must be altered when copied from the template 
database or quoted when copied.
  We have no idea what a template database might look like, all views 
and functions would need to be parsed to ensure they point to valid tables.
2. shared relations must be able to be accessed using different names in 
different databases, eg PG_DATABASE, pg_database.
3. The data in shared relations appears different to the same users in 
different databases.
  eg my unquoted username is MrRuss, in db1 (upper): MRRUSS, db2 (case 
sensitive): MrRuss, db3 (lower): mrruss

  My guts tells me that's going to lead to user confusion.


Dumping and restoring databases to different foldings can/will present 
an interesting challenge and I'm not sure how to support that.  I don't 
even know if we want to support that officially.


I'm leaning towards initdb time, mainly because I think a patch can be 
produced that isn't to invasive and is much easier to review and 
actually get applied.  I also think that adding the createdb time flags 
will push this task beyond my ability to write up a patch.  Either way 
though, consensus on what implementation we actually want going forward 
will enable some more skilled developer to do this without the pain of 
having to flesh out the design.


In light of this email and the other comments Tom and Andrew have made, 
it's very easy to say 'too hard, we can't get agreement'.  I would have 
thought that standards compliance would have been one good reason to 
push forward with at least the upper case folding ability.  Both of the 
previous threads on this issue raised lots of questions about possible 
options but there never seemed to be any knocking the ideas around and 
getting consensus phase.  I would like to at least nail down some of the 
requirement, if not all.  I have put forward my personal opinion, but I 
expect that is not of significant value as there are many others with 
much more experience than I.


Regards

Russell Smith

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make 

Re: [HACKERS] TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers

2008-03-25 Thread Russell Smith

Tom Lane wrote:

Russell Smith <[EMAIL PROTECTED]> writes:
  
As an attempt at a first PostgreSQL patch, I'd like to see if I can do 
anything about this issue.



Trying that as a first patch is a recipe for failure... the short answer
is that no one can think of a solution that will be generally acceptable.

regards, tom lane
  
What makes this change particularly challenging?  The fact that nobody 
has agreed on how it should work, or the actual coding?


regards

Russell

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TODO Item: Consider allowing control of upper/lower case folding of unquoted, identifiers

2008-03-23 Thread Russell Smith

Hi,

As an attempt at a first PostgreSQL patch, I'd like to see if I can do 
anything about this issue.


I've read both the attached threads;

http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php
http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php


There seems no consensus about how to go about this.  I have done some 
initial changes and found some problems with my attempts.  initdb always 
creates pg_catalog relations with lowercase names, as does the function 
list.  eg count() with uppercased identifiers requires  "count"().  All 
of these can be altered on database copy.  It shouldn't be a problem.  
However I see shared relations as a big problem.  The 2004 thread 
suggests that we want a per database setting.  I am unable to see how we 
share shared relations between databases with different case folder.


pg_user is an example of this;

Lowercase database;  CREATE ROLE mrruss LOGIN;   results in -> mrruss  
as data in pg_user
Uppercase database;  CREATE ROLE mrruss LOGIN;   resutls in -> MRRUSS as 
data in pg_user


Now both of those can be accessed from any database.  And you will get a 
different user based on the source database.


Overall, I'd like to concentrate on the implementation as I'm a 
beginner.  But I see this being mainly a problem with nailing down the 
actual requirement for the implementation.   So I'll try to start the 
discussion to allow me or somebody else to eventually develop a patch 
for this.


The first question is, are all the requirements of the 2004 thread still 
true now?


Setting case folder at initdb time seems the easiest method but I'm not 
sure if that's what people want.  Any GUC variables seem to play havoc 
with the pg_catalog schema and the data in the catalogs.


Ideas and comments?

Thanks

Russell

-
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VACUUM/ANALYZE counting of in-doubt tuples

2007-11-17 Thread Russell Smith
n, then the rule should be to not count
INSERT_IN_PROGRESS tuples at all --- they will be added to the 
appropriate count when the inserting transaction commits or aborts.

And DELETE_IN_PROGRESS tuples should be counted as live --- if the
deleting transaction commits, it will increment the dead-tuple count
at that time, and if it aborts then the tuple remains live.

I feel fairly comfortable with this analysis for ANALYZE, and the
patch I posted yesterday can easily be adjusted to accommodate it.
However, what of VACUUM?  As that code stands, every non-removable
tuple (including RECENTLY_DEAD ones) is counted as live, and the
dead-tuples count gets reset to zero.  That seems clearly bogus.
But the other-transaction-commits-second hypothesis seems a good bit
more dubious for VACUUM than it is for ANALYZE.

Should we attempt to adjust VACUUM's accounting as well, or leave it
for 8.4?  For that matter, should adjusting ANALYZE be left for 8.4?
Thoughts?
  
Give all my unqualified statements above, I'd be tempted to make sure we 
can measure the problem with 8.3 and get a proper solution into 8.4.  
Analyze seems safe to change now.  It doesn't actually make change to 
the live/dead tuple counts (I don't think) so is a lower risk.  I feel 
uneasy about the vacuum stuff as we don't know the exact side effects a 
change like that could have.  It could have a larger impact that 
suspected.  And at beta3/rc1 I feel it's too late in the cycle.


Regards

Russell Smith

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)

2007-05-15 Thread Russell Smith

Alvaro Herrera wrote:

Russell Smith wrote:
  

Alvaro Herrera wrote:


Alvaro Herrera wrote:

 
  

2. decide that the standard is braindead and just omit dumping the
  grantor when it's no longer available, but don't remove
  pg_auth_members.grantor

Which do people feel should be implemented?  I can do whatever we
decide; if no one has a strong opinion on the matter, my opinion is we
do (2) which is the easiest.


Here is a patch implementing this idea, vaguely based on Russell's.
  
I haven't had time to finalize my research about this, but the admin 
option with revoke doesn't appear to work as expected.


Here is my sample SQL for 8.2.4

create table test (x integer);
\z
create role test1 noinherit;
create role test2 noinherit;
grant select on test to test1 with grant option;
grant select on test to test2;
\z test
set role test1;
revoke select on test from test2;
\z test
set role test2;
select * from test;
reset role;
revoke all on test from test2;
revoke all on test from test1;
drop role test2;
drop role test1;
drop table test;
\q


The privilege doesn't appear to be revoked by test1 from test2.  I'm not 
sure if this is related, but I wanted to bring it up in light of the 
options we have for grantor.



Humm, but the privilege was not granted by test1, but by the user you
were using initially.  The docs state in a note that

A user can only revoke privileges that were granted directly by
that user.

I understand that this would apply to the grantor stuff being discussed
in this thread as well, but I haven't seen anyone arguing that we should
implement that for GRANT ROLE (and I asked three times if people felt it
was important and nobody answered).

  
Well, I would vote for implementing this in GRANT ROLE.  I wish to use 
it in my security model.  I don't think the spec is brain dead when you 
understand what it's trying to achieve.


Example:

2 Groups of administrators who are allowed to grant a role to users of 
the system


App_Admin_G1
App_Admin_G2
App_User

SET ROLE App_Admin_G1
GRANT App_User TO "Fred";
SET ROLE App_Admin_G2
GRANT App_User TO "John";
SET ROLE App_Admin_G1
REVOKE App_User FROM "John";

As App_Admin_G1 did not grant App_User rights to John, he should not be 
able to take them away.


I currently have a situation where I would like to be able to do the 
above.  I have two separate departments who might grant privileges for 
the same application to the same user.  One department administrator 
should not be able to revoke the privileges set by the other one.


I would expect superusers to be able to revoke from anybody, or the 
"owner".  I'm not sure what the owner is when we talk about granting roles.


Regards

Russell Smith


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)

2007-05-15 Thread Russell Smith

Alvaro Herrera wrote:

Alvaro Herrera wrote:
  

Alvaro Herrera wrote:



2. decide that the standard is braindead and just omit dumping the
   grantor when it's no longer available, but don't remove
   pg_auth_members.grantor

Which do people feel should be implemented?  I can do whatever we
decide; if no one has a strong opinion on the matter, my opinion is we
do (2) which is the easiest.
  

Here is a patch implementing this idea, vaguely based on Russell's.



Applied to CVS HEAD, 8.2 and 8.1.

If we want to start tracking the grantor as a shared dependency, and
have REVOKE work per spec (i.e. only revoke the privileges actually
granted by the role executing REVOKE), those are separate patches (and
they should be applied only to HEAD).  This patch merely fixes the fact
that pg_dumpall failed to work for busted databases.

  
Should there also be a doc patch for this, the document descriptions 
seemed different to what is actually implemented.  I'll check that 
before I make any further comments.


Russell

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Re: [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)

2007-05-15 Thread Russell Smith

Alvaro Herrera wrote:

Alvaro Herrera wrote:

  

2. decide that the standard is braindead and just omit dumping the
   grantor when it's no longer available, but don't remove
   pg_auth_members.grantor

Which do people feel should be implemented?  I can do whatever we
decide; if no one has a strong opinion on the matter, my opinion is we
do (2) which is the easiest.



Here is a patch implementing this idea, vaguely based on Russell's.
  


I haven't had time to finalize my research about this, but the admin 
option with revoke doesn't appear to work as expected.


Here is my sample SQL for 8.2.4

create table test (x integer);
\z
create role test1 noinherit;
create role test2 noinherit;
grant select on test to test1 with grant option;
grant select on test to test2;
\z test
set role test1;
revoke select on test from test2;
\z test
set role test2;
select * from test;
reset role;
revoke all on test from test2;
revoke all on test from test1;
drop role test2;
drop role test1;
drop table test;
\q


The privilege doesn't appear to be revoked by test1 from test2.  I'm not 
sure if this is related, but I wanted to bring it up in light of the 
options we have for grantor.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Re: [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)

2007-05-05 Thread Russell Smith

Stephen Frost wrote:

* Tom Lane ([EMAIL PROTECTED]) wrote:
  

Stephen Frost <[EMAIL PROTECTED]> writes:


If you're saying we don't currently warn if a revoke leaves the
priviledges in-tact for the right and target, I'm not sure you can
currently get in a state where it'd be possible to run into that.
  

I'm thinking of the case that comes up periodically where newbies think
that revoking a right from a particular user overrides a grant to PUBLIC
of the same right.



Technically, the grant to public is a different target from the target
of the revoke in such a case.  Following the spec would mean that even
when the grant and the revoke target is the same (unless you're the
original grantor) the right won't be removed.  I'm not against adding a
warning in the case you describe though, but I don't see it being as
necessary for that case.  What the spec describes is, at least in my
view, much more counter-intuitive than how PG currently works.


  
If we were to follow the spec, I would expect that it would be possible 
for the object owner to revoke privileges no matter what role granted 
them.  It need not be the default, but as an object owner, I'd expect to 
be able to say that I want all privileges for a role revoked, no matter 
who granted them.


8.2 docs state this on the revoke page:
--

REVOKE can also be done by a role that is not the owner of the affected 
object, but is a member of the role that owns the object, or is a member 
of a role that holds privileges WITH GRANT OPTION on the object. In this 
case the command is performed as though it were issued by the containing 
role that actually owns the object or holds the privileges WITH GRANT 
OPTION. For example, if table t1 is owned by role g1, of which role u1 
is a member, then u1 can revoke privileges on t1 that are recorded as 
being granted by g1. This would include grants made by u1 as well as by 
other members of role g1.


If the role executing REVOKE holds privileges indirectly via more than 
one role membership path, it is unspecified which containing role will 
be used to perform the command. In such cases it is best practice to use 
SET ROLE to become the specific role you want to do the REVOKE as. 
Failure to do so may lead to revoking privileges other than the ones you 
intended, or not

revoking anything at all.

--

Paragraph 1 implies that we are meeting the standard now.  I think 
paragraph two is stating that if you are a member of multiple roles 
which could have granted privileges, then you don't know which one you 
are revoking.  Makes sense if we are implementing the SQL standard.  
Does this mean we were intending to be SQL compliant when we wrote the 
documentation?

I also note that 8.1 says the same thing in its documentation.

My possible suggestion is;
1. Implement the standard for revoking only your privileges by default.
2. Allow the object owner to revoke privileges assigned by any role, as 
if you drop and recreate the object you can achieve this anyway.


Regards

Russell Smith






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Feature thought: idle in transaction timeout

2007-03-31 Thread Russell Smith

Joshua D. Drake wrote:

Hello,

I ran into an interesting problem with a customer today. They are 
running Jabber XCP (not the one we use). Unfortunately, the product 
has a bug that causes it to leave connections persistent in a 
transaction state. This is what it does:


BEGIN; SELECT 1;

Basically it is verifying that the connection is live. However, it 
never calls commit. So what happens? We can't vacuum ;).


Anyway, my thought is, we know when a transaction is idle, why not 
have an idle timeout where we will explicitly close the connection or 
rollback or something? User configurable of course.
I agree with this, it reduces the long running transaction problem a 
little where the user forgot to commit/rollback their session.  I may be 
worth having a transaction_timeout as well, and setting it to link a few 
hours by default.  That way you can't have really long running 
transactions unless you specifically set that.


Sincerely,

Joshua D. Drake






---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-22 Thread Russell Smith

Simon Riggs wrote:

On Thu, 2007-03-22 at 16:16 -0400, Tom Lane wrote:
  

"Simon Riggs" <[EMAIL PROTECTED]> writes:


There is a slight hole in that SERIALIZABLE transactions won't be able
to use any indexes they build during their transaction, since they may
need to be able to see prior data, but I don't think anybody is going to
complain about that restriction. Anyone?
  

Practically every statement I've seen in this thread that used the
phrase "SERIALIZABLE transaction" was wrong to some extent, and this
one is no different.

The issue is not whether the whole transaction is serializable or not,
it's how old is the oldest still-live snapshot, a thing that CREATE
INDEX can't tell with any certainty in READ COMMITTED mode.  So if your
solution involves any explicit dependence on the transaction
serializability mode, it's probably wrong.  I'm not totally sure if you
are expecting to be able to tell that, but I do know that the planner
has no idea what snapshots a plan it makes will be used with.



Thanks for correcting me.

Reworded: There is a slight hole in that snapshots older than the CREATE
INDEX must never be allowed to use the index. That means that
SERIALIZABLE transactions and some other situations will need to be
restricted. Personally, I would argue that such a restriction was an
acceptable loss of functionality, since I can't think of a situation
where such a thing would need to occur, though one may turn up.

Currently, I don't know how to prevent this from happening. We'll need
to examine this in more detail to see if there is a way.
  
I have seen and used transactions that create indexes inside a 
transaction, use them for the life of the transaction, and then drop 
them at the end.
I think this is an acceptable solution to not be able to use the index 
in the current transaction if the table you are building the index on is 
HOT enabled.  That way it is not really a functionality loss, it's just 
a restriction put in place if you are using a certain feature.  We do 
not want to go breaking existing code.


However HOT is enabled by default on tables, then we have a different 
situation.  And if the expectation is that HOT will be enabled by 
default in future releases, then this needs to be considered now.


Regards

Russell Smith


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)

2007-03-22 Thread Russell Smith

Jim C. Nasby wrote:

On Mon, Mar 19, 2007 at 12:05:19PM +, Simon Riggs wrote:
  

I was unwilling to compromise to have HOT if only one index existed, but
IMHO allowing HOT with <= 3 indexes is an acceptable compromise for this
release. (We can always use vertical partitioning techniques to allow
additional access paths to be added to the same table - I'd be very
happy to document that with worked examples, if requried).



I'm not sure where we're sitting with this, but I've got another idea I
haven't seen (one that I think is better than an arbitrary limit on the
number of indexes)... what if we just disallow non-concurrent index
builds on hot tables? It sounds like the additional pain involved in
chilling an entire table and keeping it chilled for the index build is
even more overhead than just doing a concurrent index build.
  
I thought about making it even simpler.  Only allow CREATE INDEX builds 
on non HOT tables.  However as I mentioned in another thread, this idea 
dies if you expect to be able to have HOT enabled by default in any 
future release.  Chilling needs to be able to be done with a regular 
Vacuum style lock for chilling to be a usable reality.


I'm sure there are use cases or this, but it seems unlikely that a high 
update table is going to have an index added to it.  Am I a long way 
from reality when saying that?



Regards

Russell Smith


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Plan invalidation design

2007-02-18 Thread Russell Smith

Gregory Stark wrote:


[snip]



Hm. The set of output columns could change? How?

If you prepare "select *" and add a column, you're saying the query should
start failing? That seems strange given the behaviour of views, which is that
once parsed the list of columns is written in stone. It seems prepared queries
should work the same way that views work and remember which physical column
they were referring to previously. (Personally I don't like that behaviour but
it feels like this should be consistent with it.)

I guess you do have a serious problem if you redefine the type of a column or
redefine a view (though I think you would have to drop and recreate it, CREATE
OR REPLACE wouldn't let you change the output columns).
  


I would think it best to move towards changing views to not have output 
columns set in stone.  It seems unreasonable that you can add/drop/alter 
columns in a table as much as you like, but you can't touch a view.  I 
also not excited about the current view restrictions.  Which means we 
don't want to start backing the idea by putting in more code that acts 
in the same way.


I'm guessing from what Tom is saying, that the reason we have views set 
in stone is because they are/can be an example of inlined SQL.  
Particularly when views are built on views.  Any further enlightenment 
welcome, but probably off topic for this thread.


Russell Smith

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Plan invalidation design

2007-02-18 Thread Russell Smith
een pre-computed by constant-folding wouldn't get updated without inval.
  
If you replan and immutable function, aren't you possibly messing up a 
functional index that is using the old function.  Hey, if you change an 
immutable function that has an index, you are in trouble already.
So the implication of a immutable function change are much more wide 
ranging that plan invalidation problems.



* if you have a SQL function that's been inlined into a plan, a change
  in the function definition wouldn't get reflected into the plan without
  inval.
* if you alter a function and change its volatility property, that might
  possibly affect the shape of plans that use the function (for instance
  some optimization transformation might now be allowed or not).
  
Replanning pl/pgsql with CREATE TEMP TABLE would be a good use here.  
You loose the preplanning benefits, but we remove the ongoing problem 
where people report that their temp-table isn't working.


Even function alterations to pl/pgsql should a replan.  But of more 
interest is being able to use the old function for currently running 
transactions when the function is changed.  Last time I tried to edit a 
pl/pgsql function while it was being used by a transaction, the 
transaction failed because the function definition changed.  I'm not 
100% sure of the use case here as I'm writing this email at too late an 
hour.

To my memory none of these problems have been complained of from the
field.  Making the cache module able to detect function-related
invalidations would be a bit of work --- for example, if a function has
been inlined, there is no recognizable reference to it at all in the plan
tree, so we'd have to modify the planner to track such things and report
them somehow.  (The corresponding problem for views doesn't exist, because
there is still a rangetable entry for a view after it's been expanded.)
So I think this is a "maybe do someday" part, not something to do in the
first release.

One interesting point is that to avoid race conditions, the function that
checks for is-plan-update-required will have to acquire locks on the
tables mentioned in the plan before it can be sure there's not a pending
invalidation event on them.  This doesn't seem like a problem, though it
might mean we want to refactor the executor API a bit to avoid duplicate
effort.
  
Is the race condition here any more likely to happen than the failure of 
a re plan when something has changed from underneath the original query?
My very brief thought gives me the impression that they are the same 
thing, however they may not be.

Comments?
  
Again, as a person who has only a limited understand of the code, I'm 
happy to be wrong about anything I have written.


Regards

Russell Smith

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] STOP all user access except for admin for a few minutes?

2007-01-22 Thread Russell Smith

[EMAIL PROTECTED] wrote:

Hi there,

Is is possible to stop all user access to postgres, but still give 
access to admin?

Just temporarily, not a security setup.

Something like, stop all users but allow user x and y.
You could restart in single user mode, or alter pg_hba.conf to allow the 
users you want and disallow all other users.


single user mode will require you have direct access to the machine to 
do the alterations.


using pg_hba.conf will not disconnect existing users as far as I'm aware.

That's the best advice I can offer, maybe somebody else will be able to 
give you more


thx

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-19 Thread Russell Smith

Darcy Buskermolen wrote:

[snip]

Another thought, is it at all possible to do a partial vacuum?  ie spend the 
next 30 minutes vacuuming foo table, and update the fsm with what hew have 
learned over the 30 mins, even if we have not done a full table scan ?
  

There was a proposal for this, but it was dropped on 2 grounds.
1. partial vacuum would mean that parts of the table are missed, the 
user could never vacuum certain parts and transaction wraparound would 
get you.  You may also have other performance issues as you forgot 
certian parts of the table


2. Index cleanup is the most expensive part of vacuum.  So doing a 
partial vacuum actually means more I/O as you have to do index cleanup 
more often.


If we are talking about autovacuum, 1 doesn't become so much of an issue 
as you just make the autovacuum remember what parts of the table it's 
vacuumed.  This really has great power when you have a dead space map.


Item 2 will still be an issue.  But if you define "partial" as either 
fill maintenance_work_mem, or finish the table, you are not increasing 
I/O at all.  As when maintenance work mem is full, you have to cleanup 
all the indexes anyway.  This is really more like VACUUM SINGLE, but the 
same principal applies.


I believe all planning really needs to think about how a dead space map 
will effect what vacuum is going to be doing in the future.



Strange idea that I haven't researched,  Given Vacuum can't be run in a 
transaction, it is possible at a certain point to quit the current 
transaction and start another one.  There has been much chat and now a 
TODO item about allowing multiple vacuums to not starve small tables.  
But if a big table has a long running vacuum the vacuum of the small 
table won't be effective anyway will it?  If vacuum of a big table was 
done in multiple transactions you could reduce the effect of long 
running vacuum.  I'm not sure how this effects the rest of the system 
thought.


Russell Smith


  




Re: [HACKERS] Dead Space Map for vacuum

2006-12-29 Thread Russell Smith

Simon Riggs wrote:

On Fri, 2006-12-29 at 10:49 -0500, Tom Lane wrote:
  

"Simon Riggs" <[EMAIL PROTECTED]> writes:


I would suggest that we tracked whether a block has had 0, 1 or 1+
updates/deletes against it. When a block has 1+ it can then be
worthwhile to VACUUM it and to place it onto the FSM. Two dead tuples is
really the minimum space worth reclaiming on any block.
  

How do you arrive at that conclusion?



FSM code ignores any block with less space than 1 average tuple, which
is a pretty reasonable rule.
  
FSM serves a different purpose than DSM and therefore has an entirely 
different set of rules governing what it should and shouldn't be doing.  
This is a reasonable rule for FSM, but not for DSM.

If you only track whether a block has been updated, not whether it has
been updated twice, then you will be VACUUMing lots of blocks that have
only a 50% chance of being usefully stored by the FSM. As I explained,
the extra bit per block is easily regained from storing less FSM data.
  
Well, it seems that when implementing the DSM, it'd be a great time to 
move FSM from it's current location in Shared Memory to somewhere else.  
Possibly the same place as DSM.  A couple of special blocks per file 
segment would a good place.  Also I'm not sure that the point of 
VACUUMing is always to be able be able to immediately reuse the space.  
There are cases where large DELETE's are done, and you just want to 
decrease the index size.  In Tom's counter example of large tuples, you 
certainly want to vacuum the index when only a single update/delete occurs.

My understanding was that DSM was meant to increase VACUUM efficiency,
so having a way to focus in on blocks most worth vacuuming makes sense
using the 80/20 rule.
  
Possibly true.  I don't have anything to indicate what usage patterns 
produce what requirements in vacuum patterns.  If there are significant 
numbers of blocks with one update, is it a loss to actually vacuum 
those.  I know it could be faster if we didn't, but would it still be 
faster than what we do now.
  

Counterexample: table in which all tuples exceed half a page.



Current FSM code will ignore those too, if they are less than the
average size of the tuple so far requested. Thats a pretty wierd
counterexample, even if it is a case that needs handling.
  
Again I'd be careful saying that FSM = DSM for handling of what should 
be done for a particular block.


Russell Smith.



Re: [HACKERS] Interface for pg_autovacuum

2006-12-20 Thread Russell Smith

Jim Nasby wrote:

On Dec 20, 2006, at 7:56 AM, Florian G. Pflug wrote:

Jim Nasby wrote:
I'm teaching a class this week and a student asked me about OIDs. He 
related the story of how in Sybase, if you moved a database from one 
server from another, permissions got all screwed up because user IDs 
no longer matched. I explained that exposing something like an 
integer ID in a user interface or an API is just a bad idea and 
PostgreSQL doesn't do that.

Then I got to pg_autovacuum
So... is there any reason there isn't a prescribed interface to 
pg_autovacuum that doesn't expose vacrelid? Can we get that added to 
TODO?


Wouldn't it be sufficient to change the type of vacrelid from oid
to regclass? Then just dumping and restoring pg_autovacuum like any
other table should Just Work.


I think that would work, though as I mentioned we'd also want to set 
reasonable defaults on the table if we decide to keep that as our 
interface.


On the other hand, this would be the only part of the system where the 
official interface/API is a system catalog table. Do we really want to 
expose the internal representation of something as our API? That 
doesn't seem wise to me...


Additionally, AFAIK it is not safe to go poking data into catalogs 
willy-nilly. Having one table where this is the interface to the 
system seems like it could lead to some dangerous confusion.
I thought the plan was to change the ALTER TABLE command to allow vacuum 
settings to be set.  I may be totally away from the mark.  But if this 
was the case it would mean that dumps would just need an alter table 
statement to maintain autovacuum information.  There is an advantage 
that if you only dump some tables, their autovac settings would go with 
them. But is that a good thing?


Reagrds

Russell Smith

--
Jim Nasby   [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Russell Smith

Andrew Dunstan wrote:

Russell Smith wrote:

Tom Lane wrote:

Stephen Frost <[EMAIL PROTECTED]> writes:
 
Force references to go through macros which implement the lookup 
for the

appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.



It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it.  (And no, that's not a vote to move to C++ ...)
  

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an 
existing column will require the entire table to be rewritten. This 
may take a significant amount of time for a large table; and it will 
temporarily require double the disk space.



Now, we are rewriting the table from scratch anyway, the on disk 
format is changing.  What is stopping us from switching the column 
order at the same time.  The only thing I can think is that the 
catalogs will need more work to update them.  It's a middle sized 
price to pay for being able to reorder the columns in the table.  One 
of the problems I have is wanting to add a column in the middle of 
the table, but FK constraints stop me dropping the table to do the 
reorder.  If ALTER TABLE would let me stick it in the middle and 
rewrite the table on disk, I wouldn't care.  It's likely that I would 
be rewriting the table anyway.  And by specifying AT POSITION, or 
BEFORE/AFTER you know for big tables it's going to take a while.




This isn't really a compromise. Remember that this discussion started 
with consideration of optimal record layout (minimising space use by 
reducing or eliminating alignment padding). The above proposal really 
does nothing for that.


cheers

andrew


This is partly true.  If you have the ability to rewrite the table and 
put columns in a specific order you can "manually" minimize the 
alignment padding.  However that will probably produce a table that is 
not in the logical order you would like.  I still see plenty of use case 
for both my initial case as the alignment padding case, even without 
logical layout being different to disk layout.


Also there has been a large about of discussion on performance relating 
to having firm numbers for proposals for different compiler options.  Do 
anybody have tested numbers, and known information about where/how you 
can eliminate padding by column ordering?  Tom suggests in this thread 
that lots of types have padding issues, so how much is it really going 
to buy us space wise if we re-order the table in optimal format.  What 
is the optimal ordering to reduce disk usage?


Russell.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Russell Smith

Tom Lane wrote:

Stephen Frost <[EMAIL PROTECTED]> writes:
  

Force references to go through macros which implement the lookup for the
appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.



It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it.  (And no, that's not a vote to move to C++ ...)
  

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an 
existing column will require the entire table to be rewritten. This may 
take a significant amount of time for a large table; and it will 
temporarily require double the disk space.



Now, we are rewriting the table from scratch anyway, the on disk format 
is changing.  What is stopping us from switching the column order at the 
same time.  The only thing I can think is that the catalogs will need 
more work to update them.  It's a middle sized price to pay for being 
able to reorder the columns in the table.  One of the problems I have is 
wanting to add a column in the middle of the table, but FK constraints 
stop me dropping the table to do the reorder.  If ALTER TABLE would let 
me stick it in the middle and rewrite the table on disk, I wouldn't 
care.  It's likely that I would be rewriting the table anyway.  And by 
specifying AT POSITION, or BEFORE/AFTER you know for big tables it's 
going to take a while.


Not that I'm able to code this at all, but I'm interested in feedback on 
this option.


Regards

Russell Smith

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


  



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Nasty VACUUM/bgwriter/segmentation bug

2006-11-18 Thread Russell Smith

Tom Lane wrote:

While working on fixing the recently reported hash-index problem,
I was using a test build with a very small RELSEG_SIZE (128K),
so that I could trigger the reported bug with a reasonably small
amount of data.  And I started finding some unexpected data corruption.
I eventually reduced it to this test case:

checkpoint;
create table foo (f1 int);
insert into foo select x from generate_series(1,10) x;
-- wait 30 seconds
delete from foo;
vacuum verbose foo;
insert into foo select x from generate_series(1,10) x;
\q
stop and restart postmaster, then
vacuum verbose foo;

This vacuum will generate a whole lot of 
WARNING:  relation "foo" page 16 is uninitialized --- fixing

WARNING:  relation "foo" page 17 is uninitialized --- fixing
...
and when the dust settles, most of the second batch of 10 rows
is gone.

What is happening is that during that 30-second wait, the bgwriter is
dumping out all the dirty pages, and acquiring open file references
to each segment of table "foo" as it does so.  The VACUUM then truncates
"foo" back to zero size, since it contains no data after the DELETE,
and then the second INSERT bulks it up again.  The problem is that the
bgwriter still has open file references to the deleted segments after
the first one, and that's where it'll write the data if given a chance.
So the updates disappear into the ether as far as any other process is
concerned, for each segment except the first.
  

Does TRUNCATE suffer from the same issue?

There's a rather indirect mechanism that's supposed to prevent similar
problems between two backends: a file truncation is supposed to be
associated with a forced relcache flush, and that causes an smgrclose(),
so other backends will be forced to reopen the file(s) before they can
do any more work with the truncated relation.  On reflection I think
I don't trust this though, because in the case where a backend writes a
dirty buffer because it needs to reclaim a buffer, it doesn't try to
open the relation at the relcache level (it cannot, because the rel
might be in a different database).  So it could use a stale smgr
relation, same as the bgwriter.  The bgwriter does not participate
in shared cache inval, not having a relcache in the first place, and
so this mechanism doesn't help it anyway.

This is a fairly low-probability bug in real-world cases, because it
could only happen when a relation is truncated and then re-expanded
across a 1GB segment boundary.  Moreover, because the bgwriter flushes
all its open files after each checkpoint, the window for trouble only
extends to the next checkpoint.  But it definitely could happen, and
it might explain some irreproducible corruption reports.
  
Regular imports that delete data or truncate relations would increase 
this probability wouldn't they?
Autovac is also likely to run on that relation in the "wait" phase, 
which other relations are being truncated by an import process.



I think that the easiest fix might be to not remove no-longer-used
segment files during a truncate, but simply reduce them to zero size
rather than delete them.  Then any open file pointers aren't
invalidated.  The only alternative I can see is to invent some new
signaling mechanism to force closure of open files, but that seems
ugly, complex, and perhaps subject to race conditions.

Thoughts?
  
Seems reasonable from my lowly user point of view.  Would there be a 
requirement to remove the extra segments at any point in the future or 
would they hang around on the disk forever?


Russell Smith

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


  



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] new feature: LDAP database name resolution

2006-02-21 Thread Russell Smith

Albe Laurenz wrote:

Thanks to everybody who answered.

Maybe it is really the best thing to use a tool like postgresql-relay or
pgpool - I will investigate these.
I'm not eager to reinvent the wheel.

We have considered relocating DNS entries, but the problem is that a
changed
DNS entry takes long to propagate; in particular Windows has a caching
problem there.


So even if you specify the TTL of the DNS records to be 60 seconds for 
the front end labels you put on your servers, Windows will not refresh 
after that period of time, even though it should?


[snip]


Yours,
Laurenz Albe

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Pgfoundry and gborg: shut one down

2006-02-18 Thread Russell Smith

Bruce Momjian wrote:

Tom Lane wrote:


Thomas Hallgren <[EMAIL PROTECTED]> writes:


Bruce Momjian wrote:


Having run had both pgfoundary and gborg for several years, I think we
have to conclude that any clean migration is never going to happen, so
let's just pick a server and announce date, and shut one of them off.


I've repeatedly asked for help moving my PL/Java stuff over to pgfoundry and offered my help 
in the process,


Indeed, we haven't made any particular effort to encourage gborg
projects to move.  I think it's a bit premature to hold a gun to
their heads.



If we don't push folks, nothing will happen, which is what has happened
for years now.  Let's set a date and tell people to move, or else.
Keeping our stuff split like this is not helping us.



Slowly disabling things is also an option to encourage people to move, 
while not ending up with a huge number of projects trying to move in the 
same week.


Disabling the ability to create new accounts and projects will tell both 
existing and new people that this is not the place to be going forward. 
 If you need a new developer or project, you need to put in the effort 
to move your project.


Disabling the ability to upload files will make people create a project 
on PgFoundry when they make a new releases, putting more pressure on to 
move across.


Even with the above two items changed, it would soon encourage people to 
move, or at least create a project on PgFoundry and move there file 
releases there.  CVS and mailing lists will need to be moved by admins, 
but that process doesn't need to be done in a single day.  It creates 
more operational overhead for each project in the short term, but that 
will continue to push them to migrate.


Who are the people who can help move projects across and how can they be 
contacted?  Maybe posting some news items on gborg about it would 
encourage people.  Having the people who can help available to assist 
people to move will mean that more projects are likely too.


I agree dates need to be made, not necessarily about the total shutdown, 
but feature removal dates will mean people are much more likely to 
"want" to move.


Regards

Russell Smith



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-03 Thread Russell Smith
On Sun, 3 Jul 2005 04:47 pm, Greg Stark wrote:
> 
> Bruce Momjian  writes:
> 
> > I have an idea!  Currently we write the backup pages (copies of pages
> > modified since the last checkpoint) when we write the WAL changes as
> > part of the commit.  See the XLogCheckBuffer() call in XLogInsert().
> 
> Can someone explain exactly what the problem being defeated by writing whole
> pages to the WAL log? Like, if page is half flushed to disk and contains half
> the old data and half of the new data, what exactly would go wrong with
> recovery? When postgres sees the write why would it have trouble recovering
> the correct contents for the page based on the half-written page and the
> regular WAL entry?

Problem we are solving.
Checkpoint Happens.  (Ensuring that ALL data make it to the disk)
  - This means we don't need the information in WAL before the checkpoint.
We write some updates to the db, changing some pages.
The Power is pulled and one page is half written.  There is no way to know if 
the page is in a valid state.
  There is no way to recover the page from wal, becuase unless we wrote the 
full page into wal, we don't have a "starting point" for modification.

That's probably very unclear but that is the idea.

Regards

Russell Smith

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] contrib/pgcrypto functions not IMMUTABLE?

2005-07-02 Thread Russell Smith
On Sun, 3 Jul 2005 03:32 pm, Michael Fuhr wrote:
> I've noticed that contrib/pgcrypto/pgcrypto.sql.in doesn't include
> a volatility category in its CREATE FUNCTION statements, so the
> functions are all created VOLATILE.  Shouldn't most of them be
> IMMUTABLE?  Or do the algorithms have side effects?  So far I've
> found no discussion about this except for one person asking about
> it last year:
> 
> http://archives.postgresql.org/pgsql-admin/2004-12/msg00065.php
> 
I know the salt functions MUST stay volatile, as they produce different output
every time you call them.  I've not looked at the pgcrypto code, so I can't
make further comment than that.

Regards

Russell Smith.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Autovacuum in the backend

2005-06-17 Thread Russell Smith
On Fri, 17 Jun 2005 06:26 pm, Andreas Pflug wrote:
> Qingqing Zhou wrote:
> > "Tom Lane" <[EMAIL PROTECTED]> writes:
> > 
> >>Yeah --- a libpq-based solution is not what I think of as integrated at
> >>all, because it cannot do anything that couldn't be done by the existing
> >>external autovacuum process.  About all you can buy there is having the
> >>postmaster spawn the autovacuum process, which is slightly more
> >>convenient to use but doesn't buy any real new functionality.
> >>
> > 
> > 
> > One reason of not using lib-pq is that this one has to wait for the
> > completion of each vacuum (we don't has async execution in libpq right?),
> 
> There *is* async execution in libpq, and it works.

I would have thought the main reasons for not using libpq means you are locked
into only using commands that are available to all users via SQL.  If you don't 
use
libpq, you open up the ability to use functions that can make use of 
information available
to the backend, and to also run functions in a way that it is not possible to 
do via SQL.

Regards

Russell Smith.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Autovacuum in the backend

2005-06-17 Thread Russell Smith
> >4) Related to this, I guess, is that a user's FSM settings might be
> >completely inappropriate. The 'Just read the manual' or 'Just read the
> >logs' argument doesn't cut it, because the main argument for autovacuum in
> >the backend is that people do not and will not.
> >  
> >
> 
> Agreed, it doesn't solve all problems, and I'm not arguing that the 
> integration of AV makes PostgreSQL newbie safe it just helps reduce the 
> newbie problem.   Again if the default FSM settings are inappropriate 
> for a database then the user is probably doing something more 
> complicated that a "my cat minka" database and will need to learn some 
> tuning skills anyway.
> 
> >5) It doesn't actually shrink tables -- ie, there's no VACUUM FULL. If
> >we're telling users about VACUUM less often than we are now, there's bound
> >to be bloating issues (see 4).
> >  
> >
> 
But what's stopping the implementation of a Partial VACUUM FULL, where we lock 
the table,
move enough blocks to shorten the relation so that there is say < 10% bloat, or 
whatever is
appropriate for that table.  Or even just short the table a few block, and 
repeat the process
when you have some time too.

> Not totally true, regular VACUUM can shrink tables a little (I think 
> only if there is free space at the end of the table it can cutoff 
> without moving data around).  But if AV is on and the settings are 
> reasonable, then a table shouldn't bloat much or at all.  Also, I don't 
> think we are telling people to VACUUM less, in fact tables that need it 
> will usually get VACUUM'd more, we are just telling the users that if 
> they turn AV on, they don't have to manage all the VACUUMing.

Regards

Russell Smith

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum in the backend

2005-06-17 Thread Russell Smith
> Added to TODO:
> 
>  * Create a bitmap of pages that need vacuuming
>  
>Instead of sequentially scanning the entire table, have the background
>writer or some other process record pages that have expired rows, then
>VACUUM can look at just those pages rather than the entire table.  In
>the event of a system crash, the bitmap would probably be invalidated.
> 
Further to this, is there any use case for allowing FSM, or this DSM to spill 
to disk
if the space fills up.  It would allow the possibility of unusual changes to 
the db
to not loose space.  You could just load part of the overflow from the disk back
int the FSM in memory and continue using free space.

Regards

Russell Smith

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum in the backend

2005-06-17 Thread Russell Smith
> > The major reasons for autovacuum as I see it are as follows:
> > 
> > * Reduces administrative overhead having to keep track of what tables 
> > need to be vacuumed how often.
> 
> Creates more overhead and thus reduces performance.
Or reduces vacuum overhead because the vacuum strategy is much better than
it was when you used cron.  Especially as people get a chance to improve 
autovac.

> > * Reduces the total amount of time the system spends vacuuming since it 
> > only vacuums when needed.
> 
> Can be easily done with cron.
Can you do partial table vacuums with CRON?
You can work out the smartest time to vacuum with cron? I thought it just 
scheduled tasks at certain times.

> 
> > * Keeps stats up-to-date automatically
> 
> Which can be done with cron
An what is the management strategy for adjusting analyze when things change 
that you weren't aware of? (eg, big table changes that were unexpected)


> 
> > * Eliminates newbie confusion
> 
> RTFM

RTFM = MySQL in a lot of cases to be honest.

> 
> > * Eliminates one of the criticisms that the public has against 
> > PostgreSQL (justifed or not)
> 
> Agreed.
This is really the same as the previous RTFM question/response.  People 
criticise because vacuum is foreign to them,
and for newbie's that equals too hard, next db please.  As much as it is a 
technical issue, it's an advocacy issue too.

Plus we finally get XID wraparound protection.  We finally decided that for 8.1 
we needed some protection, which I think
Tom committed.  This again may be a newbie thing.  But there are a lot of 
newbies out there then.   We've see on the lists
and on IRC this problem pop up a number of times.  And people say "Why didn't 
it tell me", RTFM it's exactly what they want
to hear, or the fact they thought they read the manual, and missed 
understanding that bit.

> 
> 
> Just so everyone knows from the get go here. I am purposely playing a 
> little devils advocate. Autovacuum has some drawbacks. I think we should
> be **publicly** aware of them before we pursue integration.

It does have a number of issues.  But I feel the integration issue is being 
addressed with a very short term view.
Once it's integrated there are a lot of patches, tweaks and changes that just 
can't be made until it is integrated.
The usefulness of some of the vacuum ideas that have been presented in the past 
will be able to become a reality.
The dead space map is a perfect example.  People have talked about it for most 
of the time I've been around.
But until we have an integrated vacuum none of that can really happen.
> 
> Heaven knows it would make my life easier if it was integrated but anyway...
> 
I understand these are not nessecarily Josh's view, but I thought I would offer 
comments on them.

> Sincerely,
> 
> Joshua D. Drake
> 
Regards

Russell Smith
> 
> 
> 
> > 
> > Also, as VACUUM improves, autovacuum will improve with it.
> > 
Or because of autovacuum, vacuum and autovacuum will improve.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Autovacuum in the backend

2005-06-15 Thread Russell Smith
On Thu, 16 Jun 2005 12:54 pm, Alvaro Herrera wrote:
> On Thu, Jun 16, 2005 at 11:07:31AM +1000, Gavin Sherry wrote:
> > On Wed, 15 Jun 2005, Bruce Momjian wrote:
> > 
> > > I am going to start working on it.  I am concerned it is a big job.
> > >
> > > I will post questions as I find them, and the one below is a good one.
> > 
> > I'm wondering if effort is being misdirected here. I remember when Mark
> > Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
> > significant performance loss -- I think on the order of 30% to 40% (I will
> > try and dig up a link to the results).
> 
> I think those are orthogonal issues.  One is fixing whatever performance
> issues there are because of VACUUM.  Note that the fact that Mark was
> having such a drop in performance with autovacuum does only mean that
> at the enormous load under which the OSDL tests are run, autovacuum is
> not the best solution.  Not everybody runs with that sort of load
> anyway.  (In fact lots of people don't.)

I agree here.  There have been a couple of patches for improvements to VACUUM 
rejected in the past.
EG, partial vacuum.  If we have autovacuum in the backend, it doesn't matter 
about people 
vacuuming the wrong part of a file.  The system will manage it.

I'd also say there are a much greater number of people who will be able to 
fiddle with
an implemented autovac to improve its performance and load.  However I think 
there
are less people who can complete what Alvaro is doing.

> So, one issue is that at high loads, there are improvements to be made
> to VACUUM.  The other issue is to get VACUUM to run in the first place,
> which is what autovacuum addresses.

There are plenty of ideas to shoot around here.  Like 
- only run one iteration of a vacuum so you only clean indexes once, then stop 
the vacuum till the next cycle.
- Create the dead space man stuff with the bgwriter
- Make sure you have individual table analyze and vacuum stats so vacuum can be 
flexible to different tables.

Some of the autovac issues we have seen recently like O(n^2) with tables will 
go away by being in the backend.
So not everything will perform the same after the integration.

> 
> I can easily predict that we will make adjustments and improvements to
> VACUUM in the future, but I'm not so sure if it will happen before 8.1
> feature-freezes.  I have more confidence that we can integrate
> autovacuum for 8.1, which will be a leap forward.
> 
The big leap will be to get it in the backend, which will as Chris KL suggested 
stop people shooting themselves in the foot.
I'm not sure what strict rules are imposed by Feature freeze, but there may be 
time for others to make some improvements before 8.1.
We have also looked at this for at least 2 releases now.  If it doesn't get in 
now, it will just get in for 8.2 and no improvements till 8.2.

Regards

Russell Smith

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] INSTEAD OF trigger on VIEWs

2005-05-24 Thread Russell Smith
On Tue, 24 May 2005 01:26 am, --= Tono =-- wrote:
> I have tried using INSTEAD rules but there are some
> conditional logic that needs to happen inside the rule
> (performing counts, getting and storing the primary
> key of the master record etc.).  AFAIK, rules only
> allows conditional logic to be check prior to
> execution of the rule and not inside the rule itself. 
> One way to get around this is to allow calling a
> stored procedure inside the rule. This stored
> procedure should have full access of NEW.* (and OLD.*
> in the case of UPDATE and DELETE).  This way the
> manual INSERT, UPDATE or DELETE on the actual tables
> can be performed from inside the stored procedure.
> 
Would it be possible to add an INSTEAD OF rule that calls
a function.  You could then use that function as the trigger
you wanted.  I'm not even sure if this is possible.

DO INSTEAD SELECT * FROM function(rowtype);

Regards

Russell Smith. 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: Learning curves and such (was Re: [HACKERS] pgFoundry)

2005-05-17 Thread Russell Smith
On Wed, 18 May 2005 04:31 am, Josh Berkus wrote:
> Andrew,
> 
> > Last time it came up I thought the problem was that there was not a
> > consensus on *which* bugtracker to use.
> 
> Or whether to use one.Roughly 1/3 bugzilla, 1/3 something else, and 1/3 
> don't want a bugtracker.  And, among the people who didn't want bugzilla, 
> some were vehemently opposed to it.  Bugtrackers discussed included GForge, 
> bugzilla, RT, Roundup, Jura (they offered a free license) and a few I don't 
> remember.
> 
> > Incidentally, I'm not advocating we use bugzilla (if anything I think
> > I'd lean towards using RT), but this seems like a good opportunity to
> > note that as of a week or two ago bugzilla's HEAD branch supports using
> > PostgreSQL as its backing store, and this will be maintained.
> 
> One of the things which came out of the bugtracker discussion is that 
> anything 
> we use must have the ability for developers to interact 100% by e-mail, as 
> some critical developers will not use a web interface.
> 
Doesn't pgfoundry offer this?  If not in 3.3, I'm sure it's in Gforge 4.0, or 
4.1  which will be
released soon.

Regards

Russell

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: Learning curves and such (was Re: [HACKERS] pgFoundry)

2005-05-17 Thread Russell Smith
> 
> I think it might also be valuable to have a list of things that are good
> 'starter projects'. Maybe some indication of TODO items that are
> simpler. Possibly a list of bugs, too.
> 
As someone who has looked at hacking the pg code, I agree it is difficult to
know what to look at to get started.  I love fixing bugs, but I'm used to the 
bug tracker type situation and being able to find something that looks 
relatively
easy.  That is how I've started on a number of other projects.  With no formal
bug tracker, I'm not sure what bugs I could look at.  I have talked to some 
people
on IRC about tackling the infinite date issue, but was warned off that, as the 
date/time
code is a scary place.

Then I looked into the possibility of working on the autovacuum stuff, and 
again got myself
in a little too deep.  Not low enough fruit for me.

The curve to get the meaning of some things is sometimes hard PG_TRY and things
like that just need reading code lots.

Not meaning to attack Tom, but for new people proposing new patches he can seem
intimidating.  I have been around for a little while now and am adjusting to 
the reality.
Maybe people shouldn't be hacking the code before being here a year.

> It would probably also be useful to point out ways people can help that
> don't involve hacking C code (or at least not much). Things like docs,
> the website, advocacy, performance testing, etc.

It would be useful to outline positions that are actually available for people 
to take.
It's easy to give a general list.  I've asked and seen may like it.  For me, 
what does
helping with advocacy mean?  What should be performance tested (I assume new 
code,
like the bitmap scan).  But at the same time, how do I not get into something 
that is
being duplicated by somebody else?

I'm just trying to give a bit of a perspective on the way I see things as some 
who has been
around pg for about 12 months and attempted to hack the code a couple of times.

Regards

Russell Smith

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Catalog Security WAS: Views, views, views: Summary

2005-05-13 Thread Russell Smith
On Sat, 14 May 2005 04:34 am, Andrew Dunstan wrote:
> 
> Andrew - Supernews wrote:
> 
> >>
> >>1) The "ISP" case, where you want to hide all catalog information from the 
> >>users except the database owner or superuser.
> >>
> >>
> >
> >I don't believe this is ever feasible in practice, since client interfaces
> >at any level higher than libpq will need to access metadata corresponding
> >to the data they are retrieving.
> >
> >  
> >
> 
> In the general case you might well be right. Following a scheme like I 
> have in mind is not something that would be transparent to the 
> application - it will probably impose some serious limits on the app. 
> The little sample application I did for testing did everything by stored 
> procedure. Anyway, as I said, it's a project for the future.
> 
>From a general user point of view, I do not know the system catalogs very 
well. I am very unsure of what level of information is available to every 
user on the system.

- Which parts of other databases can be seen by users?
- What is the best method to restrict connections to db's people don't have 
permissions to.
- Is there some restrictions you can place on tables people don't have access 
too.  Otherwise they can see all the columns and table info.

These are just some of the questions I have, I'm not sure where to get 
answers, searching the archives may help, but it's definitely not a final 
answer.  Especially since this stuff would be a moving target with each 
version change of PostgreSQL.

Tom mentioned that he had not had these security concerns raised before.  From 
my point of view I just have no idea about the level of information offered 
to any given user and am scared to run PostgreSQL in an ISP shared 
environment because of it.  I am sure I can secure people from connecting to 
a db by refusing them access in pg_hba.conf.  But I'm unsure of exactly what 
that buys me, and what is doesn't.

A hardening script would be helpful, but some clear information on what is 
also available to the average user would be good too.  I know I should 
probably step up to do this and don't have time at the moment.  I'm sure if I 
did, I would also miss a great number of things.

Regards

Russell Smith

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New Contrib Build?

2005-05-12 Thread Russell Smith
On Thu, 12 May 2005 02:44 pm, Tom Lane wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > Andrew Dunstan wrote:
> >> First, I *really* wish we'd call it something else. Contrib conveys
> >> "unsupported" to people.
> 
[snip]
 
> Which is as it should be, I think.  Contrib is essentially the "not
> quite ready for prime time" area.  If it were 100% up to speed then
> it'd be in the core backend already ... while if we required it to be
> 100% in advance, then it'd not have gotten out there in the first place.
> 
At which point do things move from no being 100% to being 100%.  From
what I understand some of the contrib modules have been there for a very 
long time.  Some of the may be solved other ways.  eg the new system views and
dbsize.

Other things like pg_crypto may enable simple things like changing somebodies
username without redoing their password, as we could use those functions 
instead of the current ones.
This may make some of our pg_shadow friends (with regard to recentish security 
threads)
a bit happier as well.

I suppose the question is, at what point are contrib modules re-reviewed for 
inclusion
into core?  And if they are continuing not to make it, is there something else 
that should
be done with them?

> The real issue seems to be that we have a disconnect between what is
> presently in contrib and what is on gborg or pgfoundry.  There are
> certainly many contrib modules that are only there on seniority: if
> they were submitted today then they'd have gotten put on pgfoundry.
> But I'm not sure that there's much value in an enforced cleanup.

Maybe not an enforced cleanup.  But if there are people who manage 
certain modules, it may be work asking them the question about getting
their contrib module onto pgfoundry if that is the best place for it.  And then
giving them a little bit of support in doing it.  eg, getting the cvs history 
out of the PostgreSQL cvs
tree.

Regards

Russell Smith

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pl/pgsql enabled by default

2005-05-06 Thread Russell Smith
On Fri, 6 May 2005 04:45 pm, Jim C. Nasby wrote:
> On Fri, May 06, 2005 at 02:59:04PM +1000, Neil Conway wrote:
> > Is there a good reason that pl/pgsql is not installed in databases by 
> > default?
> 
> The only reason I've seen was "if we start with including plpgsql, where
> do we draw the line?" 
Well, I thought and I'm sure it's been said, that plpgsql was our attempt to
match oracle's pl/sql.  As Tom has already suggested in the Thread regarding
whether we should move PL's out or not, plpgsql is the only one that is entirely
internal to the db.  This is where I would clearly draw the line.  If you have 
a PL,
that is only reliant on the PostgreSQL being install, then you may have a case 
for
getting it enabled.  Otherwise not a chance.  I would say plpgsql is likely to 
be 
the only PL in this situation for a long time, if the only one ever.


> Personally, I think it should be installed by default.
I agree with everybody else, having it enabled by default is a good idea.


Regards

Russell Smith

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] pgFoundry

2005-05-06 Thread Russell Smith
On Fri, 6 May 2005 01:32 pm, Joshua D. Drake wrote:
> Hello,
> 
> PgFoundry has been brought up quite a bit lately. How we want
> it to succeed, how we want it to be the hub of PostgreSQL development.
> So my question is... Why isn't it?

Because it's not the hub of PostgreSQL development.  I think it will be 
difficult to
build a culture of "This" is the place to be unless we actually kill gborg 
totally.
Currently there are still projects there, I'm personally never sure where to 
look
for a particular project.  Even some of the more high profile projects like 
Slony-I
aren't even on PgFoundy.  How can we expect people to take it seriously.

Issue two, which I know is being resolved, is that people find it painfully 
slow to 
navigate.  Who wants to search a sight that is painfully slow.  But until the 
site is
running at a good speed, and can support a reasonably large number of projects
at that speed, are people going to be encouraged to move over?  I don't think 
so.

I know there are issues with the CVS statistics.  If I'm looking
for a project to forfill a function, looking at the statistics is a good way to 
determine
if the project is going anywhere or not.  As well as releases.  Currently every 
project
say "This project has no CVS history." and lists 0 commits and 0 adds.  I don't 
think
this generally looks good for us.  If there was no information, it would be 
better than
the false information.

Also a little more prominence on the PostgreSQL main page would be helpful.
I know there is a link, but to the unknowning user, what is pgFoundry about?
Maybe some advertisting about the fact that is you want something that runs with
your PostgreSQL server, head on over to pgFoundry to find it.

We should encourage any OSS projects that are for PostgreSQL to use
pgFoundry instead of any other hosting source.  One very basic example is the
nss library I have been working on.  I recently found that in February, another 
fork
of the nss library had popped up on debian's Gforge site.  I had no idea it 
existed,
and they had no idea I existed, and they use PostgreSQL fairly exclusively.  
Where
were they looking for an nss library when then needed one?  Well, it obviously 
wasn't
at pgFoundry.

> Why is PostgreSQL not hosted at pgFoundry?
> It seems that it has everything we would need?

This is possibly true, it gives the advantage of trackers and many functions 
that
the lists are used for.  Maybe it's less likely we would lose patches and/or 
bugs.

I don't really have a lot of knowledge about the benefits disadvantages, so I'll
leave the people who actually use CVS and things like that to make a comment.
> 
> To keep this on track, consider my question as if it were 2 months from 
> now and pgFoundry was all up on the new servers etc...

Personally, this is a problem.  It's another 2 months away.  In that time, I 
think we
also need to focus on getting rid of gborg and redirecting people to pgFoundry.
But can the current setup handle the load, and can we get the move from gborg 
done?
Also is the upgrade path for moving servers easy, if it is it's probably more 
reason to
push for the full closure of gborg.


Now, despite the apparent large number of complaints. I think pgFoundry is a 
very good
idea, and will work well in the long run.  I just think we need to get some 
things going
well to get people on the site more.  Once that happens, people will 
instinctively look there.

Sincerely,

Russell Smith

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [OT] Re: [pgsql-advocacy] [HACKERS] Increased company involvement

2005-05-04 Thread Russell Smith
On Wed, 4 May 2005 04:40 am, Tom Copeland wrote:
> On Tue, 2005-05-03 at 14:26 -0400, Mitch Pirtle wrote:
> > If you guys are planning on running Gforge, then you better make 'box' 
> > plural.
> > 
> > I'm running MamboForge.net, and the poor thing is getting beat into
> > the cold hard earth every day. We (Mambo) should really have two
> > servers, at least to dedicate hardware to the database. Most of the
> > servers in that class are dual Xeons as well - just as an indicator of
> > what you are getting yourselves into ;-)
> 
> Of course, Mitch is running the second largest GForge site on the planet
> (as far as I know) second only to https://helixcommunity.org/.
> Here's a list of them:
> 
> http://gforge.org/docman/view.php/1/52/gforge-sites.html
> 
Where does all the CPU/disk time go?   Do we have any idea what are the 
strained parts of the system?

Is it the database?

Regards

Russell Smith.
> Yours,
> 
> Tom Copeland
> 
> 
> 
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
> 
> 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Feature freeze date for 8.1

2005-05-01 Thread Russell Smith
On Mon, 2 May 2005 03:05 pm, Neil Conway wrote:
> Tom Lane wrote:
> > We would?  Why?  Please provide a motivation that justifies the
> > considerably higher cost to make it count that way, as opposed to
> > time-since-BEGIN.
> 
> The specific scenario this feature is intended to resolve is 
> idle-in-transaction backends holding on to resources while the network 
> connection times out; it isn't intended to implement "I never want to 
> run a transaction that takes more than X seconds to execute." While 
> long-running transactions aren't usually a great idea, I can certainly 
> imagine installations in which some transactions might take, say, 30 
> minutes to execute but the admin would like to timeout idle connections 
> in less than that amount of time.
> 
The two big long running transactions I can think of are VACUUM on a large db,
and there is no way to shorten that time, since to stop wraparound you must 
vacuum
the whole db.

Backups with pg_dump can run for quite a long time.

I would prefer an idle timeout if it's not costly.  Because otherwise estimates 
need to be
made about how long VACUUM and backup could take, and set the timeout longer.  
Which
in some senses defeats the purpose of being able to cleanup idle connection 
quickly.

The VACUUM issue may not be a problem, as if BEGIN is not issued, then the 
transaction
timeout would probably not be used. But the issues would remain for backups.

Just some thoughts

Regards

Russell Smith

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] possible TODO: read-only tables, select from indexes only.

2005-04-22 Thread Russell Smith
On Sat, 23 Apr 2005 03:14 am, Bruce Momjian wrote:
> Hannu Krosing wrote:
> > On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
> > > See this TODO:
> > >  
> > >  * Allow data to be pulled directly from indexes
> > >  
> > >Currently indexes do not have enough tuple visibility information
> > >to allow data to be pulled from the index without also accessing
> > >the heap.  One way to allow this is to set a bit to index tuples
> > >to indicate if a tuple is currently visible to all transactions
> > >when the first valid heap lookup happens. 
> > 
Storing visibility information in the index has always been put down as a cause 
of
performance problems.  Would it be plausible to have an index type that 
included the
information and one that didn't.  You could choose which way you wanted to go.

I know especially for some tables, I would choose this index with visibility as 
it would
increase performance by not looking at the table at all for that information 
(count being a good example).  However
for general purpose indexes I would use the normal index with no visibility 
information.

The possibly of the bit method or full tuples is probably a decision for 
others, but having
the flexibility to choose in this would be a great thing.

Regards

Russell Smith

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] REINDEX ALL

2005-04-05 Thread Russell Smith
On Wed, 6 Apr 2005 08:18 am, Andreas Pflug wrote:
> Joshua D. Drake wrote:
> > The question is coming from the TODO:
> > 
> > Allow REINDEX to rebuild all database indexes, remove
> > contrib/reindexdb
> > 
> > We can do whatever the community wants :) Just tell us what it is.
> 
Does this pose a problem where everything will run inside one transaction,
effectively blocking some db functions until every table has been reindexed?

Regards

Russell Smith

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] [HACKERS] plPHP in core?

2005-04-05 Thread Russell Smith
On Tue, 5 Apr 2005 06:01 am, Joshua D. Drake wrote:
> Tom Lane wrote:
> 
> >Andrew Dunstan <[EMAIL PROTECTED]> writes:
> >
> >>... If there are no license or build issues I'm in favor.
> >>
> >
> >Peter has pointed out that the problem of circular dependencies is a
> >showstopper for integrating plPHP.  The build order has to be
> > Postgres
> > PHP (since its existing DB support requires Postgres to build)
> > plPHP
> >so putting #1 and #3 into the same package is a no go.  Which is too
> >bad, but I see no good way around it.
> >
> O.k. I am confused here. You do not need PHP DB support for plPHP. You only
> need the php.so (once were done anyway). Which means that as long as PHP
> is installed it will work, just like plperl or plpython.
> 
> The ONLY reason you would build PHP separately is if your stock installed
> PHP didn't have a feature enabled that you want. This has nothing at all
> to do with plPHP.
> 
The issue also includes the fact that you can't install libpq without having 
postgresql
installed.  If you could do that, the circular dependency wouldn't exist.

Some systems build postgresql into php, given that is the case, what Tom says 
is correct.
First you would have to force postgresql to be installed without pl/php.  Then 
install php
with postgresql support, then install pl/php.

OR

Install php without postgresql support
Install postgresql with pl/php
Rebuild php with postgresql support (Unless you only want it available in the 
db)

I may be a bad man for suggesting it...  But is it possible to ship libpq as a 
seperate
tarball that you can compile without postgresql server?

Regards

Russell Smith

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Name change proposal

2005-04-01 Thread Russell Smith
On Fri, 1 Apr 2005 05:40 pm, Michael Fuhr wrote:
> I'd like to propose that we abandon the name "PostgreSQL" and rename the
> project "Postgre", to be pronounced either "post-greh" or "post-gree".
> This change would have a twofold purpose: it would meet popular demand,
> and it would reflect my next proposal, that we abandon SQL as the query
> language and replace it with Tutorial D.
> 
April 1 is nearly over.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] locks in CREATE TRIGGER, ADD FK

2005-03-22 Thread Russell Smith
On Wed, 23 Mar 2005 12:40 pm, Christopher Kings-Lynne wrote:
> If you want to be my friend forever, then fix CLUSTER so that it uses 
> sharerowexclusive as well :D
> 
I don't think it's as easy as that, because you have to move tuples
around in the cluster operation.  Same sort of issue as vacuum full I would 
suggest.

Russell Smith

> Chris
> 
> Neil Conway wrote:
> > Neil Conway wrote:
> > 
> >> AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and 
> >> CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are 
> >> adding triggers to (the PK table, in the case of ALTER TABLE). Is this 
> >> necessary? I don't see why we can't allow SELECT queries on the table 
> >> to proceed while the trigger is being added.
> > 
> > 
> > Attached is a patch that changes both to use ShareRowExclusiveLock, and 
> > updates the documentation accordingly. I'll apply this later today, 
> > barring any objections.
> 
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
> 
> 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] type unknown - how important is it?

2005-03-15 Thread Russell Smith
On Wed, 16 Mar 2005 05:17 am, Dave Cramer wrote:
> Shachar,
> 
> I think with type oid 705 (unknown) it's safe to treat it as text. 
> Certainly better than punting.
> 
> On another note are you aware of any issues with transactions? 
> Specifically with using the dated autocommit mode ?
> 
> Dave
> 
> Shachar Shemesh wrote:
> 
> > Tom Lane wrote:
> >
> >> Dave Cramer <[EMAIL PROTECTED]> writes:
> >>  
> >>
> >>> I just had a customer complain about this. My understanding is that 
> >>> unkown is a constant which has not been typed yet. Is it possible 
> >>> for it to be a binary type, if so how ?
> >>> I would think it should only ever be a string?
> >>>   
> >>
> >>
> >> You can read "unknown" as "string literal for which the query does not
> >> provide sufficient context to assign a definite type".  I dunno what the
> >> OLE driver really needs to do with the datatype info, but I suppose that
> >> treating this as equivalent to "text" is not unreasonable.  Ask the
> >> complainant what *he* thinks it should do.
> >>
> >> regards, tom lane
> >>  
> >>
[snip]
> >
> > On the good news front, Version 1.0.0.17 is about ready to be released 
> > (initial schema support). I am resuming development after about half a 
> > year of doing other stuff.
> >
> > Shachar
> >
> 
I have complained about this in the past, and would also suggest that it be 
treated as a
string value.

CREATE table b AS SELECT 'unknown', col2 from a;

Will even create a table with a column type as unknown, which doesn't have any 
operators
to convert to anything, including text.

Regards

Russell Smith.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-18 Thread Russell Smith
On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
> [EMAIL PROTECTED] writes:
> > In fact, I think it is so bad, that I think we need to back-port a fix to
> > previous versions and issue a notice of some kind.
> 
> They already do issue notices --- see VACUUM.
> 
> A real fix (eg the forcible stop we were talking about earlier) will not
> be reasonable to back-port.
> 
Not to be rude, but if backporting is not an option, why do we not just
focus on the job of getting autovacuum into 8.1, and not have to think
about how a patch that will warn users will work?

Regards

Russell Smith

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Help me recovering data

2005-02-18 Thread Russell Smith
On Fri, 18 Feb 2005 08:53 pm, Jürgen Cappel wrote:
> Just wondering after this discussion:
> 
> Is transaction wraparound limited to a database or to an installation ?
> i.e. can heavy traffic in one db affect another db in the same installation ?
> 
XID's are global to the pg cluster, or installation.  So not using a database
will still cause XID wraparound to occur on that database.

Regards

Russell Smith.
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
> 
> 

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Help me recovering data

2005-02-18 Thread Russell Smith
On Fri, 18 Feb 2005 04:38 pm, Kevin Brown wrote:
> Tom Lane wrote:
> > Gaetano Mendola <[EMAIL PROTECTED]> writes:
> > > BTW, why not do an automatic vacuum instead of shutdown ? At least the
> > > DB do not stop working untill someone study what the problem is and
> > > how solve it.
> > 
> > No, the entire point of this discussion is to whup the DBA upside the
> > head with a big enough cluestick to get him to install autovacuum.
> > 
> > Once autovacuum is default, it won't matter anymore.
> 
> I have a concern about this that I hope is just based on some
> misunderstanding on my part.
> 
> My concern is: suppose that a database is modified extremely
> infrequently?  So infrequently, in fact, that over a billion read
> transactions occur before the next write transaction.  Once that write
> transaction occurs, you're hosed, right?  Autovacuum won't catch this
> because it takes action based on the write activity that occurs in the
> tables.
> 
> So: will autovacuum be coded to explicitly look for transaction
> wraparound, or to automatically vacuum every N number of transactions
> (e.g., 500 million)?
> 
autovacuum already checks for both Transaction wraparound, and table updates.
It vacuums individual tables as they need it, from a free space/recovery point 
of view.

It also does checks to ensure that no database is nearing transaction 
wraparound, if it
is, it initiates a database wide vacuum to resolve that issue.

Regards

Russell Smith
> 
> 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] Auto Vacuum

2004-11-28 Thread Russell Smith
Hi All,

I am doing serious thinking about the implementation of Auto Vacuum as part of 
the backend, Not using libpq, but classing internal functions directly.
It appears to me that calling internal functions directly is a better 
implementation than using the external library to do the job.

I know I might be stepping on Matthew's toes, but I don't really want to.  I am 
a complete newbie to the postgresql code, however I am trying.
Vacuum appears to be one of the bigger saw points with administrator having to 
configure it via scheduled tasks.

I have outlined things I have thought about below.  I've surely missed a lot, 
and am open to feedback.  Others may like the current tuning used
by the external autovacuum, however to have stats that are collected at vacuum, 
and used for later vacuum would mean we don't need the
stats collector running.

The major autovacuum issues 

1. Transaction Wraparound
2. Vacuum of relations
3. Tracking of when to do vacuums
4. Where to store information needed by auto vacuum

I would suggest the first step is to replicate the wraparound and relation 
vacuuming from the current code first. But I could be wrong.
Currently there are a lot of tuning options in the external autovacuum, and the 
best way to run vacuum internally will need to be thought out.

1. Transaction Wraparound

It appears from the code, that the best time to do a transaction wraparound db 
wide vacuum is when the
frozenXID has wrapped, TransactionIdPrecedes(myXID, dbform->datfrozenxid).  And 
is probably the most
simple vacuum to implement.


2. Vacuuming of relations

Currently, the entire heap must be vacuumed at one time.  I would possible be 
desireable to have only part of the relation vacuumed at
a time.  If you can find out which parts of the relation have the most slack 
space.  There is a todo item regarding tracking recent deletions
so they can be resused.  Some form of this would be helpful to work out what to 
vacuum.  Performance issues for this type of activity 
may be a concern.  But I have no experience to be able to make comment on them. 
 So I welcome yours.

3. Tracking of when to vacuum

Current autovacuum relies the stats collector to be running.  I would like to 
only use the stats if they are available,
and have an option to be able to vacuum accurately without having to have stats 
running.

By adding certain histograms, on tuples, filesize and slack space we can guage 
the time between vacuums

number of tuples will show the inserts as opposed to updates.
file size will show that the file is growing and by how much between vacuums.
slack space will show the delete/updated records.

A new guc and relation option would need to be implemented to give a target 
slack space in a file.
this will help to reduce growth in relations if vacuum happens to not run 
frequently enough.  This information
can also inform autovacuum that it should be vacuuming more frequently.  The 
number would be a percentage,
eg 10% of the total file size is allowed to be unused.  Also alter table would 
allow users to set levels of slackness
for each relation.  If the table gets too much more than the target slack 
space, a "partial" vacuum full could be run
to reduce the size of the table by moving tuples at the end of the table to 
slack space near the beginning and 
shortening the table length.  It would require a full table lock, but you may 
be able to space it out, to only do a page
at a time.

/* target percentage of slack space */
vacuum_default_target_slack_space = 10

ALTER TABLE SET TARGET SLACK SPACE = 10;

4. Where to store information required by auto vacuum.

Auto vacuum needs somewhere to stop the information it needs about current and 
future vacuuming.  I am unsure of where
to put this.  It appears as if it will have a number of fields.  I feel like 
pg_class is the best place to put the information, but 
again I have no idea.

That's the best I can do for now.  I can clarify things further if required.

Regards

Russell.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] "unkown" columns

2004-11-06 Thread Russell Smith
Hi Hackers,

I have created a table with an unknown column in it.

CREATE table test AS select 'a' as columna, 'b' as columnb;

will create a table with columna and columnb as an unknown type.

This in itself is not strictly a problem.  However there are not functions in 
postgresql to convert unknown to another value.
There are functions if you do explicit casts, but when extracting data from a table it 
is not possible.
So this creates a problem where you cannot cast the value of the column to anything 
else.  Attempting to change the column type on 8.0b4
or even trying to do select columna::text from test results in the following error.

SQL error:

ERROR:  failed to find conversion function from "unknown" to text

In statement:
ALTER TABLE "test" ALTER COLUMN "columna" TYPE text

I would have assumed there was an implicit cast to text for items in the format 'a', 
but it seems not.

I have spoken to Gavin Sherry on IRC and he has made functions to allow casting from 
unknown to text in this situation, however he has
not had an opportunity to send a mail to the list about this issue.  So I am doing it.

Neil Conway also made some comments about unknown being as issue that has a low 
priority, however I think we need to either be able to cast away from
unknown, or at least error when attempting to create a table with an unknown column 
type.

I get the same error on 7.4.5 and 8.0b4

Regards

Russell Smith

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] CVS should die (was: Possible make_oidjoins_check ...)

2004-11-04 Thread Russell Smith
On Fri, 5 Nov 2004 07:02 am, Marc G. Fournier wrote:
> On Thu, 4 Nov 2004, Tom Lane wrote:
> 
> > "Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> >> why would we lose CVS history?  I can physically move the files in
> >> /cvsroot to accomplish this ... just tell me what needs to move, and to
> >> where ...
> >
> > If you physically move the files, that would retroactively change their
> > placement in back versions, no?  ie, it would appear that all previous
> > releases had had 'em under src/tools as well.
> 
> Erk, yes, good point ...
You could always, physically copy the file to the new location. Giving you all the 
history in the new location
and run CVS delete on the only location.  I can't see how this is too different from 
the cvs remove/cvs add.
However you get to keep the history as well as keeping the old version.

The second problem still exists where it's in 2 locations in previous releases. unless 
you cvs remove the new copy from
those branches as well.  As always CVS is a bit messy with these things, but just 
throwing ideas on the pile that might work.

Regards

Russell Smith

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Record unassigned yet

2004-10-01 Thread Russell Smith
On Fri, 1 Oct 2004 07:24 pm, Johann Robette wrote:
> Hello,
> 
> I'm experiencing a strange problem. Here it is :
> I've created a function with a FOR loop.
> 
> DECLARE
> Current RECORD;
> BEGIN
> FOR current IN SELECT * FROM employees LOOP
>   Tmp := current.id;
> END LOOP;
> ...
current != Current ?

> 
> When I call the function, I get the error :
>  ERROR:  record " current " is unassigned yet
> 
> Any idea?
> 
> 
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faqs/FAQ.html
> 
> 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match