Re: [HACKERS] [ADMIN] reindexdb hangs

2007-09-12 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > I am unsure if I should backpatch to 8.1: the code in cluster.c has
> > changed, and while it is relatively easy to modify the patch, this is a
> > rare bug and nobody has reported it in CLUSTER (not many people clusters
> > temp tables, it seems).  Should I patch only REINDEX?  How far back?
> 
> I'd say go as far back as you can conveniently modify the patch for.
> This is a potential data-loss bug (even if only for temporary data)
> so we ought to take it seriously.

OK, I fixed it all the way back that it was needed: 7.4 for CLUSTER and
8.1 for REINDEX.

Before 7.4 there wasn't a database-wide version of CLUSTER.  This wasn't
a very serious bug in any case, because it would have thrown an ERROR if
it tried to cluster a remote temp table.  So apparently no one ever saw
it, because I can't remember any report about it.


For REINDEX the story is very different, because what happens is that
queries start silently returning wrong data.  My test case was

alvherre=# create temp table foo (a int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY creará el índice implícito «foo_pkey» para 
la tabla «foo»
CREATE TABLE
alvherre=# insert into foo select * from generate_series(1,4);
INSERT 0 4

-- "reindex database alvherre" in another session

alvherre=# select * from foo where a = 400;
  a  
-
(0 rows)

If you now REINDEX this table in the current session, it correctly
returns one tuple.  So if somebody is executing a procedure which
involve temp tables and someone else concurrently does a REINDEX
DATABASE, the queries of the first session are automatically corrupted.
It seems like the worst kind of bug.

Maybe we need additional protections on the bufmgr to prevent this kind
of problem.

We only introduced REINDEX DATABASE as a way to reindex user indexes in
8.1.  Before that, it only considered system catalogs, which ISTM are
never temp.

Many thanks to dx k9 for the original report.

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
"Endurecerse, pero jamás perder la ternura" (E. Guevara)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-09-10 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I am unsure if I should backpatch to 8.1: the code in cluster.c has
> changed, and while it is relatively easy to modify the patch, this is a
> rare bug and nobody has reported it in CLUSTER (not many people clusters
> temp tables, it seems).  Should I patch only REINDEX?  How far back?

I'd say go as far back as you can conveniently modify the patch for.
This is a potential data-loss bug (even if only for temporary data)
so we ought to take it seriously.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-09-10 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > I'm not sure I follow.  Are you suggesting adding a new function,
> > similar to pg_class_ownercheck, which additionally checks for temp-ness?
> 
> No, I was just suggesting adding the check for temp-ness in cluster()
> and cluster_rel() where we do pg_class_ownercheck.  We already have the
> rel open there and so it's cheap to do the temp-ness check.

I applied a patch along these lines to HEAD and 8.2.

I am unsure if I should backpatch to 8.1: the code in cluster.c has
changed, and while it is relatively easy to modify the patch, this is a
rare bug and nobody has reported it in CLUSTER (not many people clusters
temp tables, it seems).  Should I patch only REINDEX?  How far back?

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-30 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I'm not sure I follow.  Are you suggesting adding a new function,
> similar to pg_class_ownercheck, which additionally checks for temp-ness?

No, I was just suggesting adding the check for temp-ness in cluster()
and cluster_rel() where we do pg_class_ownercheck.  We already have the
rel open there and so it's cheap to do the temp-ness check.

I guess it's a question of which path you are more concerned about
making cheap, of course.  Your proposal was to filter before putting the
rel into the list at all, which certainly saves cycles when we reject a
remote temp table; but it adds cycles for all other tables.  I argue
that the remote-temp-table case is uncommon (else we'd have had many
more trouble reports) and therefore optimizing it at the cost of the
normal case isn't a win.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-30 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Yeah, an extra fetch of the pg_class row doesn't seem all that nice.
> >> I think you'd want to check it in approximately the same two places
> >> where pg_class_ownercheck() is applied (one for the 1-xact and one for
> >> the multi-xact path).
> 
> > Actually, the 1-xact path does not need it, because the check is already
> > elsewhere.
> 
> Yeah, but if you do it there it's one added comparison
> (isOtherTempNamespace is very cheap, and you can get the namespace
> cheaply from the already-open rel).  This way you need an extra syscache
> lookup because you are insisting on doing the check in a place where you
> don't have easy access to the pg_class row.  Doesn't seem better.

I'm not sure I follow.  Are you suggesting adding a new function,
similar to pg_class_ownercheck, which additionally checks for temp-ness?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-29 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Yeah, an extra fetch of the pg_class row doesn't seem all that nice.
>> I think you'd want to check it in approximately the same two places
>> where pg_class_ownercheck() is applied (one for the 1-xact and one for
>> the multi-xact path).

> Actually, the 1-xact path does not need it, because the check is already
> elsewhere.

Yeah, but if you do it there it's one added comparison
(isOtherTempNamespace is very cheap, and you can get the namespace
cheaply from the already-open rel).  This way you need an extra syscache
lookup because you are insisting on doing the check in a place where you
don't have easy access to the pg_class row.  Doesn't seem better.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-29 Thread Alvaro Herrera
Tom Lane wrote:

> > I examined cluster.c and it does seem to be missing a check too.  I'm
> > not sure where to add one though; the best choice would be the place
> > where the list of rels is built, but that scans only pg_index, so it
> > doesn't have access to the namespace of each rel.  So one idea would be
> > to get the pg_class row for each candidate, but that seems slow.
> > Another idea would be to just add all the candidates and silently skip
> > the temp indexes in cluster_rel.
> 
> Yeah, an extra fetch of the pg_class row doesn't seem all that nice.
> I think you'd want to check it in approximately the same two places
> where pg_class_ownercheck() is applied (one for the 1-xact and one for
> the multi-xact path).

Actually, the 1-xact path does not need it, because the check is already
elsewhere.  We only need logic enough to skip temp tables silently while
building the list in the multi-xact path.  What this means is that very
few people, if any, clusters temp tables; because as soon as you do, a
plain CLUSTER command in another session errors out with

alvherre=# cluster;
ERROR:  cannot cluster temporary tables of other sessions

So, patch attached.

> Are there any other commands to be worried about?  I can't see any
> besides VACUUM/ANALYZE, and those seem covered.

I can't think of any.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Index: src/backend/commands/cluster.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.162
diff -c -p -r1.162 cluster.c
*** src/backend/commands/cluster.c	19 May 2007 01:02:34 -	1.162
--- src/backend/commands/cluster.c	29 Aug 2007 22:07:04 -
*** cluster_rel(RelToCluster *rvtc, bool rec
*** 276,281 
--- 276,287 
  		}
  
  		/*
+ 		 * Note: we don't need to check whether the table is a temp for a
+ 		 * remote session here, because it will be checked in
+ 		 * check_index_is_clusterable, below.
+ 		 */
+ 
+ 		/*
  		 * Check that the index still exists
  		 */
  		if (!SearchSysCacheExists(RELOID,
*** swap_relation_files(Oid r1, Oid r2, Tran
*** 995,1004 
  }
  
  /*
!  * Get a list of tables that the current user owns and
!  * have indisclustered set.  Return the list in a List * of rvsToCluster
!  * with the tableOid and the indexOid on which the table is already
!  * clustered.
   */
  static List *
  get_tables_to_cluster(MemoryContext cluster_context)
--- 1001,1037 
  }
  
  /*
!  * Returns whether a pg_class tuple belongs to a temp namespace which is not
!  * our backend's.
!  */
! static bool
! relid_is_other_temp(Oid class_oid)
! {
! 	HeapTuple	tuple;
! 	Form_pg_class classForm;
! 	bool		istemp;
! 
! 	tuple = SearchSysCache(RELOID,
! 		   ObjectIdGetDatum(class_oid),
! 		   0, 0, 0);
! 	if (!HeapTupleIsValid(tuple))
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_TABLE),
!  errmsg("relation with OID %u does not exist", class_oid)));
! 
! 	classForm = (Form_pg_class) GETSTRUCT(tuple);
! 	istemp = isOtherTempNamespace(classForm->relnamespace);
! 
! 	ReleaseSysCache(tuple);
! 
! 	return istemp;
! }
! 
! /*
!  * Get a list of tables that the current user owns, have indisclustered set,
!  * and are not temp tables of remote backends.  Return the list in a List * of
!  * rvsToCluster with the tableOid and the indexOid on which the table is
!  * already clustered.
   */
  static List *
  get_tables_to_cluster(MemoryContext cluster_context)
*** get_tables_to_cluster(MemoryContext clus
*** 1031,1036 
--- 1064,1072 
  		if (!pg_class_ownercheck(index->indrelid, GetUserId()))
  			continue;
  
+ 		if (relid_is_other_temp(index->indrelid))
+ 			continue;
+ 
  		/*
  		 * We have to build the list in a different memory context so it will
  		 * survive the cross-transaction processing
Index: src/backend/commands/indexcmds.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/commands/indexcmds.c,v
retrieving revision 1.162
diff -c -p -r1.162 indexcmds.c
*** src/backend/commands/indexcmds.c	25 Aug 2007 19:08:19 -	1.162
--- src/backend/commands/indexcmds.c	25 Aug 2007 22:11:18 -
*** ReindexDatabase(const char *databaseName
*** 1292,1297 
--- 1292,1301 
  		if (classtuple->relkind != RELKIND_RELATION)
  			continue;
  
+ 		/* Skip temp tables of other backends; we can't reindex them at all */
+ 		if (isOtherTempNamespace(classtuple->relnamespace))
+ 			continue;
+ 
  		/* Check user/system classification, and optionally skip */
  		if (IsSystemClass(classtuple))
  		{

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   

Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-25 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Hmm, there is not any filter in ReindexDatabase() to exclude temp tables
>> of other backends, but it sure seems like there needs to be.  CLUSTER
>> might have the same issue.  I think we fixed this in VACUUM long ago,
>> but we need to check the other commands that grovel over all of a database.

> Was this ever fixed?  I think it wasn't, because I don't see any check
> in ReindexDatabase.  Here is a patch to add one.

No, that's still on the todo list.  Your patch looks reasonable.

> I examined cluster.c and it does seem to be missing a check too.  I'm
> not sure where to add one though; the best choice would be the place
> where the list of rels is built, but that scans only pg_index, so it
> doesn't have access to the namespace of each rel.  So one idea would be
> to get the pg_class row for each candidate, but that seems slow.
> Another idea would be to just add all the candidates and silently skip
> the temp indexes in cluster_rel.

Yeah, an extra fetch of the pg_class row doesn't seem all that nice.
I think you'd want to check it in approximately the same two places
where pg_class_ownercheck() is applied (one for the 1-xact and one for
the multi-xact path).

Are there any other commands to be worried about?  I can't see any
besides VACUUM/ANALYZE, and those seem covered.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-08-25 Thread Alvaro Herrera
Tom Lane wrote:
> "dx k9" <[EMAIL PROTECTED]> writes:
> > [ stuck reindex ]
> > It turns out it was a temporary database and temporary table, that just 
> > wasn't there maybe it thought it was there from some type of snapshot then 
> > the next minute it was gone.
> 
> Hmm, there is not any filter in ReindexDatabase() to exclude temp tables
> of other backends, but it sure seems like there needs to be.  CLUSTER
> might have the same issue.  I think we fixed this in VACUUM long ago,
> but we need to check the other commands that grovel over all of a database.

Was this ever fixed?  I think it wasn't, because I don't see any check
in ReindexDatabase.  Here is a patch to add one.

I examined cluster.c and it does seem to be missing a check too.  I'm
not sure where to add one though; the best choice would be the place
where the list of rels is built, but that scans only pg_index, so it
doesn't have access to the namespace of each rel.  So one idea would be
to get the pg_class row for each candidate, but that seems slow.
Another idea would be to just add all the candidates and silently skip
the temp indexes in cluster_rel.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/commands/indexcmds.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/commands/indexcmds.c,v
retrieving revision 1.162
diff -c -p -r1.162 indexcmds.c
*** src/backend/commands/indexcmds.c	25 Aug 2007 19:08:19 -	1.162
--- src/backend/commands/indexcmds.c	25 Aug 2007 22:11:18 -
*** ReindexDatabase(const char *databaseName
*** 1292,1297 
--- 1292,1301 
  		if (classtuple->relkind != RELKIND_RELATION)
  			continue;
  
+ 		/* Skip temp tables of other backends; we can't reindex them at all */
+ 		if (isOtherTempNamespace(classtuple->relnamespace))
+ 			continue;
+ 
  		/* Check user/system classification, and optionally skip */
  		if (IsSystemClass(classtuple))
  		{

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [ADMIN] reindexdb hangs

2007-05-02 Thread Tom Lane
"dx k9" <[EMAIL PROTECTED]> writes:
> [ stuck reindex ]
> It turns out it was a temporary database and temporary table, that just 
> wasn't there maybe it thought it was there from some type of snapshot then 
> the next minute it was gone.

Hmm, there is not any filter in ReindexDatabase() to exclude temp tables
of other backends, but it sure seems like there needs to be.  CLUSTER
might have the same issue.  I think we fixed this in VACUUM long ago,
but we need to check the other commands that grovel over all of a database.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly