Re: [HACKERS] Small patch to fix an O(N^2) problem in foreign keys

2015-09-11 Thread Kevin Grittner
Jan Wieck  wrote:

>>> The patch uses a heuristic method of detecting when the hash table
>>> should be destroyed and recreated. InvalidateConstraintCacheCallBack()
>>> adds the current size of the hash table to a counter. When that sum
>>> reaches 1,000,000, the hash table is flushed. This improves the schema
>>> restore of a database with 100,000 foreign keys by factor 3.
>>>
>>> According to my tests the patch does not interfere with the bulk
>>> updates, the original feature was supposed to improve.
>>
>> In the real-world customer case that caused you to look into this,
>> I thought 45ba424f drove schema-only restore time from 2 hours to
>> about 25 hours, and that this patch takes it back down to 2 hours.
>> Am I remembering right?  And this came about because it added
>> 20-some hours to a pg_upgrade run?
>
> From 2 hours to >50, but yes, this is that case.
>
>> If there are no objections, I will push this as a bug fix back to
>> 9.3, where the performance regression was introduced.

Hearing no objections, pushed with one minor tweak Jan and I
discussed off-list -- the original patch duplicated a bit of code
that we agreed should be split into a static function and called
from both places, to protect against someone later changing one and
missing the other.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small patch to fix an O(N^2) problem in foreign keys

2015-09-10 Thread Jan Wieck

On 09/08/2015 08:49 AM, Kevin Grittner wrote:

Jan Wieck  wrote:


The problem is a cache introduced in commit 45ba4247 that improves


That's a bit off; 45ba424f seems to be what you mean.


Copy paste over paper.





foreign key lookups during bulk updates when the FK value does not
change. When restoring a schema dump from a database with many (say
100,000) foreign keys, this cache is growing very big and every ALTER
TABLE command is causing a InvalidateConstraintCacheCallBack(), which
does a sequential hash table scan.

The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.

According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.


In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right?  And this came about because it added
20-some hours to a pg_upgrade run?


From 2 hours to >50, but yes, this is that case.



If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.



Regards, Jan


--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small patch to fix an O(N^2) problem in foreign keys

2015-09-08 Thread Kevin Grittner
Jan Wieck  wrote:

> The problem is a cache introduced in commit 45ba4247 that improves

That's a bit off; 45ba424f seems to be what you mean.

> foreign key lookups during bulk updates when the FK value does not
> change. When restoring a schema dump from a database with many (say
> 100,000) foreign keys, this cache is growing very big and every ALTER
> TABLE command is causing a InvalidateConstraintCacheCallBack(), which
> does a sequential hash table scan.
>
> The patch uses a heuristic method of detecting when the hash table
> should be destroyed and recreated. InvalidateConstraintCacheCallBack()
> adds the current size of the hash table to a counter. When that sum
> reaches 1,000,000, the hash table is flushed. This improves the schema
> restore of a database with 100,000 foreign keys by factor 3.
>
> According to my tests the patch does not interfere with the bulk
> updates, the original feature was supposed to improve.

In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right?  And this came about because it added
20-some hours to a pg_upgrade run?

If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Small patch to fix an O(N^2) problem in foreign keys

2015-09-03 Thread Jan Wieck

Attached is a small patch and a script to reproduce the issue.

The problem is a cache introduced in commit 45ba4247 that improves 
foreign key lookups during bulk updates when the FK value does not 
change. When restoring a schema dump from a database with many (say 
100,000) foreign keys, this cache is growing very big and every ALTER 
TABLE command is causing a InvalidateConstraintCacheCallBack(), which 
does a sequential hash table scan.


The patch uses a heuristic method of detecting when the hash table 
should be destroyed and recreated. InvalidateConstraintCacheCallBack() 
adds the current size of the hash table to a counter. When that sum 
reaches 1,000,000, the hash table is flushed. This improves the schema 
restore of a database with 100,000 foreign keys by factor 3.


According to my tests the patch does not interfere with the bulk 
updates, the original feature was supposed to improve.



Regards, Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 61edde9..d7023ce 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -183,6 +183,7 @@ typedef struct RI_CompareHashEntry
  * --
  */
 static HTAB *ri_constraint_cache = NULL;
+static long  ri_constraint_cache_seq_count = 0;
 static HTAB *ri_query_cache = NULL;
 static HTAB *ri_compare_cache = NULL;
 
@@ -2945,6 +2946,27 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
 
 	Assert(ri_constraint_cache != NULL);
 
+	/*
+	 * Prevent an O(N^2) problem when creating large amounts of foreign
+	 * key constraints with ALTER TABLE, like it happens at the end of
+	 * a pg_dump with hundred-thousands of tables having references.
+	 */
+	ri_constraint_cache_seq_count += hash_get_num_entries(ri_constraint_cache);
+	if (ri_constraint_cache_seq_count > 100)
+	{
+		HASHCTL		ctl;
+
+		hash_destroy(ri_constraint_cache);
+		memset(&ctl, 0, sizeof(ctl));
+		ctl.keysize = sizeof(Oid);
+		ctl.entrysize = sizeof(RI_ConstraintInfo);
+		ri_constraint_cache = hash_create("RI constraint cache",
+		  RI_INIT_CONSTRAINTHASHSIZE,
+		  &ctl, HASH_ELEM | HASH_BLOBS);
+		ri_constraint_cache_seq_count = 0;
+		return;
+	}
+
 	hash_seq_init(&status, ri_constraint_cache);
 	while ((hentry = (RI_ConstraintInfo *) hash_seq_search(&status)) != NULL)
 	{


t1.sh
Description: application/shellscript

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers