Re: [HACKERS] File content logging during execution of COPY queries
On 10/20/2016 12:36 PM, Aleksander Alekseev wrote: According to my colleagues it would be very nice to have this feature. For instance, if you are trying to optimize PostgreSQL for application that uses COPY and you don't have access to or something like this. It could also be useful in some other cases. This use-case doesn't really make much sense to me. Can you explain it in more detail? Is the goal here to replicate all of the statements that are changing data in the database? The idea is to record application workload in real environment and write a benchmark based on this record. Then using this benchmark we could try different OS/DBMS configuration (or maybe hardware), find an extremum, then change configuration in production environment. It's not always possible to change an application or even database (e.g. to use triggers) for this purpose. For instance, if DBMS is provided as a service. Currently PostgreSQL allows to record all workload _except_ COPY queries. Considering how easily it could be done I think it's wrong. Basically the only real question here is how it should look like in postgresql.conf. OK, how about introducing a new boolean parameter named log_copy? Corresponding patch is attached. This is a useful feature I was waiting for some time. If some application which workload you want to collect is using COPY statement, then recording network traffic was your only option. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Fun fact about autovacuum and orphan temp tables
On 09/05/2016 04:34 PM, Alvaro Herrera wrote: Grigory Smolkin wrote: Funny part is that it never drops them. So when backend is finally terminated, it tries to drop them and fails with error: FATAL: out of shared memory HINT: You might need to increase max_locks_per_transaction If I understand that rigth, we are trying to drop all these temp tables in one transaction and running out of locks to do so. Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient. It is certainly pointless to hold onto these locks for temp tables. I wonder how ugly would be to fix this problem ... Thank you for your interest in this problem. I dont think this is a source of problem. Ugly fix here would only force backend to terminate properly. It will not help at all in cause of server crash or power outage. We need a way to tell autovacuum, that we don`t need orphan temp tables, so they can be removed using existing routine. The least invasive solution would be to have a guc, something like 'keep_orphan_temp_tables' with boolean value. Which would determine a autovacuum worker policy toward encountered orphan temp tables. -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
[HACKERS] Fun fact about autovacuum and orphan temp tables
Hello, hackers! We were testing how well some application works with PostgreSQL and stumbled upon an autovacuum behavior which I fail to understand. Application in question have a habit to heavily use temporary tables in funny ways. For example it creates A LOT of them. Which is ok. Funny part is that it never drops them. So when backend is finally terminated, it tries to drop them and fails with error: FATAL: out of shared memory HINT: You might need to increase max_locks_per_transaction If I understand that rigth, we are trying to drop all these temp tables in one transaction and running out of locks to do so. After that postgresql.log is flooded at the rate 1k/s with messages like that: LOG: autovacuum: found orphan temp table "pg_temp_15"."tt38147" in database "DB_TEST" It produces a noticeable load on the system and it`s getting worst with every terminated backend or restart. I did some RTFS and it appears that autovacuum has no intention of cleaning that orphan tables unless it`s wraparound time: src/backend/postmaster/autovacuum.c /* We just ignore it if the owning backend is still active */ 2037 if (backendID == MyBackendId || BackendIdGetProc(backendID) == NULL) 2038 { 2039 /* 2040 * We found an orphan temp table (which was probably left 2041 * behind by a crashed backend). If it's so old as to need 2042 * vacuum for wraparound, forcibly drop it. Otherwise just 2043 * log a complaint. 2044 */ 2045 if (wraparound) 2046 { 2047 ObjectAddress object; 2048 2049 ereport(LOG, 2050 (errmsg("autovacuum: dropping orphan temp table \"%s\".\"%s\" in database \"%s\"", 2051 get_namespace_name(classForm->relnamespace), 2052 NameStr(classForm->relname), 2053 get_database_name(MyDatabaseId; 2054 object.classId = RelationRelationId; 2055 object.objectId = relid; 2056 object.objectSubId = 0; 2057 performDeletion(, DROP_CASCADE, PERFORM_DELETION_INTERNAL); 2058 } 2059 else 2060 { 2061 ereport(LOG, 2062 (errmsg("autovacuum: found orphan temp table \"%s\".\"%s\" in database \"%s\"", 2063 get_namespace_name(classForm->relnamespace), 2064 NameStr(classForm->relname), 2065 get_database_name(MyDatabaseId; 2066 } 2067 } 2068 } What is more troubling is that pg_statistic is starting to bloat badly. LOG: automatic vacuum of table "DB_TEST.pg_catalog.pg_statistic": index scans: 0 pages: 0 removed, 68225 remain, 0 skipped due to pins tuples: 0 removed, 2458382 remain, 2408081 are dead but not yet removable buffer usage: 146450 hits, 31 misses, 0 dirtied avg read rate: 0.010 MB/s, avg write rate: 0.000 MB/s system usage: CPU 3.27s/6.92u sec elapsed 23.87 sec What is the purpose of keeping orphan tables around and not dropping them on the spot? -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company