Re: [PERFORM] Blocking every 20 sec while mass copying.
Finally we solved our problem by using a kind of trick We have 2 kind of table : online table for read and temp table to mass insert our data We work on the temp tables (5 different tables) to insert every data without any index that goes really fast compared to the previous method then we create index on these tables simultanously, then we drop online tables(also 5 tables) and rename the temp tables to online (takes less than 1 sec) This is the faster way to insert our data that we found. On our config it goes pretty fast, we reduce our execution time to 50% and there is no more need of many maintenance on the database. Thanks for all answer that you give us. 2014-07-21 10:02 GMT+02:00 Albe Laurenz : > Please keep the list on CC: in your responses. > > Benjamin Dugast wrote: > > 2014-07-18 13:11 GMT+02:00 Albe Laurenz : > >> This sounds a lot like checkpoint I/O spikes. > >> > >> Check with the database server log if the freezes coincide with > checkpoints. > >> > >> You can increase checkpoint_segments when you load data to have them > occur less often. > >> > >> If you are on Linux and you have a lot of memory, you might hit spikes > because too > >> much dirty data are cached; check /proc/sys/vm/dirty_ratio and > /proc/sys/dirty_background_ratio. > > > The checkpoint_segments is set to 64 already > > > > the dirty_ration was set by default to 10 i put it down to 5 > > the dirty_background_ratio was set to 5 and I changed it to 2 > > > > There is less freezes but the insert is so slower than before. > > That seems to indicate that my suspicion was right. > > I would say that your I/O system is saturated. > Have you checked with "iostat -mNx 1"? > > If you really cannot drop the indexes during loading, there's probably not > much more > you can do to speed up the load. > You can try to increase checkpoint_segments beyond 64 and see if that buys > you anything. > > Tuning the file system write cache will not reduce the amount of I/O > necessary, but it > should reduce the spikes (which is what I thought was your problem). > > Yours, > Laurenz Albe >
[PERFORM] Very slow planning performance on partition table
Hi, I have a table partitioned with about 60 children tables. Now I found the planning time of simple query with partition key are very slow. # explain analyze select count(*) as cnt from article where pid=88 and hash_code='2ca3ff8b17b163f0212c2ba01b80a064'; QUERY PLAN - Aggregate (cost=16.55..16.56 rows=1 width=0) (actual time=0.259..0.259 rows=1 loops=1) -> Append (cost=0.00..16.55 rows=2 width=0) (actual time=0.248..0.250 rows=1 loops=1) -> Seq Scan on article (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((pid = 88) AND (hash_code = '2ca3ff8b17b163f0212c2ba01b80a064'::bpchar)) -> Index Scan using article_88_hash_idx on article_88 article (cost=0.00..16.55 rows=1 width=0) (actual time=0.246..0.248 rows=1 loops=1) Index Cond: (hash_code = '2ca3ff8b17b163f0212c2ba01b80a064'::bpchar) Filter: (pid = 88) Total runtime: 3.816 ms (8 rows) Time: 30999.986 ms You can see the timing output that the actual run time of the 'explain analyze' is 30 seconds while the select sql itself takes only 3 ms. My partition key is on article.pid and the constraint is simple like this: CONSTRAINT article_88_pid_check CHECK (pid = 88). What's wrong and how can I improve the planning performance? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very slow planning performance on partition table
On Wed, Jul 23, 2014 at 6:21 AM, Rural Hunter wrote: > What's wrong and how can I improve the planning performance? What is constraint exclusion set to? -- Douglas J Hunley (doug.hun...@gmail.com)
Re: [PERFORM] Building multiple indexes on one table.
>Von: pgsql-performance-ow...@postgresql.org >[pgsql-performance-ow...@postgresql.org]" im Auftrag von "Claudio >Freire [klaussfre...@gmail.com] >Gesendet: Freitag, 18. Juli 2014 01:21 >An: Chris Ruprecht >Cc: pgsql-performance@postgresql.org >Betreff: Re: [PERFORM] Building multiple indexes on one table. > >On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht wrote: >> Is there any way that I can build multiple indexes on one table without >> having to scan the table multiple times? For small tables, that's probably >> not an issue, but if I have a 500 GB table that I need to create 6 indexes >> on, I don't want to read that table 6 times. >> Nothing I could find in the manual other than reindex, but that's not >> helping, since it only rebuilds indexes that are already there and I don't >> know if that reads the table once or multiple times. If I could create >> indexes inactive and then run reindex, which then reads the table once, I >> would have a solution. But that doesn't seem to exist either. > >Just build them with separate but concurrent connections, and the >scans will be synchronized so it will be only one. > >Btw, reindex rebuilds one index at a time, so what I do is issue >separate reindex for each index in parallel, to avoid the repeated >scans as well. > >Just make sure you've got the I/O and CPU capacity for it (you'll be >writing many indexes at once, so there is a lot of I/O). Index creation on large tables are mostly CPU bound as long as no swap occurs. I/O may be an issue when all your indexes are similar; e.g. all on single int4 columns. in other cases the writes will not all take place concurrently. To reduce I/O due to swap, you can consider increasing maintenance_work_mem on the connextions/sessionns that build the indexes. regards, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Building multiple indexes on one table.
On Wed, Jul 23, 2014 at 4:40 PM, Marc Mamin wrote: >>On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht wrote: >>> Is there any way that I can build multiple indexes on one table without >>> having to scan the table multiple times? For small tables, that's probably >>> not an issue, but if I have a 500 GB table that I need to create 6 indexes >>> on, I don't want to read that table 6 times. >>> Nothing I could find in the manual other than reindex, but that's not >>> helping, since it only rebuilds indexes that are already there and I don't >>> know if that reads the table once or multiple times. If I could create >>> indexes inactive and then run reindex, which then reads the table once, I >>> would have a solution. But that doesn't seem to exist either. >> >>Just build them with separate but concurrent connections, and the >>scans will be synchronized so it will be only one. >> >>Btw, reindex rebuilds one index at a time, so what I do is issue >>separate reindex for each index in parallel, to avoid the repeated >>scans as well. >> >>Just make sure you've got the I/O and CPU capacity for it (you'll be >>writing many indexes at once, so there is a lot of I/O). > > Index creation on large tables are mostly CPU bound as long as no swap occurs. > I/O may be an issue when all your indexes are similar; e.g. all on single > int4 columns. > in other cases the writes will not all take place concurrently. > To reduce I/O due to swap, you can consider increasing maintenance_work_mem > on the connextions/sessionns > that build the indexes. Usually there will always be swap, unless you've got toy indexes. But swap I/O is all sequential I/O, with a good readahead setting there should be no problem. It's the final writing step that can be a bottleneck if you have a lame I/O system and try to push 5 or 6 indexes at once. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Building multiple indexes on one table.
Your question: Is there any way that I can build multiple indexes on one table without having to scan the table multiple times? My answer: I don't think so. Since each index has a different indexing rule, it will analyze the same table in a different way. I've built indexes on a 100GB table recently and it didn't take me too much time (Amazon EC2 with 8 CPU cores / 70 GB RAM). I don't remember how much time it took, but that's a good sign right ;-) ? Painful jobs are always remembered... (ok, the hardware helped a lot). So, my advice is: get yourself a good maintenance window and just build indexes, remember that they will help a lot of people querying this table. 2014-07-23 16:49 GMT-03:00 Claudio Freire : > On Wed, Jul 23, 2014 at 4:40 PM, Marc Mamin wrote: > >>On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht > wrote: > >>> Is there any way that I can build multiple indexes on one table > without having to scan the table multiple times? For small tables, that's > probably not an issue, but if I have a 500 GB table that I need to create 6 > indexes on, I don't want to read that table 6 times. > >>> Nothing I could find in the manual other than reindex, but that's not > helping, since it only rebuilds indexes that are already there and I don't > know if that reads the table once or multiple times. If I could create > indexes inactive and then run reindex, which then reads the table once, I > would have a solution. But that doesn't seem to exist either. > >> > >>Just build them with separate but concurrent connections, and the > >>scans will be synchronized so it will be only one. > >> > >>Btw, reindex rebuilds one index at a time, so what I do is issue > >>separate reindex for each index in parallel, to avoid the repeated > >>scans as well. > >> > >>Just make sure you've got the I/O and CPU capacity for it (you'll be > >>writing many indexes at once, so there is a lot of I/O). > > > > Index creation on large tables are mostly CPU bound as long as no swap > occurs. > > I/O may be an issue when all your indexes are similar; e.g. all on > single int4 columns. > > in other cases the writes will not all take place concurrently. > > To reduce I/O due to swap, you can consider increasing > maintenance_work_mem on the connextions/sessionns > > that build the indexes. > > Usually there will always be swap, unless you've got toy indexes. > > But swap I/O is all sequential I/O, with a good readahead setting > there should be no problem. > > It's the final writing step that can be a bottleneck if you have a > lame I/O system and try to push 5 or 6 indexes at once. > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
[PERFORM] High rate of transaction failure with the Serializable Isolation Level
Hello PGSQL performance community, [By way of introduction, we are a TPC subcommittee that is developing a benchmark with cloud-like characteristics for virtualized databases. The end-to-end benchmarking kit will be publicly available, and will run on PGSQL] I am running into very high failure rates when I run with the Serializable Isolation Level. I have simplified our configuration to a single database with a constant workload, a TPC-E workload if you will, to focus on this this problem. We are running with PGSQL 9.2.4, ODBC 2.2.14 (as well as 2.3.3pre, which didn't help), RHEL 6.4, and a 6-way VM with 96GB of memory on a 4-socket Westmere server. With our 9 transactions running with a mix of SQL_TXN_READ_COMMITTED and SQL_TXN_REPEATABLE_READ, we get less than 1% deadlocks, all of which occur because each row in one table, BROKER, may be read or written by multiple transactions at the same time. So, there are legitimate conflicts, which we deal with using an exponential backoff algorithm that sleeps for 10ms/30ms/90ms/etc. When we raise the Trade-Result transaction to SQL_TXN_SERIALIZABLE, we face a storm of conflicts. Out of 37,342 Trade-Result transactions, 15,707 hit an error, and have to be rolled back and retired one or more times. The total failure count (due to many transactions failing more than once) is 31,388. What is unusual is that the majority of the failures occur in a statement that should not have any isolation conflicts. About 17K of failures are from the statement below: 2014-07-23 11:27:15 PDT 26085 ERROR: could not serialize access due to read/write dependencies among transactions 2014-07-23 11:27:15 PDT 26085 DETAIL: Reason code: Canceled on identification as a pivot, during write. 2014-07-23 11:27:15 PDT 26085 HINT: The transaction might succeed if retried. 2014-07-23 11:27:15 PDT 26085 CONTEXT: SQL statement "update TRADE set T_COMM = comm_amount, T_DTS = trade_dts, T_ST_ID = st_completed_id, T_TRADE_PRICE = trade_price where T_ID = trade_id" PL/pgSQL function traderesultframe5(ident_t,value_t,character,timestamp without time zone,trade_t,s_price_t) line 15 at SQL statement This doesn't make sense since at any given time, only one transaction might possibly be accessing the row that is being updated. There should be no conflicts if we have row-level locking/isolation The second most common conflict happens 7.6K times in the statement below: 2014-07-23 11:27:23 PDT 26039 ERROR: could not serialize access due to read/write dependencies among transactions 2014-07-23 11:27:23 PDT 26039 DETAIL: Reason code: Canceled on identification as a pivot, during conflict in checking. 2014-07-23 11:27:23 PDT 26039 HINT: The transaction might succeed if retried. 2014-07-23 11:27:23 PDT 26039 CONTEXT: SQL statement "insert intoSETTLEMENT (SE_T_ID, SE_CASH_TYPE, SE_CASH_DUE_DATE, SE_AMT) values (trade_id, cash_type, due_date, se_amount )" PL/pgSQL function traderesultframe6(ident_t,timestamp without time zone,character varying,value_t,timestamp without time zone,trade_t,smallint,s_qty_t,character) line 23 at SQL statement I don't understand why an insert would hit a serialization conflict We also have 4.5K conflicts when we try to commit: 2014-07-23 11:27:23 PDT 26037 ERROR: could not serialize access due to read/write dependencies among transactions 2014-07-23 11:27:23 PDT 26037 DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. 2014-07-23 11:27:23 PDT 26037 HINT: The transaction might succeed if retried. 2014-07-23 11:27:23 PDT 26037 STATEMENT: COMMIT Does PGSQL raise locks to page level when we run with SQL_TXN_SERIALIZABLE? Are there any knobs I can play with to alleviate this? FWIW, the same transactions on MS SQL Server see almost no conflicts. Thanks, Reza
Re: [PERFORM] Very slow planning performance on partition table
It's the default value(partition): # grep exclusion postgresql.conf #constraint_exclusion = partition?0?2?0?2?0?2 # on, off, or partition btw, I'm on postgresql 9.2.4 ?? 2014/7/24 1:35, Douglas J Hunley : On Wed, Jul 23, 2014 at 6:21 AM, Rural Hunterwrote: What's wrong and how can I improve the planning performance? What is constraint exclusion set to? -- Douglas J Hunley (doug.hun...@gmail.com)
Re: [PERFORM] High rate of transaction failure with the Serializable Isolation Level
On 07/24/2014 09:18 AM, Reza Taheri wrote: > What is unusual is that the majority of the failures occur in a > statement that should not have any isolation conflicts. About 17K of > failures are from the statement below: It's not just that statement that is relevant. At SERIALIZABLE isolation the entire transaction's actions must be considered, as must the conflicting transaction. > This doesn’t make sense since at any given time, only one transaction > might possibly be accessing the row that is being updated. There should > be no conflicts if we have row-level locking/isolation. Is that statement run standalone, or as part of a larger transaction? > The second most common conflict happens 7.6K times in the statement below: ... > I don’t understand why an insert would hit a serialization conflict If the INSERTing transaction previously queried for a key that was created by a concurrent transaction this can occur as there is no serialization execution order of the transactions that could produce the same result. This doesn't produce exactly the same error, but demonstrates one such case: regress=> CREATE TABLE demo (id integer primary key, value integer); CREATE TABLE regress=> INSERT INTO demo(id, value) VALUES (1, 42); INSERT 0 1 then regress=> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; BEGIN regress=> SELECT id FROM demo WHERE id = 2; id (0 rows) session1=> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; BEGIN session2=> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; BEGIN session1=> SELECT id FROM demo WHERE id = 2; id (0 rows) session2=> SELECT id FROM demo WHERE id = 3; id (0 rows) session1=> INSERT INTO demo VALUES (3, 43); INSERT 0 1 session2=> INSERT INTO demo VALUES (2, 43); INSERT 0 1 session2=> COMMIT; COMMIT session1=> COMMIT; ERROR: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt. HINT: The transaction might succeed if retried. > Does PGSQL raise locks to page level when we run with > SQL_TXN_SERIALIZABLE? >From the documentation (http://www.postgresql.org/docs/current/static/transaction-iso.html): > Predicate locks in PostgreSQL, like in most other database systems, are based > on data actually accessed by a transaction. These will show up in the > pg_locks system view with a mode of SIReadLock. The particular locks acquired > during execution of a query will depend on the plan used by the query, and > multiple finer-grained locks (e.g., tuple locks) may be combined into fewer > coarser-grained locks (e.g., page locks) during the course of the transaction > to prevent exhaustion of the memory used to track the locks. ... so yes, it may raise locks to page level. That doesn't mean that's necessarily what's happening here. > Are there any knobs I can play with to alleviate > this? A lower FILLFACTOR can spread data out at the cost of wasted space. > FWIW, the same transactions on MS SQL Server see almost no conflicts. Many DBMSs don't detect all serialization anomalies. PostgreSQL doesn't detect all possible anomalies but it detects many that other systems may not. To see what's going on and why MS SQL Server (version?) doesn't complain, it'd be best to boil each case down to a minimal reproducible test case that can be analyzed in isolation. PostgreSQL's isolationtester tool, in src/test/isolation, can be handy for automating this kind of conflict, and provides some useful examples of cases that are detected. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High rate of transaction failure with the Serializable Isolation Level
On 07/24/2014 09:18 AM, Reza Taheri wrote: > Does PGSQL raise locks to page level when we run with > SQL_TXN_SERIALIZABLE? Are there any knobs I can play with to alleviate > this? FWIW, the same transactions on MS SQL Server see almost no conflicts. > Also, in the documentation (http://www.postgresql.org/docs/current/static/transaction-iso.html): > When the system is forced to combine multiple page-level predicate locks into > a single relation-level predicate lock because the predicate lock table is > short of memory, an increase in the rate of serialization failures may occur. > You can avoid this by increasing max_pred_locks_per_transaction. ... so I suggest experimenting with higher max_pred_locks_per_transaction values. http://www.postgresql.org/docs/9.1/static/runtime-config-locks.html#GUC-MAX-PRED-LOCKS-PER-TRANSACTION ... though that should only really affect object level locks (tables, etc) according to the docs. I'd need to dig further to determine how to reduce or eliminate lock combining of row-level to page-level and page-level to object-level locks. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance