Re: [PERFORM] Index creation running now for 14 hours
From: Tory M Blue [mailto:tmb...@gmail.com] Sent: Wednesday, August 26, 2015 3:26 PM To: Igor Neyman iney...@perceptron.com Cc: pgsql-performance pgsql-performance@postgresql.org Subject: Re: [PERFORM] Index creation running now for 14 hours On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman iney...@perceptron.commailto:iney...@perceptron.com wrote: From: pgsql-performance-ow...@postgresql.orgmailto:pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.orgmailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Tory M Blue Sent: Wednesday, August 26, 2015 3:14 PM To: pgsql-performance pgsql-performance@postgresql.orgmailto:pgsql-performance@postgresql.org Subject: [PERFORM] Index creation running now for 14 hours I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100% (32 core box) and there is almost zero I/O activity. CentOS 6.6 16398 | clsdb | 25765 | 10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 | | 45712 | 2015-08-25 21:12:01.6 19819-07 | 2015-08-25 21:22:08.68766-07 | 2015-08-25 22:16:03.10099-07 | 2015-08-25 22:16:03.100992-07 | f | active | select _cls.fini shTableAfterCopy(143); analyze torque.impressions; I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without stopping postgres or slon, but that would start me back at square one. Thanks Tory i Check pg_locks in regards to the table in question. Regards, Igor Neyman thanks Igor I did, but not clear what that is telling me, there are 249 rows in there, nothing has a table name , they are all for the PID in the analyze torque.impressions line that I listed above pid 25765. Here is one for an exclusive lock, but what should I be looking for? There are no other processes on this box other than slon and this index creation. transactionid | || | || 93588453 | | | | 4/25823460 | 25765 | ExclusiveL ock | t | f Thanks Tory There are objects OIDs in pg_lock, not names. Find the OID of the table that you create your index for, and search pg_locks for the records referencing your table. It cannot be that all records in pg_locks are for pid running “analyze”, there should be records with pid running your “create index”. What’s the size of the table you are indexing? Also, take a look at pg_stat_activity for long running transactions/queries. Igor Neyman
Re: [PERFORM] Index creation running now for 14 hours
On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman iney...@perceptron.com wrote: *From:* pgsql-performance-ow...@postgresql.org [mailto: pgsql-performance-ow...@postgresql.org] *On Behalf Of *Tory M Blue *Sent:* Wednesday, August 26, 2015 3:14 PM *To:* pgsql-performance pgsql-performance@postgresql.org *Subject:* [PERFORM] Index creation running now for 14 hours I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100% (32 core box) and there is almost zero I/O activity. CentOS 6.6 16398 | clsdb | 25765 | 10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 | | 45712 | 2015-08-25 21:12:01.6 19819-07 | 2015-08-25 21:22:08.68766-07 | 2015-08-25 22:16:03.10099-07 | 2015-08-25 22:16:03.100992-07 | f | active | select _cls.fini shTableAfterCopy(143); analyze torque.impressions; I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without stopping postgres or slon, but that would start me back at square one. Thanks Tory i Check pg_locks in regards to the table in question. Regards, Igor Neyman thanks Igor I did, but not clear what that is telling me, there are 249 rows in there, nothing has a table name , they are all for the PID in the analyze torque.impressions line that I listed above pid 25765. Here is one for an exclusive lock, but what should I be looking for? There are no other processes on this box other than slon and this index creation. transactionid | || | || 93588453 | | | | 4/25823460 | 25765 | ExclusiveL ock | t | f Thanks Tory
Re: [PERFORM] Index creation running now for 14 hours
From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Tory M Blue Sent: Wednesday, August 26, 2015 3:14 PM To: pgsql-performance pgsql-performance@postgresql.org Subject: [PERFORM] Index creation running now for 14 hours I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100% (32 core box) and there is almost zero I/O activity. CentOS 6.6 16398 | clsdb | 25765 | 10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 | | 45712 | 2015-08-25 21:12:01.6 19819-07 | 2015-08-25 21:22:08.68766-07 | 2015-08-25 22:16:03.10099-07 | 2015-08-25 22:16:03.100992-07 | f | active | select _cls.fini shTableAfterCopy(143); analyze torque.impressions; I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without stopping postgres or slon, but that would start me back at square one. Thanks Tory iostat: sdb is the db directory avg-cpu: %user %nice %system %iowait %steal %idle 3.550.000.230.000.00 96.22 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.00 0.0012.00 0 24 sdb 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %system %iowait %steal %idle 3.570.000.060.000.00 96.37 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 21.50 0.00 15484.00 0 30968 avg-cpu: %user %nice %system %iowait %steal %idle 3.720.000.060.000.00 96.22 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 2.00 0.0020.00 0 40 sdb 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %system %iowait %steal %idle 4.060.000.050.020.00 95.87 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 4.00 0.0064.00 0128 sdb 3.50 0.00 108.00 0216 avg-cpu: %user %nice %system %iowait %steal %idle 3.360.000.030.000.00 96.61 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %system %iowait %steal %idle 3.410.000.060.000.00 96.53 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %system %iowait %steal %idle 3.450.000.270.000.00 96.28 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 1.00 0.0024.00 0 48 avg-cpu: %user %nice %system %iowait %steal %idle 3.500.000.300.000.00 96.20 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 1.50 0.00 344.00 0688 sdb 0.00 0.00 0.00 0 0 Check pg_locks in regards to the table in question. Regards, Igor Neyman
Re: [PERFORM] Index creation running now for 14 hours
On Wed, Aug 26, 2015 at 2:45 PM, Qingqing Zhou zhouqq.postg...@gmail.com wrote: On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com wrote: Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy about. Regards, Qingqing I appreciate the attempted help, but I know what it's doing, it's creating indexes for the last 14+ hours. I've killed it now, as it was about to run my machine out of disk space, stopped it at 97% full, could not go any longer. I will now clean up the table a bit but will still have 500million rows with 6 indexes on it. I will create the indexes after the data is laid down vs during, so it doesn't block my other table replications. I will then fire off my index creations in parallel for my other tables so I can actually use the hardware the DB is sitting on. But I guess the answer is, no real way to tell what the box is doing when it's creating an index. Yes there was a lock, no I could not find a way to see how it's progressing so there was no way for me to gauge when it would be done. Thanks Tory Tory
Re: [PERFORM] Index creation running now for 14 hours
Hi, On 08/26/2015 11:53 PM, Tory M Blue wrote: On Wed, Aug 26, 2015 at 2:45 PM, Qingqing Zhou zhouqq.postg...@gmail.com mailto:zhouqq.postg...@gmail.com wrote: On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com mailto:tmb...@gmail.com wrote: Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy about. Regards, Qingqing I appreciate the attempted help, but I know what it's doing, it's creating indexes for the last 14+ hours. Sure, but what exactly was it doing? 'perf top' might give us a hint which function is consuming most of the time, for example. I've killed it now, as it was about to run my machine out of disk space, stopped it at 97% full, could not go any longer. Which suggests it's using a lot of temp files. Indexes are built by reading all the necessary data from the table (just the columns), sorted and then an index is built using the sorted data (because it can be done very efficiently - much faster than when simply inserting the tuples into the btree index). The fact that you ran out of disk space probably means that you don't have enough space for the sort (it clearly does not fit into maintenance_work_mem), and there's no way around that - you need enough disk space. I will now clean up the table a bit but will still have 500million rows with 6 indexes on it. I will create the indexes after the data is laid down vs during, so it doesn't block my other table replications. I will then fire off my index creations in parallel for my other tables so I can actually use the hardware the DB is sitting on. That's a very bad idea, because each of the index builds will require disk space for the sort, and you're even more likely to run out of disk space. But I guess the answer is, no real way to tell what the box is doing when it's creating an index. Yes there was a lock, no I could not find a way to see how it's progressing so there was no way for me to gauge when it would be done. Had it been waiting on a lock, it wouldn't consume 100% of CPU. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index creation running now for 14 hours
On 08/26/2015 10:26 PM, Tory M Blue wrote: the table is 90GB without indexes, 285GB with indexes and bloat, The row count is not actually completing.. 125Million rows over 13 months, this table is probably close to 600million rows. You don't need to do SELECT COUNT(*) if you only need an approximate number. You can look at pg_class.reltuples: SELECT reltuples FROM pg_class WHERE relname = 'impressions'; That should be a sufficiently accurate estimate. The above is when it had finished copying the table and started on the index.. Well as I said I'm running out of storage as the index is creating some serious data on the filesystem, I'll have to kill it, try to massage the data a bit and increase the maintenance_work mem to use some of my 256GB of ram to try to get through this. Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours Please post details on the configuration (shared_buffer, work_mem, maintenance_work_mem and such). BTW while the the CREATE INDEX is reporting 3.5GB, it most likely wrote a lot of data into on-disk chunks when sorting the data. So it's actually using the memory through page cache (i.e. don't increase maintenance_work_mem too much, you don't want to force the data to disk needlessly). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index creation running now for 14 hours
On Wed, Aug 26, 2015 at 3:36 PM, Tomas Vondra tomas.von...@2ndquadrant.com wrote: But I guess the answer is, no real way to tell what the box is doing when it's creating an index. Yes there was a lock, no I could not find a way to see how it's progressing so there was no way for me to gauge when it would be done. Had it been waiting on a lock, it wouldn't consume 100% of CPU. When things are going out to disk anyway, you're often better off with a lower maintenance_work_mem (or work_mem). It's actually kind of bogus than run size is dictated by these settings. Reducing it will tend to make tuplesort's maintenance of the heap invariant inexpensive, while not really making the merge phase more painful. I would try 128MB of maintenance_work_mem. That could be significantly faster. Check how the I/O load on the system compares with a higher maintenance_work_mem setting. Often, this will make the sort less CPU bound, which is good here. I am currently working on making this a lot better in Postgres 9.6. Also, note that text and numeric sorts will be much faster in 9.5. Of course, as Tomas says, if you don't have the disk space to do the sort, you're not going to be able to complete it. That much is very clear. If you're really worried about these costs, I suggest enabling trace_sort locally, and monitoring the progress of this sort in the logs. -- Regards, Peter Geoghegan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index creation running now for 14 hours
On Wed, Aug 26, 2015 at 12:36 PM, Igor Neyman iney...@perceptron.com wrote: *From:* Tory M Blue [mailto:tmb...@gmail.com] *Sent:* Wednesday, August 26, 2015 3:26 PM *To:* Igor Neyman iney...@perceptron.com *Cc:* pgsql-performance pgsql-performance@postgresql.org *Subject:* Re: [PERFORM] Index creation running now for 14 hours On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman iney...@perceptron.com wrote: *From:* pgsql-performance-ow...@postgresql.org [mailto: pgsql-performance-ow...@postgresql.org] *On Behalf Of *Tory M Blue *Sent:* Wednesday, August 26, 2015 3:14 PM *To:* pgsql-performance pgsql-performance@postgresql.org *Subject:* [PERFORM] Index creation running now for 14 hours I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100% (32 core box) and there is almost zero I/O activity. CentOS 6.6 16398 | clsdb | 25765 | 10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 | | 45712 | 2015-08-25 21:12:01.6 19819-07 | 2015-08-25 21:22:08.68766-07 | 2015-08-25 22:16:03.10099-07 | 2015-08-25 22:16:03.100992-07 | f | active | select _cls.fini shTableAfterCopy(143); analyze torque.impressions; I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without stopping postgres or slon, but that would start me back at square one. Thanks Tory i Check pg_locks in regards to the table in question. Regards, Igor Neyman thanks Igor I did, but not clear what that is telling me, there are 249 rows in there, nothing has a table name , they are all for the PID in the analyze torque.impressions line that I listed above pid 25765. Here is one for an exclusive lock, but what should I be looking for? There are no other processes on this box other than slon and this index creation. transactionid | || | || 93588453 | | | | 4/25823460 | 25765 | ExclusiveL ock | t | f Thanks Tory There are objects OIDs in pg_lock, not names. Find the OID of the table that you create your index for, and search pg_locks for the records referencing your table. It cannot be that all records in pg_locks are for pid running “analyze”, there should be records with pid running your “create index”. What’s the size of the table you are indexing? Also, take a look at pg_stat_activity for long running transactions/queries. Igor Neyman the table is 90GB without indexes, 285GB with indexes and bloat, The row count is not actually completing.. 125Million rows over 13 months, this table is probably close to 600million rows. Yes I have long running queries, my job started last night at 9pm, it appears 3 of the 6 indexes on this table are completed, but I'm about to blow out my disk space, so I won't be able to give it much longer to run Bah! 16398 | clsdb | 25765 | 10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 | | 45712 | 2015-08-25 21:12:01.619819-07 | 2015-08-25 21:22:08.68766-07 | 2015-08-25 22:16:03.10099-07 | 2015-08-25 22:16:03.100992-07 | f | active | select _cls.finishTableAfterCopy(143); analyze torque.impressions; 16398 | lsdb | 25777 | 10 | postgres | slon.local_cleanup| 10.13.200.232 | | 45718 | 2015-08-25 21:12:01.624032-07 | 2015-08-25 21:23:09.103395-07 | 2015-08-25 21:23:09.103395-07 | 2015-08-25 21:23:09.103397-07 | t | active | begin;lock table _cls.sl_config_lock;select _cls.cleanupEvent('10 minutes'::interval);commit; there is nothing else in the pg_stat table other than a bunch of slony connections, these are the only 2 items that have been running since the index started last night at 10:16pm 2015-08-25 22:16:03 PDT CONFIG remoteWorkerThread_1: 67254824703 bytes copied for table torque.impressions The above is when it had finished copying the table and started on the index.. Well as I said I'm running out of storage as the index is creating some serious data on the filesystem, I'll have to kill it, try to massage the data a bit and increase the maintenance_work mem to use some of my 256GB of ram to try to get through this. Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours Tory
Re: [PERFORM] Index creation running now for 14 hours
On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com wrote: Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy about. Regards, Qingqing -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance