Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Igor Neyman


From: Tory M Blue [mailto:tmb...@gmail.com]
Sent: Wednesday, August 26, 2015 3:26 PM
To: Igor Neyman iney...@perceptron.com
Cc: pgsql-performance pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Index creation running now for 14 hours



On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman 
iney...@perceptron.commailto:iney...@perceptron.com wrote:


From: 
pgsql-performance-ow...@postgresql.orgmailto:pgsql-performance-ow...@postgresql.org
 
[mailto:pgsql-performance-ow...@postgresql.orgmailto:pgsql-performance-ow...@postgresql.org]
 On Behalf Of Tory M Blue
Sent: Wednesday, August 26, 2015 3:14 PM
To: pgsql-performance 
pgsql-performance@postgresql.orgmailto:pgsql-performance@postgresql.org
Subject: [PERFORM] Index creation running now for 14 hours

I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it 
started this particular tables index creation at 10:16pm and it's still 
running. 1 single core is at 100% (32 core box) and there is almost zero I/O 
activity.

CentOS 6.6


 16398 | clsdb | 25765 |   10 | postgres | slon.remoteWorkerThread_1 | 
10.13.200.232 | |   45712 | 2015-08-25 21:12:01.6
19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  | 
2015-08-25 22:16:03.100992-07 | f   | active | select _cls.fini
shTableAfterCopy(143); analyze torque.impressions;
I was wondering if there were underlying tools to see how it's progressing, or 
if there is anything I can do to bump the performance mid creation? Nothing I 
can do really without stopping postgres or slon, but that would start me back 
at square one.

Thanks
Tory


i

Check pg_locks in regards to the table in question.

Regards,
Igor Neyman

thanks Igor I did, but not clear what that is telling me, there are 249 rows in 
there, nothing has a table name , they are all for the PID in the analyze 
torque.impressions line that I listed above pid 25765.

Here is one for an exclusive lock, but what should I be looking for? There are 
no other processes on this box other than slon and this index creation.


 transactionid |  ||  |   ||  
93588453 | |   |  | 4/25823460 | 25765 | ExclusiveL
ock   | t   | f

Thanks
Tory

There are objects OIDs in pg_lock, not names.
Find the OID of the table that you create your index for, and search pg_locks 
for the records referencing your table.
It cannot be that all records in pg_locks are for pid running “analyze”, there 
should be records with pid running your “create index”.
What’s the size of the table you are indexing?
Also, take a look at pg_stat_activity for long running transactions/queries.

Igor Neyman



Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Tory M Blue
On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman iney...@perceptron.com
wrote:





 *From:* pgsql-performance-ow...@postgresql.org [mailto:
 pgsql-performance-ow...@postgresql.org] *On Behalf Of *Tory M Blue
 *Sent:* Wednesday, August 26, 2015 3:14 PM
 *To:* pgsql-performance pgsql-performance@postgresql.org
 *Subject:* [PERFORM] Index creation running now for 14 hours



 I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it
 started this particular tables index creation at 10:16pm and it's still
 running. 1 single core is at 100% (32 core box) and there is almost zero
 I/O activity.



 CentOS 6.6





  16398 | clsdb | 25765 |   10 | postgres | slon.remoteWorkerThread_1 |
 10.13.200.232 | |   45712 | 2015-08-25 21:12:01.6

 19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  |
 2015-08-25 22:16:03.100992-07 | f   | active | select _cls.fini

 shTableAfterCopy(143); analyze torque.impressions;

 I was wondering if there were underlying tools to see how it's
 progressing, or if there is anything I can do to bump the performance mid
 creation? Nothing I can do really without stopping postgres or slon, but
 that would start me back at square one.



 Thanks

 Tory





 i



 Check pg_locks in regards to the table in question.



 Regards,

 Igor Neyman


thanks Igor I did, but not clear what that is telling me, there are 249
rows in there, nothing has a table name , they are all for the PID in the
analyze torque.impressions line that I listed above pid 25765.

Here is one for an exclusive lock, but what should I be looking for? There
are no other processes on this box other than slon and this index creation.


 transactionid |  ||  |   ||
93588453 | |   |  | 4/25823460 | 25765 |
ExclusiveL

ock   | t   | f

Thanks
Tory


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Igor Neyman


From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Tory M Blue
Sent: Wednesday, August 26, 2015 3:14 PM
To: pgsql-performance pgsql-performance@postgresql.org
Subject: [PERFORM] Index creation running now for 14 hours

I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it 
started this particular tables index creation at 10:16pm and it's still 
running. 1 single core is at 100% (32 core box) and there is almost zero I/O 
activity.

CentOS 6.6


 16398 | clsdb | 25765 |   10 | postgres | slon.remoteWorkerThread_1 | 
10.13.200.232 | |   45712 | 2015-08-25 21:12:01.6
19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  | 
2015-08-25 22:16:03.100992-07 | f   | active | select _cls.fini
shTableAfterCopy(143); analyze torque.impressions;
I was wondering if there were underlying tools to see how it's progressing, or 
if there is anything I can do to bump the performance mid creation? Nothing I 
can do really without stopping postgres or slon, but that would start me back 
at square one.

Thanks
Tory


iostat: sdb is the db directory

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.550.000.230.000.00   96.22

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   1.00 0.0012.00  0 24
sdb   0.00 0.00 0.00  0  0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.570.000.060.000.00   96.37

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   0.00 0.00 0.00  0  0
sdb  21.50 0.00 15484.00  0  30968

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.720.000.060.000.00   96.22

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   2.00 0.0020.00  0 40
sdb   0.00 0.00 0.00  0  0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   4.060.000.050.020.00   95.87

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   4.00 0.0064.00  0128
sdb   3.50 0.00   108.00  0216

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.360.000.030.000.00   96.61

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.410.000.060.000.00   96.53

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.450.000.270.000.00   96.28

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   0.00 0.00 0.00  0  0
sdb   1.00 0.0024.00  0 48

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   3.500.000.300.000.00   96.20

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda   1.50 0.00   344.00  0688
sdb   0.00 0.00 0.00  0  0

Check pg_locks in regards to the table in question.

Regards,
Igor Neyman


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Tory M Blue
On Wed, Aug 26, 2015 at 2:45 PM, Qingqing Zhou zhouqq.postg...@gmail.com
wrote:

 On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com wrote:
 
  Right now the 100% cpu process which is this index is only using 3.5GB
  and has been for the last 15 hours
 

 If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy about.

 Regards,
 Qingqing



I appreciate the attempted help, but I know what it's doing, it's creating
indexes for the last 14+ hours. I've killed it now,  as it was about to run
my machine out of disk space, stopped it at 97% full, could not go any
longer.

I will now clean up the table a bit but will still have 500million rows
with 6 indexes on it. I will create the indexes after the data is laid down
vs during,  so it doesn't block my other table replications. I will then
fire off my index creations in parallel  for my other tables so I can
actually use the hardware the DB is sitting on.

But I guess the answer is, no real way to tell what the box is doing when
it's creating an index. Yes there was a lock, no I could not find a way to
see how it's progressing so there was no way for me to gauge when it would
be done.

Thanks
Tory

Tory


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Tomas Vondra

Hi,

On 08/26/2015 11:53 PM, Tory M Blue wrote:



On Wed, Aug 26, 2015 at 2:45 PM, Qingqing Zhou
zhouqq.postg...@gmail.com mailto:zhouqq.postg...@gmail.com wrote:

On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com
mailto:tmb...@gmail.com wrote:
 
  Right now the 100% cpu process which is this index is only using
3.5GB
  and has been for the last 15 hours
 

If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy
about.

Regards,
Qingqing



I appreciate the attempted help, but I know what it's doing, it's
creating indexes for the last 14+ hours.


Sure, but what exactly was it doing? 'perf top' might give us a hint 
which function is consuming most of the time, for example.


 I've killed it now,  as it was

about to run my machine out of disk space, stopped it at 97% full, could
not go any longer.


Which suggests it's using a lot of temp files.

Indexes are built by reading all the necessary data from the table (just 
the columns), sorted and then an index is built using the sorted data 
(because it can be done very efficiently - much faster than when simply 
inserting the tuples into the btree index).


The fact that you ran out of disk space probably means that you don't 
have enough space for the sort (it clearly does not fit into 
maintenance_work_mem), and there's no way around that - you need enough 
disk space.



I will now clean up the table a bit but will still have 500million rows
with 6 indexes on it. I will create the indexes after the data is laid
down vs during,  so it doesn't block my other table replications. I will
then fire off my index creations in parallel  for my other tables so I
can actually use the hardware the DB is sitting on.


That's a very bad idea, because each of the index builds will require 
disk space for the sort, and you're even more likely to run out of disk 
space.




But I guess the answer is, no real way to tell what the box is doing
when it's creating an index. Yes there was a lock, no I could not find a
way to see how it's progressing so there was no way for me to gauge when
it would be done.


Had it been waiting on a lock, it wouldn't consume 100% of CPU.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Tomas Vondra



On 08/26/2015 10:26 PM, Tory M Blue wrote:



the table is 90GB without indexes,  285GB with indexes and bloat,  The
row count is not actually completing.. 125Million rows over 13 months,
this table is probably close to 600million rows.


You don't need to do SELECT COUNT(*) if you only need an approximate 
number. You can look at pg_class.reltuples:


   SELECT reltuples FROM pg_class WHERE relname = 'impressions';

That should be a sufficiently accurate estimate.


The above is when it had finished copying the table and started on the
index..

Well as I said I'm running out of storage as the index is creating some
serious data on the filesystem, I'll have to kill it, try to massage the
data a bit and increase the maintenance_work mem to use some of my 256GB
of ram to try to get through this. Right now the 100% cpu process which
is this index is only using 3.5GB and has been for the last 15 hours


Please post details on the configuration (shared_buffer, work_mem, 
maintenance_work_mem and such).


BTW while the the CREATE INDEX is reporting 3.5GB, it most likely wrote 
a lot of data into on-disk chunks when sorting the data. So it's 
actually using the memory through page cache (i.e. don't increase 
maintenance_work_mem too much, you don't want to force the data to disk 
needlessly).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Peter Geoghegan
On Wed, Aug 26, 2015 at 3:36 PM, Tomas Vondra
tomas.von...@2ndquadrant.com wrote:
 But I guess the answer is, no real way to tell what the box is doing
 when it's creating an index. Yes there was a lock, no I could not find a
 way to see how it's progressing so there was no way for me to gauge when
 it would be done.


 Had it been waiting on a lock, it wouldn't consume 100% of CPU.

When things are going out to disk anyway, you're often better off with
a lower maintenance_work_mem (or work_mem). It's actually kind of
bogus than run size is dictated by these settings. Reducing it will
tend to make tuplesort's maintenance of the heap invariant
inexpensive, while not really making the merge phase more painful. I
would try 128MB of maintenance_work_mem. That could be significantly
faster. Check how the I/O load on the system compares with a higher
maintenance_work_mem setting. Often, this will make the sort less CPU
bound, which is good here.

I am currently working on making this a lot better in Postgres 9.6.
Also, note that text and numeric sorts will be much faster in 9.5.

Of course, as Tomas says, if you don't have the disk space to do the
sort, you're not going to be able to complete it. That much is very
clear.

If you're really worried about these costs, I suggest enabling
trace_sort locally, and monitoring the progress of this sort in the
logs.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Tory M Blue
On Wed, Aug 26, 2015 at 12:36 PM, Igor Neyman iney...@perceptron.com
wrote:





 *From:* Tory M Blue [mailto:tmb...@gmail.com]
 *Sent:* Wednesday, August 26, 2015 3:26 PM
 *To:* Igor Neyman iney...@perceptron.com
 *Cc:* pgsql-performance pgsql-performance@postgresql.org
 *Subject:* Re: [PERFORM] Index creation running now for 14 hours







 On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman iney...@perceptron.com
 wrote:





 *From:* pgsql-performance-ow...@postgresql.org [mailto:
 pgsql-performance-ow...@postgresql.org] *On Behalf Of *Tory M Blue
 *Sent:* Wednesday, August 26, 2015 3:14 PM
 *To:* pgsql-performance pgsql-performance@postgresql.org
 *Subject:* [PERFORM] Index creation running now for 14 hours



 I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it
 started this particular tables index creation at 10:16pm and it's still
 running. 1 single core is at 100% (32 core box) and there is almost zero
 I/O activity.



 CentOS 6.6





  16398 | clsdb | 25765 |   10 | postgres | slon.remoteWorkerThread_1 |
 10.13.200.232 | |   45712 | 2015-08-25 21:12:01.6

 19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  |
 2015-08-25 22:16:03.100992-07 | f   | active | select _cls.fini

 shTableAfterCopy(143); analyze torque.impressions;

 I was wondering if there were underlying tools to see how it's
 progressing, or if there is anything I can do to bump the performance mid
 creation? Nothing I can do really without stopping postgres or slon, but
 that would start me back at square one.



 Thanks

 Tory





 i



 Check pg_locks in regards to the table in question.



 Regards,

 Igor Neyman



 thanks Igor I did, but not clear what that is telling me, there are 249
 rows in there, nothing has a table name , they are all for the PID in the
 analyze torque.impressions line that I listed above pid 25765.



 Here is one for an exclusive lock, but what should I be looking for? There
 are no other processes on this box other than slon and this index creation.





  transactionid |  ||  |   ||
 93588453 | |   |  | 4/25823460 | 25765 |
 ExclusiveL

 ock   | t   | f



 Thanks

 Tory



 There are objects OIDs in pg_lock, not names.

 Find the OID of the table that you create your index for, and search
 pg_locks for the records referencing your table.

 It cannot be that all records in pg_locks are for pid running “analyze”,
 there should be records with pid running your “create index”.

 What’s the size of the table you are indexing?

 Also, take a look at pg_stat_activity for long running
 transactions/queries.



 Igor Neyman




the table is 90GB without indexes,  285GB with indexes and bloat,  The row
count is not actually completing.. 125Million rows over 13 months, this
table is probably close to 600million rows.

Yes I have long running queries, my job started last night at 9pm, it
appears 3 of the 6 indexes on this table are completed, but I'm about to
blow out my disk space, so I won't be able to give it much longer to
run Bah!

16398 | clsdb | 25765 |   10 | postgres | slon.remoteWorkerThread_1 |
10.13.200.232 | |   45712 | 2015-08-25
21:12:01.619819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25
22:16:03.10099-07  | 2015-08-25 22:16:03.100992-07 | f   | active |
select _cls.finishTableAfterCopy(143); analyze torque.impressions;

 16398 | lsdb | 25777 |   10 | postgres | slon.local_cleanup|
10.13.200.232 | |   45718 | 2015-08-25
21:12:01.624032-07 | 2015-08-25 21:23:09.103395-07 | 2015-08-25
21:23:09.103395-07 | 2015-08-25 21:23:09.103397-07 | t   | active |
begin;lock table _cls.sl_config_lock;select _cls.cleanupEvent('10
minutes'::interval);commit;
there is nothing else in the pg_stat table other than a bunch of slony
connections, these are the only 2 items that have been running since the
index started last night at 10:16pm

2015-08-25 22:16:03 PDT CONFIG remoteWorkerThread_1: 67254824703 bytes
copied for table torque.impressions

The above is when it had finished copying the table and started on the
index..

Well as I said I'm running out of storage as the index is creating some
serious data on the filesystem, I'll have to kill it, try to massage the
data a bit and increase the maintenance_work mem to use some of my 256GB of
ram to try to get through this. Right now the 100% cpu process which is
this index is only using 3.5GB and has been for the last 15 hours


Tory


Re: [PERFORM] Index creation running now for 14 hours

2015-08-26 Thread Qingqing Zhou
On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue tmb...@gmail.com wrote:

 Right now the 100% cpu process which is this index is only using 3.5GB
 and has been for the last 15 hours


If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy about.

Regards,
Qingqing


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance