> On Jun 7, 2018, at 4:18 PM, Robert <dom...@logicalchaos.org> wrote:
> 
>> You can capture the execution plan of the bad statement by using 
>> auto_explain,
>> that would certainly shed more light on the problem.
> 

A different query started showing up as the problem, the auto_explain with 
analyze shows an oddity, the total query duration is 11k seconds, while the 
query itself is 3 seconds.  I captured a ZFS snapshot as soon as the problem 
was noticed.

Jun 16 23:15:30 blackpearl postgres[9860]: [79-1] 
db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1 LOG:  duration: 
10856644.336 ms  plan:                                               
Jun 16 23:15:30 blackpearl postgres[9860]: [79-2]       Query Text: DELETE FROM 
ds3.blob WHERE EXISTS (SELECT * FROM ds3.s3_object WHERE id = 
ds3.blob.object_id AND (bucket_id = $1))                 
Jun 16 23:15:30 blackpearl postgres[9860]: [79-3]       Delete on blob  
(cost=1308.79..1312.82 rows=1 width=12) (actual time=3465.919..3465.919 rows=0 
loops=1)                                        
Jun 16 23:15:30 blackpearl postgres[9860]: [79-4]         ->  Nested Loop  
(cost=1308.79..1312.82 rows=1 width=12) (actual time=50.293..2435.271 
rows=300000 loops=1)                                  
Jun 16 23:15:30 blackpearl postgres[9860]: [79-5]               ->  Bitmap Heap 
Scan on s3_object  (cost=634.39..636.41 rows=1 width=22) (actual 
time=50.269..153.885 rows=300000 loops=1)             
Jun 16 23:15:30 blackpearl postgres[9860]: [79-6]                     Recheck 
Cond: (bucket_id = 'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)                
                                         
Jun 16 23:15:30 blackpearl postgres[9860]: [79-7]                     Heap 
Blocks: exact=3704
Jun 16 23:15:30 blackpearl postgres[9860]: [79-8]                     ->  
Bitmap Index Scan on ds3_s3_object__bucket_id  (cost=0.00..634.39 rows=1 
width=0) (actual time=49.552..49.552 rows=300000 loops=1)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-9]                           
Index Cond: (bucket_id = 'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)          
                                           
Jun 16 23:15:30 blackpearl postgres[9860]: [79-10]              ->  Bitmap Heap 
Scan on blob  (cost=674.39..676.41 rows=1 width=22) (actual time=0.005..0.006 
rows=1 loops=300000)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-11]                    Recheck 
Cond: (object_id = s3_object.id)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-12]                    Heap 
Blocks: exact=300000
Jun 16 23:15:30 blackpearl postgres[9860]: [79-13]                    ->  
Bitmap Index Scan on ds3_blob__object_id  (cost=0.00..674.39 rows=1 width=0) 
(actual time=0.004..0.004 rows=1 loops=300000)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-14]                          
Index Cond: (object_id = s3_object.id)


Doing a ZFS rollback and executing the query shows what is happening, although 
not to the extent above.  If I read this correctly, it’s the constraint checks 
that are causing the query to take so long.  I don’t see any server 
configuration that might allow those checks to be parallelized.  Is that 
possible?

tapesystem=# PREPARE foo(uuid) AS DELETE FROM ds3.blob WHERE EXISTS (SELECT * 
FROM ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1)); EXPLAIN 
ANALYZE EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e');
PREPARE
                                                           QUERY PLAN           
                                                                                
                                       
--------------------------------------------------------------------------------------------------------------------------------
                                                                       
 Delete on blob  (cost=9555.07..21134.01 rows=299993 width=12) (actual 
time=1516.140..1516.140 rows=0 loops=1)
   ->  Hash Join  (cost=9555.07..21134.01 rows=299993 width=12) (actual 
time=237.816..621.306 rows=300000 loops=1)
         Hash Cond: (s3_object.id = blob.object_id)
         ->  Seq Scan on s3_object  (cost=0.00..7454.04 rows=299993 width=22) 
(actual time=0.027..148.503 rows=300000 loops=1)                                
                                         
               Filter: (bucket_id = 
'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)
               Rows Removed by Filter: 3
         ->  Hash  (cost=5805.03..5805.03 rows=300003 width=22) (actual 
time=235.219..235.219 rows=300003 loops=1)
               Buckets: 524288  Batches: 1  Memory Usage: 19917kB
               ->  Seq Scan on blob  (cost=0.00..5805.03 rows=300003 width=22) 
(actual time=0.038..114.107 rows=300003 loops=1)                                
                                        
 Trigger for constraint blob_tape_blob_id_fkey: time=5389.627 calls=300000
 Trigger for constraint multi_part_upload_placeholder_blob_id_fkey: 
time=4566.305 calls=300000
 Trigger for constraint multi_part_upload_part_blob_id_fkey: time=3597.896 
calls=300000
 Trigger for constraint blob_pool_blob_id_fkey: time=4631.851 calls=300000
 Trigger for constraint blob_target_blob_id_fkey: time=4688.731 calls=300000
 Trigger for constraint suspect_blob_tape_blob_id_fkey: time=4473.787 
calls=300000
 Trigger for constraint suspect_blob_pool_blob_id_fkey: time=4475.286 
calls=300000
 Trigger for constraint suspect_blob_target_blob_id_fkey: time=4353.237 
calls=300000
 Trigger for constraint blob_s3_target_blob_id_fkey: time=4451.687 calls=300000
 Trigger for constraint blob_azure_target_blob_id_fkey: time=4448.613 
calls=300000
 Trigger for constraint suspect_blob_azure_target_blob_id_fkey: time=4353.467 
calls=300000
 Execution time: 85175.581 ms

I executed a vacuum analyze, then a vacuum full analyze, neither changed the 
timing.  Other than removing constraints, is there any way to address this?

Best,
Robert

Reply via email to