Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-28 Thread Arup Rakshit
Hi,

> This is a broad topic, and I can only give you some hints.

Yes when I am reading this 
https://www.postgresql.org/docs/10/static/using-explain.html 
  I found the doc 
said like “...Plan-reading is an art that requires some experience to master, 
but this section attempts to cover the basics..”

Lets see how can I get there. :) I probably need to ask a lot of questions.


Thanks for sharing those points. I made a note about them.


Thanks,

Arup Rakshit
a...@zeit.io



> On 25-Sep-2018, at 11:45 AM, Laurenz Albe  wrote:
> 
> Arup Rakshit wrote:
>> Thanks for the link. I read the documentation you linked, and part of it
>> I understood and rest went above my head. Probably I need to read it multiple
>> times to understand what is going on. I am learning how indexing works in 
>> DBMS.
>> Mostly I understood Btree so far. I am an application developer.
>> Being an application developer I think I need to know which column should be
>> indexed and what kind of index to apply in which case.
> 
> That is true; a truth that many developers unfortunately ignore.
> 
>> Most of the time, when
>> I see slow query, I saw people ask to run the explain to see the plan.
>> And explain statement shows lot of data about the query. So my questions is 
>> that:
>> Which part I should read from the plan output to figure out reason of 
>> slowness
>> or what need to be used to improve it. What are basic things I should know 
>> about it.
>> I think, I don’t need to internal math for this, am I right? The query in
>> this post is not a slow query, it is something I ran to see how index merge 
>> happens.
>> I am asking generally. Can you give me some directions on this, so that I can
>> build up some bases on this subject.
> 
> This is a broad topic, and I can only give you some hints.
> In order to understand EXPLAIN output and to improve your query, you need
> to know some of how the database is implemented.
> 
> You have to understand index scans, index only scans and bitmap index scans.
> You have to understand nested loop, hash and merge joins.
> You have to understand table statistics, dead tuples and table bloat.
> 
> The first things to look for in EXPLAIN (ANALYZE, BUFFERS) output is in which
> nodes the time is spent, and where the estimated number of rows diverges
> significantly from the actual number of rows.
> The former are the spots where there is room for improvement, and the latter
> is often the root cause of a bad plan choice.
> Also, watch out for the nodes that touch a lot of blocks.
> They can cause intermittent slow-down if the blocks are not in cache.
> 
>> Also what are the best resources to learn GIST, GIN indexes — something which
>> teaches it from the ground up? 
> 
> The documentation, and for what goes beyond that, the source.
> 
> Yours,
> Laurenz Albe
> -- 
> Cybertec | https://www.cybertec-postgresql.com
> 



Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-25 Thread Laurenz Albe
Arup Rakshit wrote:
> Thanks for the link. I read the documentation you linked, and part of it
> I understood and rest went above my head. Probably I need to read it multiple
> times to understand what is going on. I am learning how indexing works in 
> DBMS.
> Mostly I understood Btree so far. I am an application developer.
> Being an application developer I think I need to know which column should be
> indexed and what kind of index to apply in which case.

That is true; a truth that many developers unfortunately ignore.

> Most of the time, when
> I see slow query, I saw people ask to run the explain to see the plan.
> And explain statement shows lot of data about the query. So my questions is 
> that:
> Which part I should read from the plan output to figure out reason of slowness
> or what need to be used to improve it. What are basic things I should know 
> about it.
> I think, I don’t need to internal math for this, am I right? The query in
> this post is not a slow query, it is something I ran to see how index merge 
> happens.
> I am asking generally. Can you give me some directions on this, so that I can
> build up some bases on this subject.

This is a broad topic, and I can only give you some hints.
In order to understand EXPLAIN output and to improve your query, you need
to know some of how the database is implemented.

You have to understand index scans, index only scans and bitmap index scans.
You have to understand nested loop, hash and merge joins.
You have to understand table statistics, dead tuples and table bloat.

The first things to look for in EXPLAIN (ANALYZE, BUFFERS) output is in which
nodes the time is spent, and where the estimated number of rows diverges
significantly from the actual number of rows.
The former are the spots where there is room for improvement, and the latter
is often the root cause of a bad plan choice.
Also, watch out for the nodes that touch a lot of blocks.
They can cause intermittent slow-down if the blocks are not in cache.

> Also what are the best resources to learn GIST, GIN indexes — something which
> teaches it from the ground up? 

The documentation, and for what goes beyond that, the source.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-24 Thread Arup Rakshit
Hello Thomas,

Thanks for the link. I read the documentation you linked, and part of it I 
understood and rest went above my head. Probably I need to read it multiple 
times to understand what is going on. I am learning how indexing works in DBMS. 
Mostly I understood Btree so far. I am an application developer. Being an 
application developer I think I need to know which column should be indexed and 
what kind of index to apply in which case. Most of the time, when I see slow 
query, I saw people ask to run the explain to see the plan. And explain 
statement shows lot of data about the query. So my questions is that: Which 
part I should read from the plan output to figure out reason of slowness or 
what need to be used to improve it. What are basic things I should know about 
it. I think, I don’t need to internal math for this, am I right? The query in 
this post is not a slow query, it is something I ran to see how index merge 
happens. I am asking generally. Can you give me some directions on this, so 
that I can build up some bases on this subject.


Also what are the best resources to learn GIST, GIN indexes — something which 
teaches it from the ground up? 


Thanks,

Arup Rakshit
a...@zeit.io



> On 24-Sep-2018, at 4:11 AM, Tomas Vondra  wrote:
> 
> 
> 
> On 09/23/2018 10:21 PM, Arup Rakshit wrote:
>> Hello I have some questions related to the query plan output about the
>> planned and actual rows. In the following example:
>> 
>> # explain (analyze true, costs true, format yaml) select * from users
>> where lower(city) = 'melanyfort' and lower(state) = 'ohio';
>>  QUERY PLAN 
>>
>> 
>>  - Plan:
>>   +
>>  Node Type: "Bitmap Heap Scan"  
>>   +
>>  Parallel Aware: false  
>>   +
>>  Relation Name: "users" 
>>   +
>>  Alias: "users" 
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 14.80  
>>   +
>>  Plan Rows: 1   
>>   +
>>  Plan Width: 73 
>>   +
>>  Actual Startup Time: 0.155 
>>   +
>>  Actual Total Time: 0.155   
>>   +
>>  Actual Rows: 0 
>>   +
>>  Actual Loops: 1
>>   +
>>  Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND
>> (lower((state)::text) = 'ohio'::text))"+
>>  Rows Removed by Index Recheck: 0   
>>   +
>>  Exact Heap Blocks: 0   
>>   +
>>  Lossy Heap Blocks: 0   
>>   +
>>  Plans: 
>>   +
>>- Node Type: "BitmapAnd" 
>>   +
>>  Parent Relationship: "Outer"   
>>   +
>>  Parallel Aware: false  
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 10.78  
>>   +
>>  Plan Rows: 1   
>>   +
>>  Plan Width: 0  
>>   +
>>  Actual Startup Time: 0.153 
>>   +
>>  Actual Total Time: 0.153   
>>   +
>>  Actual Rows: 0

Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-23 Thread Tomas Vondra



On 09/23/2018 10:21 PM, Arup Rakshit wrote:
> Hello I have some questions related to the query plan output about the
> planned and actual rows. In the following example:
> 
> # explain (analyze true, costs true, format yaml) select * from users
> where lower(city) = 'melanyfort' and lower(state) = 'ohio';
>                                                  QUERY PLAN             
>                                    
> 
>  - Plan:                                                                
>                                   +
>      Node Type: "Bitmap Heap Scan"                                      
>                                   +
>      Parallel Aware: false                                              
>                                   +
>      Relation Name: "users"                                             
>                                   +
>      Alias: "users"                                                     
>                                   +
>      Startup Cost: 10.78                                                
>                                   +
>      Total Cost: 14.80                                                  
>                                   +
>      Plan Rows: 1                                                       
>                                   +
>      Plan Width: 73                                                     
>                                   +
>      Actual Startup Time: 0.155                                         
>                                   +
>      Actual Total Time: 0.155                                           
>                                   +
>      Actual Rows: 0                                                     
>                                   +
>      Actual Loops: 1                                                    
>                                   +
>      Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND
> (lower((state)::text) = 'ohio'::text))"+
>      Rows Removed by Index Recheck: 0                                   
>                                   +
>      Exact Heap Blocks: 0                                               
>                                   +
>      Lossy Heap Blocks: 0                                               
>                                   +
>      Plans:                                                             
>                                   +
>        - Node Type: "BitmapAnd"                                         
>                                   +
>          Parent Relationship: "Outer"                                   
>                                   +
>          Parallel Aware: false                                          
>                                   +
>          Startup Cost: 10.78                                            
>                                   +
>          Total Cost: 10.78                                              
>                                   +
>          Plan Rows: 1                                                   
>                                   +
>          Plan Width: 0                                                  
>                                   +
>          Actual Startup Time: 0.153                                     
>                                   +
>          Actual Total Time: 0.153                                       
>                                   +
>          Actual Rows: 0                                                 
>                                   +
>          Actual Loops: 1                                                
>                                   +
>          Plans:                                                         
>                                   +
>            - Node Type: "Bitmap Index Scan"                             
>                                   +
>              Parent Relationship: "Member"                              
>                                   +
>              Parallel Aware: false                                      
>                                   +
>              Index Name: "users_lower_idx"                              
>                                   +
>              Startup Cost: 0.00                                         
>                                   +
>              Total Cost: 4.66                                           
>                                   +
>              Plan Rows: 50                                              
>                                   +
>              Plan Width: 0                                              
>                                   +
>              Actual Startup Time: 0.048                                 
>                                   +
>              Actual Total Time: 0.048                   

Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-23 Thread Arup Rakshit
Hello I have some questions related to the query plan output about the planned 
and actual rows. In the following example:

# explain (analyze true, costs true, format yaml) select * from users where 
lower(city) = 'melanyfort' and lower(state) = 'ohio';
 QUERY PLAN 
   

 - Plan:
   +
 Node Type: "Bitmap Heap Scan"  
   +
 Parallel Aware: false  
   +
 Relation Name: "users" 
   +
 Alias: "users" 
   +
 Startup Cost: 10.78
   +
 Total Cost: 14.80  
   +
 Plan Rows: 1   
   +
 Plan Width: 73 
   +
 Actual Startup Time: 0.155 
   +
 Actual Total Time: 0.155   
   +
 Actual Rows: 0 
   +
 Actual Loops: 1
   +
 Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND 
(lower((state)::text) = 'ohio'::text))"+
 Rows Removed by Index Recheck: 0   
   +
 Exact Heap Blocks: 0   
   +
 Lossy Heap Blocks: 0   
   +
 Plans: 
   +
   - Node Type: "BitmapAnd" 
   +
 Parent Relationship: "Outer"   
   +
 Parallel Aware: false  
   +
 Startup Cost: 10.78
   +
 Total Cost: 10.78  
   +
 Plan Rows: 1   
   +
 Plan Width: 0  
   +
 Actual Startup Time: 0.153 
   +
 Actual Total Time: 0.153   
   +
 Actual Rows: 0 
   +
 Actual Loops: 1
   +
 Plans: 
   +
   - Node Type: "Bitmap Index Scan" 
   +
 Parent Relationship: "Member"  
   +
 Parallel Aware: false  
   +
 Index Name: "users_lower_idx"  
   +
 Startup Cost: 0.00 
   +
 Total Cost: 4.66   
   +
 Plan Rows: 50  
   +
 Plan Width: 0  
   +
 Actual Startup Time: 0.048 
   +
 Actual Total Time: 0.048   
   +
 Actual Rows: 1 
   +