from:"Arup Rakshit"

Create array of data from JSONB in PG 9.5

2019-11-15 Thread Arup Rakshit

Hi,

I wrote the below SQL which works and give me what I want, but it doesn’t work 
in 9.5 version. I wrote this query in PG 10. I want to change it so that I can 
run it on 9.5 as production using 9.5 version. I need help on this.

SELECT DISTINCT
ARRAY [jsonb_array_elements(data) ->> 'group', 
jsonb_array_elements(jsonb_array_elements(data) -> 'properties') ->> 'name']
FROM
"vessels"
WHERE
"vessels"."deleted_at" IS NULL
AND "vessels"."company_id" = '7d105acd-be5a-4225-a2db-b549105e4172';


When I ran the same query to production I get error:

ERROR:  set-valued function called in context that cannot accept a set


Thanks,

Arup Rakshit
a...@zeit.io

Re: Extend inner join to fetch not yet connected rows also

2019-09-22 Thread Arup Rakshit



> On 22-Sep-2019, at 6:55 PM, Jan Kohnert  
> wrote:
> 
> Hi Arup,
> 
> Am Sonntag, 22. September 2019, 14:48:20 CEST schrieb Arup Rakshit:
>> Hi Jan,
>> 
>>> On 22-Sep-2019, at 5:38 PM, Jan Kohnert 
>>> wrote:
>>> maybe something like
>>> 
>>> select
>>> 
>>>   c.id,
>>>   c.name,
>>>   case when cs.user_id = 8 then true else false end as has
>>> 
>>> from craftsmanships c
>>> left join contractor_skills cs
>>> 
>>>   on cs.craftsmanship_id = c.craftmanship_id;
>> 
>> But this query fetched duplicate data:
> 
> yeah, that's possible, since I don't exactly know your data model. If only 
> the 
> values above are required, you could simply use distinct:

When someone adds a craftsmanship to their skill set, the contractor_skills 
table holds that relationship. I don’t think distinct is the correct tool, as 
it will eliminate the correct data. users and craftsmanship has m:n 
relationship via the join table contractor_skills.

SELECT
craftsmanships.id,
craftsmanships.name,
CASE WHEN contractor_skills.user_id IS NULL THEN
FALSE
ELSE
TRUE
END AS has
FROM
"craftsmanships"
LEFT JOIN "contractor_skills" ON "contractor_skills"."craftsmanship_id" 
= "craftsmanships"."id"
LEFT JOIN "users" ON "users"."id" = "contractor_skills"."user_id"
WHERE (contractor_skills.user_id = 8
OR contractor_skills.user_id IS NULL)
ORDER BY
"craftsmanships"."id”;

Gives correct result. Not sure if still this query has bug in it.

id | name  | has
+---+-
  1 | paint | t
  2 | drywall   | t
  3 | bathrooms | f
  4 | kitchens  | f
  5 | flooring  | f
  6 | basements | f
  7 | carpentry | f
  8 | decks (displayed as decks and patios) | f
  9 | windows (windows and doors)   | f
 10 | countertops   | f
 11 | landscaping   | f
 12 | electrical| f
 13 | plumbing  | f
 14 | handyman  | f
(14 rows)




> 
> select distinct 
>c.id,
>c.name,
>case when cs.user_id = 8 then true else false end as has 
> from craftsmanships c
> left join contractor_skills cs
>on cs.craftsmanship_id = c.id
> order by
>   c.id;
> 
> -- 
> MfG Jan
> 
> 
> 
>

Re: Extend inner join to fetch not yet connected rows also

2019-09-22 Thread Arup Rakshit

Hi Jan,

> On 22-Sep-2019, at 5:38 PM, Jan Kohnert  
> wrote:
> 
> Hey,
> 
> Am Sonntag, 22. September 2019, 13:21:46 CEST schrieb Arup Rakshit:
>> I have craftsmanships table which has (id, name) and users table (id, email,
>> ..). When a user has some craftsmanships, they are stored inside the
>> contractor_skills(user_id, craftsmanship_id, id) table.
> 
> [...]
> 
>> But I want to list all craftsmanships and has column should have `t` when
>> user_id #8 has it, else `f`. How can I extend this   query?
> 
> maybe something like
> 
> select
>c.id,
>c.name,
>case when cs.user_id = 8 then true else false end as has
> from craftsmanships c
> left join contractor_skills cs
>on cs.craftsmanship_id = c.craftmanship_id;
> 
> -- 
> MfG Jan
> 


But this query fetched duplicate data:

 id | name  | has
+---+-
  2 | drywall   | t
  1 | paint | t
  1 | paint | f
 11 | landscaping   | f
 12 | electrical| f
 10 | countertops   | f
 13 | plumbing  | f
  5 | flooring  | f
  8 | decks (displayed as decks and patios) | f
  6 | basements | f
  4 | kitchens  | f
  3 | bathrooms | f
 14 | handyman  | f
  9 | windows (windows and doors)   | f
  7 | carpentry | f
(15 rows)



>

Re: Extend inner join to fetch not yet connected rows also

2019-09-22 Thread Arup Rakshit

Hi Jan,

I was close and came up with:

SELECT
craftsmanships.id,
craftsmanships.name,
CASE WHEN contractor_skills.user_id IS NULL THEN
FALSE
ELSE
TRUE
END AS has
FROM
"craftsmanships"
LEFT JOIN "contractor_skills" ON "contractor_skills"."craftsmanship_id" 
= "craftsmanships"."id"
LEFT JOIN "users" ON "users"."id" = "contractor_skills"."user_id"
WHERE (contractor_skills.user_id = 3
OR contractor_skills.user_id IS NULL)
ORDER BY
"craftsmanships"."id”;

But after I read yours I found mine is doing lot of unnecessary joins. Thank 
you.


Thanks,

Arup Rakshit
a...@zeit.io



> On 22-Sep-2019, at 5:38 PM, Jan Kohnert  
> wrote:
> 
> Hey,
> 
> Am Sonntag, 22. September 2019, 13:21:46 CEST schrieb Arup Rakshit:
>> I have craftsmanships table which has (id, name) and users table (id, email,
>> ..). When a user has some craftsmanships, they are stored inside the
>> contractor_skills(user_id, craftsmanship_id, id) table.
> 
> [...]
> 
>> But I want to list all craftsmanships and has column should have `t` when
>> user_id #8 has it, else `f`. How can I extend this   query?
> 
> maybe something like
> 
> select
>c.id,
>c.name,
>case when cs.user_id = 8 then true else false end as has
> from craftsmanships c
> left join contractor_skills cs
>on cs.craftsmanship_id = c.craftmanship_id;
> 
> -- 
> MfG Jan
> 
> 
> 
>

Extend inner join to fetch not yet connected rows also

2019-09-22 Thread Arup Rakshit

I have craftsmanships table which has (id, name) and users table (id, email, 
..). When a user has some craftsmanships, they are stored inside the 
contractor_skills(user_id, craftsmanship_id, id) table.

What I want here is that to list all the available craftsmanships with id, name 
and has column. I can get now only those craftsmanships that a specific user 
has, 

SELECT
craftsmanships.id,
craftsmanships.name,
TRUE as has
FROM
"craftsmanships"
INNER JOIN "contractor_skills" ON 
"contractor_skills"."craftsmanship_id" = "craftsmanships"."id"
INNER JOIN "users" ON "users"."id" = "contractor_skills"."user_id"
WHERE (contractor_skills.user_id = 8)
ORDER BY
"craftsmanships".”id"

—— 
 id |  name   | has
+-+-
  1 | paint   | t
  2 | drywall | t
(2 rows)

But I want to list all craftsmanships and has column should have `t` when 
user_id #8 has it, else `f`. How can I extend thisquery?



Thanks,

Arup Rakshit
a...@zeit.io

Re: duplicate key value violates unique constraint "chinese_price_infos_pkey"

2019-05-06 Thread Arup Rakshit

Hi,

Thanks for your reply. It is automatic, my app don’t creates ID, it delegates 
it to the DB. I am using Ruby on Rails app, where we use Postgresql. 

docking_dev=# \d chinese_price_infos;
 Table "public.chinese_price_infos"
   Column|Type | Collation | Nullable | 
Default 
-+-+---+--+-
 id  | integer |   | not null | 
nextval('chinese_price_infos_id_seq'::regclass)
 created_at  | timestamp without time zone |   |  | 
 updated_at  | timestamp without time zone |   |  | 
 item_code   | character varying(255)  |   |  | 
 description | character varying(255)  |   |  | 
 unit| character varying(255)  |   |  | 
 price_cents | integer |   |  | 
 uuid| uuid|   |  | 
uuid_generate_v4()
 company_id  | uuid|   |  | 
Indexes:
"chinese_price_infos_pkey" PRIMARY KEY, btree (id)
"index_chinese_price_infos_on_company_id" btree (company_id)




Thanks,

Arup Rakshit
a...@zeit.io



> On 06-May-2019, at 4:38 PM, Ray O'Donnell  wrote:
> 
> On 06/05/2019 12:05, Arup Rakshit wrote:
>> Every time I try to insert I get the error:
>> docking_dev=# INSERT INTO "chinese_price_infos" ("item_code",
>> "price_cents", "unit", "description", "company_id", "created_at",
>> "updated_at") VALUES ('01GS10001', 6000, 'Lift', 'Shore Crane
>> Rental', '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06
>> 10:49:03.894725', '2019-05-06 10:49:03.894725'); ERROR:  duplicate
>> key value violates unique constraint "chinese_price_infos_pkey" DETAIL:  Key 
>> (id)=(71165) already exists. docking_dev=# INSERT INTO
>> "chinese_price_infos" ("item_code", "price_cents", "unit",
>> "description", "company_id", "created_at", "updated_at") VALUES
>> ('01GS10001', 6000, 'Lift', 'Shore Crane Rental',
>> '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725',
>> '2019-05-06 10:49:03.894725'); ERROR:  duplicate key value violates
>> unique constraint "chinese_price_infos_pkey" DETAIL:  Key
>> (id)=(71166) already exists.
>> Then I found:
>> docking_dev=# SELECT MAX(id) FROM chinese_price_infos; max  128520 
>> (1 row)
>> docking_dev=# SELECT nextval('chinese_price_infos_id_seq'); nextval 
>> - 71164 (1 row)
>> Not sure how it is out of sync. How can I fix this permanently. I ran
>> vacuum analyze verbose; still same error.
> 
> 
> You can fix it by using setval() to set the sequence manually to something 
> higher than the highest current id value in the table. However, it sounds as 
> if something in the application code may be causing problems For example, 
> is something generating id values without reference to the sequence?
> 
> Ray.
> 
> 
> 
> -- 
> Raymond O'Donnell // Galway // Ireland
> r...@rodonnell.ie

duplicate key value violates unique constraint "chinese_price_infos_pkey"

2019-05-06 Thread Arup Rakshit

Every time I try to insert I get the error:

docking_dev=# INSERT INTO "chinese_price_infos" ("item_code", "price_cents", 
"unit", "description", "company_id", "created_at", "updated_at") VALUES 
('01GS10001', 6000, 'Lift', 'Shore Crane Rental', 
'9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725', 
'2019-05-06 10:49:03.894725');
ERROR:  duplicate key value violates unique constraint 
"chinese_price_infos_pkey"
DETAIL:  Key (id)=(71165) already exists.
docking_dev=# INSERT INTO "chinese_price_infos" ("item_code", "price_cents", 
"unit", "description", "company_id", "created_at", "updated_at") VALUES 
('01GS10001', 6000, 'Lift', 'Shore Crane Rental', 
'9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725', 
'2019-05-06 10:49:03.894725');
ERROR:  duplicate key value violates unique constraint 
"chinese_price_infos_pkey"
DETAIL:  Key (id)=(71166) already exists.

Then I found:

docking_dev=# SELECT MAX(id) FROM chinese_price_infos;
  max   

 128520
(1 row)

docking_dev=# SELECT nextval('chinese_price_infos_id_seq');
 nextval 
-
   71164
(1 row)

Not sure how it is out of sync. How can I fix this permanently. I ran vacuum 
analyze verbose; still same error.

Thanks,

Arup Rakshit
a...@zeit.io

Re: SQl help to build a result with custom aliased bool column

2019-04-08 Thread Arup Rakshit

Hi,

Thanks for showing different ways to achieve the goal. So what should be the 
optimal way to solve this. I have an composite index using company_id and 
feature_id columns for project_features table.

I do ruby on rails development, where table names are plural always by 
convention. The tables I created above in different schema to ask question with 
sample data and test the query output. So they are little inconsistent, yes you 
are right.

Thanks,

Arup Rakshit
a...@zeit.io

> On 08-Apr-2019, at 4:36 PM, mariusz  wrote:
> 
> bear in mind that with a large number of companies and proper index on
> company_features the optimizer could limit company_features as
> necessary, while your examples read everything anyway and mangle output
> to get proper result (with dups and bugs, but also not optimal)

Re: SQl help to build a result with custom aliased bool column

2019-04-08 Thread Arup Rakshit

I am still having some bugs. I am getting duplicate in the result set.

psql (11.0, server 10.5)
Type "help" for help.

aruprakshit=# select * from features;
 id | name 
+--
  1 | f1
  2 | f2
  3 | f3
  4 | f4
(4 rows)

aruprakshit=# select * from company;
 id | name 
+--
  1 | c1
  2 | c2
(2 rows)

aruprakshit=# select * from company_features;
 id | company_id | feature_id 
++
  1 |  1 |  1
  2 |  1 |  2
  3 |  2 |  3
  4 |  1 |  3
(4 rows)

aruprakshit=# SELECT
aruprakshit-# features.id,
aruprakshit-# features.name,
aruprakshit-# coalesce(company_features.company_id = 1, false) AS active
aruprakshit-# FROM
aruprakshit-# features
aruprakshit-# LEFT JOIN company_features ON features.id = 
company_features.feature_id;
 id | name | active 
+--+
  1 | f1   | t
  2 | f2   | t
  3 | f3   | f
  3 | f3   | t
  4 | f4   | f
(5 rows)


I should get in the results only 3, as total number of features are 3.

Thanks,

Arup Rakshit
a...@zeit.io



> On 08-Apr-2019, at 3:28 PM, Szymon Lipiński  wrote:
> 
> Hey,
> you could just use
> 
> SELECT
> features.id,
> features.name,
> company_features.company_id = 1 as active
> 
> regards,
> Szymon
> 
> On Mon, 8 Apr 2019 at 09:55, Arup Rakshit  wrote:
> I have 2 tables Company and Feature. They are connected via a join table 
> called CompanyFeature. I want to build a result set where it will have id, 
> name and a custom boolean column. This boolean column is there to say if the 
> feature is present for the company or not.
> 
> Company table:
> 
> | id | name | 
> ||--| 
> | 1  | c1   | 
> | 2  | c2   | 
> | 3  | c3   | 
> 
> Feature table:
> 
> | id | name | 
> ||--| 
> | 1  | f1   | 
> | 2  | f2   | 
> | 3  | f3   | 
> 
> Company Feature table:
> 
> | id | feature_id | company_id | 
> |||| 
> | 1  | 2  | 1  | 
> | 2  | 1  | 1  | 
> | 3  | 3  | 2  | 
> 
> The result should look like for company `c1`:
> 
> | id | name | active | 
> ||--|| 
> | 1  | f1   | t  | 
> | 2  | f2   | t  | 
> | 3  | f3   | f  | 
> 
> I tried something like:
> 
> SELECT
> features.id,
> features.name,
> CASE WHEN company_features.company_id = 1 THEN
> TRUE
> ELSE
> FALSE
> END AS active
> FROM
> features
> LEFT JOIN company_features ON company_features.feature_id = 
> features.id
> 
> It works. But is there any better way to achieve this?
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io
> 
> 
> 
> 
>

Re: SQl help to build a result with custom aliased bool column

2019-04-08 Thread Arup Rakshit

I knew that will be more compact way. Thanks for showing it. One thing I still 
would like to handle is that, to make sure the column contains only True/False. 
But right now sometimes it shows NULL. How can I fix this?

id|name|active|
--||--|
 1|f1  |true  |
 2|f2  |true  |
 3|f3  |false |
 4|f4  |  |


Thanks,

Arup Rakshit
a...@zeit.io



> On 08-Apr-2019, at 3:28 PM, Szymon Lipiński  wrote:
> 
> Hey,
> you could just use
> 
> SELECT
> features.id,
> features.name,
> company_features.company_id = 1 as active
> 
> regards,
> Szymon
> 
> On Mon, 8 Apr 2019 at 09:55, Arup Rakshit  wrote:
> I have 2 tables Company and Feature. They are connected via a join table 
> called CompanyFeature. I want to build a result set where it will have id, 
> name and a custom boolean column. This boolean column is there to say if the 
> feature is present for the company or not.
> 
> Company table:
> 
> | id | name | 
> ||--| 
> | 1  | c1   | 
> | 2  | c2   | 
> | 3  | c3   | 
> 
> Feature table:
> 
> | id | name | 
> ||--| 
> | 1  | f1   | 
> | 2  | f2   | 
> | 3  | f3   | 
> 
> Company Feature table:
> 
> | id | feature_id | company_id | 
> |||| 
> | 1  | 2  | 1  | 
> | 2  | 1  | 1  | 
> | 3  | 3  | 2  | 
> 
> The result should look like for company `c1`:
> 
> | id | name | active | 
> ||--|| 
> | 1  | f1   | t  | 
> | 2  | f2   | t  | 
> | 3  | f3   | f  | 
> 
> I tried something like:
> 
> SELECT
> features.id,
> features.name,
> CASE WHEN company_features.company_id = 1 THEN
> TRUE
> ELSE
> FALSE
> END AS active
> FROM
>     features
> LEFT JOIN company_features ON company_features.feature_id = 
> features.id
> 
> It works. But is there any better way to achieve this?
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io
> 
> 
> 
> 
>

SQl help to build a result with custom aliased bool column

2019-04-08 Thread Arup Rakshit

I have 2 tables Company and Feature. They are connected via a join table called 
CompanyFeature. I want to build a result set where it will have id, name and a 
custom boolean column. This boolean column is there to say if the feature is 
present for the company or not.

Company table:

| id | name | 
||--| 
| 1  | c1   | 
| 2  | c2   | 
| 3  | c3   | 

Feature table:

| id | name | 
||--| 
| 1  | f1   | 
| 2  | f2   | 
| 3  | f3   | 

Company Feature table:

| id | feature_id | company_id | 
|||| 
| 1  | 2  | 1  | 
| 2  | 1  | 1  | 
| 3  | 3  | 2  | 

The result should look like for company `c1`:

| id | name | active | 
||--|| 
| 1  | f1   | t  | 
| 2  | f2   | t  | 
| 3  | f3   | f  | 

I tried something like:

SELECT
features.id,
features.name,
CASE WHEN company_features.company_id = 1 THEN
TRUE
ELSE
FALSE
END AS active
FROM
features
LEFT JOIN company_features ON company_features.feature_id = features.id

It works. But is there any better way to achieve this?


Thanks,

Arup Rakshit
a...@zeit.io

ERROR: type "myrowtype" does not exist

2018-10-26 Thread Arup Rakshit

I was reading the json_populate_record function from the official doc 
https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE
 
<https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE>.
 The doc mentioned examples are giving me errors

psql (11.0, server 10.5)
Type "help" for help.

aruprakshit=# select * from json_populate_record(null::extype, '{"x":"foo", 
"y":43, "q": true}');
ERROR:  type "extype" does not exist
LINE 1: select * from json_populate_record(null::extype, '{"x":"foo"...
 ^
aruprakshit=# select * from json_populate_record(null::myrowtype, '{"a": 1, 
"b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}}');
ERROR:  type "myrowtype" does not exist
LINE 1: select * from json_populate_record(null::myrowtype, '{"a": 1…

What does null::extype, null::myrowtype mean here ?


Thanks,

Arup Rakshit
a...@zeit.io

Re: Why my query not using index to sort?

2018-09-30 Thread Arup Rakshit

   

'c4f55044-3f4e-439e-a586-8b6978c7b406', 


'36825da8-2f58-4a62-bdb4-2b91cbe18299', 

        
'cca256cf-b415-472c-8b9f-a8432d02c580');
Time: 14.377 ms
docking_dev=# 


Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 2:50 AM, Adrian Klaver  wrote:
> 
> On 9/28/18 12:14 PM, Arup Rakshit wrote:
>> Forgot to mention in my previous email, it was a quick send click. Sorry for 
>> that.
>> In the gist you need to see all the line with Vessel Load(.. . I load the 
>> page multiple times to catch the different times, so you will the line 
>> multiple times there in the log file.
> 
> Do you know what Vessel Load () is actually measuring?
> 
> To me it looks like it is covering both the query(including ROR overhead) and 
> the HTTP request/response cycle.
> 
> Also have you looked at:
> 
> https://guides.rubyonrails.org/debugging_rails_applications.html#impact-of-logs-on-performance
> 
>> Thanks,
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: How to improve sql query to achieve the better plan

2018-09-30 Thread Arup Rakshit

I just added it as you said, but I am getting same plan.


Sort  (cost=62842.16..62846.91 rows=1897 width=35) (actual 
time=1845.831..1845.950 rows=1229 loops=1)
  Sort Key: projects.id
  Sort Method: quicksort  Memory: 145kB
  ->  HashAggregate  (cost=62710.42..62738.88 rows=1897 width=35) (actual 
time=1844.178..1845.060 rows=1229 loops=1)
Group Key: projects.id
->  Hash Right Join  (cost=159.68..45382.09 rows=364807 width=35) 
(actual time=1.534..618.717 rows=365784 loops=1)
  Hash Cond: (workitems.project_id = projects.id)
  Filter: (workitems.deleted_at IS NULL)
  Rows Removed by Filter: 257457
  ->  Seq Scan on workitems  (cost=0.00..36653.75 rows=623175 
width=43) (actual time=0.047..213.842 rows=623175 loops=1)
  ->  Hash  (cost=135.97..135.97 rows=1897 width=16) (actual 
time=1.478..1.478 rows=1897 loops=1)
Buckets: 2048  Batches: 1  Memory Usage: 105kB
->  Seq Scan on projects  (cost=0.00..135.97 rows=1897 
width=16) (actual time=0.006..0.914 rows=1897 loops=1)
Planning time: 0.498 ms
Execution time: 1846.100 ms

——

Indexes:
"workitems_pkey" PRIMARY KEY, btree (id)
"index_workitems_on_company_id" btree (company_id)
"index_workitems_on_deleted_at" btree (deleted_at)
"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)
"index_workitems_on_project_id" btree (project_id)
"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)
"index_workitems_on_workitem_category_id" btree (workitem_category_id)
"patrial_index_workitems_200_1" btree (project_id) WHERE deleted_at IS NULL


Thanks,

Arup Rakshit
a...@zeit.io



> On 30-Sep-2018, at 10:15 PM, Pavel Stehule  wrote:
> 
> CREATE INDEX ON workitems(project_id) WHERE deleted_at is null

How to improve sql query to achieve the better plan

2018-09-30 Thread Arup Rakshit

I have the below query which is taking 1873 ms. How can I improve this?

explain analyze select
sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - 
coalesce(workitems.discount, 0)/ 100)) as total_budget_cents,
sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - 
coalesce(workitems.discount, 0)/ 100) + coalesce(additional_cost_cents, 0) - 
coalesce(cost_reduction_cents, 0)) as final_budget_cents,
projects.id as project_id
from
projects
left join workitems on
workitems.project_id = projects.id
where
workitems.deleted_at is null
group by
projects.id
order by
project_id asc


And explain output is:

Sort  (cost=62851.33..62856.07 rows=1897 width=35) (actual 
time=1872.867..1873.003 rows=1229 loops=1)
  Sort Key: projects.id
  Sort Method: quicksort  Memory: 145kB
  ->  HashAggregate  (cost=62719.59..62748.04 rows=1897 width=35) (actual 
time=1871.281..1872.104 rows=1229 loops=1)
Group Key: projects.id
->  Hash Right Join  (cost=159.68..45386.32 rows=364911 width=35) 
(actual time=2.226..637.936 rows=365784 loops=1)
  Hash Cond: (workitems.project_id = projects.id)
  Filter: (workitems.deleted_at IS NULL)
  Rows Removed by Filter: 257457
  ->  Seq Scan on workitems  (cost=0.00..36655.53 rows=623353 
width=43) (actual time=0.020..220.215 rows=623175 loops=1)
  ->  Hash  (cost=135.97..135.97 rows=1897 width=16) (actual 
time=2.177..2.177 rows=1897 loops=1)
Buckets: 2048  Batches: 1  Memory Usage: 105kB
->  Seq Scan on projects  (cost=0.00..135.97 rows=1897 
width=16) (actual time=0.013..1.451 rows=1897 loops=1)
Planning time: 2.775 ms
Execution time: 1873.308 ms


Projects table has the index:

Indexes:
"projects_pkey" PRIMARY KEY, btree (id)
"index_projects_on_company_id" btree (company_id)
"index_projects_on_deleted_at" btree (deleted_at)
"index_projects_on_inspector_id" btree (inspector_id)
"index_projects_on_managed_offline_by_user_id" btree 
(managed_offline_by_user_id)
"index_projects_on_project_status_id" btree (project_status_id)
"index_projects_on_shipyard_id" btree (shipyard_id)
"index_projects_on_vessel_id" btree (vessel_id)

Workitems table has the index:

Indexes:
"workitems_pkey" PRIMARY KEY, btree (id)
"index_workitems_on_company_id" btree (company_id)
"index_workitems_on_deleted_at" btree (deleted_at)
"index_workitems_on_parent_workitem_id" btree (parent_workitem_id)
"index_workitems_on_project_id" btree (project_id)
"index_workitems_on_standard_workitem_id" btree (standard_workitem_id)
"index_workitems_on_workitem_category_id" btree (workitem_category_id)


Thanks,

Arup Rakshit
a...@zeit.io

Re: Why my query not using index to sort?

2018-09-29 Thread Arup Rakshit

When I keep the sort off, I see it is executing much more faster.

set enable_sort = off;  
explain analyze select
 *
from
"standard_workitems"
where
"standard_workitems"."deleted_at" is null
and "standard_workitems"."company_id" = 
'6fed40b7-fdd7-4efb-a163-c2b42e6486ae'
order by
item_code asc;

Index Scan using standard_workitems_partial_index_idx_1_1 on standard_workitems 
 (cost=0.42..5802.04 rows=1697 width=763) (actual time=0.018..1.076 rows=2071 
loops=1)
  Index Cond: (company_id = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid)
Planning time: 0.191 ms
Execution time: 1.210 ms


I have no idea why PG thinks in memory sort will be cheaper.. 


Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 9:40 PM, Arup Rakshit  wrote:
> 
> Hello Tom,
> 
> I have another query, where I am expecting the sort from index, but it is in 
> memory and takes lot of time.
> 
> Query:
> 
> explain analyze select
>*
> from
>   "standard_workitems"
> where
>   "standard_workitems"."deleted_at" is null
>   and "standard_workitems"."company_id" = 
> '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'
> order by
>   standard_workitems.item_code asc;
> 
> Explain plan:
> 
> Sort  (cost=3454.03..3458.18 rows=1660 width=810) (actual time=20.302..20.502 
> rows=2071 loops=1)
>   Sort Key: item_code
>   Sort Method: quicksort  Memory: 800kB
>   ->  Bitmap Heap Scan on standard_workitems  (cost=57.29..3365.25 rows=1660 
> width=810) (actual time=0.297..0.781 rows=2071 loops=1)
> Recheck Cond: ((company_id = 
> '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) AND (deleted_at IS NULL))
> Heap Blocks: exact=139
> ->  Bitmap Index Scan on standard_workitems_partial_index_idx_1_1  
> (cost=0.00..56.87 rows=1660 width=0) (actual time=0.272..0.272 rows=2071 
> loops=1)
>   Index Cond: (company_id = 
> '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid)
> Planning time: 0.199 ms
> Execution time: 20.688 ms
> 
> Indexes I have:
> 
> Indexes:
> "standard_workitems_pkey" PRIMARY KEY, btree (id)
> "index_standard_workitems_on_company_id" btree (company_id)
> "index_standard_workitems_on_deleted_at" btree (deleted_at)
> "index_standard_workitems_on_item_code" btree (item_code)
> "index_standard_workitems_on_workitem_category_id" btree 
> (workitem_category_id)
> "standard_workitems_partial_index_idx_1_1" btree (company_id, item_code) 
> WHERE deleted_at IS NULL
> 
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> 
>> On 28-Sep-2018, at 7:07 PM, Tom Lane > <mailto:t...@sss.pgh.pa.us>> wrote:
>> 
>> Arup Rakshit mailto:a...@zeit.io>> writes:
>>> My query is not using name index to sort the result.
>> 
>> Given the rowcounts here, I think the planner is making the right choice.
>> Sorting 70-some rows with a Sort node is probably cheaper than doing
>> random disk I/O to get them in sorted order.  With more rows involved,
>> it might make the other choice.
>> 
>> As a testing measure (don't do it in production!), you could set
>> enable_sort = off, which will force the planner to pick a non-Sort
>> plan if possible.  Then you could see whether that's actually faster
>> or slower, and by how much.
>> 
>>  regards, tom lane
>

Re: Why my query not using index to sort?

2018-09-29 Thread Arup Rakshit

Hello Tom,

I have another query, where I am expecting the sort from index, but it is in 
memory and takes lot of time.

Query:

explain analyze select
 *
from
"standard_workitems"
where
"standard_workitems"."deleted_at" is null
and "standard_workitems"."company_id" = 
'6fed40b7-fdd7-4efb-a163-c2b42e6486ae'
order by
standard_workitems.item_code asc;

Explain plan:

Sort  (cost=3454.03..3458.18 rows=1660 width=810) (actual time=20.302..20.502 
rows=2071 loops=1)
  Sort Key: item_code
  Sort Method: quicksort  Memory: 800kB
  ->  Bitmap Heap Scan on standard_workitems  (cost=57.29..3365.25 rows=1660 
width=810) (actual time=0.297..0.781 rows=2071 loops=1)
Recheck Cond: ((company_id = 
'6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) AND (deleted_at IS NULL))
Heap Blocks: exact=139
->  Bitmap Index Scan on standard_workitems_partial_index_idx_1_1  
(cost=0.00..56.87 rows=1660 width=0) (actual time=0.272..0.272 rows=2071 
loops=1)
  Index Cond: (company_id = 
'6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid)
Planning time: 0.199 ms
Execution time: 20.688 ms

Indexes I have:

Indexes:
"standard_workitems_pkey" PRIMARY KEY, btree (id)
"index_standard_workitems_on_company_id" btree (company_id)
"index_standard_workitems_on_deleted_at" btree (deleted_at)
"index_standard_workitems_on_item_code" btree (item_code)
"index_standard_workitems_on_workitem_category_id" btree 
(workitem_category_id)
"standard_workitems_partial_index_idx_1_1" btree (company_id, item_code) 
WHERE deleted_at IS NULL



Thanks,

Arup Rakshit
a...@zeit.io



> On 28-Sep-2018, at 7:07 PM, Tom Lane  wrote:
> 
> Arup Rakshit  writes:
>> My query is not using name index to sort the result.
> 
> Given the rowcounts here, I think the planner is making the right choice.
> Sorting 70-some rows with a Sort node is probably cheaper than doing
> random disk I/O to get them in sorted order.  With more rows involved,
> it might make the other choice.
> 
> As a testing measure (don't do it in production!), you could set
> enable_sort = off, which will force the planner to pick a non-Sort
> plan if possible.  Then you could see whether that's actually faster
> or slower, and by how much.
> 
>   regards, tom lane

Re: Why my query not using index to sort?

2018-09-29 Thread Arup Rakshit

Hello Adrian,


I used to consider this time as the sql execution time approximately. I’ll do 
some research and get back to you. Thanks for mentioning it.


Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 2:50 AM, Adrian Klaver  wrote:
> 
> On 9/28/18 12:14 PM, Arup Rakshit wrote:
>> Forgot to mention in my previous email, it was a quick send click. Sorry for 
>> that.
>> In the gist you need to see all the line with Vessel Load(.. . I load the 
>> page multiple times to catch the different times, so you will the line 
>> multiple times there in the log file.
> 
> Do you know what Vessel Load () is actually measuring?
> 
> To me it looks like it is covering both the query(including ROR overhead) and 
> the HTTP request/response cycle.
> 
> Also have you looked at:
> 
> https://guides.rubyonrails.org/debugging_rails_applications.html#impact-of-logs-on-performance
> 
>> Thanks,
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Forgot to mention in my previous email, it was a quick send click. Sorry for 
that.

In the gist you need to see all the line with Vessel Load(.. . I load the page 
multiple times to catch the different times, so you will the line multiple 
times there in the log file.

Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 12:40 AM, Arup Rakshit  wrote:
> 
> Hi Adrian,
> 
> > The application is?
> 
> It is Ruby On Rails application
> 
> > The log settings are?
> > Where is it being printed?
> 
> The default application log, the production.log file.
> 
> > What is being printed?
> 
> Check the gist: 
> https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6 
> <https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6>
> 
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> 
>> On 29-Sep-2018, at 12:28 AM, Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote:
>> 
>> On 9/28/18 11:51 AM, Arup Rakshit wrote:
>>> Yes, I have shown the explain plan output. But in my application log the 
>>> sql query prints 1.7 to 1.9 ms.
>> 
>> So you have added another layer to the process.
>> 
>> The application is?
>> 
>> The log settings are?
>> 
>> What is being printed?
>> 
>> Where is it being printed?
>> 
>>> Thanks,
>>> Arup Rakshit
>>> a...@zeit.io <mailto:a...@zeit.io> <mailto:a...@zeit.io 
>>> <mailto:a...@zeit.io>>
>> 
>> 
>> 
>> -- 
>> Adrian Klaver
>> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
>

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Hi Adrian,

> The application is?

It is Ruby On Rails application

> The log settings are?
> Where is it being printed?

The default application log, the production.log file.

> What is being printed?

Check the gist: 
https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6 
<https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6>



Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 12:28 AM, Adrian Klaver  wrote:
> 
> On 9/28/18 11:51 AM, Arup Rakshit wrote:
>> Yes, I have shown the explain plan output. But in my application log the sql 
>> query prints 1.7 to 1.9 ms.
> 
> So you have added another layer to the process.
> 
> The application is?
> 
> The log settings are?
> 
> What is being printed?
> 
> Where is it being printed?
> 
>> Thanks,
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit


Yes, I have shown the explain plan output. But in my application log the sql 
query prints 1.7 to 1.9 ms.


Thanks,

Arup Rakshit
a...@zeit.io



> On 29-Sep-2018, at 12:17 AM, Adrian Klaver  wrote:
> 
> On 9/28/18 11:32 AM, Arup Rakshit wrote:
>> Yes. But I thought I could improve it, so that it can be finished in 0.xx 
>> ms. It takes now between 1.7 to 1.9 ms in production.
> 
> That is not what you showed in your first post:
> 
> Planning time: 0.178 ms
> Execution time: 0.527 ms
> 
> 
> 
>> Thanks,
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
>>> On 28-Sep-2018, at 9:46 PM, Andreas Kretschmer >> <mailto:andr...@a-kretschmer.de>> wrote:
>>> 
>>> 
>>> 
>>> Am 28.09.2018 um 18:03 schrieb Arup Rakshit:
>>>> Also I meatn the execution time is less than I had before with enable_sort 
>>>> = off.
>>> 
>>> yeah, but not that much. different plan.
>>> 
>>> 
>>> Regards, Andreas
>>> 
>>> -- 
>>> 2ndQuadrant - The PostgreSQL Support Company.
>>> www.2ndQuadrant.com <http://www.2ndQuadrant.com>
>>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Yes. But I thought I could improve it, so that it can be finished in 0.xx ms. 
It takes now between 1.7 to 1.9 ms in production.


Thanks,

Arup Rakshit
a...@zeit.io



> On 28-Sep-2018, at 9:46 PM, Andreas Kretschmer  
> wrote:
> 
> 
> 
> Am 28.09.2018 um 18:03 schrieb Arup Rakshit:
>> Also I meatn the execution time is less than I had before with enable_sort = 
>> off. 
> 
> yeah, but not that much. different plan.
> 
> 
> Regards, Andreas
> 
> -- 
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
>

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Also I meatn the execution time is less than I had before with enable_sort = 
off. Why 6 rows was a side question. :)

Thanks,

Arup Rakshit
a...@zeit.io



> On 28-Sep-2018, at 9:29 PM, Andreas Kretschmer  
> wrote:
> 
> 
> 
> Am 28.09.2018 um 16:49 schrieb Arup Rakshit:
>>  Planning time: 1.867 ms
>>  Execution time: 0.252 ms
>> (6 rows)
>> 
>> Why it is showing *6 rows*? Also it seems less than what I had before:
>> 
> 
> the explain-output are 6 rows ;-)
> 
> 
> Regards, Andreas
> 
> -- 
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
> 
>

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Oh I see. That makes sense, I was reading too much into that line.. :)

Thanks,

Arup Rakshit
a...@zeit.io



> On 28-Sep-2018, at 9:29 PM, Andreas Kretschmer  
> wrote:
> 
> 
> 
> Am 28.09.2018 um 16:49 schrieb Arup Rakshit:
>>  Planning time: 1.867 ms
>>  Execution time: 0.252 ms
>> (6 rows)
>> 
>> Why it is showing *6 rows*? Also it seems less than what I had before:
>> 
> 
> the explain-output are 6 rows ;-)
> 
> 
> Regards, Andreas
> 
> -- 
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
> 
>

Re: Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

Hello Tom,

Here is the plan with `enable_sort = off`.

# set enable_sort = off;

   SET
# explain analyze select "vessels" .* from "vessels" where 
"vessels"."deleted_at" is null and "vessels"."company_id" = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954' order by "vessels"."name" ASC;
 QUERY PLAN
-
 Index Scan using inspector_tool_idx20_1 on vessels  (cost=0.27..40.76 rows=76 
width=107) (actual time=0.047..0.120 rows=77 loops=1)
   Index Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
   Filter: (deleted_at IS NULL)
   Rows Removed by Filter: 4
 Planning time: 1.867 ms
 Execution time: 0.252 ms
(6 rows)

Why it is showing *6 rows*? Also it seems less than what I had before:

# explain analyze select "vessels" .* from "vessels" where 
"vessels"."deleted_at" is null and "vessels"."company_id" = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954' order by "vessels"."name" ASC;
  QUERY PLAN
---
 Sort  (cost=17.29..17.48 rows=76 width=107) (actual time=0.789..0.796 rows=77 
loops=1)
   Sort Key: name
   Sort Method: quicksort  Memory: 38kB
   ->  Bitmap Heap Scan on vessels  (cost=4.90..14.91 rows=76 width=107) 
(actual time=0.090..0.122 rows=77 loops=1)
 Recheck Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
 Filter: (deleted_at IS NULL)
 Rows Removed by Filter: 4
 Heap Blocks: exact=3
 ->  Bitmap Index Scan on inspector_tool_idx20_1  (cost=0.00..4.88 
rows=81 width=0) (actual time=0.059..0.059 rows=81 loops=1)
   Index Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
 Planning time: 1.743 ms
 Execution time: 0.954 ms
(12 rows)



Thanks,

Arup Rakshit
a...@zeit.io



> On 28-Sep-2018, at 7:07 PM, Tom Lane  wrote:
> 
> Arup Rakshit  writes:
>> My query is not using name index to sort the result.
> 
> Given the rowcounts here, I think the planner is making the right choice.
> Sorting 70-some rows with a Sort node is probably cheaper than doing
> random disk I/O to get them in sorted order.  With more rows involved,
> it might make the other choice.
> 
> As a testing measure (don't do it in production!), you could set
> enable_sort = off, which will force the planner to pick a non-Sort
> plan if possible.  Then you could see whether that's actually faster
> or slower, and by how much.
> 
>   regards, tom lane

Why my query not using index to sort?

2018-09-28 Thread Arup Rakshit

My query is not using name index to sort the result.

explain analyze select
"vessels" .*
from
"vessels"
where
"vessels"."deleted_at" is null
and "vessels"."company_id" = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'
order by
"vessels"."name" ASC;

I have below index in my vessels table:

Indexes:
"vessels_pkey" PRIMARY KEY, btree (id)
"inspector_tool_idx20_1" btree (company_id)
"inspector_tool_idx20_2" btree (name)


and following is my plan output.

Sort  (cost=17.29..17.48 rows=76 width=107) (actual time=0.468..0.473 rows=77 
loops=1)
  Sort Key: name
  Sort Method: quicksort  Memory: 38kB
  ->  Bitmap Heap Scan on vessels  (cost=4.90..14.91 rows=76 width=107) (actual 
time=0.036..0.059 rows=77 loops=1)
Recheck Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 4
Heap Blocks: exact=3
->  Bitmap Index Scan on inspector_tool_idx20_1  (cost=0.00..4.88 
rows=81 width=0) (actual time=0.022..0.023 rows=81 loops=1)
  Index Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
Planning time: 0.178 ms
Execution time: 0.527 ms



After that I changed my index as follows, and the got almost same plan output.


Indexes:
"vessels_pkey" PRIMARY KEY, btree (id)
"inspector_tool_idx20_1" btree (company_id, name)

Sort  (cost=17.29..17.48 rows=76 width=107) (actual time=0.475..0.480 rows=77 
loops=1)
  Sort Key: name
  Sort Method: quicksort  Memory: 38kB
  ->  Bitmap Heap Scan on vessels  (cost=4.90..14.91 rows=76 width=107) (actual 
time=0.038..0.058 rows=77 loops=1)
Recheck Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 4
Heap Blocks: exact=3
->  Bitmap Index Scan on inspector_tool_idx20_1  (cost=0.00..4.88 
rows=81 width=0) (actual time=0.025..0.025 rows=81 loops=1)
  Index Cond: (company_id = 
'86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid)
Planning time: 0.168 ms
Execution time: 0.533 ms

Is there anyway, I can improve the sorting so that it can use the index ?


Thanks,

Arup Rakshit
a...@zeit.io

Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-28 Thread Arup Rakshit

Hi,

> This is a broad topic, and I can only give you some hints.

Yes when I am reading this 
https://www.postgresql.org/docs/10/static/using-explain.html 
<https://www.postgresql.org/docs/10/static/using-explain.html>  I found the doc 
said like “...Plan-reading is an art that requires some experience to master, 
but this section attempts to cover the basics..”

Lets see how can I get there. :) I probably need to ask a lot of questions.


Thanks for sharing those points. I made a note about them.


Thanks,

Arup Rakshit
a...@zeit.io



> On 25-Sep-2018, at 11:45 AM, Laurenz Albe  wrote:
> 
> Arup Rakshit wrote:
>> Thanks for the link. I read the documentation you linked, and part of it
>> I understood and rest went above my head. Probably I need to read it multiple
>> times to understand what is going on. I am learning how indexing works in 
>> DBMS.
>> Mostly I understood Btree so far. I am an application developer.
>> Being an application developer I think I need to know which column should be
>> indexed and what kind of index to apply in which case.
> 
> That is true; a truth that many developers unfortunately ignore.
> 
>> Most of the time, when
>> I see slow query, I saw people ask to run the explain to see the plan.
>> And explain statement shows lot of data about the query. So my questions is 
>> that:
>> Which part I should read from the plan output to figure out reason of 
>> slowness
>> or what need to be used to improve it. What are basic things I should know 
>> about it.
>> I think, I don’t need to internal math for this, am I right? The query in
>> this post is not a slow query, it is something I ran to see how index merge 
>> happens.
>> I am asking generally. Can you give me some directions on this, so that I can
>> build up some bases on this subject.
> 
> This is a broad topic, and I can only give you some hints.
> In order to understand EXPLAIN output and to improve your query, you need
> to know some of how the database is implemented.
> 
> You have to understand index scans, index only scans and bitmap index scans.
> You have to understand nested loop, hash and merge joins.
> You have to understand table statistics, dead tuples and table bloat.
> 
> The first things to look for in EXPLAIN (ANALYZE, BUFFERS) output is in which
> nodes the time is spent, and where the estimated number of rows diverges
> significantly from the actual number of rows.
> The former are the spots where there is room for improvement, and the latter
> is often the root cause of a bad plan choice.
> Also, watch out for the nodes that touch a lot of blocks.
> They can cause intermittent slow-down if the blocks are not in cache.
> 
>> Also what are the best resources to learn GIST, GIN indexes — something which
>> teaches it from the ground up? 
> 
> The documentation, and for what goes beyond that, the source.
> 
> Yours,
> Laurenz Albe
> -- 
> Cybertec | https://www.cybertec-postgresql.com
>

Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-24 Thread Arup Rakshit

Hello Thomas,

Thanks for the link. I read the documentation you linked, and part of it I 
understood and rest went above my head. Probably I need to read it multiple 
times to understand what is going on. I am learning how indexing works in DBMS. 
Mostly I understood Btree so far. I am an application developer. Being an 
application developer I think I need to know which column should be indexed and 
what kind of index to apply in which case. Most of the time, when I see slow 
query, I saw people ask to run the explain to see the plan. And explain 
statement shows lot of data about the query. So my questions is that: Which 
part I should read from the plan output to figure out reason of slowness or 
what need to be used to improve it. What are basic things I should know about 
it. I think, I don’t need to internal math for this, am I right? The query in 
this post is not a slow query, it is something I ran to see how index merge 
happens. I am asking generally. Can you give me some directions on this, so 
that I can build up some bases on this subject.

Also what are the best resources to learn GIST, GIN indexes — something which 
teaches it from the ground up? 

Thanks,

Arup Rakshit
a...@zeit.io

> On 24-Sep-2018, at 4:11 AM, Tomas Vondra  wrote:
> 
> 
> 
> On 09/23/2018 10:21 PM, Arup Rakshit wrote:
>> Hello I have some questions related to the query plan output about the
>> planned and actual rows. In the following example:
>> 
>> # explain (analyze true, costs true, format yaml) select * from users
>> where lower(city) = 'melanyfort' and lower(state) = 'ohio';
>>  QUERY PLAN 
>>
>> 
>>  - Plan:
>>   +
>>  Node Type: "Bitmap Heap Scan"  
>>   +
>>  Parallel Aware: false  
>>   +
>>  Relation Name: "users" 
>>   +
>>  Alias: "users" 
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 14.80  
>>   +
>>  Plan Rows: 1   
>>   +
>>  Plan Width: 73 
>>   +
>>  Actual Startup Time: 0.155 
>>   +
>>  Actual Total Time: 0.155   
>>   +
>>  Actual Rows: 0 
>>   +
>>  Actual Loops: 1
>>   +
>>  Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND
>> (lower((state)::text) = 'ohio'::text))"+
>>  Rows Removed by Index Recheck: 0   
>>   +
>>  Exact Heap Blocks: 0   
>>   +
>>  Lossy Heap Blocks: 0   
>>   +
>>  Plans: 
>>   +
>>- Node Type: "BitmapAnd" 
>>   +
>>  Parent Relationship: "Outer"   
>>   +
>>  Parallel Aware: false  
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 10.78  
>>   +
>>  Plan Rows: 1   
>&

Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-23 Thread Arup Rakshit

 
   +
 Actual Rows: 1 
   +
 Actual Loops: 1
   +
 Index Cond: "(lower((city)::text) = 'melanyfort'::text)"   
   +
   - Node Type: "Bitmap Index Scan" 
   +
 Parent Relationship: "Member"  
   +
 Parallel Aware: false  
   +
 Index Name: "lower_state_users_idx"
   +
 Startup Cost: 0.00 
   +
 Total Cost: 5.87   
   +
 Plan Rows: 211 
   +
 Plan Width: 0  
   +
 Actual Startup Time: 0.102 
   +
 Actual Total Time: 0.102   
   +
 Actual Rows: 211   
   +
 Actual Loops: 1
   +
 Index Cond: "(lower((state)::text) = 'ohio'::text)"
   +
   Planning Time: 0.260 
   +
   Triggers:
   +
   Execution Time: 0.249
(1 row)

aruprakshit=# 


--

In the first node type of "Bitmap Index Scan” on “users_lower_idx”, I see the 
plan rows are 50, but actual rows it got 1. In the second node type of "Bitmap 
Index Scan” on “ lower_state_users_idx”, I see the plan rows are 211, and 
actual rows 211. Both are same. Based on what conditions planner estimated the 
planned and actual rows count? 


In node type “BitmapAnd”, I see again the actual rows 1, then why on the final 
plan i.e. Node Type: "Bitmap Heap Scan” again planner estimated rows 1? How 
does it counts these? What does the Loops count says us about the query?  


Thanks,

Arup Rakshit
a...@zeit.io

Bitmap Heap Scan and Bitmap Index Scan

2018-09-15 Thread Arup Rakshit

Here is a explain plan of a very simple query:

aruprakshit=# explain analyze select first_name, last_name from users where 
lower(state) = 'colorado';
QUERY PLAN
--
 Bitmap Heap Scan on users  (cost=5.86..161.40 rows=203 width=13) (actual 
time=0.134..0.444 rows=203 loops=1)
   Recheck Cond: (lower((state)::text) = 'colorado'::text)
   Heap Blocks: exact=106
   ->  Bitmap Index Scan on lower_state_users_idx  (cost=0.00..5.81 rows=203 
width=0) (actual time=0.098..0.098 rows=203 loops=1)
 Index Cond: (lower((state)::text) = 'colorado'::text)
 Planning time: 0.263 ms
 Execution time: 0.517 ms
(7 rows)

I read this 
https://www.postgresql.org/message-id/12553.1135634231%40sss.pgh.pa.us 
<https://www.postgresql.org/message-id/12553.1135634...@sss.pgh.pa.us> and 
https://www.postgresql.org/message-id/464F3C5D.2000700%40enterprisedb.com 
<https://www.postgresql.org/message-id/464f3c5d.2000...@enterprisedb.com>  to 
understand what this bitmap heap scan and index scan is. But there are some 
questions still in mind which I am not able to figure out yet.

Does bitmap index apply when normal index scan is costly?
Does bitmap index always store page number of matching tuples instead of just 
the tuples?
What is Heap Blocks: exact=106 ?
Why the cost is higher in Heap scan than index scan?

Thanks,

Arup Rakshit
a...@zeit.io

Bitmap Heap Scan and Bitmap Index Scan

2018-09-14 Thread Arup Rakshit

Here is a explain plan of a very simple query:

aruprakshit=# explain analyze select first_name, last_name from users where 
lower(state) = 'colorado';
QUERY PLAN
--
 Bitmap Heap Scan on users  (cost=5.86..161.40 rows=203 width=13) (actual 
time=0.134..0.444 rows=203 loops=1)
   Recheck Cond: (lower((state)::text) = 'colorado'::text)
   Heap Blocks: exact=106
   ->  Bitmap Index Scan on lower_state_users_idx  (cost=0.00..5.81 rows=203 
width=0) (actual time=0.098..0.098 rows=203 loops=1)
 Index Cond: (lower((state)::text) = 'colorado'::text)
 Planning time: 0.263 ms
 Execution time: 0.517 ms
(7 rows)

I read this 
https://www.postgresql.org/message-id/12553.1135634231%40sss.pgh.pa.us 
<https://www.postgresql.org/message-id/12553.1135634...@sss.pgh.pa.us> and 
https://www.postgresql.org/message-id/464F3C5D.2000700%40enterprisedb.com 
<https://www.postgresql.org/message-id/464f3c5d.2000...@enterprisedb.com>  to 
understand what this bitmap heap scan and index scan is. But there are some 
questions still in mind which I am not able to figure out yet.

Does bitmap index apply when normal index scan is costly?
Does bitmap index always store page number of matching tuples instead of just 
the tuples?
What is Heap Blocks: exact=106 ?
Why the cost is higher in Heap scan than index scan?



Thanks,

Arup Rakshit
a...@zeit.io

Re: Can I add Index to make a query faster which involves joins on unnest ?

2018-09-14 Thread Arup Rakshit

Hello Rob,


So how do I improve this query speed?


Thanks,

Arup Rakshit
a...@zeit.io



> On 14-Sep-2018, at 12:27 AM, Rob Sargent  wrote:
> 
> 
> 
>> On Sep 13, 2018, at 12:17 PM, Arup Rakshit > <mailto:a...@zeit.io>> wrote:
>> 
>> The below query basically gives the result by maintaining the order of the 
>> sizes in the list.
>> 
>> explain analyze select
>> "price_levels"."name",
>> "price_levels"."size"
>> from
>> "price_levels"
>> join unnest(array['M',
>> 'L',
>> 'XL',
>> '2XL',
>> '3XL',
>> '4XL',
>> '5XL',
>> '6XL',
>> 'S']) with ordinality t(size,
>> ord)
>> using (size)
>> order by
>> t.size
>> 
>> 
>> I have a Btree index on the size column.
>> 
>> Explain output is:
>> 
>> Merge Join  (cost=4.61..5165.38 rows=6 width=46) (actual 
>> time=0.157..57.872 rows=6 loops=1)
>>   Merge Cond: ((price_levels.size)::text = t.size)
>>   ->  Index Scan using price_levels_size_idx on price_levels  
>> (cost=0.29..4111.05 rows=6 width=14) (actual time=0.044..25.941 
>> rows=6 loops=1)
>>   ->  Sort  (cost=4.32..4.57 rows=100 width=32) (actual time=0.108..3.946 
>> rows=53289 loops=1)
>> Sort Key: t.size
>> Sort Method: quicksort  Memory: 25kB
>> ->  Function Scan on unnest t  (cost=0.00..1.00 rows=100 width=32) 
>> (actual time=0.030..0.033 rows=9 loops=1)
>> Planning time: 0.667 ms
>> Execution time: 62.846 ms
>> 
>> 
>> 
>> Thanks,
>> 
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
>> 
>> 
> There are not value of size fit it to be a worthwhile key.
>> 
>

Can I add Index to make a query faster which involves joins on unnest ?

2018-09-13 Thread Arup Rakshit

The below query basically gives the result by maintaining the order of the 
sizes in the list.

explain analyze select
"price_levels"."name",
"price_levels"."size"
from
"price_levels"
join unnest(array['M',
'L',
'XL',
'2XL',
'3XL',
'4XL',
'5XL',
'6XL',
'S']) with ordinality t(size,
ord)
using (size)
order by
t.size


I have a Btree index on the size column.

Explain output is:

Merge Join  (cost=4.61..5165.38 rows=6 width=46) (actual time=0.157..57.872 
rows=6 loops=1)
  Merge Cond: ((price_levels.size)::text = t.size)
  ->  Index Scan using price_levels_size_idx on price_levels  
(cost=0.29..4111.05 rows=6 width=14) (actual time=0.044..25.941 rows=6 
loops=1)
  ->  Sort  (cost=4.32..4.57 rows=100 width=32) (actual time=0.108..3.946 
rows=53289 loops=1)
Sort Key: t.size
Sort Method: quicksort  Memory: 25kB
->  Function Scan on unnest t  (cost=0.00..1.00 rows=100 width=32) 
(actual time=0.030..0.033 rows=9 loops=1)
Planning time: 0.667 ms
Execution time: 62.846 ms



Thanks,

Arup Rakshit
a...@zeit.io

Re: Select rows when all all ids of its children records matches

2018-09-12 Thread Arup Rakshit

Hi Rob,

I figured it out. thanks. It is giving correct data.

> Aren't you looking for cte.tags = array[1,2]?

posts_tags_cte has tags column, so I am using it.

Thanks,

Arup Rakshit
a...@zeit.io



> On 12-Sep-2018, at 9:47 PM, Rob Sargent  wrote:
> 
> 
> 
> On 09/12/2018 10:08 AM, Arup Rakshit wrote:
>> I tried :
>> 
>> WITH posts_tags_cte AS (
>> SELECT post_id, array_agg(tag_id) as tags
>> FROM posts_tags
>> WHERE tag_id in (1, 2)
>> GROUP BY post_id
>> )
>> SELECT posts.id <http://posts.id/> FROM posts_tags_cte JOIN posts ON 
>> posts.id <http://posts.id/> = posts_tags_cte.post_id
>> WHERE posts_tags_cte.tags @> array[1, 2]::int8[]
>> 
>> But it gives me all the posts.
>> 
>> 
> Aren't you looking for cte.tags = array[1,2]?
>

Re: Select rows when all all ids of its children records matches

2018-09-12 Thread Arup Rakshit

I tried :

WITH posts_tags_cte AS (
SELECT post_id, array_agg(tag_id) as tags
FROM posts_tags
WHERE tag_id in (1, 2)
GROUP BY post_id
)
SELECT posts.id FROM posts_tags_cte JOIN posts ON posts.id = 
posts_tags_cte.post_id
WHERE posts_tags_cte.tags @> array[1, 2]::int8[]

But it gives me all the posts.


Thanks,

Arup Rakshit
a...@zeit.io



> On 12-Sep-2018, at 9:14 PM, Arup Rakshit  wrote:
> 
> Can you show me the SQL construction? Do I need to use `WITH`?
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io <mailto:a...@zeit.io>
> 
> 
> 
>> On 12-Sep-2018, at 9:13 PM, David G. Johnston > <mailto:david.g.johns...@gmail.com>> wrote:
>> 
>> On Wednesday, September 12, 2018, Arup Rakshit > <mailto:a...@zeit.io>> wrote:
>> IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag 
>> ids )
>> 
>> Build arrays and then use the “contains” operator.
>> 
>> David J. 
>

Re: Select rows when all all ids of its children records matches

2018-09-12 Thread Arup Rakshit

Can you show me the SQL construction? Do I need to use `WITH`?


Thanks,

Arup Rakshit
a...@zeit.io



> On 12-Sep-2018, at 9:13 PM, David G. Johnston  
> wrote:
> 
> On Wednesday, September 12, 2018, Arup Rakshit  <mailto:a...@zeit.io>> wrote:
> IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag 
> ids )
> 
> Build arrays and then use the “contains” operator.
> 
> David J.

Re: Select rows when all all ids of its children records matches

2018-09-12 Thread Arup Rakshit

IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag ids 
)


Thanks,

Arup Rakshit
a...@zeit.io



> On 12-Sep-2018, at 8:58 PM, Ron  wrote:
> 
> Maybe this:
> select p.id, p.name
> from posts p,
>  posts_tags pt,
>  tags t
> where t.id in (1, 2, 3)
>   and t.id = pt.tag_id
>   and pt.post_id = p.id;
> 
> 
> On 09/12/2018 10:23 AM, Arup Rakshit wrote:
>> I have a table posts(id, name), posts_tags(post_id, tag_id) and tags (id, 
>> name) ... I want to get all posts which has tag id 1, 2 and 3 for example. 
>> How should I do this? I tried ALL, but it didn’t work.
>> 
>> Those tag ids comes from UI by users, so I am looking for generic approach.
>> 
>> 
>> Thanks,
>> 
>> Arup Rakshit
>> a...@zeit.io <mailto:a...@zeit.io>
>> 
>> 
>> 
> 
> -- 
> Angular momentum makes the world go 'round.

Why order by column not using index with distinct keyword in select clause?

2018-09-11 Thread Arup Rakshit

I have define a simple B Tree index on column *country* for users table. I 
don’t understand why the order by column not using the index scan when using 
*distinct* keyword in the select clause. Can anyone explain what is happening 
here?

aruprakshit=# \d users;
 Table "public.users"
   Column   | Type  | Collation | Nullable |  
Default
+---+---+--+---
 city   | character varying |   |  |
 last_name  | character varying(50) |   |  |
 country| character varying(50) |   |  |
 sequence   | integer   |   |  |
 first_name | character varying(50) |   |  |
 state  | character varying(50) |   |  |
 email  | character varying |   |  |
 id | smallint  |   | not null | 
nextval('users_id_seq'::regclass)
Indexes:
"users_pk" PRIMARY KEY, btree (id)

aruprakshit=# explain analyze select distinct country from users order by 
country asc;
 QUERY PLAN

 Sort  (cost=283.19..283.85 rows=263 width=11) (actual time=10.525..10.555 
rows=263 loops=1)
   Sort Key: country
   Sort Method: quicksort  Memory: 38kB
   ->  HashAggregate  (cost=269.99..272.62 rows=263 width=11) (actual 
time=8.469..8.521 rows=263 loops=1)
 Group Key: country
 ->  Seq Scan on users  (cost=0.00..244.99 rows= width=11) (actual 
time=0.022..3.428 rows= loops=1)
 Planning time: 0.358 ms
 Execution time: 10.634 ms
(8 rows)

aruprakshit=# explain analyze select country from users order by country asc;
 QUERY PLAN
-
 Index Only Scan using country on users  (cost=0.29..886.27 rows= width=11) 
(actual time=0.083..7.581 rows= loops=1)
   Heap Fetches: 
 Planning time: 0.118 ms
 Execution time: 8.332 ms
(4 rows)

aruprakshit=# explain analyze select * from users order by country asc;
   QUERY PLAN

 Index Scan using country on users  (cost=0.29..886.27 rows= width=73) 
(actual time=0.015..8.432 rows= loops=1)
 Planning time: 0.213 ms
 Execution time: 9.086 ms
(3 rows)

aruprakshit=#


Thanks,

Arup Rakshit
a...@zeit.io

Re: Why my query not doing index only scan

2018-09-10 Thread Arup Rakshit

Hello Stephen,

I would like to ask one more question related to this topic. When I take a dump 
from production, and restore it to development DB, what are the commands I 
generally need to run to dev deb quack close to production?

Thanks,

Arup Rakshit
a...@zeit.io



> On 10-Sep-2018, at 4:58 PM, Stephen Frost  wrote:
> 
> Greetings,
> 
> * Arup Rakshit (a...@zeit.io) wrote:
>> I have an index defined "inspector_tool_idx4_1" UNIQUE, btree (company_id, 
>> item_code, deleted_at). Now I am using the *company_id* column in the where 
>> clause, and the selecting just the *item_code* field for all matching rows. 
>> I expected here the planner will do a index only scans. But it is doing 
>> bitmap index scan. Any idea what it is not doing what I expected it to do.
> 
> One possibility is that the visibility map isn't current.
> 
> Indexes don't include visibility information.  The way an index-only
> scan works is that we track pages which are 'all visible' (meaning that
> every tuple on that page is visible to all running transactions) in a
> seperate file called the 'visibility map' (aka the VM).  The VM is
> updated by the VACUUM process- but we only automatically run a VACUUM
> (with the autovacuum process) when thresholds have been reached for the
> number of UPDATE'd or DELETE'd tuples.
> 
> What this means is that if you are playing around in development and
> just COPY a bunch of rows into a table, but don't ever UPDATE or DELETE
> the rows in that table, then you'll almost never get an index-only scan
> because the VM won't be current (and PG knows this).
> 
> Make sure to do a VACUUM after loading data (and letting any ongoing
> transactions finish) and then re-test.  That should make it sure that
> the VM is current and make it more likely that PG will do an index-only
> scan.  Not a guarantee still, but that's the first thing I'd try, based
> on what you've shared here.
> 
> Thanks!
> 
> Stephen

Re: Why my query not doing index only scan

2018-09-10 Thread Arup Rakshit

Thanks Stephen. After running the vacuum on the table it worked.

inspection_development=# explain analyze select item_code from 
inspector_tool_components where company_id = 
'7881ff2e-0557-4734-9da8-2d33072ff1ef';
 QUERY 
PLAN
-
 Index Only Scan using inspector_tool_idx4_1 on inspector_tool_components  
(cost=0.41..57.87 rows=1226 width=8) (actual time=0.044..0.626 rows=1232 
loops=1)
   Index Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid)
   Heap Fetches: 0
 Planning time: 0.190 ms
 Execution time: 0.778 ms
(5 rows)


Thanks,

Arup Rakshit
a...@zeit.io



> On 10-Sep-2018, at 4:58 PM, Stephen Frost  wrote:
> 
> VACUUM

Why my query not doing index only scan

2018-09-10 Thread Arup Rakshit

Hello All,

I am learning at this point how index works in DBMS. So I am looking for a very 
broad explanation to clear my basics.

I have an index defined "inspector_tool_idx4_1" UNIQUE, btree (company_id, 
item_code, deleted_at). Now I am using the *company_id* column in the where 
clause, and the selecting just the *item_code* field for all matching rows. I 
expected here the planner will do a index only scans. But it is doing bitmap 
index scan. Any idea what it is not doing what I expected it to do.


inspection_development=# explain analyze select item_code from 
inspector_tool_components where company_id = 
'7881ff2e-0557-4734-9da8-2d33072ff1ef';
 QUERY PLAN
-
 Bitmap Heap Scan on inspector_tool_components  (cost=45.92..1360.48 rows=1226 
width=8) (actual time=0.382..1.202 rows=1232 loops=1)
   Recheck Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid)
   Heap Blocks: exact=81
   ->  Bitmap Index Scan on inspector_tool_idx4_1  (cost=0.00..45.61 rows=1226 
width=0) (actual time=0.347..0.347 rows=1232 loops=1)
 Index Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid)
 Planning time: 0.207 ms
 Execution time: 1.358 ms
(7 rows)



Thanks,

Arup Rakshit
a...@zeit.io

Re: help to query json column

2018-03-07 Thread Arup Rakshit


Thanks Charles. that worked.


> On Mar 7, 2018, at 12:27 PM, Charles Clavadetscher 
> <clavadetsc...@swisspug.org> wrote:
> 
> Hello
> 
>> -Original Message-
>> From: Arup Rakshit [mailto:aruprakshit1...@outlook.com]
>> Sent: Mittwoch, 7. März 2018 05:41
>> To: pgsql-general@lists.postgresql.org
>> Subject: help to query json column
>> 
>> Hi,
>> 
>> I would like to select only rows where signature has a non null value. My 
>> json looks like :
>> 
>> {
>>  "carInspection": {
>>"damages": [
>>  {
>>"x": 68.670309653916,
>>"y": 44.08014571949,
>>"errors": [
>>  {
>>"code": "BR",
>>"description": "Gebrochen"
>>  },
>>  {
>>"code": "F",
>>"description": "Reifen platt"
>>  }
>>]
>>  },
>>  {
>>"x": 40.8014571949,
>>"y": 50.273224043716,
>>"errors": [
>>  {
>>"code": "BR",
>>"description": "Gebrochen"
>>  },
>>  {
>>"code": "F",
>>"description": "Reifen platt"
>>  }
>>]
>>  },
>>  {
>>"x": 48.269581056466,
>>"y": 37.340619307832,
>>"errors": [
>>  {
>>"code": "F",
>>"description": "Reifen platt"
>>  }
>>]
>>  },
>>  {
>>"x": 49.180327868852,
>>"y": 15.482695810565,
>>"errors": [
>>  {
>>"code": "F",
>>"description": "Reifen platt"
>>  }
>>]
>>  }
>>],
>>"layoutURL": "default",
>>"signature1": "> 0 1000 200\">\n

help to query json column

2018-03-06 Thread Arup Rakshit

Hi,

I would like to select only rows where signature has a non null value. My json 
looks like :

{
  "carInspection": {
"damages": [
  {
"x": 68.670309653916,
"y": 44.08014571949,
"errors": [
  {
"code": "BR",
"description": "Gebrochen"
  },
  {
"code": "F",
"description": "Reifen platt"
  }
]
  },
  {
"x": 40.8014571949,
"y": 50.273224043716,
"errors": [
  {
"code": "BR",
"description": "Gebrochen"
  },
  {
"code": "F",
"description": "Reifen platt"
  }
]
  },
  {
"x": 48.269581056466,
"y": 37.340619307832,
"errors": [
  {
"code": "F",
"description": "Reifen platt"
  }
]
  },
  {
"x": 49.180327868852,
"y": 15.482695810565,
"errors": [
  {
"code": "F",
"description": "Reifen platt"
  }
]
  }
],
"layoutURL": "default",
"signature1":

43 matches

Mail list logo