Create array of data from JSONB in PG 9.5
Hi, I wrote the below SQL which works and give me what I want, but it doesn’t work in 9.5 version. I wrote this query in PG 10. I want to change it so that I can run it on 9.5 as production using 9.5 version. I need help on this. SELECT DISTINCT ARRAY [jsonb_array_elements(data) ->> 'group', jsonb_array_elements(jsonb_array_elements(data) -> 'properties') ->> 'name'] FROM "vessels" WHERE "vessels"."deleted_at" IS NULL AND "vessels"."company_id" = '7d105acd-be5a-4225-a2db-b549105e4172'; When I ran the same query to production I get error: ERROR: set-valued function called in context that cannot accept a set Thanks, Arup Rakshit a...@zeit.io
Re: Extend inner join to fetch not yet connected rows also
> On 22-Sep-2019, at 6:55 PM, Jan Kohnert > wrote: > > Hi Arup, > > Am Sonntag, 22. September 2019, 14:48:20 CEST schrieb Arup Rakshit: >> Hi Jan, >> >>> On 22-Sep-2019, at 5:38 PM, Jan Kohnert >>> wrote: >>> maybe something like >>> >>> select >>> >>> c.id, >>> c.name, >>> case when cs.user_id = 8 then true else false end as has >>> >>> from craftsmanships c >>> left join contractor_skills cs >>> >>> on cs.craftsmanship_id = c.craftmanship_id; >> >> But this query fetched duplicate data: > > yeah, that's possible, since I don't exactly know your data model. If only > the > values above are required, you could simply use distinct: When someone adds a craftsmanship to their skill set, the contractor_skills table holds that relationship. I don’t think distinct is the correct tool, as it will eliminate the correct data. users and craftsmanship has m:n relationship via the join table contractor_skills. SELECT craftsmanships.id, craftsmanships.name, CASE WHEN contractor_skills.user_id IS NULL THEN FALSE ELSE TRUE END AS has FROM "craftsmanships" LEFT JOIN "contractor_skills" ON "contractor_skills"."craftsmanship_id" = "craftsmanships"."id" LEFT JOIN "users" ON "users"."id" = "contractor_skills"."user_id" WHERE (contractor_skills.user_id = 8 OR contractor_skills.user_id IS NULL) ORDER BY "craftsmanships"."id”; Gives correct result. Not sure if still this query has bug in it. id | name | has +---+- 1 | paint | t 2 | drywall | t 3 | bathrooms | f 4 | kitchens | f 5 | flooring | f 6 | basements | f 7 | carpentry | f 8 | decks (displayed as decks and patios) | f 9 | windows (windows and doors) | f 10 | countertops | f 11 | landscaping | f 12 | electrical| f 13 | plumbing | f 14 | handyman | f (14 rows) > > select distinct >c.id, >c.name, >case when cs.user_id = 8 then true else false end as has > from craftsmanships c > left join contractor_skills cs >on cs.craftsmanship_id = c.id > order by > c.id; > > -- > MfG Jan > > > >
Re: Extend inner join to fetch not yet connected rows also
Hi Jan, > On 22-Sep-2019, at 5:38 PM, Jan Kohnert > wrote: > > Hey, > > Am Sonntag, 22. September 2019, 13:21:46 CEST schrieb Arup Rakshit: >> I have craftsmanships table which has (id, name) and users table (id, email, >> ..). When a user has some craftsmanships, they are stored inside the >> contractor_skills(user_id, craftsmanship_id, id) table. > > [...] > >> But I want to list all craftsmanships and has column should have `t` when >> user_id #8 has it, else `f`. How can I extend this query? > > maybe something like > > select >c.id, >c.name, >case when cs.user_id = 8 then true else false end as has > from craftsmanships c > left join contractor_skills cs >on cs.craftsmanship_id = c.craftmanship_id; > > -- > MfG Jan > But this query fetched duplicate data: id | name | has +---+- 2 | drywall | t 1 | paint | t 1 | paint | f 11 | landscaping | f 12 | electrical| f 10 | countertops | f 13 | plumbing | f 5 | flooring | f 8 | decks (displayed as decks and patios) | f 6 | basements | f 4 | kitchens | f 3 | bathrooms | f 14 | handyman | f 9 | windows (windows and doors) | f 7 | carpentry | f (15 rows) >
Re: Extend inner join to fetch not yet connected rows also
Hi Jan, I was close and came up with: SELECT craftsmanships.id, craftsmanships.name, CASE WHEN contractor_skills.user_id IS NULL THEN FALSE ELSE TRUE END AS has FROM "craftsmanships" LEFT JOIN "contractor_skills" ON "contractor_skills"."craftsmanship_id" = "craftsmanships"."id" LEFT JOIN "users" ON "users"."id" = "contractor_skills"."user_id" WHERE (contractor_skills.user_id = 3 OR contractor_skills.user_id IS NULL) ORDER BY "craftsmanships"."id”; But after I read yours I found mine is doing lot of unnecessary joins. Thank you. Thanks, Arup Rakshit a...@zeit.io > On 22-Sep-2019, at 5:38 PM, Jan Kohnert > wrote: > > Hey, > > Am Sonntag, 22. September 2019, 13:21:46 CEST schrieb Arup Rakshit: >> I have craftsmanships table which has (id, name) and users table (id, email, >> ..). When a user has some craftsmanships, they are stored inside the >> contractor_skills(user_id, craftsmanship_id, id) table. > > [...] > >> But I want to list all craftsmanships and has column should have `t` when >> user_id #8 has it, else `f`. How can I extend this query? > > maybe something like > > select >c.id, >c.name, >case when cs.user_id = 8 then true else false end as has > from craftsmanships c > left join contractor_skills cs >on cs.craftsmanship_id = c.craftmanship_id; > > -- > MfG Jan > > > >
Extend inner join to fetch not yet connected rows also
I have craftsmanships table which has (id, name) and users table (id, email, ..). When a user has some craftsmanships, they are stored inside the contractor_skills(user_id, craftsmanship_id, id) table. What I want here is that to list all the available craftsmanships with id, name and has column. I can get now only those craftsmanships that a specific user has, SELECT craftsmanships.id, craftsmanships.name, TRUE as has FROM "craftsmanships" INNER JOIN "contractor_skills" ON "contractor_skills"."craftsmanship_id" = "craftsmanships"."id" INNER JOIN "users" ON "users"."id" = "contractor_skills"."user_id" WHERE (contractor_skills.user_id = 8) ORDER BY "craftsmanships".”id" —— id | name | has +-+- 1 | paint | t 2 | drywall | t (2 rows) But I want to list all craftsmanships and has column should have `t` when user_id #8 has it, else `f`. How can I extend thisquery? Thanks, Arup Rakshit a...@zeit.io
Re: duplicate key value violates unique constraint "chinese_price_infos_pkey"
Hi, Thanks for your reply. It is automatic, my app don’t creates ID, it delegates it to the DB. I am using Ruby on Rails app, where we use Postgresql. docking_dev=# \d chinese_price_infos; Table "public.chinese_price_infos" Column|Type | Collation | Nullable | Default -+-+---+--+- id | integer | | not null | nextval('chinese_price_infos_id_seq'::regclass) created_at | timestamp without time zone | | | updated_at | timestamp without time zone | | | item_code | character varying(255) | | | description | character varying(255) | | | unit| character varying(255) | | | price_cents | integer | | | uuid| uuid| | | uuid_generate_v4() company_id | uuid| | | Indexes: "chinese_price_infos_pkey" PRIMARY KEY, btree (id) "index_chinese_price_infos_on_company_id" btree (company_id) Thanks, Arup Rakshit a...@zeit.io > On 06-May-2019, at 4:38 PM, Ray O'Donnell wrote: > > On 06/05/2019 12:05, Arup Rakshit wrote: >> Every time I try to insert I get the error: >> docking_dev=# INSERT INTO "chinese_price_infos" ("item_code", >> "price_cents", "unit", "description", "company_id", "created_at", >> "updated_at") VALUES ('01GS10001', 6000, 'Lift', 'Shore Crane >> Rental', '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 >> 10:49:03.894725', '2019-05-06 10:49:03.894725'); ERROR: duplicate >> key value violates unique constraint "chinese_price_infos_pkey" DETAIL: Key >> (id)=(71165) already exists. docking_dev=# INSERT INTO >> "chinese_price_infos" ("item_code", "price_cents", "unit", >> "description", "company_id", "created_at", "updated_at") VALUES >> ('01GS10001', 6000, 'Lift', 'Shore Crane Rental', >> '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725', >> '2019-05-06 10:49:03.894725'); ERROR: duplicate key value violates >> unique constraint "chinese_price_infos_pkey" DETAIL: Key >> (id)=(71166) already exists. >> Then I found: >> docking_dev=# SELECT MAX(id) FROM chinese_price_infos; max 128520 >> (1 row) >> docking_dev=# SELECT nextval('chinese_price_infos_id_seq'); nextval >> - 71164 (1 row) >> Not sure how it is out of sync. How can I fix this permanently. I ran >> vacuum analyze verbose; still same error. > > > You can fix it by using setval() to set the sequence manually to something > higher than the highest current id value in the table. However, it sounds as > if something in the application code may be causing problems For example, > is something generating id values without reference to the sequence? > > Ray. > > > > -- > Raymond O'Donnell // Galway // Ireland > r...@rodonnell.ie
duplicate key value violates unique constraint "chinese_price_infos_pkey"
Every time I try to insert I get the error: docking_dev=# INSERT INTO "chinese_price_infos" ("item_code", "price_cents", "unit", "description", "company_id", "created_at", "updated_at") VALUES ('01GS10001', 6000, 'Lift', 'Shore Crane Rental', '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725', '2019-05-06 10:49:03.894725'); ERROR: duplicate key value violates unique constraint "chinese_price_infos_pkey" DETAIL: Key (id)=(71165) already exists. docking_dev=# INSERT INTO "chinese_price_infos" ("item_code", "price_cents", "unit", "description", "company_id", "created_at", "updated_at") VALUES ('01GS10001', 6000, 'Lift', 'Shore Crane Rental', '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725', '2019-05-06 10:49:03.894725'); ERROR: duplicate key value violates unique constraint "chinese_price_infos_pkey" DETAIL: Key (id)=(71166) already exists. Then I found: docking_dev=# SELECT MAX(id) FROM chinese_price_infos; max 128520 (1 row) docking_dev=# SELECT nextval('chinese_price_infos_id_seq'); nextval - 71164 (1 row) Not sure how it is out of sync. How can I fix this permanently. I ran vacuum analyze verbose; still same error. Thanks, Arup Rakshit a...@zeit.io
Re: SQl help to build a result with custom aliased bool column
Hi, Thanks for showing different ways to achieve the goal. So what should be the optimal way to solve this. I have an composite index using company_id and feature_id columns for project_features table. I do ruby on rails development, where table names are plural always by convention. The tables I created above in different schema to ask question with sample data and test the query output. So they are little inconsistent, yes you are right. Thanks, Arup Rakshit a...@zeit.io > On 08-Apr-2019, at 4:36 PM, mariusz wrote: > > bear in mind that with a large number of companies and proper index on > company_features the optimizer could limit company_features as > necessary, while your examples read everything anyway and mangle output > to get proper result (with dups and bugs, but also not optimal)
Re: SQl help to build a result with custom aliased bool column
I am still having some bugs. I am getting duplicate in the result set. psql (11.0, server 10.5) Type "help" for help. aruprakshit=# select * from features; id | name +-- 1 | f1 2 | f2 3 | f3 4 | f4 (4 rows) aruprakshit=# select * from company; id | name +-- 1 | c1 2 | c2 (2 rows) aruprakshit=# select * from company_features; id | company_id | feature_id ++ 1 | 1 | 1 2 | 1 | 2 3 | 2 | 3 4 | 1 | 3 (4 rows) aruprakshit=# SELECT aruprakshit-# features.id, aruprakshit-# features.name, aruprakshit-# coalesce(company_features.company_id = 1, false) AS active aruprakshit-# FROM aruprakshit-# features aruprakshit-# LEFT JOIN company_features ON features.id = company_features.feature_id; id | name | active +--+ 1 | f1 | t 2 | f2 | t 3 | f3 | f 3 | f3 | t 4 | f4 | f (5 rows) I should get in the results only 3, as total number of features are 3. Thanks, Arup Rakshit a...@zeit.io > On 08-Apr-2019, at 3:28 PM, Szymon Lipiński wrote: > > Hey, > you could just use > > SELECT > features.id, > features.name, > company_features.company_id = 1 as active > > regards, > Szymon > > On Mon, 8 Apr 2019 at 09:55, Arup Rakshit wrote: > I have 2 tables Company and Feature. They are connected via a join table > called CompanyFeature. I want to build a result set where it will have id, > name and a custom boolean column. This boolean column is there to say if the > feature is present for the company or not. > > Company table: > > | id | name | > ||--| > | 1 | c1 | > | 2 | c2 | > | 3 | c3 | > > Feature table: > > | id | name | > ||--| > | 1 | f1 | > | 2 | f2 | > | 3 | f3 | > > Company Feature table: > > | id | feature_id | company_id | > |||| > | 1 | 2 | 1 | > | 2 | 1 | 1 | > | 3 | 3 | 2 | > > The result should look like for company `c1`: > > | id | name | active | > ||--|| > | 1 | f1 | t | > | 2 | f2 | t | > | 3 | f3 | f | > > I tried something like: > > SELECT > features.id, > features.name, > CASE WHEN company_features.company_id = 1 THEN > TRUE > ELSE > FALSE > END AS active > FROM > features > LEFT JOIN company_features ON company_features.feature_id = > features.id > > It works. But is there any better way to achieve this? > > > Thanks, > > Arup Rakshit > a...@zeit.io > > > > >
Re: SQl help to build a result with custom aliased bool column
I knew that will be more compact way. Thanks for showing it. One thing I still would like to handle is that, to make sure the column contains only True/False. But right now sometimes it shows NULL. How can I fix this? id|name|active| --||--| 1|f1 |true | 2|f2 |true | 3|f3 |false | 4|f4 | | Thanks, Arup Rakshit a...@zeit.io > On 08-Apr-2019, at 3:28 PM, Szymon Lipiński wrote: > > Hey, > you could just use > > SELECT > features.id, > features.name, > company_features.company_id = 1 as active > > regards, > Szymon > > On Mon, 8 Apr 2019 at 09:55, Arup Rakshit wrote: > I have 2 tables Company and Feature. They are connected via a join table > called CompanyFeature. I want to build a result set where it will have id, > name and a custom boolean column. This boolean column is there to say if the > feature is present for the company or not. > > Company table: > > | id | name | > ||--| > | 1 | c1 | > | 2 | c2 | > | 3 | c3 | > > Feature table: > > | id | name | > ||--| > | 1 | f1 | > | 2 | f2 | > | 3 | f3 | > > Company Feature table: > > | id | feature_id | company_id | > |||| > | 1 | 2 | 1 | > | 2 | 1 | 1 | > | 3 | 3 | 2 | > > The result should look like for company `c1`: > > | id | name | active | > ||--|| > | 1 | f1 | t | > | 2 | f2 | t | > | 3 | f3 | f | > > I tried something like: > > SELECT > features.id, > features.name, > CASE WHEN company_features.company_id = 1 THEN > TRUE > ELSE > FALSE > END AS active > FROM > features > LEFT JOIN company_features ON company_features.feature_id = > features.id > > It works. But is there any better way to achieve this? > > > Thanks, > > Arup Rakshit > a...@zeit.io > > > > >
SQl help to build a result with custom aliased bool column
I have 2 tables Company and Feature. They are connected via a join table called CompanyFeature. I want to build a result set where it will have id, name and a custom boolean column. This boolean column is there to say if the feature is present for the company or not. Company table: | id | name | ||--| | 1 | c1 | | 2 | c2 | | 3 | c3 | Feature table: | id | name | ||--| | 1 | f1 | | 2 | f2 | | 3 | f3 | Company Feature table: | id | feature_id | company_id | |||| | 1 | 2 | 1 | | 2 | 1 | 1 | | 3 | 3 | 2 | The result should look like for company `c1`: | id | name | active | ||--|| | 1 | f1 | t | | 2 | f2 | t | | 3 | f3 | f | I tried something like: SELECT features.id, features.name, CASE WHEN company_features.company_id = 1 THEN TRUE ELSE FALSE END AS active FROM features LEFT JOIN company_features ON company_features.feature_id = features.id It works. But is there any better way to achieve this? Thanks, Arup Rakshit a...@zeit.io
ERROR: type "myrowtype" does not exist
I was reading the json_populate_record function from the official doc https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE <https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE>. The doc mentioned examples are giving me errors psql (11.0, server 10.5) Type "help" for help. aruprakshit=# select * from json_populate_record(null::extype, '{"x":"foo", "y":43, "q": true}'); ERROR: type "extype" does not exist LINE 1: select * from json_populate_record(null::extype, '{"x":"foo"... ^ aruprakshit=# select * from json_populate_record(null::myrowtype, '{"a": 1, "b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}}'); ERROR: type "myrowtype" does not exist LINE 1: select * from json_populate_record(null::myrowtype, '{"a": 1… What does null::extype, null::myrowtype mean here ? Thanks, Arup Rakshit a...@zeit.io
Re: Why my query not using index to sort?
'c4f55044-3f4e-439e-a586-8b6978c7b406', '36825da8-2f58-4a62-bdb4-2b91cbe18299', 'cca256cf-b415-472c-8b9f-a8432d02c580'); Time: 14.377 ms docking_dev=# Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 2:50 AM, Adrian Klaver wrote: > > On 9/28/18 12:14 PM, Arup Rakshit wrote: >> Forgot to mention in my previous email, it was a quick send click. Sorry for >> that. >> In the gist you need to see all the line with Vessel Load(.. . I load the >> page multiple times to catch the different times, so you will the line >> multiple times there in the log file. > > Do you know what Vessel Load () is actually measuring? > > To me it looks like it is covering both the query(including ROR overhead) and > the HTTP request/response cycle. > > Also have you looked at: > > https://guides.rubyonrails.org/debugging_rails_applications.html#impact-of-logs-on-performance > >> Thanks, >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com
Re: How to improve sql query to achieve the better plan
I just added it as you said, but I am getting same plan. Sort (cost=62842.16..62846.91 rows=1897 width=35) (actual time=1845.831..1845.950 rows=1229 loops=1) Sort Key: projects.id Sort Method: quicksort Memory: 145kB -> HashAggregate (cost=62710.42..62738.88 rows=1897 width=35) (actual time=1844.178..1845.060 rows=1229 loops=1) Group Key: projects.id -> Hash Right Join (cost=159.68..45382.09 rows=364807 width=35) (actual time=1.534..618.717 rows=365784 loops=1) Hash Cond: (workitems.project_id = projects.id) Filter: (workitems.deleted_at IS NULL) Rows Removed by Filter: 257457 -> Seq Scan on workitems (cost=0.00..36653.75 rows=623175 width=43) (actual time=0.047..213.842 rows=623175 loops=1) -> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=1.478..1.478 rows=1897 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 105kB -> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.006..0.914 rows=1897 loops=1) Planning time: 0.498 ms Execution time: 1846.100 ms —— Indexes: "workitems_pkey" PRIMARY KEY, btree (id) "index_workitems_on_company_id" btree (company_id) "index_workitems_on_deleted_at" btree (deleted_at) "index_workitems_on_parent_workitem_id" btree (parent_workitem_id) "index_workitems_on_project_id" btree (project_id) "index_workitems_on_standard_workitem_id" btree (standard_workitem_id) "index_workitems_on_workitem_category_id" btree (workitem_category_id) "patrial_index_workitems_200_1" btree (project_id) WHERE deleted_at IS NULL Thanks, Arup Rakshit a...@zeit.io > On 30-Sep-2018, at 10:15 PM, Pavel Stehule wrote: > > CREATE INDEX ON workitems(project_id) WHERE deleted_at is null
How to improve sql query to achieve the better plan
I have the below query which is taking 1873 ms. How can I improve this? explain analyze select sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100)) as total_budget_cents, sum(coalesce(price_cents, 0)::bigint * coalesce(quantity, 0) * (1 - coalesce(workitems.discount, 0)/ 100) + coalesce(additional_cost_cents, 0) - coalesce(cost_reduction_cents, 0)) as final_budget_cents, projects.id as project_id from projects left join workitems on workitems.project_id = projects.id where workitems.deleted_at is null group by projects.id order by project_id asc And explain output is: Sort (cost=62851.33..62856.07 rows=1897 width=35) (actual time=1872.867..1873.003 rows=1229 loops=1) Sort Key: projects.id Sort Method: quicksort Memory: 145kB -> HashAggregate (cost=62719.59..62748.04 rows=1897 width=35) (actual time=1871.281..1872.104 rows=1229 loops=1) Group Key: projects.id -> Hash Right Join (cost=159.68..45386.32 rows=364911 width=35) (actual time=2.226..637.936 rows=365784 loops=1) Hash Cond: (workitems.project_id = projects.id) Filter: (workitems.deleted_at IS NULL) Rows Removed by Filter: 257457 -> Seq Scan on workitems (cost=0.00..36655.53 rows=623353 width=43) (actual time=0.020..220.215 rows=623175 loops=1) -> Hash (cost=135.97..135.97 rows=1897 width=16) (actual time=2.177..2.177 rows=1897 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 105kB -> Seq Scan on projects (cost=0.00..135.97 rows=1897 width=16) (actual time=0.013..1.451 rows=1897 loops=1) Planning time: 2.775 ms Execution time: 1873.308 ms Projects table has the index: Indexes: "projects_pkey" PRIMARY KEY, btree (id) "index_projects_on_company_id" btree (company_id) "index_projects_on_deleted_at" btree (deleted_at) "index_projects_on_inspector_id" btree (inspector_id) "index_projects_on_managed_offline_by_user_id" btree (managed_offline_by_user_id) "index_projects_on_project_status_id" btree (project_status_id) "index_projects_on_shipyard_id" btree (shipyard_id) "index_projects_on_vessel_id" btree (vessel_id) Workitems table has the index: Indexes: "workitems_pkey" PRIMARY KEY, btree (id) "index_workitems_on_company_id" btree (company_id) "index_workitems_on_deleted_at" btree (deleted_at) "index_workitems_on_parent_workitem_id" btree (parent_workitem_id) "index_workitems_on_project_id" btree (project_id) "index_workitems_on_standard_workitem_id" btree (standard_workitem_id) "index_workitems_on_workitem_category_id" btree (workitem_category_id) Thanks, Arup Rakshit a...@zeit.io
Re: Why my query not using index to sort?
When I keep the sort off, I see it is executing much more faster. set enable_sort = off; explain analyze select * from "standard_workitems" where "standard_workitems"."deleted_at" is null and "standard_workitems"."company_id" = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae' order by item_code asc; Index Scan using standard_workitems_partial_index_idx_1_1 on standard_workitems (cost=0.42..5802.04 rows=1697 width=763) (actual time=0.018..1.076 rows=2071 loops=1) Index Cond: (company_id = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) Planning time: 0.191 ms Execution time: 1.210 ms I have no idea why PG thinks in memory sort will be cheaper.. Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 9:40 PM, Arup Rakshit wrote: > > Hello Tom, > > I have another query, where I am expecting the sort from index, but it is in > memory and takes lot of time. > > Query: > > explain analyze select >* > from > "standard_workitems" > where > "standard_workitems"."deleted_at" is null > and "standard_workitems"."company_id" = > '6fed40b7-fdd7-4efb-a163-c2b42e6486ae' > order by > standard_workitems.item_code asc; > > Explain plan: > > Sort (cost=3454.03..3458.18 rows=1660 width=810) (actual time=20.302..20.502 > rows=2071 loops=1) > Sort Key: item_code > Sort Method: quicksort Memory: 800kB > -> Bitmap Heap Scan on standard_workitems (cost=57.29..3365.25 rows=1660 > width=810) (actual time=0.297..0.781 rows=2071 loops=1) > Recheck Cond: ((company_id = > '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) AND (deleted_at IS NULL)) > Heap Blocks: exact=139 > -> Bitmap Index Scan on standard_workitems_partial_index_idx_1_1 > (cost=0.00..56.87 rows=1660 width=0) (actual time=0.272..0.272 rows=2071 > loops=1) > Index Cond: (company_id = > '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) > Planning time: 0.199 ms > Execution time: 20.688 ms > > Indexes I have: > > Indexes: > "standard_workitems_pkey" PRIMARY KEY, btree (id) > "index_standard_workitems_on_company_id" btree (company_id) > "index_standard_workitems_on_deleted_at" btree (deleted_at) > "index_standard_workitems_on_item_code" btree (item_code) > "index_standard_workitems_on_workitem_category_id" btree > (workitem_category_id) > "standard_workitems_partial_index_idx_1_1" btree (company_id, item_code) > WHERE deleted_at IS NULL > > > > Thanks, > > Arup Rakshit > a...@zeit.io <mailto:a...@zeit.io> > > > >> On 28-Sep-2018, at 7:07 PM, Tom Lane > <mailto:t...@sss.pgh.pa.us>> wrote: >> >> Arup Rakshit mailto:a...@zeit.io>> writes: >>> My query is not using name index to sort the result. >> >> Given the rowcounts here, I think the planner is making the right choice. >> Sorting 70-some rows with a Sort node is probably cheaper than doing >> random disk I/O to get them in sorted order. With more rows involved, >> it might make the other choice. >> >> As a testing measure (don't do it in production!), you could set >> enable_sort = off, which will force the planner to pick a non-Sort >> plan if possible. Then you could see whether that's actually faster >> or slower, and by how much. >> >> regards, tom lane >
Re: Why my query not using index to sort?
Hello Tom, I have another query, where I am expecting the sort from index, but it is in memory and takes lot of time. Query: explain analyze select * from "standard_workitems" where "standard_workitems"."deleted_at" is null and "standard_workitems"."company_id" = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae' order by standard_workitems.item_code asc; Explain plan: Sort (cost=3454.03..3458.18 rows=1660 width=810) (actual time=20.302..20.502 rows=2071 loops=1) Sort Key: item_code Sort Method: quicksort Memory: 800kB -> Bitmap Heap Scan on standard_workitems (cost=57.29..3365.25 rows=1660 width=810) (actual time=0.297..0.781 rows=2071 loops=1) Recheck Cond: ((company_id = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) AND (deleted_at IS NULL)) Heap Blocks: exact=139 -> Bitmap Index Scan on standard_workitems_partial_index_idx_1_1 (cost=0.00..56.87 rows=1660 width=0) (actual time=0.272..0.272 rows=2071 loops=1) Index Cond: (company_id = '6fed40b7-fdd7-4efb-a163-c2b42e6486ae'::uuid) Planning time: 0.199 ms Execution time: 20.688 ms Indexes I have: Indexes: "standard_workitems_pkey" PRIMARY KEY, btree (id) "index_standard_workitems_on_company_id" btree (company_id) "index_standard_workitems_on_deleted_at" btree (deleted_at) "index_standard_workitems_on_item_code" btree (item_code) "index_standard_workitems_on_workitem_category_id" btree (workitem_category_id) "standard_workitems_partial_index_idx_1_1" btree (company_id, item_code) WHERE deleted_at IS NULL Thanks, Arup Rakshit a...@zeit.io > On 28-Sep-2018, at 7:07 PM, Tom Lane wrote: > > Arup Rakshit writes: >> My query is not using name index to sort the result. > > Given the rowcounts here, I think the planner is making the right choice. > Sorting 70-some rows with a Sort node is probably cheaper than doing > random disk I/O to get them in sorted order. With more rows involved, > it might make the other choice. > > As a testing measure (don't do it in production!), you could set > enable_sort = off, which will force the planner to pick a non-Sort > plan if possible. Then you could see whether that's actually faster > or slower, and by how much. > > regards, tom lane
Re: Why my query not using index to sort?
Hello Adrian, I used to consider this time as the sql execution time approximately. I’ll do some research and get back to you. Thanks for mentioning it. Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 2:50 AM, Adrian Klaver wrote: > > On 9/28/18 12:14 PM, Arup Rakshit wrote: >> Forgot to mention in my previous email, it was a quick send click. Sorry for >> that. >> In the gist you need to see all the line with Vessel Load(.. . I load the >> page multiple times to catch the different times, so you will the line >> multiple times there in the log file. > > Do you know what Vessel Load () is actually measuring? > > To me it looks like it is covering both the query(including ROR overhead) and > the HTTP request/response cycle. > > Also have you looked at: > > https://guides.rubyonrails.org/debugging_rails_applications.html#impact-of-logs-on-performance > >> Thanks, >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com
Re: Why my query not using index to sort?
Forgot to mention in my previous email, it was a quick send click. Sorry for that. In the gist you need to see all the line with Vessel Load(.. . I load the page multiple times to catch the different times, so you will the line multiple times there in the log file. Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 12:40 AM, Arup Rakshit wrote: > > Hi Adrian, > > > The application is? > > It is Ruby On Rails application > > > The log settings are? > > Where is it being printed? > > The default application log, the production.log file. > > > What is being printed? > > Check the gist: > https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6 > <https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6> > > > > Thanks, > > Arup Rakshit > a...@zeit.io <mailto:a...@zeit.io> > > > >> On 29-Sep-2018, at 12:28 AM, Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote: >> >> On 9/28/18 11:51 AM, Arup Rakshit wrote: >>> Yes, I have shown the explain plan output. But in my application log the >>> sql query prints 1.7 to 1.9 ms. >> >> So you have added another layer to the process. >> >> The application is? >> >> The log settings are? >> >> What is being printed? >> >> Where is it being printed? >> >>> Thanks, >>> Arup Rakshit >>> a...@zeit.io <mailto:a...@zeit.io> <mailto:a...@zeit.io >>> <mailto:a...@zeit.io>> >> >> >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> >
Re: Why my query not using index to sort?
Hi Adrian, > The application is? It is Ruby On Rails application > The log settings are? > Where is it being printed? The default application log, the production.log file. > What is being printed? Check the gist: https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6 <https://gist.github.com/aruprakshit/a6bd7ca221c9a13cd583e0827aa24ad6> Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 12:28 AM, Adrian Klaver wrote: > > On 9/28/18 11:51 AM, Arup Rakshit wrote: >> Yes, I have shown the explain plan output. But in my application log the sql >> query prints 1.7 to 1.9 ms. > > So you have added another layer to the process. > > The application is? > > The log settings are? > > What is being printed? > > Where is it being printed? > >> Thanks, >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com
Re: Why my query not using index to sort?
Yes, I have shown the explain plan output. But in my application log the sql query prints 1.7 to 1.9 ms. Thanks, Arup Rakshit a...@zeit.io > On 29-Sep-2018, at 12:17 AM, Adrian Klaver wrote: > > On 9/28/18 11:32 AM, Arup Rakshit wrote: >> Yes. But I thought I could improve it, so that it can be finished in 0.xx >> ms. It takes now between 1.7 to 1.9 ms in production. > > That is not what you showed in your first post: > > Planning time: 0.178 ms > Execution time: 0.527 ms > > > >> Thanks, >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> >>> On 28-Sep-2018, at 9:46 PM, Andreas Kretschmer >> <mailto:andr...@a-kretschmer.de>> wrote: >>> >>> >>> >>> Am 28.09.2018 um 18:03 schrieb Arup Rakshit: >>>> Also I meatn the execution time is less than I had before with enable_sort >>>> = off. >>> >>> yeah, but not that much. different plan. >>> >>> >>> Regards, Andreas >>> >>> -- >>> 2ndQuadrant - The PostgreSQL Support Company. >>> www.2ndQuadrant.com <http://www.2ndQuadrant.com> >>> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com
Re: Why my query not using index to sort?
Yes. But I thought I could improve it, so that it can be finished in 0.xx ms. It takes now between 1.7 to 1.9 ms in production. Thanks, Arup Rakshit a...@zeit.io > On 28-Sep-2018, at 9:46 PM, Andreas Kretschmer > wrote: > > > > Am 28.09.2018 um 18:03 schrieb Arup Rakshit: >> Also I meatn the execution time is less than I had before with enable_sort = >> off. > > yeah, but not that much. different plan. > > > Regards, Andreas > > -- > 2ndQuadrant - The PostgreSQL Support Company. > www.2ndQuadrant.com >
Re: Why my query not using index to sort?
Also I meatn the execution time is less than I had before with enable_sort = off. Why 6 rows was a side question. :) Thanks, Arup Rakshit a...@zeit.io > On 28-Sep-2018, at 9:29 PM, Andreas Kretschmer > wrote: > > > > Am 28.09.2018 um 16:49 schrieb Arup Rakshit: >> Planning time: 1.867 ms >> Execution time: 0.252 ms >> (6 rows) >> >> Why it is showing *6 rows*? Also it seems less than what I had before: >> > > the explain-output are 6 rows ;-) > > > Regards, Andreas > > -- > 2ndQuadrant - The PostgreSQL Support Company. > www.2ndQuadrant.com > >
Re: Why my query not using index to sort?
Oh I see. That makes sense, I was reading too much into that line.. :) Thanks, Arup Rakshit a...@zeit.io > On 28-Sep-2018, at 9:29 PM, Andreas Kretschmer > wrote: > > > > Am 28.09.2018 um 16:49 schrieb Arup Rakshit: >> Planning time: 1.867 ms >> Execution time: 0.252 ms >> (6 rows) >> >> Why it is showing *6 rows*? Also it seems less than what I had before: >> > > the explain-output are 6 rows ;-) > > > Regards, Andreas > > -- > 2ndQuadrant - The PostgreSQL Support Company. > www.2ndQuadrant.com > >
Re: Why my query not using index to sort?
Hello Tom, Here is the plan with `enable_sort = off`. # set enable_sort = off; SET # explain analyze select "vessels" .* from "vessels" where "vessels"."deleted_at" is null and "vessels"."company_id" = '86529964-6e9b-4bfa-ba9e-62bd24eaa954' order by "vessels"."name" ASC; QUERY PLAN - Index Scan using inspector_tool_idx20_1 on vessels (cost=0.27..40.76 rows=76 width=107) (actual time=0.047..0.120 rows=77 loops=1) Index Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Filter: (deleted_at IS NULL) Rows Removed by Filter: 4 Planning time: 1.867 ms Execution time: 0.252 ms (6 rows) Why it is showing *6 rows*? Also it seems less than what I had before: # explain analyze select "vessels" .* from "vessels" where "vessels"."deleted_at" is null and "vessels"."company_id" = '86529964-6e9b-4bfa-ba9e-62bd24eaa954' order by "vessels"."name" ASC; QUERY PLAN --- Sort (cost=17.29..17.48 rows=76 width=107) (actual time=0.789..0.796 rows=77 loops=1) Sort Key: name Sort Method: quicksort Memory: 38kB -> Bitmap Heap Scan on vessels (cost=4.90..14.91 rows=76 width=107) (actual time=0.090..0.122 rows=77 loops=1) Recheck Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Filter: (deleted_at IS NULL) Rows Removed by Filter: 4 Heap Blocks: exact=3 -> Bitmap Index Scan on inspector_tool_idx20_1 (cost=0.00..4.88 rows=81 width=0) (actual time=0.059..0.059 rows=81 loops=1) Index Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Planning time: 1.743 ms Execution time: 0.954 ms (12 rows) Thanks, Arup Rakshit a...@zeit.io > On 28-Sep-2018, at 7:07 PM, Tom Lane wrote: > > Arup Rakshit writes: >> My query is not using name index to sort the result. > > Given the rowcounts here, I think the planner is making the right choice. > Sorting 70-some rows with a Sort node is probably cheaper than doing > random disk I/O to get them in sorted order. With more rows involved, > it might make the other choice. > > As a testing measure (don't do it in production!), you could set > enable_sort = off, which will force the planner to pick a non-Sort > plan if possible. Then you could see whether that's actually faster > or slower, and by how much. > > regards, tom lane
Why my query not using index to sort?
My query is not using name index to sort the result. explain analyze select "vessels" .* from "vessels" where "vessels"."deleted_at" is null and "vessels"."company_id" = '86529964-6e9b-4bfa-ba9e-62bd24eaa954' order by "vessels"."name" ASC; I have below index in my vessels table: Indexes: "vessels_pkey" PRIMARY KEY, btree (id) "inspector_tool_idx20_1" btree (company_id) "inspector_tool_idx20_2" btree (name) and following is my plan output. Sort (cost=17.29..17.48 rows=76 width=107) (actual time=0.468..0.473 rows=77 loops=1) Sort Key: name Sort Method: quicksort Memory: 38kB -> Bitmap Heap Scan on vessels (cost=4.90..14.91 rows=76 width=107) (actual time=0.036..0.059 rows=77 loops=1) Recheck Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Filter: (deleted_at IS NULL) Rows Removed by Filter: 4 Heap Blocks: exact=3 -> Bitmap Index Scan on inspector_tool_idx20_1 (cost=0.00..4.88 rows=81 width=0) (actual time=0.022..0.023 rows=81 loops=1) Index Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Planning time: 0.178 ms Execution time: 0.527 ms After that I changed my index as follows, and the got almost same plan output. Indexes: "vessels_pkey" PRIMARY KEY, btree (id) "inspector_tool_idx20_1" btree (company_id, name) Sort (cost=17.29..17.48 rows=76 width=107) (actual time=0.475..0.480 rows=77 loops=1) Sort Key: name Sort Method: quicksort Memory: 38kB -> Bitmap Heap Scan on vessels (cost=4.90..14.91 rows=76 width=107) (actual time=0.038..0.058 rows=77 loops=1) Recheck Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Filter: (deleted_at IS NULL) Rows Removed by Filter: 4 Heap Blocks: exact=3 -> Bitmap Index Scan on inspector_tool_idx20_1 (cost=0.00..4.88 rows=81 width=0) (actual time=0.025..0.025 rows=81 loops=1) Index Cond: (company_id = '86529964-6e9b-4bfa-ba9e-62bd24eaa954'::uuid) Planning time: 0.168 ms Execution time: 0.533 ms Is there anyway, I can improve the sorting so that it can use the index ? Thanks, Arup Rakshit a...@zeit.io
Re: Help to understand Actual Rows vs Plan Rows from the query planner output
Hi, > This is a broad topic, and I can only give you some hints. Yes when I am reading this https://www.postgresql.org/docs/10/static/using-explain.html <https://www.postgresql.org/docs/10/static/using-explain.html> I found the doc said like “...Plan-reading is an art that requires some experience to master, but this section attempts to cover the basics..” Lets see how can I get there. :) I probably need to ask a lot of questions. Thanks for sharing those points. I made a note about them. Thanks, Arup Rakshit a...@zeit.io > On 25-Sep-2018, at 11:45 AM, Laurenz Albe wrote: > > Arup Rakshit wrote: >> Thanks for the link. I read the documentation you linked, and part of it >> I understood and rest went above my head. Probably I need to read it multiple >> times to understand what is going on. I am learning how indexing works in >> DBMS. >> Mostly I understood Btree so far. I am an application developer. >> Being an application developer I think I need to know which column should be >> indexed and what kind of index to apply in which case. > > That is true; a truth that many developers unfortunately ignore. > >> Most of the time, when >> I see slow query, I saw people ask to run the explain to see the plan. >> And explain statement shows lot of data about the query. So my questions is >> that: >> Which part I should read from the plan output to figure out reason of >> slowness >> or what need to be used to improve it. What are basic things I should know >> about it. >> I think, I don’t need to internal math for this, am I right? The query in >> this post is not a slow query, it is something I ran to see how index merge >> happens. >> I am asking generally. Can you give me some directions on this, so that I can >> build up some bases on this subject. > > This is a broad topic, and I can only give you some hints. > In order to understand EXPLAIN output and to improve your query, you need > to know some of how the database is implemented. > > You have to understand index scans, index only scans and bitmap index scans. > You have to understand nested loop, hash and merge joins. > You have to understand table statistics, dead tuples and table bloat. > > The first things to look for in EXPLAIN (ANALYZE, BUFFERS) output is in which > nodes the time is spent, and where the estimated number of rows diverges > significantly from the actual number of rows. > The former are the spots where there is room for improvement, and the latter > is often the root cause of a bad plan choice. > Also, watch out for the nodes that touch a lot of blocks. > They can cause intermittent slow-down if the blocks are not in cache. > >> Also what are the best resources to learn GIST, GIN indexes — something which >> teaches it from the ground up? > > The documentation, and for what goes beyond that, the source. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com >
Re: Help to understand Actual Rows vs Plan Rows from the query planner output
Hello Thomas, Thanks for the link. I read the documentation you linked, and part of it I understood and rest went above my head. Probably I need to read it multiple times to understand what is going on. I am learning how indexing works in DBMS. Mostly I understood Btree so far. I am an application developer. Being an application developer I think I need to know which column should be indexed and what kind of index to apply in which case. Most of the time, when I see slow query, I saw people ask to run the explain to see the plan. And explain statement shows lot of data about the query. So my questions is that: Which part I should read from the plan output to figure out reason of slowness or what need to be used to improve it. What are basic things I should know about it. I think, I don’t need to internal math for this, am I right? The query in this post is not a slow query, it is something I ran to see how index merge happens. I am asking generally. Can you give me some directions on this, so that I can build up some bases on this subject. Also what are the best resources to learn GIST, GIN indexes — something which teaches it from the ground up? Thanks, Arup Rakshit a...@zeit.io > On 24-Sep-2018, at 4:11 AM, Tomas Vondra wrote: > > > > On 09/23/2018 10:21 PM, Arup Rakshit wrote: >> Hello I have some questions related to the query plan output about the >> planned and actual rows. In the following example: >> >> # explain (analyze true, costs true, format yaml) select * from users >> where lower(city) = 'melanyfort' and lower(state) = 'ohio'; >> QUERY PLAN >> >> >> - Plan: >> + >> Node Type: "Bitmap Heap Scan" >> + >> Parallel Aware: false >> + >> Relation Name: "users" >> + >> Alias: "users" >> + >> Startup Cost: 10.78 >> + >> Total Cost: 14.80 >> + >> Plan Rows: 1 >> + >> Plan Width: 73 >> + >> Actual Startup Time: 0.155 >> + >> Actual Total Time: 0.155 >> + >> Actual Rows: 0 >> + >> Actual Loops: 1 >> + >> Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND >> (lower((state)::text) = 'ohio'::text))"+ >> Rows Removed by Index Recheck: 0 >> + >> Exact Heap Blocks: 0 >> + >> Lossy Heap Blocks: 0 >> + >> Plans: >> + >>- Node Type: "BitmapAnd" >> + >> Parent Relationship: "Outer" >> + >> Parallel Aware: false >> + >> Startup Cost: 10.78 >> + >> Total Cost: 10.78 >> + >> Plan Rows: 1 >&
Help to understand Actual Rows vs Plan Rows from the query planner output
+ Actual Rows: 1 + Actual Loops: 1 + Index Cond: "(lower((city)::text) = 'melanyfort'::text)" + - Node Type: "Bitmap Index Scan" + Parent Relationship: "Member" + Parallel Aware: false + Index Name: "lower_state_users_idx" + Startup Cost: 0.00 + Total Cost: 5.87 + Plan Rows: 211 + Plan Width: 0 + Actual Startup Time: 0.102 + Actual Total Time: 0.102 + Actual Rows: 211 + Actual Loops: 1 + Index Cond: "(lower((state)::text) = 'ohio'::text)" + Planning Time: 0.260 + Triggers: + Execution Time: 0.249 (1 row) aruprakshit=# -- In the first node type of "Bitmap Index Scan” on “users_lower_idx”, I see the plan rows are 50, but actual rows it got 1. In the second node type of "Bitmap Index Scan” on “ lower_state_users_idx”, I see the plan rows are 211, and actual rows 211. Both are same. Based on what conditions planner estimated the planned and actual rows count? In node type “BitmapAnd”, I see again the actual rows 1, then why on the final plan i.e. Node Type: "Bitmap Heap Scan” again planner estimated rows 1? How does it counts these? What does the Loops count says us about the query? Thanks, Arup Rakshit a...@zeit.io
Bitmap Heap Scan and Bitmap Index Scan
Here is a explain plan of a very simple query: aruprakshit=# explain analyze select first_name, last_name from users where lower(state) = 'colorado'; QUERY PLAN -- Bitmap Heap Scan on users (cost=5.86..161.40 rows=203 width=13) (actual time=0.134..0.444 rows=203 loops=1) Recheck Cond: (lower((state)::text) = 'colorado'::text) Heap Blocks: exact=106 -> Bitmap Index Scan on lower_state_users_idx (cost=0.00..5.81 rows=203 width=0) (actual time=0.098..0.098 rows=203 loops=1) Index Cond: (lower((state)::text) = 'colorado'::text) Planning time: 0.263 ms Execution time: 0.517 ms (7 rows) I read this https://www.postgresql.org/message-id/12553.1135634231%40sss.pgh.pa.us <https://www.postgresql.org/message-id/12553.1135634...@sss.pgh.pa.us> and https://www.postgresql.org/message-id/464F3C5D.2000700%40enterprisedb.com <https://www.postgresql.org/message-id/464f3c5d.2000...@enterprisedb.com> to understand what this bitmap heap scan and index scan is. But there are some questions still in mind which I am not able to figure out yet. Does bitmap index apply when normal index scan is costly? Does bitmap index always store page number of matching tuples instead of just the tuples? What is Heap Blocks: exact=106 ? Why the cost is higher in Heap scan than index scan? Thanks, Arup Rakshit a...@zeit.io
Bitmap Heap Scan and Bitmap Index Scan
Here is a explain plan of a very simple query: aruprakshit=# explain analyze select first_name, last_name from users where lower(state) = 'colorado'; QUERY PLAN -- Bitmap Heap Scan on users (cost=5.86..161.40 rows=203 width=13) (actual time=0.134..0.444 rows=203 loops=1) Recheck Cond: (lower((state)::text) = 'colorado'::text) Heap Blocks: exact=106 -> Bitmap Index Scan on lower_state_users_idx (cost=0.00..5.81 rows=203 width=0) (actual time=0.098..0.098 rows=203 loops=1) Index Cond: (lower((state)::text) = 'colorado'::text) Planning time: 0.263 ms Execution time: 0.517 ms (7 rows) I read this https://www.postgresql.org/message-id/12553.1135634231%40sss.pgh.pa.us <https://www.postgresql.org/message-id/12553.1135634...@sss.pgh.pa.us> and https://www.postgresql.org/message-id/464F3C5D.2000700%40enterprisedb.com <https://www.postgresql.org/message-id/464f3c5d.2000...@enterprisedb.com> to understand what this bitmap heap scan and index scan is. But there are some questions still in mind which I am not able to figure out yet. Does bitmap index apply when normal index scan is costly? Does bitmap index always store page number of matching tuples instead of just the tuples? What is Heap Blocks: exact=106 ? Why the cost is higher in Heap scan than index scan? Thanks, Arup Rakshit a...@zeit.io
Re: Can I add Index to make a query faster which involves joins on unnest ?
Hello Rob, So how do I improve this query speed? Thanks, Arup Rakshit a...@zeit.io > On 14-Sep-2018, at 12:27 AM, Rob Sargent wrote: > > > >> On Sep 13, 2018, at 12:17 PM, Arup Rakshit > <mailto:a...@zeit.io>> wrote: >> >> The below query basically gives the result by maintaining the order of the >> sizes in the list. >> >> explain analyze select >> "price_levels"."name", >> "price_levels"."size" >> from >> "price_levels" >> join unnest(array['M', >> 'L', >> 'XL', >> '2XL', >> '3XL', >> '4XL', >> '5XL', >> '6XL', >> 'S']) with ordinality t(size, >> ord) >> using (size) >> order by >> t.size >> >> >> I have a Btree index on the size column. >> >> Explain output is: >> >> Merge Join (cost=4.61..5165.38 rows=6 width=46) (actual >> time=0.157..57.872 rows=6 loops=1) >> Merge Cond: ((price_levels.size)::text = t.size) >> -> Index Scan using price_levels_size_idx on price_levels >> (cost=0.29..4111.05 rows=6 width=14) (actual time=0.044..25.941 >> rows=6 loops=1) >> -> Sort (cost=4.32..4.57 rows=100 width=32) (actual time=0.108..3.946 >> rows=53289 loops=1) >> Sort Key: t.size >> Sort Method: quicksort Memory: 25kB >> -> Function Scan on unnest t (cost=0.00..1.00 rows=100 width=32) >> (actual time=0.030..0.033 rows=9 loops=1) >> Planning time: 0.667 ms >> Execution time: 62.846 ms >> >> >> >> Thanks, >> >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> >> >> > There are not value of size fit it to be a worthwhile key. >> >
Can I add Index to make a query faster which involves joins on unnest ?
The below query basically gives the result by maintaining the order of the sizes in the list. explain analyze select "price_levels"."name", "price_levels"."size" from "price_levels" join unnest(array['M', 'L', 'XL', '2XL', '3XL', '4XL', '5XL', '6XL', 'S']) with ordinality t(size, ord) using (size) order by t.size I have a Btree index on the size column. Explain output is: Merge Join (cost=4.61..5165.38 rows=6 width=46) (actual time=0.157..57.872 rows=6 loops=1) Merge Cond: ((price_levels.size)::text = t.size) -> Index Scan using price_levels_size_idx on price_levels (cost=0.29..4111.05 rows=6 width=14) (actual time=0.044..25.941 rows=6 loops=1) -> Sort (cost=4.32..4.57 rows=100 width=32) (actual time=0.108..3.946 rows=53289 loops=1) Sort Key: t.size Sort Method: quicksort Memory: 25kB -> Function Scan on unnest t (cost=0.00..1.00 rows=100 width=32) (actual time=0.030..0.033 rows=9 loops=1) Planning time: 0.667 ms Execution time: 62.846 ms Thanks, Arup Rakshit a...@zeit.io
Re: Select rows when all all ids of its children records matches
Hi Rob, I figured it out. thanks. It is giving correct data. > Aren't you looking for cte.tags = array[1,2]? posts_tags_cte has tags column, so I am using it. Thanks, Arup Rakshit a...@zeit.io > On 12-Sep-2018, at 9:47 PM, Rob Sargent wrote: > > > > On 09/12/2018 10:08 AM, Arup Rakshit wrote: >> I tried : >> >> WITH posts_tags_cte AS ( >> SELECT post_id, array_agg(tag_id) as tags >> FROM posts_tags >> WHERE tag_id in (1, 2) >> GROUP BY post_id >> ) >> SELECT posts.id <http://posts.id/> FROM posts_tags_cte JOIN posts ON >> posts.id <http://posts.id/> = posts_tags_cte.post_id >> WHERE posts_tags_cte.tags @> array[1, 2]::int8[] >> >> But it gives me all the posts. >> >> > Aren't you looking for cte.tags = array[1,2]? >
Re: Select rows when all all ids of its children records matches
I tried : WITH posts_tags_cte AS ( SELECT post_id, array_agg(tag_id) as tags FROM posts_tags WHERE tag_id in (1, 2) GROUP BY post_id ) SELECT posts.id FROM posts_tags_cte JOIN posts ON posts.id = posts_tags_cte.post_id WHERE posts_tags_cte.tags @> array[1, 2]::int8[] But it gives me all the posts. Thanks, Arup Rakshit a...@zeit.io > On 12-Sep-2018, at 9:14 PM, Arup Rakshit wrote: > > Can you show me the SQL construction? Do I need to use `WITH`? > > > Thanks, > > Arup Rakshit > a...@zeit.io <mailto:a...@zeit.io> > > > >> On 12-Sep-2018, at 9:13 PM, David G. Johnston > <mailto:david.g.johns...@gmail.com>> wrote: >> >> On Wednesday, September 12, 2018, Arup Rakshit > <mailto:a...@zeit.io>> wrote: >> IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag >> ids ) >> >> Build arrays and then use the “contains” operator. >> >> David J. >
Re: Select rows when all all ids of its children records matches
Can you show me the SQL construction? Do I need to use `WITH`? Thanks, Arup Rakshit a...@zeit.io > On 12-Sep-2018, at 9:13 PM, David G. Johnston > wrote: > > On Wednesday, September 12, 2018, Arup Rakshit <mailto:a...@zeit.io>> wrote: > IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag > ids ) > > Build arrays and then use the “contains” operator. > > David J.
Re: Select rows when all all ids of its children records matches
IN is OR, I want the AND logic. Select posts which has tag 1, 2 and 3 ( tag ids ) Thanks, Arup Rakshit a...@zeit.io > On 12-Sep-2018, at 8:58 PM, Ron wrote: > > Maybe this: > select p.id, p.name > from posts p, > posts_tags pt, > tags t > where t.id in (1, 2, 3) > and t.id = pt.tag_id > and pt.post_id = p.id; > > > On 09/12/2018 10:23 AM, Arup Rakshit wrote: >> I have a table posts(id, name), posts_tags(post_id, tag_id) and tags (id, >> name) ... I want to get all posts which has tag id 1, 2 and 3 for example. >> How should I do this? I tried ALL, but it didn’t work. >> >> Those tag ids comes from UI by users, so I am looking for generic approach. >> >> >> Thanks, >> >> Arup Rakshit >> a...@zeit.io <mailto:a...@zeit.io> >> >> >> > > -- > Angular momentum makes the world go 'round.
Why order by column not using index with distinct keyword in select clause?
I have define a simple B Tree index on column *country* for users table. I don’t understand why the order by column not using the index scan when using *distinct* keyword in the select clause. Can anyone explain what is happening here? aruprakshit=# \d users; Table "public.users" Column | Type | Collation | Nullable | Default +---+---+--+--- city | character varying | | | last_name | character varying(50) | | | country| character varying(50) | | | sequence | integer | | | first_name | character varying(50) | | | state | character varying(50) | | | email | character varying | | | id | smallint | | not null | nextval('users_id_seq'::regclass) Indexes: "users_pk" PRIMARY KEY, btree (id) aruprakshit=# explain analyze select distinct country from users order by country asc; QUERY PLAN Sort (cost=283.19..283.85 rows=263 width=11) (actual time=10.525..10.555 rows=263 loops=1) Sort Key: country Sort Method: quicksort Memory: 38kB -> HashAggregate (cost=269.99..272.62 rows=263 width=11) (actual time=8.469..8.521 rows=263 loops=1) Group Key: country -> Seq Scan on users (cost=0.00..244.99 rows= width=11) (actual time=0.022..3.428 rows= loops=1) Planning time: 0.358 ms Execution time: 10.634 ms (8 rows) aruprakshit=# explain analyze select country from users order by country asc; QUERY PLAN - Index Only Scan using country on users (cost=0.29..886.27 rows= width=11) (actual time=0.083..7.581 rows= loops=1) Heap Fetches: Planning time: 0.118 ms Execution time: 8.332 ms (4 rows) aruprakshit=# explain analyze select * from users order by country asc; QUERY PLAN Index Scan using country on users (cost=0.29..886.27 rows= width=73) (actual time=0.015..8.432 rows= loops=1) Planning time: 0.213 ms Execution time: 9.086 ms (3 rows) aruprakshit=# Thanks, Arup Rakshit a...@zeit.io
Re: Why my query not doing index only scan
Hello Stephen, I would like to ask one more question related to this topic. When I take a dump from production, and restore it to development DB, what are the commands I generally need to run to dev deb quack close to production? Thanks, Arup Rakshit a...@zeit.io > On 10-Sep-2018, at 4:58 PM, Stephen Frost wrote: > > Greetings, > > * Arup Rakshit (a...@zeit.io) wrote: >> I have an index defined "inspector_tool_idx4_1" UNIQUE, btree (company_id, >> item_code, deleted_at). Now I am using the *company_id* column in the where >> clause, and the selecting just the *item_code* field for all matching rows. >> I expected here the planner will do a index only scans. But it is doing >> bitmap index scan. Any idea what it is not doing what I expected it to do. > > One possibility is that the visibility map isn't current. > > Indexes don't include visibility information. The way an index-only > scan works is that we track pages which are 'all visible' (meaning that > every tuple on that page is visible to all running transactions) in a > seperate file called the 'visibility map' (aka the VM). The VM is > updated by the VACUUM process- but we only automatically run a VACUUM > (with the autovacuum process) when thresholds have been reached for the > number of UPDATE'd or DELETE'd tuples. > > What this means is that if you are playing around in development and > just COPY a bunch of rows into a table, but don't ever UPDATE or DELETE > the rows in that table, then you'll almost never get an index-only scan > because the VM won't be current (and PG knows this). > > Make sure to do a VACUUM after loading data (and letting any ongoing > transactions finish) and then re-test. That should make it sure that > the VM is current and make it more likely that PG will do an index-only > scan. Not a guarantee still, but that's the first thing I'd try, based > on what you've shared here. > > Thanks! > > Stephen
Re: Why my query not doing index only scan
Thanks Stephen. After running the vacuum on the table it worked. inspection_development=# explain analyze select item_code from inspector_tool_components where company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'; QUERY PLAN - Index Only Scan using inspector_tool_idx4_1 on inspector_tool_components (cost=0.41..57.87 rows=1226 width=8) (actual time=0.044..0.626 rows=1232 loops=1) Index Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid) Heap Fetches: 0 Planning time: 0.190 ms Execution time: 0.778 ms (5 rows) Thanks, Arup Rakshit a...@zeit.io > On 10-Sep-2018, at 4:58 PM, Stephen Frost wrote: > > VACUUM
Why my query not doing index only scan
Hello All, I am learning at this point how index works in DBMS. So I am looking for a very broad explanation to clear my basics. I have an index defined "inspector_tool_idx4_1" UNIQUE, btree (company_id, item_code, deleted_at). Now I am using the *company_id* column in the where clause, and the selecting just the *item_code* field for all matching rows. I expected here the planner will do a index only scans. But it is doing bitmap index scan. Any idea what it is not doing what I expected it to do. inspection_development=# explain analyze select item_code from inspector_tool_components where company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'; QUERY PLAN - Bitmap Heap Scan on inspector_tool_components (cost=45.92..1360.48 rows=1226 width=8) (actual time=0.382..1.202 rows=1232 loops=1) Recheck Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid) Heap Blocks: exact=81 -> Bitmap Index Scan on inspector_tool_idx4_1 (cost=0.00..45.61 rows=1226 width=0) (actual time=0.347..0.347 rows=1232 loops=1) Index Cond: (company_id = '7881ff2e-0557-4734-9da8-2d33072ff1ef'::uuid) Planning time: 0.207 ms Execution time: 1.358 ms (7 rows) Thanks, Arup Rakshit a...@zeit.io
Re: help to query json column
Thanks Charles. that worked. > On Mar 7, 2018, at 12:27 PM, Charles Clavadetscher > <clavadetsc...@swisspug.org> wrote: > > Hello > >> -Original Message- >> From: Arup Rakshit [mailto:aruprakshit1...@outlook.com] >> Sent: Mittwoch, 7. März 2018 05:41 >> To: pgsql-general@lists.postgresql.org >> Subject: help to query json column >> >> Hi, >> >> I would like to select only rows where signature has a non null value. My >> json looks like : >> >> { >> "carInspection": { >>"damages": [ >> { >>"x": 68.670309653916, >>"y": 44.08014571949, >>"errors": [ >> { >>"code": "BR", >>"description": "Gebrochen" >> }, >> { >>"code": "F", >>"description": "Reifen platt" >> } >>] >> }, >> { >>"x": 40.8014571949, >>"y": 50.273224043716, >>"errors": [ >> { >>"code": "BR", >>"description": "Gebrochen" >> }, >> { >>"code": "F", >>"description": "Reifen platt" >> } >>] >> }, >> { >>"x": 48.269581056466, >>"y": 37.340619307832, >>"errors": [ >> { >>"code": "F", >>"description": "Reifen platt" >> } >>] >> }, >> { >>"x": 49.180327868852, >>"y": 15.482695810565, >>"errors": [ >> { >>"code": "F", >>"description": "Reifen platt" >> } >>] >> } >>], >>"layoutURL": "default", >>"signature1": "> 0 1000 200\">\n
help to query json column
Hi, I would like to select only rows where signature has a non null value. My json looks like : { "carInspection": { "damages": [ { "x": 68.670309653916, "y": 44.08014571949, "errors": [ { "code": "BR", "description": "Gebrochen" }, { "code": "F", "description": "Reifen platt" } ] }, { "x": 40.8014571949, "y": 50.273224043716, "errors": [ { "code": "BR", "description": "Gebrochen" }, { "code": "F", "description": "Reifen platt" } ] }, { "x": 48.269581056466, "y": 37.340619307832, "errors": [ { "code": "F", "description": "Reifen platt" } ] }, { "x": 49.180327868852, "y": 15.482695810565, "errors": [ { "code": "F", "description": "Reifen platt" } ] } ], "layoutURL": "default", "signature1":