Re: Collecting statistics about contents of JSONB columns

2022-05-17 Thread Tomas Vondra
On 5/17/22 13:44, Mahendra Singh Thalor wrote: > ... > > Hi Nikita, > I and Tomas discussed the design for disabling all-paths > collection(collect stats for only some paths). Below are some > thoughts/doubts/questions. > > *Point 1)* Please can you elaborate more that how are you going to >

Re: Collecting statistics about contents of JSONB columns

2022-05-17 Thread Mahendra Singh Thalor
On Fri, 11 Mar 2022 at 04:29, Nikita Glukhov wrote: > > > On 04.02.2022 05:47, Tomas Vondra wrote: > > On 1/25/22 17:56, Mahendra Singh Thalor wrote: > > > > ... > > For the last few days, I was trying to understand these patches, and based on Tomas's suggestion, I was doing some performance

Re: Collecting statistics about contents of JSONB columns

2022-04-07 Thread Justin Pryzby
I noticed some typos. diff --git a/src/backend/utils/adt/jsonb_selfuncs.c b/src/backend/utils/adt/jsonb_selfuncs.c index f5520f88a1d..d98cd7020a1 100644 --- a/src/backend/utils/adt/jsonb_selfuncs.c +++ b/src/backend/utils/adt/jsonb_selfuncs.c @@ -1342,7 +1342,7 @@

Re: Collecting statistics about contents of JSONB columns

2022-04-01 Thread Greg Stark
This patch has bitrotted, presumably after the other JSON patchset was applied. It looks like it's failing in the json header file so it may be as simple as additional functions added on nearby lines. Please rebase. Reminder, it's the last week of the commitfest so time is of the essence

Re: Collecting statistics about contents of JSONB columns

2022-03-11 Thread Mahendra Singh Thalor
On Fri, 4 Feb 2022 at 08:30, Tomas Vondra wrote: > > > > On 2/4/22 03:47, Tomas Vondra wrote: > > ./json-generate.py 3 2 8 1000 6 1000 > > Sorry, this should be (different order of parameters): > > ./json-generate.py 3 2 1000 8 6 1000 > Thanks, Tomas for this test case. Hi Hackers, For

Re: Collecting statistics about contents of JSONB columns

2022-02-03 Thread Tomas Vondra
On 2/4/22 03:47, Tomas Vondra wrote: ./json-generate.py 3 2 8 1000 6 1000 Sorry, this should be (different order of parameters): ./json-generate.py 3 2 1000 8 6 1000 regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: Collecting statistics about contents of JSONB columns

2022-02-03 Thread Tomas Vondra
On 1/25/22 17:56, Mahendra Singh Thalor wrote: > ... For the last few days, I was trying to understand these patches, and based on Tomas's suggestion, I was doing some performance tests. With the attached .SQL file, I can see that analyze is taking more time with these patches. *Setup:

Re: Collecting statistics about contents of JSONB columns

2022-01-25 Thread Greg Stark
On Thu, 6 Jan 2022 at 14:56, Tomas Vondra wrote: > > > Not sure I understand. I wasn't suggesting any user-defined filtering, > but something done by default, similarly to what we do for regular MCV > lists, based on frequency. We'd include frequent paths while excluding > rare ones. > > So no

Re: Collecting statistics about contents of JSONB columns

2022-01-25 Thread Mahendra Singh Thalor
On Tue, 25 Jan 2022 at 03:50, Tomas Vondra wrote: > > On 1/23/22 01:24, Nikita Glukhov wrote: > > Hi! > > > > I am glad that you found my very old patch interesting and started to > > work on it. We failed to post it in 2016 mostly because we were not > > satisfied with JSONB storage. Also we

Re: Collecting statistics about contents of JSONB columns

2022-01-24 Thread Tomas Vondra
On 1/23/22 01:24, Nikita Glukhov wrote: Hi! I am glad that you found my very old patch interesting and started to work on it. We failed to post it in 2016 mostly because we were not satisfied with JSONB storage. Also we decided to wait for completion of work on extended statistics as we

Re: Collecting statistics about contents of JSONB columns

2022-01-06 Thread Tomas Vondra
On 1/5/22 21:22, Simon Riggs wrote: On Fri, 31 Dec 2021 at 22:07, Tomas Vondra wrote: The patch does something far more elegant - it simply uses stavalues to store an array of JSONB documents, each describing stats for one path extracted from the sampled documents. Sounds good. I'm

Re: Collecting statistics about contents of JSONB columns

2022-01-06 Thread Tomas Vondra
On 1/1/22 22:16, Zhihong Yu wrote: Hi, +static JsonPathStats +jsonStatsFindPathStats(JsonStats jsdata, char *path, int pathlen) Stats appears twice in the method name. I think findJsonPathStats() should suffice. It should check `if (jsdata->nullfrac >= 1.0)` as jsonStatsGetPathStatsStr does.

Re: Collecting statistics about contents of JSONB columns

2022-01-06 Thread Tomas Vondra
On 1/1/22 16:33, Zhihong Yu wrote: Hi, For patch 1: +   List       *statisticsName = NIL;   /* optional stats estimat. procedure */ I think if the variable is named estimatorName (or something similar), it would be easier for people to grasp its purpose. I agree "statisticsName" might

Re: Collecting statistics about contents of JSONB columns

2022-01-05 Thread Simon Riggs
On Fri, 31 Dec 2021 at 22:07, Tomas Vondra wrote: > The patch does something far more > elegant - it simply uses stavalues to store an array of JSONB documents, > each describing stats for one path extracted from the sampled documents. Sounds good. > I'm sure there's plenty open questions -

Re: Collecting statistics about contents of JSONB columns

2022-01-05 Thread Thomas Munro
On Sat, Jan 1, 2022 at 11:07 AM Tomas Vondra wrote: > 0006-Add-jsonb-statistics-20211230.patch Hi Tomas, -CREATE OR REPLACE FUNCTION explain_jsonb(sql_query text) +CREATE OR REPLACE FUNCTION explain_jsonb(sql_query text) https://cirrus-ci.com/task/6405547984420864 It looks like there is a

Re: Collecting statistics about contents of JSONB columns

2022-01-01 Thread Zhihong Yu
On Sat, Jan 1, 2022 at 7:33 AM Zhihong Yu wrote: > > > On Fri, Dec 31, 2021 at 2:07 PM Tomas Vondra < > tomas.von...@enterprisedb.com> wrote: > >> Hi, >> >> One of the complaints I sometimes hear from users and customers using >> Postgres to store JSON documents (as JSONB type, of course) is

Re: Collecting statistics about contents of JSONB columns

2022-01-01 Thread Zhihong Yu
On Fri, Dec 31, 2021 at 2:07 PM Tomas Vondra wrote: > Hi, > > One of the complaints I sometimes hear from users and customers using > Postgres to store JSON documents (as JSONB type, of course) is that the > selectivity estimates are often pretty poor. > > Currently we only really have MCV and