Where are the 50 "t* columns? On Mon, Dec 23, 2024 at 1:26 PM Divyansh Gupta JNsThMAudy < ag1567...@gmail.com> wrote:
> Ron here is the entire table schema FYI, userid is the mandate column on > which filter is always applies: > > CREATE TABLE dbo.googledocs_tbl ( > > gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 > MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, > > userid int8 NOT NULL, > > useremail varchar(600) NOT NULL, > > title public.citext NULL, > > authorname varchar(600) NULL, > > authoremail varchar(600) NULL, > > updated varchar(300) NOT NULL, > > entryid varchar(2000) NOT NULL, > > lastmodifiedby varchar(600) NULL, > > lastmodifiedbyemail varchar(600) NULL, > > "size" varchar(300) NULL, > > contenttype varchar(250) NULL, > > fileextension varchar(50) NULL, > > docfoldername public.citext NULL, > > folderresourceid public.citext NULL, > > filesize int8 DEFAULT 0 NOT NULL, > > retentionstatus int2 DEFAULT 0 NOT NULL, > > docfileref int8 NULL, > > usid int4 NULL, > > archivepath varchar(500) NULL, > > createddate timestamp(6) DEFAULT NULL::timestamp without time zone NULL, > > zipfilename varchar(100) NULL, > > oncreatedat timestamp(6) DEFAULT clock_timestamp() NOT NULL, > > onupdateat timestamp(6) DEFAULT clock_timestamp() NOT NULL, > > startsnapshot int4 DEFAULT 0 NOT NULL, > > currentsnapshot int4 DEFAULT 0 NOT NULL, > > dismiss int2 DEFAULT 0 NOT NULL, > > checksum varchar NULL, > > typeoffile int2 GENERATED ALWAYS AS ( > > CASE > > WHEN authoremail::text = useremail::text THEN 0::smallint > > ELSE 1::smallint > > END) STORED NOT NULL, > > parquetfilename varchar(100) NULL, > > metadata_vector tsvector GENERATED ALWAYS AS (to_tsvector('english':: > regconfig, (((((COALESCE(title::character varying, ''::text::character > varying)::text || ' '::text) || (COALESCE(docfoldername::text, ''::text) > || ' '::text)) || (COALESCE(authorname, ''::text::character varying)::text > || ' '::text)) || (COALESCE(fileextension, ''::text::character varying):: > text || ' '::text)) || (COALESCE(lastmodifiedby, ''::text::character > varying)::text || ' '::text)) || COALESCE(contenttype, ''::character > varying::text::character varying)::text)) STORED NULL, > > isfileencrypted int4 DEFAULT 0 NULL, > > addons_json jsonb DEFAULT '{}'::jsonb NULL, > > CONSTRAINT googledocs_tbl_clone_pkey PRIMARY KEY (gdid, userid), > > CONSTRAINT fk_googledocs_tbl_clone_users_tbl FOREIGN KEY (userid) > REFERENCES dbo.users_tbl(uid) ON DELETE CASCADE > > ) > > PARTITION BY HASH (userid); > > On Mon, Dec 23, 2024 at 11:53 PM Divyansh Gupta JNsThMAudy < > ag1567...@gmail.com> wrote: > >> Adrian, Please check this out; >> >> PARTITION BY HASH (userid); CREATE TABLE dbo.googledocs_tbl_clone_part_0 >> PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, >> remainder 0); ... CREATE TABLE dbo.googledocs_tbl_clone_part_83 PARTITION >> OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, remainder 83); >> >> On Mon, Dec 23, 2024 at 11:48 PM Divyansh Gupta JNsThMAudy < >> ag1567...@gmail.com> wrote: >> >>> Adrian, the partition is on userid using hash partition with 84 >>> partitions >>> >>> Ron, there could be more than 20 Million records possible for a single >>> userid in that case if I create index on userid only not on other column >>> the query is taking more than 30 seconds to return the results. >>> >>> On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <ronljohnso...@gmail.com> >>> wrote: >>> >>>> If your queries all reference userid, then you only need indices on >>>> gdid and userid. >>>> >>>> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy < >>>> ag1567...@gmail.com> wrote: >>>> >>>>> I have one confusion with this design if I opt to create 50 columns I >>>>> need to create 50 index which will work with userid index in Bitmap on the >>>>> other hand if I create a JSONB column I need to create a single index ? >>>>> >>>>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <ronljohnso...@gmail.com> >>>>> wrote: >>>>> >>>>>> Given what you just wrote, I'd stick with 50 separate t* columns. >>>>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions. >>>>>> >>>>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy < >>>>>> ag1567...@gmail.com> wrote: >>>>>> >>>>>>> Values can be updated based on customer actions >>>>>>> >>>>>>> All rows won't have all 50 key value pairs always if I make those >>>>>>> keys into columns the rows might have null value on the other hand if >>>>>>> it is >>>>>>> JSONB then the key value pair will not be there >>>>>>> >>>>>>> Yes in UI customers can search for the key value pairs >>>>>>> >>>>>>> During data population the key value pair will be empty array in >>>>>>> case of JSONB column or NULL in case of table columns, later when >>>>>>> customer >>>>>>> performs some actions that time the key value pairs will populate and >>>>>>> update, based on what action customer performs. >>>>>>> >>>>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, < >>>>>>> ag1567...@gmail.com> wrote: >>>>>>> >>>>>>>> Let's make it more understandable, here is the table schema with 50 >>>>>>>> columns in it >>>>>>>> >>>>>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE >>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>>>>> userid int8 NOT NULL, >>>>>>>> t1 int4 NULL, >>>>>>>> t2 int4 NULL, >>>>>>>> >>>>>>>> [snip] > t46 int4 NULL, >>>>>>>> t47 int4 NULL, >>>>>>>> t48 int4 NULL, >>>>>>>> t49 int4 NULL, >>>>>>>> t50 int4 NULL, >>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>>>>> ); >>>>>>>> >>>>>>>> Every time when i query I will query it along with userid >>>>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2) >>>>>>>> more key filters if customer applies >>>>>>>> >>>>>>>> On the other hand if I create a single jsonb column the schema will >>>>>>>> look like : >>>>>>>> >>>>>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE >>>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>>>>> userid int8 NOT NULL, >>>>>>>> addons_json jsonb default '{}'::jsonb >>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>>>>> ); >>>>>>>> >>>>>>>> and the query would be like >>>>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or >>>>>>>> (addons_json @> {t1:2}) or (addons_json @> {t1:3}) >>>>>>>> more key filters if customer applies >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston < >>>>>>>> david.g.johns...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy < >>>>>>>>> ag1567...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> So here my question is considering one JSONB column is perfect or >>>>>>>>>> considering 50 columns will be more optimised. >>>>>>>>>> >>>>>>>>> The relational database engine is designed around the column-based >>>>>>>>> approach. Especially if the columns are generally unchanging, >>>>>>>>> combined >>>>>>>>> with using fixed-width data types. >>>>>>>>> >>>>>>>>> David J. >>>>>>>>> >>>>>>>>> -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster!