Re: Indexing fragments of a column's value ?
You can index on expressions, and these will be recognised by the query generator. drop schema tmp CASCADE; create schema tmp; set search_path = tmp; CREATE TABLE bagger ( eight CHAR(8) NOT NULL PRIMARY KEY , more text ); CREATE INDEX bagger_idx_12 ON bagger (substr(eight,1,2)); CREATE INDEX bagger_idx_34 ON bagger (substr(eight,3,2)); CREATE INDEX bagger_idx_58 ON bagger (substr(eight,5,4)); INSERT INTO bagger(eight, more) SELECT translate(to_hex( gs), ' ' , '0') , gs::text FROM generate_series(0,40, 64999) gs ; VACUUM ANALYZE bagger; EXPLAIN ANALYZE SELECT * FROM bagger WHERE 1=1 -- AND eight >= '00' AND eight < '05' AND substr(eight, 1,2) >= '30' AND substr(eight,1,2) < '05' AND substr(eight, 3,2) >= '90' AND substr(eight,3,2) < 'A5' AND substr(eight, 5,4) >= '' AND substr(eight,5,4) < '' ; /*** The optimiser is smart enough to ignore one of the indexes. */ QUERY PLAN - Bitmap Heap Scan on bagger (cost=41.46..59.62 rows=1 width=19) (actual time=0.049..0.056 rows=0 loops=1) Recheck Cond: ((substr((eight)::text, 1, 2) >= '30'::text) AND (substr((eight)::text, 1, 2) < '05'::text) AND (substr((eight)::text, 3, 2) >= '90'::text) AND (substr((eight)::text, 3, 2) < 'A5'::text)) Filter: ((substr((eight)::text, 5, 4) >= ''::text) AND (substr((eight)::text, 5, 4) < ''::text)) -> BitmapAnd (cost=41.46..41.46 rows=16 width=0) (actual time=0.042..0.047 rows=0 loops=1) -> Bitmap Index Scan on bagger_idx_12 (cost=0.00..4.47 rows=308 width=0) (actual time=0.039..0.039 rows=0 loops=1) Index Cond: ((substr((eight)::text, 1, 2) >= '30'::text) AND (substr((eight)::text, 1, 2) < '05'::text)) -> Bitmap Index Scan on bagger_idx_34 (cost=0.00..36.74 rows=3205 width=0) (never executed) Index Cond: ((substr((eight)::text, 3, 2) >= '90'::text) AND (substr((eight)::text, 3, 2) < 'A5'::text)) Planning Time: 5.487 ms Execution Time: 0.310 ms (10 rows) HTH, AvK
Re: Indexing fragments of a column's value ?
On Fri, Nov 3, 2023 at 3:34 PM David Gauthier wrote: > I'm asking about the possibility of indexing portions of a column's value > where the column has a static field format. Example, a char(8) which > contains all hex values (basically a hex number that's always 8 chars wide, > leading zeros if needed). Someone might want to select all recs where the > first 2 digits are 'ff' or maybe the last 4 hex digits match regexp_match > '00[cdef]{2}' or maybe a match of the entire string... "0dd63a87". > > If I know the placement and width of the fields that need to be indexed, > can indices be defined to facilitate queries ? Example... > - match all 8 chars > - match the 3rd and 4th chars > - match the last 4 chars > > I suppose I could fragment the thing into multiple columns for the > purposes of a search. So add a column called "last4" as a char(4) that > matches the last 4 chars of that column. Then index that. etc... But > inquiring to see if there is something more elegant. > > Just curious why not store it as a HEX string (functional index, even). And allow searching against the stringified hex, then your regex works? Kirk
Re: Indexing fragments of a column's value ?
Le ven. 3 nov. 2023 à 21:01, Tom Lane a écrit : > David Gauthier writes: > > I'm asking about the possibility of indexing portions of a column's value > > where the column has a static field format. > > GIN indexes are meant for exactly that. You might have to write your > own opclass to break up the input values in the way you want though. > > A less difficult answer would be to write a function that breaks up > the input into (say) an array of text and then use the existing > GIN array support. But you'd pay for that by needing to write more > complicated queries to use the index. > > regards, tom lane > > > Hello David, Reading your mail, it seems that your data column contains 3 different kinds of atomic information: * Characters 1-2 * Characters 3-4 * Characters 5-8 Does it make sense to split this data into 3 separate columns? Each one could be indexed, and you can rebuild the original thanks to a generated column: https://www.postgresql.org/docs/current/ddl-generated-columns.html HTH, Thomas
Re: Indexing fragments of a column's value ?
David Gauthier writes: > I'm asking about the possibility of indexing portions of a column's value > where the column has a static field format. GIN indexes are meant for exactly that. You might have to write your own opclass to break up the input values in the way you want though. A less difficult answer would be to write a function that breaks up the input into (say) an array of text and then use the existing GIN array support. But you'd pay for that by needing to write more complicated queries to use the index. regards, tom lane
Indexing fragments of a column's value ?
I'm asking about the possibility of indexing portions of a column's value where the column has a static field format. Example, a char(8) which contains all hex values (basically a hex number that's always 8 chars wide, leading zeros if needed). Someone might want to select all recs where the first 2 digits are 'ff' or maybe the last 4 hex digits match regexp_match '00[cdef]{2}' or maybe a match of the entire string... "0dd63a87". If I know the placement and width of the fields that need to be indexed, can indices be defined to facilitate queries ? Example... - match all 8 chars - match the 3rd and 4th chars - match the last 4 chars I suppose I could fragment the thing into multiple columns for the purposes of a search. So add a column called "last4" as a char(4) that matches the last 4 chars of that column. Then index that. etc... But inquiring to see if there is something more elegant. Thanks !