I apologize before hand replying again on my own reply . I know it is frowned 
upon . My inline comments.

> On Jun 1, 2017, at 2:05 PM, armand pirvu <armand.pi...@gmail.com> wrote:
> 
> Thank you Karl and David
> 
> Ideally as far as I can tell the index would need to be show_id, file_id, 
> lower(…)
> 
> 
> The question is if this is  possible ?
> 
> 
> Thanks
> Armand
> 
> 
>> On Jun 1, 2017, at 12:24 PM, Karl Czajkowski <kar...@isi.edu> wrote:
>> 
>> On May 31, armand pirvu modulated:
>> 
>>> The idea is that I would like to avoid having an index for each key
>>> possibly wanted to search, but rather say have the whole record then
>>> search by the key and get the key value, thus having one index serving
>>> multiple purposes so to speak
>>> 
>> 
>> First, benchmarking would be important to figure out if any proposed
>> indexing actually speeds up the kinds of queries you want to perform.
>> With the recently added parallel query features, a simpler indexing
>> scheme with some brute-force search might be adequate?
>> 

Not sure what you mean by benchmarking
But I think comparative times , aka 2 seconds vs a couple milliseconds is quite 
a difference. 
A table scan while in certain cases is okay , in a case when there is heavy 
usage on the same part/area , it will become a problem. 


>> But, you could use a search idiom like this:
>> 
>>    (lower(json_column::text)::json) -> lower('key') = 'value'::json
>> 
>> This will down-convert the case on all values and keys.  The left-hand
>> parenthetic expression could be precomputed in an expression index to
>> avoid repeated case conversion. But, typical searches will still have
>> to scan the whole index to perform the projection and match the final
>> value tests on the right-hand side.
>> 
>> If you want to do things like substring matching on field values, you
>> might stick with text and using regexp matches:
>> 
>>    (lower(json_column::text)) ~ ‘valuepattern'

In this case a regular index will be ignored even though IMO it should scan the 
index and get the needed information
The criteria I am after gets back 9 rows max out of 100k+ records so I say the 
restriction is darn good. Wouldn’t that be the case for the optimizer to pick 
the path with the least resistance aka best restriction ? Granted it uses a 
lower function which and the search in the text column which is the third in 
the index is not really starting form left. But the index starts with show_id , 
file_id and those are always part of the key. I can see though once the 
show_id, file_id is NOT a good restriction anymore , than the last column will 
make the difference . Either case will that not translate into an index scan ? 
Or the index to be considered in this case, event the last column search has to 
follow the left to right, aka not in between search ?  


>> 
>> or more structural searches:
>> 
>>    (lower(json_column::text)) ~ '"key": "[^"]*substring[^"]*"'
>> 
>> Here, the left-hand expression could be trigram indexed to help with
>> sparse, substring matching without a full index scan.  We've had good
>> luck using trigram indexing with regexp matching, though I've honestly
>> never used it for the purpose sketched above...
>> 
>> Karl
> 


Seems to me trigram could be the answer since I have some decent results once I 
applied it, more to dig

Overall could it be that the optimizer blatantly ignores a scan index which is 
cheaper than a table scan, or jsonb implementation still has a long way to come 
up  or the way it is used in my case is not the one designed for ?


thanks
Armand






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to