Druid does use ordinary unicode collation, and it isn't currently configurable. 
For that reason, we can, and do, use range scans for e.g. "LIKE 'Druid%'".

Although your comment makes me think. Currently I believe we scan the full 
index to satisfy a filter like "UPPER(name) = 'Druid'". In principle we don't 
need to do that, we could instead scan all the entries that start with 'd' or 
'D'. (Or perhaps 'dr', 'Dr', 'dR', and 'DR'? I'm not sure where the benefit 
ends of doing something like this. Possibly when the cardinality of possible 
matches approaches or exceeds the cardinality of the index.)

Could be an interesting enhancement.

Gian

On 2025/04/11 16:33:13 Julian Hyde wrote:
> Gian,
> 
> Does the index use ordinary unicode collation (so that ‘d’  < ‘e’ … ‘z’ < ‘A’ 
> < ‘D’, and therefore ‘druid’ < ‘dRuid’ < ‘fluid’ < ‘Druid’ < ‘DRUID’)?
> 
> There’s another collation where ‘d’ < ‘D’ < ‘e’ < ‘E’ etc. This puts ‘druid’, 
> ‘dRuid’, ‘Druid’, ‘DRUID’ next to each other, so that the query "UPPER(name) 
> = ‘DRUID’” can be answered with a very small range scan.
> 
> Julian
> 
> 
> > On Apr 11, 2025, at 9:24 AM, Gian Merlino <g...@apache.org> wrote:
> > 
> > String indexes in Druid are case-sensitive. UPPER (and LOWER) can use the 
> > index, but they must do it through index scans, rather than point index 
> > lookups. So index usage will be less efficient.
> > 
> > Gian
> > 
> > On 2025/04/09 18:50:45 vikas thakur wrote:
> >> On Wed, Apr 9, 2025 at 2:49 PM vikas thakur <vikas.thaku...@gmail.com>
> >> wrote:
> >> 
> >>> Can someone help with logic of upper sql function of Druid ? What is
> >>> underlying implementation, does it use case insensitive indexes ?
> >>> Currently, benchmarking performance with upper function and standardizing
> >>> the casing for APIs.
> >>> 
> >>> Regards,
> >>> VT
> >>> 
> >> 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > For additional commands, e-mail: dev-h...@druid.apache.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to