[ 
https://issues.apache.org/jira/browse/HUDI-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar closed HUDI-8578.
--------------------------------
    Resolution: Fixed

> Standardize SI/FI terminology and syntax 
> -----------------------------------------
>
>                 Key: HUDI-8578
>                 URL: https://issues.apache.org/jira/browse/HUDI-8578
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: Vinoth Chandar
>            Assignee: Lokesh Jain
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> We have to ensure users intuitively understand what we are shipping. Let's 
> align with existing terminology of a popular database than inventing our own 
> "names". 
> Postgres uses "Expression indexes" . and we can change our terminology to be 
> the same. 
> * References to "Functional Index" in docs/code/RFC changes to "Expression 
> Index"
> * "func" becomes "expr" (no need to change syntax with an "expression" clause 
> like pg yet. 
> * Change anything in storage like prefix "func_index_" or the 
> hoodiemetadata.avsc fields or the index defs files. (need to be really really 
> thorough)
> Here's how we need the experience to be. 
> {code:java}
> CREATE INDEX name ON table (primaryKeyColumn); -- should create RLI 
> effectively, alternatively `WHERE primaryKeyColumn = | IN ` has to work with 
> RLI is enabled via Streamer/Datasource config. Need to test/confirm this.  
> CREATE INDEX name ON table (someOtherColumn); -- should create 
> secondary_index backed by RLI, remove secondary_index as an option in using 
> clause, if no index type is specified SI is the default. 
> CREATE INDEX name ON table USING BLOOM_FILTERS(column) options(expr='lower'); 
> -- build a bloom filter from lower case version of column
> CREATE INDEX name ON table USING COL_STATS(column) 
> options(expr='from_unixtime', format='yyyy-MM-dd'); -- build date based 
> indexing of ts column
> CREATE INDEX name ON table USING COL_STATS(column);  -- should just add to 
> the column stats? (or) should we disallow this for now?
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to