bhasudha commented on code in PR #10087: URL: https://github.com/apache/hudi/pull/10087#discussion_r1393181396
########## website/docs/sql_ddl.md: ########## @@ -178,6 +178,145 @@ TBLPROPERTIES ( AS SELECT * FROM parquet_table; ``` +### Create Index + +Hudi supports creating and dropping indexes, including functional indexes, on a table. + +:::note +Creating indexes through SQL is in preview in version 1.0.0-beta only. It will be generally available in version 1.0.0. +Please report any issues you find either via [GitHub issues](https://github.com/apache/hudi/issues) or creating a [JIRA](https://issues.apache.org/jira/projects/HUDI/issues). +::: + +```sql +-- Create Index +CREATE INDEX [IF NOT EXISTS] index_name ON [TABLE] table_name +[USING index_type] +(column_name1 [OPTIONS(key1=value1, key2=value2, ...)], column_name2 [OPTIONS(key1=value1, key2=value2, ...)], ...) +[OPTIONS (key1=value1, key2=value2, ...)] + +-- Drop Index +DROP INDEX [IF EXISTS] index_name ON [TABLE] table_name +``` + +- `index_name` is the name of the index to be created or dropped. +- `table_name` is the name of the table on which the index is created or dropped. +- `index_type` is the type of the index to be created. Currently, only `files`, `column_stats` and `bloom_filters` is supported. +- `column_name` is the name of the column on which the index is created. +- Both index and column on which the index is created can be qualified with some options in the form of key-value pairs. + We will see this with an example of functional index below. + +#### Create Functional Index + +A [functional index](https://github.com/apache/hudi/blob/00ece7bce0a4a8d0019721a28049723821e01842/rfc/rfc-63/rfc-63.md) +is an index on a function of a column. It is a new addition to Hudi's [multi-modal indexing](https://hudi.apache.org/blog/2022/05/17/Introducing-Multi-Modal-Index-for-the-Lakehouse-in-Apache-Hudi) +subsystem which provides faster access method and also absorb partitioning as part of the indexing system. Let us see +some examples of creating a functional index. + +```sql +-- Create a functional index on the column `ts` (unix epoch) of the table `hudi_table` using the function `from_unixtime` with the format `yyyy-MM-dd` +CREATE INDEX IF NOT EXISTS ts_datestr ON hudi_table USING column_stats(ts) OPTIONS(func='from_unixtime', format='yyyy-MM-dd'); +-- Create a functional index on the column `ts` (timestamp in yyyy-MM-dd HH:mm:ss) of the table `hudi_table` using the function `hour` +CREATE INDEX ts_hour ON hudi_table_func_index USING column_stats(ts) options(func='hour'); Review Comment: `ON hudi_table` instead of `ON hudi_table_func_index` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
