[ 
https://issues.apache.org/jira/browse/HUDI-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-3177:
------------------------------
    Description: 
Users should be able to trigger index creation using CREATE INDEX statement for 
one or more partitions.
 
{code:java}
CREATE [BLOOM | COL_STATS | SOME_INDEX_TYPE] INDEX ON TABLE  [table_name] FOR 
COLUMNS (col1, col2, col3) WITH OPTION  (<file_group_count>, 
<some_other_option>);{code}
 
Maps to following hudi configs:
{code:java}
METADATA_PREFIX + ".index.bloom.filter.file.group.count” 
METADATA_PREFIX + ".index.column.stats.file.group.count" 
METADATA_PREFIX + ".index.bloom.filter.for.columns” -> comma-separated column 
names 
METADATA_PREFIX + ".index.column.stats.for.columns" -> comma-separated column 
names{code}
Even the CLI indexer tool will map user inputs to the above configs.
By default, bloom filter will only be for record key and column stats will be 
for all columns.

  was:Users should be able to trigger index creation using CREATE INDEX 
statement for one or more partitions.


> Support CREATE INDEX statement
> ------------------------------
>
>                 Key: HUDI-3177
>                 URL: https://issues.apache.org/jira/browse/HUDI-3177
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: index, metadata
>            Reporter: Sagar Sumit
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> Users should be able to trigger index creation using CREATE INDEX statement 
> for one or more partitions.
>  
> {code:java}
> CREATE [BLOOM | COL_STATS | SOME_INDEX_TYPE] INDEX ON TABLE  [table_name] FOR 
> COLUMNS (col1, col2, col3) WITH OPTION  (<file_group_count>, 
> <some_other_option>);{code}
>  
> Maps to following hudi configs:
> {code:java}
> METADATA_PREFIX + ".index.bloom.filter.file.group.count” 
> METADATA_PREFIX + ".index.column.stats.file.group.count" 
> METADATA_PREFIX + ".index.bloom.filter.for.columns” -> comma-separated column 
> names 
> METADATA_PREFIX + ".index.column.stats.for.columns" -> comma-separated column 
> names{code}
> Even the CLI indexer tool will map user inputs to the above configs.
> By default, bloom filter will only be for record key and column stats will be 
> for all columns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to