[
https://issues.apache.org/jira/browse/HUDI-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinoth Chandar updated HUDI-8533:
---------------------------------
Description: is this expected behavior? We should be able to create a
bloom_filters index on any column without having to specify any options,
including a conversion function. Also lets allow for users to configure the
bloom filter - number of bits, fp ratio etc? (was: {code:java}
spark-sql (default)> create index idx_lucene on hudi_table using vinoth(state)
; {code}
succeeds. and does nothing.
{code:java}
% ls /tmp/hudi_test_table/.hoodie/metadata
column_stats partition_stats secondary_index_idx_driver
files record_index secondary_index_idx_rider
{code}
We should only allow indexes that are actually working/supported. throw errors
for everything else.
{code:java}
spark-sql (default)> create index idx_vinoth on hudi_table using vinoth(state)
; 24/11/15 10:07:39 ERROR SparkSQLDriver: Failed in [create index idx_vinoth on
hudi_table using vinoth(state) ]
org.apache.hudi.exception.HoodieIndexException: Unknown hoodie index
type:vinoth at
org.apache.hudi.index.secondary.SecondaryIndexType.lambda$of$3(SecondaryIndexType.java:55)
at java.util.Optional.orElseThrow(Optional.java:290) at
org.apache.hudi.index.secondary.SecondaryIndexType.of(SecondaryIndexType.java:54)
at
org.apache.hudi.index.secondary.HoodieSecondaryIndex$Builder.setIndexType(HoodieSecondaryIndex.java:112)
at
org.apache.hudi.index.secondary.SecondaryIndexManager.create(SecondaryIndexManager.java:110)
at
org.apache.spark.sql.hudi.command.CreateIndexCommand.run(IndexCommands.scala:60)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(command
{code}
We should clean up `org.apache.hudi.index.secondary.SecondaryIndexType` and its
usages.. )
> Bloom Index creation without function fails
> -------------------------------------------
>
> Key: HUDI-8533
> URL: https://issues.apache.org/jira/browse/HUDI-8533
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Sagar Sumit
> Assignee: Lokesh Jain
> Priority: Blocker
> Fix For: 1.0.0
>
>
> is this expected behavior? We should be able to create a bloom_filters index
> on any column without having to specify any options, including a conversion
> function. Also lets allow for users to configure the bloom filter - number of
> bits, fp ratio etc?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)