Re: Limitations of non unique keys

2021-11-05 Thread Sivabalan
got you. thanks for the clarification. On Fri, Nov 5, 2021 at 3:53 PM Vinoth Chandar wrote: > Hi Siva, > > I think this is more about bloom filters and record level index, which is > different from RFC-27. > > RFC-08 talks about record level indexing. Bloom filter indexes have a > discuss

Re: Limitations of non unique keys

2021-11-05 Thread Vinoth Chandar
Hi Siva, I think this is more about bloom filters and record level index, which is different from RFC-27. RFC-08 talks about record level indexing. Bloom filter indexes have a discuss thread just kicked off. Main thing we are trying to solidify in 0.10.0 is foundational metadata table and

Re: Limitations of non unique keys

2021-11-05 Thread Sivabalan
Thanks for bringing this up. We have a RFC-27 on data skipping which is the secondary indexing being discussed here. We are flushing out few more details on this end and will put up patches

Re: Limitations of non unique keys

2021-11-03 Thread Nicolas Paris
> In another words, we are generalizing this so hudi feels more like > MySQL and not HBase/Cassandra (key value store). Thats the direction > we are approaching. wow this is amazing. I haven't found yet RFC about this, nor ready to test PR. This answer my initial question: with the secondary

Re: Limitations of non unique keys

2021-11-03 Thread Vinoth Chandar
Hi. With the indexing approach we are taking, you should be able to add secondary indexes on any column. not just the key. In another words, we are generalizing this so hudi feels more like MySQL and not HBase/Cassandra (key value store). Thats the direction we are approaching. love to hear more

Re: Limitations of non unique keys

2021-11-02 Thread Nicolas Paris
for example does the move of blooms into hfiles (0.10.0 feature) makes unique bloom keys mandatory ? On Thu Oct 28, 2021 at 7:00 PM CEST, Nicolas Paris wrote: > > > Are you asking if there are advantages to allowing duplicates or not having > > keys in your table? > it's all about allowing

Re: Limitations of non unique keys

2021-10-28 Thread Vinoth Chandar
Hi, Are you asking if there are advantages to allowing duplicates or not having keys in your table? Having keys, helps with othe practical scenarios, in addition to what you called out. e.g: Oftentimes, you would want to backfill an insert-only table and you don't want to introduce duplicates

Limitations of non unique keys

2021-10-26 Thread Nicolas Paris
Hi devs, AFAIK, hudi has been designed to have primary keys in the hudi's key. However it is possible to also choose a non unique field. I have listed several trouble with such design: Non unique key yield to : - cannot delete / update a unique record - cannot apply primary key for new sql