+100
- Reduces index lookup time hence improves job runtime
- Paves the way for streaming style ingestion
- Eliminates dependency on Hbase (alternate "global index" support at the
moment)

-Nishith

On Mon, Feb 24, 2020 at 10:56 AM Vinoth Chandar <vin...@apache.org> wrote:

> +1 from me as well. This will be a product defining feature, if we can do
> it/
>
> On Sun, Feb 23, 2020 at 6:27 PM vino yang <yanghua1...@gmail.com> wrote:
>
> > Hi Sivabalan,
> >
> > Thanks for your proposal.
> >
> > Big +1 from my side, indexing for record granularity is really good for
> > performance. It is also towards the streaming processing.
> >
> > Best,
> > Vino
> >
> > Sivabalan <n.siv...@gmail.com> 于2020年2月23日周日 上午12:52写道:
> >
> > > As Aapche Hudi is getting widely adopted, performance has become the
> need
> > > of the hour. This RFC focusses on improving performance of the Hudi
> index
> > > by introducing record level index. The proposal is to implement a new
> > index
> > > format that is a mapping of (recordKey <-> partition, fileId) or
> > > ((recordKey, partitionPath) → fileId). This mapping will be stored and
> > > maintained by Hudi as another implementation of HoodieIndex. This
> record
> > > level indexing will definitely give a boost to both read and write
> > > performance.
> > >
> > > Here
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/HUDI/RFC+-+08+%3A+Record+level+indexing+mechanisms+for+Hudi+datasets
> > > >
> > > is the link to RFC.
> > >
> > > Appreciate your review and thoughts.
> > >
> > > --
> > > Regards,
> > > -Sivabalan
> > >
> >
>

Reply via email to