Hi Anirban,

I don't really like the dependency on the external database for the index.
Every reader should be able to access the database, and given a big table
with several readers, it could become a bottleneck.

I can imagine something similar as part of a REST catalog where the catalog
is used for planning:
- The Catalog could decide to read and cache the metadata from the files
(the cache could be stored in a db, or rocksdb, or whatever)
- During the planning the Catalog could get the relevant rowgroups, and
combine back them to a smaller number of splits (if there are
continuous rowgroups, they could be combined)
- The users don't need to do anything else, just call the Catalog planning
API.

In this way, we don't have to change the metadata to get the same gains.

WDYT?

Anirban Goswami <[email protected]> ezt írta (időpont: 2025.
nov. 18., K, 19:48):

> Thanks Peter.
>
> I was also doing some analysis on how to get secondary index in iceberg as
> we are dealing with several usecases where the table is pretty big and
> partitions are on different keys. In case we try to query with other keys
> it is always difficult to get better responses, or say similar response
> that snowflake or similar system provides by some accelerations or say
> saerch optimisations methods.
>
> Already we have huge metadata load on us and if we try to add idnex as
> file system then it will be too much to process and maintan as well. I have
> created one doc with some thougts and want to udnerstand how u look at it.
>
> OLTP Database-Backed Index Architecture for Apache Iceberg
> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
> docs.google.com
> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
>
> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
>
> Regards,
> Ani
>
>
> On 2025/11/18 11:32:24 Péter Váry wrote:
> > Hi Team,
> >
> > Do we have any progress on this topic? I’d really like to see this move
> > forward.
> >
> > Following Sreeram’s suggestion, we should start collecting the key use
> > cases we want to support with indexes. Here’s what I’ve heard so far:
> >
> >    - *Primary key index*
> >       - Find a single or few rows by a given primary key
> >       - Build the Flink “primary key → file_name, position” state by
> bulk
> >       reading the primary key index
> >    - *Secondary index*
> >       - Range or min/max filtering on columns that are not part of the
> >       primary key (primary sort order)
> >    - *Full-text index*
> >       - Term search in text columns
> >    - *Vector index*
> >       - Nearest or approximate nearest neighbor search
> >    - *Geospatial index*
> >       - Finding points within a polygon or nearest location
> >
> > We should identify a few critical use cases and keep the others in mind
> > when designing how we store, retrieve, and use these indexes. Personally,
> > I’d love to see *vector indexes in Iceberg*, enabling fast AI searches on
> > Iceberg tables.
> >
> > For reference, I asked Copilot to collect the currently available index
> > types in MSSQL, Oracle, Postgres, MySQL, and LanceDB. Here’s the list:
> >
> https://docs.google.com/spreadsheets/d/14cBdwsOw89ivolHtAw342YNoGmb1-Kri1E80hwWymL0Thanks
> > ,
> >
> > Peter
> >
> >
> > Aihua Xu <[email protected]> ezt írta (időpont: 2025. nov. 2., V, 4:11):
> >
> > > Thanks Steven for raising this topic and giving a summary on the
> > > proposals. I would like to get involved in this area.
> > >
> > > On Fri, Oct 31, 2025 at 4:49 PM huaxin gao <[email protected]> wrote:
> > >
> > >> Thanks, Steven, for taking the initiative. I have previously
> collaborated
> > >> with Miao from Adobe on secondary index and would like to continue
> that
> > >> work.
> > >>
> > >> Huaxin
> > >>
> > >> On Fri, Oct 31, 2025 at 1:07 PM Xinli shang <[email protected]>
> > >> wrote:
> > >>
> > >>> Thanks Steven for proposing this! This is right direction to go.
> > >>> Definitely we see challenges in some cases without indexing support,
> > >>> especially around equality deletes and point lookups. I would like to
> > >>> contribute as well. One thing we need to be careful is that the
> overhead of
> > >>> the index itself like memory usage, index update etc.
> > >>>
> > >>> Namratha, for Parquet column index, we had one for Presto
> > >>> https://www.youtube.com/watch?v=fr_HdhMEa3s.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Oct 31, 2025 at 11:48 AM namratha mk <[email protected]>
> wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I see the point in the doc :
> > >>>>
> > >>>> *The primary key index can also be useful for point lookup.*
> > >>>> But to achieve the above we would need to store native file format
> > >>>> metadata like parquet page index
> > >>>> <https://parquet.apache.org/docs/file-format/pageindex/> in the
> > >>>> primary index which helps in fetching for lookup use case. Has
> there been
> > >>>> any talks in the community about this? Would like to get more
> opinions on
> > >>>> this.
> > >>>>
> > >>>> Thanks,
> > >>>> Namratha
> > >>>>
> > >>>> On Sat, Jul 19, 2025 at 2:39 AM Manish Malhotra <
> > >>>> [email protected]> wrote:
> > >>>>
> > >>>>> Thanks Steven,
> > >>>>> +1 on this initiative, I am also interested to contribute in this
> > >>>>> area.
> > >>>>> As you mentioned it has a quite a breadth, my though is we can
> start a
> > >>>>> document to  discuss different layers separately like type of
> indexes, sync
> > >>>>> vs async, spec changes, priority of the index to be supported
> (instead of
> > >>>>> targeting all in one go)
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Manish
> > >>>>>
> > >>>>> On Fri, Jul 18, 2025 at 10:41 PM Steven Wu <[email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Vignesh, that is yet to be discussed. We haven't got to that kind
> of
> > >>>>>> detail yet.
> > >>>>>>
> > >>>>>> In some cases, the index files are expected to be added along with
> > >>>>>> the data files in the same commit. Maybe some cases (like
> secondary index)
> > >>>>>> would prefer async process.
> > >>>>>>
> > >>>>>> On Fri, Jul 18, 2025 at 4:11 PM Vignesh <[email protected]>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Are the index files for all kinds expected to be written and
> added
> > >>>>>>> along with data files or would it be an optional async step?
> > >>>>>>>
> > >>>>>>> On Fri, Jul 18, 2025, 5:09 AM Péter Váry <
> > >>>>>>> [email protected]> wrote:
> > >>>>>>>
> > >>>>>>>> > *Primary Index*: Conventionally Primary Index - just means
> what
> > >>>>>>>> the Table's Primary storage layout/organization was. Given that
> Iceberg
> > >>>>>>>> supports Sort-order - if the Spec adds constraints to
> derive/influence Sort
> > >>>>>>>> order based on the Identifier columns - it satisfies the
> Primary Index
> > >>>>>>>> criteria.
> > >>>>>>>>
> > >>>>>>>> Here is my mental model:
> > >>>>>>>> - Primary Key - the unique identifier for the rows
> > >>>>>>>> - Primary Key index - database index constructed on the Primary
> Key
> > >>>>>>>> column
> > >>>>>>>> - Iceberg sort order - performance optimization used to speed up
> > >>>>>>>> frequent, or costly queries.
> > >>>>>>>>
> > >>>>>>>> The Iceberg sort order is often defined above different columns
> > >>>>>>>> than the Primary Key, so I would try to avoid mixing the two
> concepts.
> > >>>>>>>>
> > >>>>>>>> > we found that an Iceberg Table based Store Secondary Index -
> > >>>>>>>> provides the right balance between the ability to skip over and
> load needed
> > >>>>>>>> sections and yet provide the right performance benefits.
> > >>>>>>>>
> > >>>>>>>> Could you please elaborate on what "Iceberg Table based Store
> > >>>>>>>> Secondary Index" means?
> > >>>>>>>> Is this another Iceberg table with different columns and
> different
> > >>>>>>>> sort order?
> > >>>>>>>>
> > >>>>>>>> > they want it to be in an open format, so that it can be shared
> > >>>>>>>> with other engines!
> > >>>>>>>>
> > >>>>>>>> Wholeheartedly agreed!
> > >>>>>>>>
> > >>>>>>>> Thanks Steven for starting, and others for participating in the
> > >>>>>>>> discussion!
> > >>>>>>>> PEter
> > >>>>>>>>
> > >>>>>>>> Sreeram Garlapati <[email protected]> ezt írta (időpont:
> > >>>>>>>> 2025. júl. 15., K, 22:12):
> > >>>>>>>>
> > >>>>>>>>> Thanks Steven for starting this.
> > >>>>>>>>>
> > >>>>>>>>> I am interested in the - Index'ing related conversations.
> > >>>>>>>>>
> > >>>>>>>>> Here are some preliminary thoughts:
> > >>>>>>>>>
> > >>>>>>>>>    1. *Primary Index*: Conventionally Primary Index - just
> means
> > >>>>>>>>>    what the Table's Primary storage layout/organization was.
> Given that
> > >>>>>>>>>    Iceberg supports Sort-order - if the Spec adds constraints
> to
> > >>>>>>>>>    derive/influence Sort order based on the Identifier
> columns - it satisfies
> > >>>>>>>>>    the Primary Index criteria.
> > >>>>>>>>>    2. *Secondary Index*: Secondary Index storage calls for an
> > >>>>>>>>>    efficient organization which can hold Secondary Keys along
> with the
> > >>>>>>>>>    Location of the Row and any included columns. The index
> can be of many
> > >>>>>>>>>    types, based on the Data. Iceberg tables are typically
> v.v.large. Hence,
> > >>>>>>>>>    these Indexes also tend to be very large. Based on our
> past 1-2 years of
> > >>>>>>>>>    work in this space, we found that an Iceberg Table based
> Store Secondary
> > >>>>>>>>>    Index - provides the right balance between the ability to
> skip over and
> > >>>>>>>>>    load needed sections and yet provide the right performance
> benefits. This
> > >>>>>>>>>    decision was also shaped by popular opinion from many of
> our partners &
> > >>>>>>>>>    customers - as the Index computation involves a lot of
> computation, they
> > >>>>>>>>>    want it to be in an open format, so that it can be shared
> with other
> > >>>>>>>>>    engines!
> > >>>>>>>>>    3. *Others: Full Text Search Indexes and Vector Indexes*:
> It
> > >>>>>>>>>    is critical that we allow years of innovation in the space
> of Full Text
> > >>>>>>>>>    Search and Vector indexes, especially with the current
> acceleration in AI
> > >>>>>>>>>    adoption & the need it is driving on the Keyword and
> Similarity Search
> > >>>>>>>>>    space. Given that Iceberg tables are extremely large, it
> is critical for us
> > >>>>>>>>>    to provide a good story for Indexes that can be
> incrementally updated /
> > >>>>>>>>>    partially loaded into memory.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to the discussions.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Sreeram
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Jul 15, 2025 at 9:33 AM Anurag Mantripragada
> > >>>>>>>>> <[email protected]> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for starting this thread, Steven!
> > >>>>>>>>>>
> > >>>>>>>>>> I have been interested in secondary indexing in Iceberg. There
> > >>>>>>>>>> was an old proposal secondary indexing [1], we may need to
> revist/redesign
> > >>>>>>>>>> these structures. I agree this is a very broad topic and
> having indexing
> > >>>>>>>>>> structures general enough to support a wide range of
> use-cases will be a
> > >>>>>>>>>> key challenge.
> > >>>>>>>>>>
> > >>>>>>>>>> I would like to get involved any discussions related to
> indexing.
> > >>>>>>>>>>
> > >>>>>>>>>> [1] -
> > >>>>>>>>>>
> https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Anurag Mantripragada
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Jul 15, 2025, at 2:37 AM, Maximilian Michels <
> [email protected]>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks Steven for the summary. It would be great to extend the
> > >>>>>>>>>> Iceberg spec with index files, such that they can be used for
> the different
> > >>>>>>>>>> use cases.
> > >>>>>>>>>>
> > >>>>>>>>>> For my understanding, let me further outline the different
> types
> > >>>>>>>>>> of use cases for index files:
> > >>>>>>>>>>
> > >>>>>>>>>> ---
> > >>>>>>>>>> Topic 1: Accelerating the resolution of equality deletes
> > >>>>>>>>>> ---
> > >>>>>>>>>>
> > >>>>>>>>>> In its current form, equality deletes make it impossible to
> > >>>>>>>>>> achieve proper merge-on-read performance in streaming reads,
> and they also
> > >>>>>>>>>> add a significant performance overhead in batch pipelines.
> > >>>>>>>>>>
> > >>>>>>>>>> Approach (a):
> > >>>>>>>>>>
> https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/
> > >>>>>>>>>> Converting equality deletes to positional deletes would be a
> > >>>>>>>>>> great achievement. I'm wondering though, if all engines will
> be able to
> > >>>>>>>>>> achieve this. There is quite some runtime complexity involved
> to achieve
> > >>>>>>>>>> this. If I understand correctly, the index can be
> bootstrapped via table
> > >>>>>>>>>> maintenance tasks, then has to be maintained by the streaming
> writer.
> > >>>>>>>>>>
> > >>>>>>>>>> Approach (b):
> > >>>>>>>>>>
> https://lists.apache.org/thread/gjjr30txq318qp6pff3x5fx1jmdnr6fv
> > >>>>>>>>>> This would boost the resolution of equality deletes during
> reads
> > >>>>>>>>>> via indices. The indices can be built via maintenance tasks,
> or directly by
> > >>>>>>>>>> the writer as in (a). But how to keep the index fresh if we
> don't write the
> > >>>>>>>>>> index at the writers? Readers won't always be able to use an
> > >>>>>>>>>> up-to-date index, making this less suitable for streaming
> reads.
> > >>>>>>>>>>
> > >>>>>>>>>> ---
> > >>>>>>>>>> Topic 2: Full text search in table scans
> > >>>>>>>>>> ---
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> https://docs.google.com/document/d/1bMACRCJBB8ycSXCFbP_BdCbFCAegRoxr2O2NXZirOmY/edit
> > >>>>>>>>>> Adding full-text search would broaden Iceberg’s applicability,
> > >>>>>>>>>> enabling new search use cases and making table scans far more
> powerful.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Max
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Jul 9, 2025 at 11:35 PM Steven Wu <[email protected]>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Similar to other V4 threads, I am starting a thread to gauge
> > >>>>>>>>>>> interest in adding index support in Iceberg V4 and gather a
> focus group in
> > >>>>>>>>>>> this area.
> > >>>>>>>>>>>
> > >>>>>>>>>>> There have been a few discussions related to indexing
> recently.
> > >>>>>>>>>>>
> > >>>>>>>>>>>    - Me and Peter Vary are working on a proposal (WIP) to
> > >>>>>>>>>>>    only write position deletes in the Flink streaming
> writer. It would need a
> > >>>>>>>>>>>    primary key index to work reasonably efficiently. [1]
> > >>>>>>>>>>>    - Xiaoxuan Li has a proposal to leverage index files to
> > >>>>>>>>>>>    improve merge-on-read performance with equality deletes.
> [2]
> > >>>>>>>>>>>    - pengzhiwei has a proposal to support full-text index
> and
> > >>>>>>>>>>>    vector index. [3]
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> *Idea: index files*
> > >>>>>>>>>>>
> > >>>>>>>>>>> To support those use cases, Iceberg can add support for index
> > >>>>>>>>>>> files (in addition to data files and delete files). It
> should be general
> > >>>>>>>>>>> enough to support different forms of indexing.
> > >>>>>>>>>>>
> > >>>>>>>>>>>    - Primary key index
> > >>>>>>>>>>>    - Secondary index
> > >>>>>>>>>>>    - Full text index
> > >>>>>>>>>>>    - Vector index
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> This email is a starting point. It is a large topic. A lot of
> > >>>>>>>>>>> discussions and maturation of the ideas are needed before a
> formal proposal.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> Steven
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]
> > >>>>>>>>>>>
> https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/
> > >>>>>>>>>>> (WIP)
> > >>>>>>>>>>> [2]
> > >>>>>>>>>>>
> https://lists.apache.org/thread/j4zl44g6dllzzyg9ln45pvgoosfhxqrq
> > >>>>>>>>>>> [3] https://github.com/apache/iceberg/issues/12636
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>
> > >>> --
> > >>> Xinli Shang
> > >>>
> > >>
> >
>

Reply via email to