[DISCUSS] PIP-17: Introduce secondary column index

yejunhao Thu, 14 Mar 2024 02:53:03 -0700

Hi, Paimon Devs, I’d like to start a discussion about PIP-17[1].

Up to now, Paimon use zorder & order & hilbert sort compaction to speed up 
query. After sort compaction, files will be sorted by the order of specified 
columns. But in some situations, for example, we have tens of columns that 
should be added in the filter column, sometimes all of them come up together, 
sometimes, just a few of them. Zorder or order compaction can't handle this 
situation, because too many columns will reduce the effect of sorting. So if 
the column base number of these columns is small, we can use bloomfilter or 
other indexes to speed up queries. That's why this PIP comes up. I want to 
introduce an index framework to support paimon with flexible index system.


Look forward to your question and suggestions.

Best, junhao

[1] 
https://cwiki.apache.org/confluence/display/PAIMON/PIP-17%3A+Introduce+secondary+column+index

[DISCUSS] PIP-17: Introduce secondary column index

Reply via email to