You can't make documents more likely to be in the same segment, however I'm
thinking you could use index sorting to make documents closer to each other
on a per-segment basis?

Le jeu. 18 mai 2017 à 11:04, Tommaso Teofili <[email protected]> a
écrit :

> Hi all,
>
> I am working on a use case where my Lucene index stores documents composed
> by (relatively short) text and binary values, at retrieval time I need to
> retrieve documents that belong to a set of cluster values (e.g. facets).
> In that context I was wondering if and how it'd be possible to make it
> more probable that documents (and associated docValues) that belong to a
> same cluster fall into the same segment.
> That would allow to have a higher storage locality [1] and presumably a
> better performance (given docs belonging to the same clusters get retrieved
> together most of the times in my use case).
> At first I had looked into extending the DV format but that's segment
> agnostic therefore I am thinking of coming up with a merge policy which
> produces segments whose docs belong to the same cluster with a high
> probability.
> Any other ideas / suggestions ?
>
> Regards,
> Tommaso
>
> [1] : https://en.wikipedia.org/wiki/Locality_of_reference
>

Reply via email to