On Tue, Nov 5, 2024 at 5:17 PM Adrien Grand <jpou...@gmail.com> wrote
Why is it important to break down per field as opposed to scaling based on > the total volume of vector data? > It's really for internal planning purposes / service telemetry ... at Amazon product search team (where I also work w/ Tanmay -- hi Tanmay!) we have a number of teams using our Lucene search service to experiment with KNN search, varying the number of dimensions, whether quantization is in use, which ML model, etc. These fields come and go, sometimes without our (low level infrastructure) service knowing ahead of time how these fields are changing. So we would ideally have an efficient way to break out per-field KNN disk and "ideal hot RAM" online (in our production service) instead of offline / inefficiently like rewriting the whole index into separate files (Robert's cool DiskUsage tool). It's tricky with KNN and features like scalar quantization ( https://www.elastic.co/search-labs/blog/scalar-quantization-in-lucene) and soon RabitQ (https://github.com/apache/lucene/pull/13651) because the on-disk form (which retains full float32 precision vectors) is different from what searching really uses (the quantized byte-per-dimension form). So the disk consumed by each field is larger than the amount of effective "hot RAM" you might need. Mike McCandless http://blog.mikemccandless.com