Hi Viliam, Your logic is mostly correct, here is a version that should be a bit simpler and correct (but beware, untested):
IndexReader reader; // your multi-reader int docID; // top-level doc ID int readerID = ReaderUtil.subIndex(docID, reader.leaves()); LeafReaderContext leafContext = reader.leaves().get(readerID); int leafDocID = docID - leafContext.docBase; FloatVectorValues values = leafContext.reader().getFloatVectorValues("my_vector_field"); DocIndexIterator iterator = values.iterator(); float[] vector; if (iterator.advance(leafDocID) == docID) { // this doc ID has a vector vector = values.vectorValue(iterator.index()); } else { vector = null; } On Mon, Feb 10, 2025 at 5:01 PM Viliam Ďurina <viliam.dur...@gmail.com> wrote: > Dear all, > > when indexing vector fields, Lucene doesn't allow specifying the vector > field as stored (it throws `IllegalStateException: Cannot store value of > type class [F`). When trying to retrieve the value using > `IndexReader.storedFields()`, the vector field isn't stored. > > However, Lucene 10 stores the vectors in `.vec` files. I was able to > retrieve them using this complicated code, for which I had to make the > `readerIndex` and `readerBase` methods in `BaseCompositeReader` public > (they are protected): > > int docId = ...; // the docId to retrieve, e.g. coming out of a search > IndexReader node = reader.getContext().reader(); > while (node instanceof BaseCompositeReader) { > int index = ((BaseCompositeReader) node).readerIndex(docId); > int base = ((BaseCompositeReader) node).readerBase(index); > docId -= base; > node = ((BaseCompositeReader) > node).getContext().children().get(index).reader(); > } > assert node instanceof LeafReader; > assert node.leaves().size() == 1; > FloatVectorValues vectorValues = > > node.leaves().getFirst().reader().getFloatVectorValues("myVectorField"); > float[] vector = vectorValues.vectorValue(docId); > > My reader is a `MultiReader`, composed of multiple `DirectoryReader`s. > > Is there any public API to retrieve the vector values? If not, is there any > particular reason to not make the vectors available, if Lucene stores them > anyway? Even if the vectors are quantized, original raw vectors are stored, > though they are never used. > > Thanks, > Viliam > -- Adrien