Pulkitg64 opened a new pull request, #15630: URL: https://github.com/apache/lucene/pull/15630
### Description Issue: #13158 With the help of claude code, I have tried to add support for writing empty `vec` and `vemf` files. This is the first step of adding read-only index support. Once empty raw vector-data files are generated along with full-precision files, the user can choose to use either of them (support for this is yet to be added). When the user chooses to use empty raw data files during search time, this will allow them to reduce disk usage by at least 80%, depending on the quantization used. The follow-up for this PR would be to add support for differentiating between files used by writers and files used by searchers. For example, full-precision vector files are required by writers since they are used for computing quantized vectors; however, they are not needed during search time and hence can be dropped. Implementation: During the segment flush process, we write empty vector files containing no vector data—only headers and footers compatible with Lucene99FlatVectorsFormat. This enables the FlatVectorsReader classes to read empty files without throwing an exception. As part of this PR, we added support to the QuantizedVectorsReader class to read quantized vectors directly when no full-precision vectors are present in the index. Therefore, reading empty files should not be an issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
