Pulkitg64 opened a new pull request, #15630:
URL: https://github.com/apache/lucene/pull/15630

   ### Description
   
   Issue: #13158
   
   With the help of claude code, I have tried to add support for writing empty 
`vec` and `vemf` files. This is the first step of adding read-only index 
support. Once empty raw vector-data files are generated along with 
full-precision files, the user can choose to use either of them (support for 
this is yet to be added). When the user chooses to use empty raw data files 
during search time, this will allow them to reduce disk usage by at least 80%, 
depending on the quantization used.
   
   The follow-up for this PR would be to add support for differentiating 
between files used by writers and files used by searchers. For example, 
full-precision vector files are required by writers since they are used for 
computing quantized vectors; however, they are not needed during search time 
and hence can be dropped.
   
   Implementation:
   
   During the segment flush process, we write empty vector files containing no 
vector data—only headers and footers compatible with Lucene99FlatVectorsFormat. 
This enables the FlatVectorsReader classes to read empty files without throwing 
an exception. As part of this PR, we added support to the 
QuantizedVectorsReader class to read quantized vectors directly when no 
full-precision vectors are present in the index. Therefore, reading empty files 
should not be an issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to