mccullocht commented on issue #15734: URL: https://github.com/apache/lucene/issues/15734#issuecomment-3930097851
This is a very interesting idea, although I agree it may not be something you want to do in Lucene itself. If it was going to happen inside Lucene you would want the query object to participate and perform expansion before search to avoid doing it per segment. One aspect that might be interesting to play with is the asymmetric quantization of the query and doc vectors that happens in 1 and 2 bit quantization. Would 6-bit quantization perform better as 6144 dims x 1 bit or 3072 dimx x 2 bits than 1536 dims x 4 bit? Storage of the original vector makes this cost prohibitive but there might be ways to work around this. Re: compression factor. Another way of expression this might be "I would like each vector to be N bytes long", making it purely about storage/memory costs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
