benwtrent commented on code in PR #16092:
URL: https://github.com/apache/lucene/pull/16092#discussion_r3374465950
##########
lucene/core/src/java/org/apache/lucene/search/KnnFloatVectorQuery.java:
##########
@@ -95,6 +98,26 @@ public KnnFloatVectorQuery(
String field, float[] target, int k, Query filter, KnnSearchStrategy
searchStrategy) {
super(field, k, filter, searchStrategy);
this.target = VectorUtil.checkFinite(Objects.requireNonNull(target,
"target"));
+ this.targetPreRotated = false;
+ }
+
+ private boolean targetPreRotated;
+
+ @Override
+ public Query rewrite(IndexSearcher indexSearcher) throws IOException {
+ if (targetPreRotated == false) {
+ FieldInfo fi =
+
FieldInfos.getMergedFieldInfos(indexSearcher.getIndexReader()).fieldInfo(field);
+ if (fi != null
+ && fi.getVectorDimension() == target.length
+ &&
"true".equals(fi.getAttribute(RotationAwareKnnVectorsFormat.ROTATION_ENABLED_KEY)))
{
+ float[] rotated = new float[target.length];
+ HadamardRotation.forDimension(fi.getVectorDimension()).rotate(target,
rotated);
Review Comment:
> It amortizes to 0% as the index gets bigger and bigger?
It does amortize, but the constant overhead is significant for single bit
quantized, which I expect to be even faster than native int8 (which is almost
400x better throughput than the rotation?!?).
I agree, we should aim for opt in and simplicity. I am simply saying we
shouldn't incur significant constant overheads when they are avoidable. I am
fine with iterating on the opt-in API as it stands now. I have just been
pushing a little bit to see if we can land on an API that doesn't have a
significantly adverse segments with less than 1M vectors.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]