atris commented on PR #15613:
URL: https://github.com/apache/lucene/pull/15613#issuecomment-3899240459

   I finished the 16.41M benchmark on 32GB RAM. SPANN R=1 is the lowest‑latency 
config observed; recall tops out ~0.847 even with higher NProbe. R=2 fails at 
full scale due to disk exhaustion. HNSW hits ~0.99 recall but ~2s latency, so 
not viable for interactive use on this hardware. The recall ceiling is 
hardware‑bound, not a SPANN limit; higher recall is achievable when replication 
is feasible (validated at 10M).
   
    Luceneutil summary loglines:
   
     SPANN R=1, 16.41M (NProbe 12/24/48) 
     SUMMARY: 0.833  46.500  … 16410000 … 64319.01 … SPANN
     SUMMARY: 0.844  12.320  … 16410000 … 64319.02 … SPANN
     SUMMARY: 0.847  13.400  … 16410000 … 64318.87 … SPANN
   
     SPANN R=2 failure, 16.41M 
   
     HNSW, 16.41M
     SUMMARY: 0.991  1849.740 … 16410000 … HNSW
     SUMMARY: 0.993  2024.140 … 16410000 … HNSW
     SUMMARY: 0.996  2076.020 … 16410000 … HNSW
   
     SPANN R=2 success, 10M
     SUMMARY: 0.933  60.550 … 10000000 … SPANN
   
     Current work: disk‑efficient SPANN build (partitionId+docId only) + 
centroid HNSW assignment to reduce scratch  space and indexing time, aiming to 
make R=2 feasible at 16.4M.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to