Pulkitg64 commented on PR #15549:
URL: https://github.com/apache/lucene/pull/15549#issuecomment-3842988189

   I think I found the problem. I was running these benchmarks on m5.12x large 
machines. This instance doesn't support float16  intrinsic operations. So, I 
changed my instance to m7g.8x large machines and here are the results:
   
   I am seeing much better performance with float16 encoding now. The latency 
with float16 is still 50% higher than float32. I think this is expected because 
there is extra conversion between float16 and float32. Also I haven't 
implemented bulk scoring as well, so maybe that will help us in some latency . 
The indexing rate is improved by 10% (this maybe because of fast fetching of 
smaller vectors). 
   
   | Encoding | recall  | latency(ms)  | netCPU  | avgCpuCount  | visited  | 
index(s)  | index_docs/s  | force_merge(s)  | index_size(MB) |
   
|----------|---------|--------------|---------|--------------|----------|-----------|---------------|-----------------|----------------|
   | float16  |  0.992  | 3.229        | 3.154   | 0.977        | 6820     | 
17.01     | 5879.93       | 0.01            | 207.65         |
   | float32  |  0.990  | 2.111        | 2.066   | 0.978        | 6858     | 
19.18     | 5214.04       | 22.81           | 403.03         |
   
   
   * Profiler for float16:
   
   ```
   40.69%        82592         
jdk.incubator.vector.Float16Vector#reduceLanesTemplate() [Inlined code]
   20.50%        41612         
org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProductBody()
 [JIT compiled code]
   5.41%         10983         
jdk.incubator.vector.Float16Vector#fromArray0Template() [Inlined code]
   5.00%         10158         
org.apache.lucene.index.Float16VectorValues$1#vectorValue() [Inlined code]
   3.92%         7964          
jdk.internal.vm.vector.VectorSupport#maybeRebox() [Inlined code]
   2.21%         4488          
jdk.internal.vm.vector.VectorSupport$VectorPayload#getPayload() [Inlined code]
   1.71%         3467          
org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code]
   1.43%         2909          
org.apache.lucene.util.hnsw.OnHeapHnswGraph#getNeighbors() [Inlined code]
   1.19%         2408          
org.apache.lucene.util.TernaryLongHeap#downHeap() [Inlined code]
   1.18%         2386          
org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsReader$BlockState#doReset()
 [JIT compiled code]
   0.87%         1763          java.lang.invoke.VarHandleSegmentAsInts#get() 
[Inlined code]
   0.85%         1722          org.apache.lucene.util.FixedBitSet#getAndSet() 
[Inlined code]
   0.75%         1514          
org.apache.lucene.util.hnsw.OnHeapHnswGraph#nextNeighbor() [Inlined code]
   0.63%         1278          
org.apache.lucene.util.hnsw.HnswConcurrentMergeBuilder$MergeSearcher#graphSeek()
 [JIT compiled code]
   0.62%         1251          
jdk.incubator.vector.Float16Vector#lanewiseTemplate() [Inlined code]
   0.61%         1247          
org.apache.lucene.util.hnsw.HnswGraphBuilder#diversityCheck() [JIT compiled 
code]
   0.47%         961           java.util.ArrayList#elementData() [Inlined code]
   0.47%         951           org.apache.lucene.util.hnsw.NeighborArray#size() 
[Inlined code]
   0.45%         904           sun.nio.ch.UnixFileDispatcherImpl#write0() 
[Native code]
   0.44%         894           org.apache.lucene.util.FixedBitSet#getAndSet() 
[JIT compiled code]
   0.40%         813           
org.apache.lucene.util.hnsw.NeighborArray#nodes() [Inlined code]
   0.36%         730           sun.nio.ch.UnixFileDispatcherImpl#read0() 
[Native code]
   0.35%         710           
org.apache.lucene.codecs.hnsw.DefaultFlatVectorScorer$Float16ScoringSupplier$1#setScoringOrdinal()
 [Inlined code]
   0.34%         699           org.apache.lucene.util.TernaryLongHeap#upHeap() 
[Inlined code]
   0.34%         689           
org.apache.lucene.util.hnsw.HnswGraphSearcher#graphNextNeighbor() [Inlined code]
   0.33%         677           
jdk.internal.misc.ScopedMemoryAccess#getByteInternal() [Inlined code]
   0.31%         623           sun.nio.ch.UnixFileDispatcherImpl#force0() 
[Native code]
   0.30%         616           
org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProduct() 
[Inlined code]
   0.26%         521           jdk.incubator.vector.Float16Vector#rOpTemplate() 
[Inlined code]
   0.24%         495           
org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [Interpreted code]
   ```
   
   * Profiler for float32
   
   ```
   63.09%        125971        
jdk.incubator.vector.FloatVector#reduceLanesTemplate() [Inlined code]
   5.72%         11426         
jdk.internal.misc.ScopedMemoryAccess#loadFromMemorySegmentScopedInternal() 
[Inlined code]
   3.86%         7714          
org.apache.lucene.index.FloatVectorValues$1#vectorValue() [Inlined code]
   2.97%         5930          
org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code]
   1.58%         3155          org.apache.lucene.util.FixedBitSet#getAndSet() 
[Inlined code]
   1.35%         2691          
org.apache.lucene.util.TernaryLongHeap#downHeap() [Inlined code]
   1.28%         2565          
org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProductBody()
 [JIT compiled code]
   1.26%         2515          
jdk.incubator.vector.FloatVector#fromArray0Template() [Inlined code]
   1.25%         2500          
org.apache.lucene.util.hnsw.HnswConcurrentMergeBuilder$MergeSearcher#graphSeek()
 [JIT compiled code]
   1.16%         2326          
org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsReader$BlockState#doReset()
 [JIT compiled code]
   1.11%         2212          
org.apache.lucene.util.hnsw.HnswGraphBuilder#diversityCheck() [Inlined code]
   1.08%         2164          
jdk.incubator.vector.FloatVector#lanewiseTemplate() [Inlined code]
   1.02%         2029          
org.apache.lucene.util.hnsw.OnHeapHnswGraph#getNeighbors() [Inlined code]
   0.69%         1381          
jdk.internal.misc.ScopedMemoryAccess#getByteInternal() [Inlined code]
   0.58%         1165          
org.apache.lucene.util.hnsw.OnHeapHnswGraph#nextNeighbor() [Inlined code]
   0.54%         1075          sun.nio.ch.UnixFileDispatcherImpl#write0() 
[Native code]
   0.53%         1067          sun.nio.ch.UnixFileDispatcherImpl#force0() 
[Native code]
   0.49%         985           
org.apache.lucene.util.VectorUtil#normalizeToUnitInterval() [Inlined code]
   0.45%         902           org.apache.lucene.util.TernaryLongHeap#upHeap() 
[Inlined code]
   0.43%         858           
org.apache.lucene.util.hnsw.NeighborArray#nodes() [Inlined code]
   0.41%         811           
org.apache.lucene.codecs.hnsw.DefaultFlatVectorScorer$FloatScoringSupplier$1#setScoringOrdinal()
 [Inlined code]
   0.39%         775           
org.apache.lucene.util.GroupVIntUtil#readGroupVInt() [Inlined code]
   0.38%         749           
org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get() [JIT 
compiled code]
   0.37%         739           sun.nio.ch.UnixFileDispatcherImpl#read0() 
[Native code]
   0.36%         716           
org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [Interpreted code]
   0.28%         556           
jdk.incubator.vector.FloatVector#fromMemorySegment() [Inlined code]
   0.24%         479           
org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsReader$OffHeapHnswGraph#seek()
 [Inlined code]
   0.24%         475           
java.util.concurrent.locks.ReentrantReadWriteLock#readLock() [Inlined code]
   0.21%         427           java.util.ArrayList#elementData() [Inlined code]
   0.20%         402           
java.util.concurrent.locks.AbstractQueuedLongSynchronizer#compareAndSetState() 
[Inlined code]
   ```
   
   #### Next Steps:
   
   Understand the flame chart and try to further improve the float16 encoding 
benchmark runs.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to