siddharthteotia commented on pull request #6262:
URL: https://github.com/apache/incubator-pinot/pull/6262#issuecomment-730664669
I think the JMH perf is probably a reasonable indicator of the benefit for
single read APIs. However, I don't think we have added exhaustive unit tests
for all numBits and different docId ranges covering all cases to ensure these
are all functionally correct and won't segfault.
```
@Override
public int read(int index) {
int offset = index >>> 3;
int bitOffsetInByte = index & 0x7;
return (_dataBuffer.getByte(offset) >>> (7 - bitOffsetInByte)) & 0x1;
}
@Override
public int readUnchecked(int index) {
int offset = index >>> 3;
int bitOffsetInByte = index & 0x7;
return (_dataBuffer.getByte(offset) >>> (7 - bitOffsetInByte)) & 0x1;
}
```
For the getDictIds function:
```
@Override
public void readDictIds(int[] docIds, int length, int[] dictIdBuffer,
ForwardIndexReaderContext context)
```
- Is it correct to say that new bulk read32() will be used here only if the
docIDs array has sequential (consecutive) ids. Otherwise, we will use the new
single read APIs?
- The readUnchecked() has the potential to segfault / core dump?
- There is a chance that perf wise, we might see -ve impact due to memcpy
(especially for high throughput use cases). So for this API at least, I don't
think JMH is a reasonable indicator of better performance always. It is
worthwhile to do validation on a prod-like workload.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]