iverase commented on a change in pull request #556: LUCENE-8673: Use radix
sorting when merging dimensional points
URL: https://github.com/apache/lucene-solr/pull/556#discussion_r252680898
##########
File path: lucene/core/src/java/org/apache/lucene/util/bkd/PointReader.java
##########
@@ -35,47 +33,11 @@
/** Returns the packed byte[] value */
public abstract byte[] packedValue();
- /** Point ordinal */
- public abstract long ord();
-
/** DocID for this point */
public abstract int docID();
- /** Iterates through the next {@code count} ords, marking them in the
provided {@code ordBitSet}. */
- public void markOrds(long count, LongBitSet ordBitSet) throws IOException {
- for(int i=0;i<count;i++) {
- boolean result = next();
- if (result == false) {
- throw new IllegalStateException("did not see enough points from
reader=" + this);
- }
- assert ordBitSet.get(ord()) == false: "ord=" + ord() + " was seen twice
from " + this;
- ordBitSet.set(ord());
- }
- }
-
- /** Splits this reader into left and right partitions */
- public long split(long count, LongBitSet rightTree, PointWriter left,
PointWriter right, boolean doClearBits) throws IOException {
-
- // Partition this source according to how the splitDim split the values:
- long rightCount = 0;
- for (long i=0;i<count;i++) {
- boolean result = next();
- assert result;
- byte[] packedValue = packedValue();
- long ord = ord();
- int docID = docID();
- if (rightTree.get(ord)) {
- right.append(packedValue, ord, docID);
- rightCount++;
- if (doClearBits) {
- rightTree.clear(ord);
- }
- } else {
- left.append(packedValue, ord, docID);
- }
- }
+ /** Build histogram of the document at the provided byte position */
+ public abstract void buildHistogram(int bytePosition, int[] histogram)
throws IOException;
Review comment:
It actually does quite a bit (you don't need to copy every packedValue to a
byte array). Note that I am using the same strategy that the method `markOrds`
is using in the current strategy.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]