steFaiz commented on code in PR #6738:
URL: https://github.com/apache/paimon/pull/6738#discussion_r2588659837
##########
paimon-common/src/main/java/org/apache/paimon/lookup/sort/BlockIterator.java:
##########
@@ -70,32 +62,58 @@ public void remove() {
throw new UnsupportedOperationException();
}
+ /**
+ * Seeks to the first entry with a key that is either equal to or greater
than the specified
+ * key. After this call, the next invocation of {@link
BlockIterator#next()} will return that
+ * entry (if it exists).
+ *
+ * <p>Note that the comparing value must be monotonically increasing
across current block e.g.
+ * key and some special values such as the {@code lastRecordPosition} of
an {@code
+ * IndexBlockEntry}.
+ *
+ * @param targetKey target key
+ * @return true if found an equal record
+ */
public boolean seekTo(MemorySlice targetKey) {
int left = 0;
int right = recordCount - 1;
+ int mid = left + (right - left) / 2;
while (left <= right) {
- int mid = left + (right - left) / 2;
+ mid = left + (right - left) / 2;
seekTo(mid);
BlockEntry midEntry = readEntry();
int compare = comparator.compare(midEntry.getKey(), targetKey);
if (compare == 0) {
- polled = midEntry;
- return true;
+ break;
} else if (compare > 0) {
- polled = midEntry;
right = mid - 1;
} else {
left = mid + 1;
Review Comment:
## **Quick Answer**
It can't pass the Test QAQ
<img width="642" height="158" alt="image"
src="https://github.com/user-attachments/assets/e1fc0a1b-94fa-4b1a-b353-e0a7f7aa4797"
/>
## **Reason**
if we just let `polled` = null here, it means that if we are seeking to some
value exactly not present in current block, the `polled` will always be null.
At that case, the next iterator position after seek will always be the position
after the last `MID` position (because we will seek to mid and call readNext
for each loop).
Let's recall the example before:
First case:
* At that case, the target element is skipped!
```
Iter Pos
∆
|
|
values: ... 10, 20, 30, 40, 50 ...
| |
V V
HIGH MID
LOW
```
* the test result also prove that: the actual value is greater than our
expectation
<img width="592" height="214" alt="image"
src="https://github.com/user-attachments/assets/29895ef9-96ab-4e79-a658-cca4a2c5d419"
/>
Second case:
* At that case, the iteration process is expected.
```
Iter Pos
∆
|
values: 10, 20, 30, 40, 50
| |
V V
MID LOW
HIGH
```
## **Thinking**
I think if we really want to keep the `polled` field, which is always the
target element (if has some), we need to seek to the next position of polled
before returning. I think it will be equilvalent to current implementation. And
removing `polled` field makes the code more clean and readable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]