hanahmily commented on code in PR #861:
URL:
https://github.com/apache/skywalking-banyandb/pull/861#discussion_r2576125821
##########
banyand/liaison/grpc/property.go:
##########
@@ -762,3 +784,38 @@ type repairInProcessKey struct {
entity string
modTime int64
}
+
+// sortPropertiesWithSortedValues sorts properties using pre-extracted
sortedValue bytes.
+func sortPropertiesWithSortedValues(
Review Comment:
Inputs: Streams[1..N], Limit K
Output: Top-K Properties
1. Initialize Min-Heap: Create a Priority Queue (Min-Heap) of size $N$.
The Heap orders elements based on the SortTag value.
Pull the first item from each Stream and push to the Heap.
2. Global Deduplication State:
SeenIDs: A Hash Map <ID, Revision>.
ResultBuffer: A list to store potential results.
3. ResultBuffer: Accumulate candidates
4. The Merge Loop:
While Heap is not empty:
- Pop: Remove the top element P (smallest tag value) from the Heap.
- Refill: Fetch the next element from the stream P came from and push to
Heap.
- Conflict Check:
- Check SeenIDs for P.ID.
- Scenario 1 (New ID): P.ID is not in SeenIDs.
- Add P.ID -> P.Revision to SeenIDs.
- Append P to ResultBuffer.
- Scenario 2 (Duplicate - Older): SeenIDs has P.ID with Revision
$T_{old} \ge P.Revision$.
- P is an obsolete revision from a lagging shard.
- Action: Discard P.
- Scenario 3 (Duplicate - Newer): SeenIDs has P.ID with Revision
$T_{old} < P.Revision$.
- We must replace the old entry in the ResultBuffer, or remove it
when P is a deleted value
- Resort the ResultBuffer based on the new tag values
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]