lxy-9602 commented on PR #6807:
URL: https://github.com/apache/paimon/pull/6807#issuecomment-3673382646
> @jerry-024 Vector search and future Full-text search should inherit from
TopN, furthermore, I think the results of the predicate index should be
integrated into the TopN like this:
>
> RangeBitmapFileIndex.java
>
> ```java
> //...
> public FileIndexResult visitTopN(TopN topN) {
> FileIndexResult result = topN.getPredicateIndexResult();
> RoaringBitmap32 foundSet =
> result instanceof BitmapIndexResult ? ((BitmapIndexResult)
result).get() : null;
>
> int limit = topN.limit();
> List<SortValue> orders = topN.orders();
> SortValue sort = orders.get(0);
> SortValue.NullOrdering nullOrdering = sort.nullOrdering();
> boolean strict = orders.size() == 1;
> if (ASCENDING.equals(sort.direction())) {
> return new BitmapIndexResult(
> () -> bitmap.bottomK(limit, nullOrdering, foundSet,
strict));
> } else {
> return new BitmapIndexResult(
> () -> bitmap.topK(limit, nullOrdering, foundSet, strict));
> }
> }
> ```
>
> FileIndexReader.java
>
> ```java
> //...
> public FileIndexResult visitTopN(TopN topN) {
> return REMAIN;
> }
> public FileIndexResult visitVectorSearch(VectorSearch search){
> return visitTopN(search);
> }
> public FileIndexResult visitFullTextSearch(FullTextSearch search){
> return visitTopN(search);
> }
> ```
>
> cc @Tan-JiaLiang @lxy-9602
> @jerry-024 Vector search and future Full-text search should inherit from
TopN, furthermore, I think the results of the predicate index should be
integrated into the TopN like this:
>
> RangeBitmapFileIndex.java
>
> ```java
> //...
> public FileIndexResult visitTopN(TopN topN) {
> FileIndexResult result = topN.getPredicateIndexResult();
> RoaringBitmap32 foundSet =
> result instanceof BitmapIndexResult ? ((BitmapIndexResult)
result).get() : null;
>
> int limit = topN.limit();
> List<SortValue> orders = topN.orders();
> SortValue sort = orders.get(0);
> SortValue.NullOrdering nullOrdering = sort.nullOrdering();
> boolean strict = orders.size() == 1;
> if (ASCENDING.equals(sort.direction())) {
> return new BitmapIndexResult(
> () -> bitmap.bottomK(limit, nullOrdering, foundSet,
strict));
> } else {
> return new BitmapIndexResult(
> () -> bitmap.topK(limit, nullOrdering, foundSet, strict));
> }
> }
> ```
>
> FileIndexReader.java
>
> ```java
> //...
> public FileIndexResult visitTopN(TopN topN) {
> return REMAIN;
> }
> public FileIndexResult visitVectorSearch(VectorSearch search){
> return visitTopN(search);
> }
> public FileIndexResult visitFullTextSearch(FullTextSearch search){
> return visitTopN(search);
> }
> ```
>
> cc @Tan-JiaLiang @lxy-9602
The TopN interface appears to accept a pre-filtered bitmap + N limit, which
closely resembles the current vector search semantics. These two might be good
candidates for unification under a generic TopK abstraction?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]