hanahmily commented on code in PR #898:
URL:
https://github.com/apache/skywalking-banyandb/pull/898#discussion_r2625272253
##########
pkg/filter/dictionary_filter.go:
##########
@@ -18,61 +18,74 @@
package filter
import (
- "github.com/apache/skywalking-banyandb/pkg/convert"
+ "bytes"
+
+ "github.com/apache/skywalking-banyandb/pkg/encoding"
pbv1 "github.com/apache/skywalking-banyandb/pkg/pb/v1"
)
// DictionaryFilter is a filter implementation backed by a dictionary.
-// It uses a map-based lookup for O(1) performance instead of O(n) linear
search.
+// For non-array types: uses linear iteration through values.
+// For array types: uses iterative approach, extracting and sorting on-the-fly.
type DictionaryFilter struct {
- valueSet map[string]struct{}
+ // Original serialized values
values [][]byte
valueType pbv1.ValueType
}
-// NewDictionaryFilter creates a new dictionary filter with the given values.
-func NewDictionaryFilter(values [][]byte) *DictionaryFilter {
- df := &DictionaryFilter{
- values: values,
+// MightContain checks if an item is in the dictionary.
+// For non-array types: linear iteration through values.
+// For array types: checks if the single item exists as an element in any
stored array.
+func (df *DictionaryFilter) MightContain(item []byte) bool {
+ if df.valueType == pbv1.ValueTypeStrArr || df.valueType ==
pbv1.ValueTypeInt64Arr {
+ return false
+ }
+
+ for _, v := range df.values {
+ if bytes.Equal(v, item) {
+ return true
+ }
}
- df.buildValueSet()
- return df
+ return false
Review Comment:
From the benchmark, the map is much slower.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]