eye-gu commented on issue #13811: URL: https://github.com/apache/skywalking/issues/13811#issuecomment-4231841193
> > 1. Shard tag and entity tag can differ, scattering the same entity across nodes. > > > > Entity and ShardingKey are independent fields in Measure. When ShardingKey is set, it overrides the shard routing from Entity. For example, with entity.tag_names=["service_id"] and sharding_key.tag_names=["instance_id"], data for the same service_id lands on different shards/nodes under different instance_id values. Each node only sees a partial view of that entity. > > No, ShardingKey and Entity are not independent. ShardingKey aims to enhance topn streaming performance and must adhere to the rule that the same entity always maps to the same node. Refer to the example I mentioned at [#12526](https://github.com/apache/skywalking/issues/12526). The OAP server follows the rule to set up the ShardingKey. > > Your insight inspired me to add a validation step to enforce this implicit rule. If the end user sets them as your example, it will cause an unexpected result. > > > 2. Even on a single node, agg=UNSPECIFIED still truncates incorrectly. > > > > The coordinator sends agg=AGGREGATION_FUNCTION_UNSPECIFIED to data nodes, which prevents proper aggregation. For a COUNT TopN with TopN=2, a node holding entity-A(5 points), entity-B(3 points), entity-C(1 point) cannot compute COUNT(entity-A)=5. It simply truncates raw results by the TopN limit, returning incorrect partial data. > > ## TopN Query Distribution and Sharding Logic > In BanyanDB, the current TopN query implementation pushes the aggregation functions directly to the data nodes rather than pruning them. > > ### 1. Ad-hoc TopN Queries > During distributed analysis, the system determines whether to push down the logic based on the presence of an aggregate function: > > Go > ``` > // DistributedAnalyze converts logical expressions into an executable > // operation tree represented by a Plan. > func DistributedAnalyze(criteria *measurev1.QueryRequest, ss []logical.Schema) (logical.Plan, error) { > // ... > pushDownAgg := criteria.GetAgg() != nil > plan := newUnresolvedDistributed(criteria, pushDownAgg) > // ... > } > ``` > > If `criteria.GetAgg()` is not nil, the aggregation function is pushed down to the data nodes for execution. > > ### 2. Pre-calculated TopN Streaming > If you are referring to pre-calculated TopN streaming rather than ad-hoc queries, the behavior relies on the `ShardingKey`. To maintain high performance, BanyanDB ensures that all data for a specific entity resides on the same node. > > #### Comparison: Sharding Scenarios > Suppose we want to calculate Top 2 by Count for the entity set `Service + Instance`. > > Scenario Configuration Data Distribution & Merging > A No ShardingKey Node A returns ServiceA(Inst1:5, Inst3:3).Node B returns ServiceA(Inst2:6, Inst4:1).The Liaison node must merge these results to output: ServiceA(Inst2:6, Inst1:5). > B ShardingKey = Service Node A contains all data for ServiceA and returns ServiceA(Inst2:6, Inst1:5)directly.Node B contains no data for ServiceA. > ### Design Principle > A core design principle of BanyanDB is to avoid distributing the same aggregation entity across different data nodes.By ensuring an entity's data is localized to a single node via the `ShardingKey`, we eliminate unnecessary network overhead and coordinator-side merging, significantly improving performance. 1. Sorry for the noise! I wasn’t aware of this shard key limitation for measures, so this issue indeed doesn’t exist. 2. For testing the distributed TopN query, my entry point was banyand/dquery/topn.go. However, at this entry point, there was no pushdown. https://github.com/apache/skywalking-banyandb/blob/8105dfe1bd9787d8acdb5e9d9b780d85eb4db9a7/banyand/dquery/topn.go#L106-L108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
