hanahmily commented on issue #13811:
URL: https://github.com/apache/skywalking/issues/13811#issuecomment-4231777757

   
   > 1. Shard tag and entity tag can differ, scattering the same entity across 
nodes.
   > 
   > Entity and ShardingKey are independent fields in Measure. When ShardingKey 
is set, it overrides the shard routing from Entity. For example, with 
entity.tag_names=["service_id"] and sharding_key.tag_names=["instance_id"], 
data for the same service_id lands on different shards/nodes under different 
instance_id values. Each node only sees a partial view of that entity.
   > 
   
   No, ShardingKey and Entity are not independent. ShardingKey aims to enhance 
topn streaming performance and must adhere to the rule that the same entity 
always maps to the same node. Refer to the example I mentioned at 
https://github.com/apache/skywalking/issues/12526. The OAP server follows the 
rule to set up the ShardingKey. 
   
   Your insight inspired me to add a validation step to enforce this implicit 
rule. If the end user sets them as your example, it will cause an unexpected 
result. 
   
   
   > 2. Even on a single node, agg=UNSPECIFIED still truncates incorrectly.
   > 
   > The coordinator sends agg=AGGREGATION_FUNCTION_UNSPECIFIED to data nodes, 
which prevents proper aggregation. For a COUNT TopN with TopN=2, a node holding 
entity-A(5 points), entity-B(3 points), entity-C(1 point) cannot compute 
COUNT(entity-A)=5. It simply truncates raw results by the TopN limit, returning 
incorrect partial data.
   
   <head></head><h2 data-path-to-node="3" style="caret-color: rgb(0, 0, 0); 
color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; 
letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; 
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; 
-webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans&quot;, sans-serif !important; line-height: 1.15 !important; 
margin-top: 0px !important;">TopN Query Distribution and Sharding Logic</h2><p 
data-path-to-node="4" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); 
font-style: normal; font-variant-caps: normal; font-weight: 400; 
letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; 
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; 
-webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font
 -family: &quot;Google Sans Text&quot;, sans-serif !important; line-height: 
1.15 !important; margin-top: 0px !important;">In BanyanDB, the current<span 
class="Apple-converted-space"> </span><b data-path-to-node="4" 
data-index-in-node="25" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">TopN query</b><span class="Apple-converted-space"> 
</span>implementation pushes the aggregation functions directly to the data 
nodes rather than pruning them.</p><h3 data-path-to-node="5" 
style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; 
font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: 
start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; 
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans&quot;, sans-serif !important; line-
 height: 1.15 !important; margin-top: 0px !important;">1. Ad-hoc TopN 
Queries</h3><p data-path-to-node="6" style="caret-color: rgb(0, 0, 0); color: 
rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; 
letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; 
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; 
-webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">During distributed analysis, the 
system determines whether to push down the logic based on the presence of an 
aggregate function:</p><response-element class="" 
ng-version="0.0.0-PLACEHOLDER" style="caret-color: rgb(0, 0, 0); color: rgb(0, 
0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; 
letter-spacing: normal; orphans: 2; text-align: start; text-indent
 : 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 
0px; -webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;"><code-block _nghost-ng-c1583389803="" 
class="ng-tns-c1583389803-103 ng-star-inserted" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;"><div _ngcontent-ng-c1583389803="" 
class="code-block ng-tns-c1583389803-103 ng-animate-disabled ng-trigger 
ng-trigger-codeBlockRevealAnimation" 
jslog="223238;track:impression,attention;BardVeMetadataKey:[[&quot;r_e388c064eee99d0a&quot;,&quot;c_5825bdc60b46ed16&quot;,null,&quot;rc_a80596458e72d6a2&quot;,null,null,&quot;en&quot;,null,1,null,null,1,0]]"
 style="display: block; font-family: &quot;Google Sans Text&quot;, sans-serif 
!important; line-heigh
 t: 1.15 !important; margin-top: 0px !important;"><div 
_ngcontent-ng-c1583389803="" class="code-block-decoration header-formatted 
gds-title-s ng-tns-c1583389803-103 ng-star-inserted" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;"><span _ngcontent-ng-c1583389803="" 
class="ng-tns-c1583389803-103" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;">Go</span><div _ngcontent-ng-c1583389803="" class="buttons 
ng-tns-c1583389803-103 ng-star-inserted" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;"><button _ngcontent-ng-c1583389803="" aria-label="Copy code" 
mat-icon-button="" mattooltip="Copy code" class="mdc-icon-button 
mat-mdc-icon-button mat-mdc-button-base mat-mdc-tooltip-trigger copy-button 
ng-tns-c1583389803-103 mat-unthemed ng-star-inserted"
  mat-ripple-loader-uninitialized="" 
mat-ripple-loader-class-name="mat-mdc-button-ripple" 
mat-ripple-loader-centered="" 
jslog="179062;track:generic_click,impression;BardVeMetadataKey:[[&quot;r_e388c064eee99d0a&quot;,&quot;c_5825bdc60b46ed16&quot;,null,&quot;rc_a80596458e72d6a2&quot;,null,null,&quot;en&quot;,null,1,null,null,1,0]];mutable:true"
 style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;"><span 
class="mat-mdc-button-persistent-ripple mdc-icon-button__ripple" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;"></span><mat-icon 
_ngcontent-ng-c1583389803="" role="img" fonticon="content_copy" class="mat-icon 
notranslate gds-icon-s google-symbols mat-ligature-font mat-icon-no-color" 
aria-hidden="true" data-mat-icon-type="font" data-mat-icon-name="content_copy" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !impo
 rtant; line-height: 1.15 !important; margin-top: 0px 
!important;"></mat-icon><span class="mat-focus-indicator" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;"></span><span 
class="mat-mdc-button-touch-target" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;"></span></button></div></div><div _ngcontent-ng-c1583389803="" 
class="formatted-code-block-internal-container ng-tns-c1583389803-103" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;"><div 
_ngcontent-ng-c1583389803="" class="animated-opacity ng-tns-c1583389803-103" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;"><pre 
_ngcontent-ng-c1583389803="" class="ng-tns-c1583389803-103" style="font-family: 
&quot;Goo
 gle Sans Text&quot;, sans-serif !important; line-height: 1.15 !important; 
margin-top: 0px !important;"><code _ngcontent-ng-c1583389803="" role="text" 
data-test-id="code-content" class="code-container formatted 
ng-tns-c1583389803-103" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;"><span class="hljs-comment" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;">// DistributedAnalyze converts logical expressions into an 
executable </span>
   <span class="hljs-comment" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">// operation tree represented by a Plan.</span>
   <span class="hljs-function" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;"><span class="hljs-keyword" style="font-family: &quot;Google 
Sans Text&quot;, sans-serif !important; line-height: 1.15 !important; 
margin-top: 0px !important;">func</span> <span class="hljs-title" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px 
!important;">DistributedAnalyze</span><span class="hljs-params" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;">(criteria 
*measurev1.QueryRequest, ss []logical.Schema)</span> <span class="hljs-params" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;">(logical.Plan, 
error)</span></span> {
       <span class="hljs-comment" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;">// ...</span>
       pushDownAgg := criteria.GetAgg() != <span class="hljs-literal" 
style="font-family: &quot;Google Sans Text&quot;, sans-serif !important; 
line-height: 1.15 !important; margin-top: 0px !important;">nil</span>
       plan := newUnresolvedDistributed(criteria, pushDownAgg)
       <span class="hljs-comment" style="font-family: &quot;Google Sans 
Text&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 
0px !important;">// ...</span>
   }
   </code></pre></div></div></div></code-block></response-element><p 
data-path-to-node="8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); 
font-style: normal; font-variant-caps: normal; font-weight: 400; 
letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; 
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; 
-webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">If<span class="Apple-converted-space"> 
</span><code data-path-to-node="8" data-index-in-node="3" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">criteria.GetAgg()</code><span 
class="Apple-converted-space"> </span>is not nil, the aggregation function is 
pushed down to the data nodes for execution.</p><h3 da
 ta-path-to-node="9" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); 
font-style: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 
2; text-align: start; text-indent: 0px; text-transform: none; white-space: 
normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; 
text-decoration-line: none; text-decoration-thickness: auto; 
text-decoration-style: solid; font-family: &quot;Google Sans&quot;, sans-serif 
!important; line-height: 1.15 !important; margin-top: 0px !important;">2. 
Pre-calculated TopN Streaming</h3><p data-path-to-node="10" style="caret-color: 
rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: 
normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: 
start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; 
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans Tex
 t&quot;, sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">If you are referring to<span class="Apple-converted-space"> 
</span><b data-path-to-node="10" data-index-in-node="24" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">pre-calculated TopN streaming</b><span 
class="Apple-converted-space"> </span>rather than ad-hoc queries, the behavior 
relies on the<span class="Apple-converted-space"> </span><code 
data-path-to-node="10" data-index-in-node="109" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">ShardingKey</code>. To maintain high 
performance, BanyanDB ensures that all data for a specific entity resides on 
the same node.</p><h4 data-path-to-node="11" style="caret-color: rgb(0, 0, 0); 
color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; 
letter-spacing: normal; orphans:
  2; text-align: start; text-indent: 0px; text-transform: none; white-space: 
normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; 
text-decoration-line: none; text-decoration-thickness: auto; 
text-decoration-style: solid; font-family: &quot;Google Sans&quot;, sans-serif 
!important; line-height: 1.15 !important; margin-top: 0px 
!important;">Comparison: Sharding Scenarios</h4><p data-path-to-node="12" 
style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; 
font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 
2; text-align: start; text-indent: 0px; text-transform: none; white-space: 
normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; 
text-decoration-line: none; text-decoration-thickness: auto; 
text-decoration-style: solid; font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">Suppose we want to calculate<span class="Apple-converted-space">�
 �</span><b data-path-to-node="12" data-index-in-node="29" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">Top 2</b><span 
class="Apple-converted-space"> </span>by<span class="Apple-converted-space"> 
</span><b data-path-to-node="12" data-index-in-node="38" style="font-family: 
&quot;Google Sans Text&quot;, sans-serif !important; line-height: 1.15 
!important; margin-top: 0px !important;">Count</b><span 
class="Apple-converted-space"> </span>for the entity set<span 
class="Apple-converted-space"> </span><code data-path-to-node="12" 
data-index-in-node="63" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">Service + Instance</code>.</p>
   
   Scenario | Configuration | Data Distribution & Merging
   -- | -- | --
   A | No ShardingKey | Node A returns ServiceA(Inst1:5, Inst3:3).Node B 
returns ServiceA(Inst2:6, Inst4:1).The Liaison node must merge these results to 
output: ServiceA(Inst2:6, Inst1:5).
   B | ShardingKey = Service | Node A contains all data for ServiceA and 
returns ServiceA(Inst2:6, Inst1:5)directly.Node B contains no data for ServiceA.
   
   <h3 data-path-to-node="14" style="caret-color: rgb(0, 0, 0); color: rgb(0, 
0, 0); font-style: normal; font-variant-caps: normal; letter-spacing: normal; 
orphans: 2; text-align: start; text-indent: 0px; text-transform: none; 
white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 
0px; text-decoration-line: none; text-decoration-thickness: auto; 
text-decoration-style: solid; font-family: &quot;Google Sans&quot;, sans-serif 
!important; line-height: 1.15 !important; margin-top: 0px !important;">Design 
Principle</h3><p data-path-to-node="15" style="caret-color: rgb(0, 0, 0); 
color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; 
font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; 
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; 
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-line: none; 
text-decoration-thickness: auto; text-decoration-style: solid; font-family: 
&quot;Google Sans Text&quot;
 , sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">A core design principle of BanyanDB is to<span 
class="Apple-converted-space"> </span><b data-path-to-node="15" 
data-index-in-node="42" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">avoid distributing the same aggregation entity across different 
data nodes.</b>By ensuring an entity's data is localized to a single node via 
the<span class="Apple-converted-space"> </span><code data-path-to-node="15" 
data-index-in-node="185" style="font-family: &quot;Google Sans Text&quot;, 
sans-serif !important; line-height: 1.15 !important; margin-top: 0px 
!important;">ShardingKey</code>, we eliminate unnecessary network overhead and 
coordinator-side merging, significantly improving performance.</p>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to