[ 
https://issues.apache.org/jira/browse/IGNITE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-12935:
---------------------------------------
    Description: 
# Mention in the log only partitions for which there are no nodes that suit as 
historical supplier
 For these partitions, print minimal counter (since which we should perform 
historical rebalancing) with corresponding node and maximum reserved counter 
(since which cluster can perform historical rebalancing) with corresponding 
node.
 This will let us know:
 ## Whether history was reserved at all
 ## How much reserved history we lack to perform a historical rebalancing
 ## I see resulting output like this:
{noformat}
 Historical rebalancing wasn't scheduled for some partitions:
 History wasn't reserved for: [list of partitions and groups]
 History was reserved, but minimum present counter is less than maximum 
reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, 
maxReservedNodeId=ID], ...]{noformat}

 ## We can also aggregate previous message by (minNodeId) to easily find the 
exact node (or nodes) which were the reason of full rebalance.
 # Log results of {{reserveHistoryForExchange()}}. They can be compactly 
represented as mappings: {{(grpId -> checkpoint (id, timestamp))}}. For every 
group, also log message about why the previous checkpoint wasn't successfully 
reserved.
 There can be three reasons:
 ## Previous checkpoint simply isn't present in the history (the oldest is 
reserved)
 ## WAL reservation failure (call below returned false)
{code}\{code}

{code:java}
chpEntry = entry(cpTs);
boolean reserved =  cctx.wal().reserve(chpEntry.checkpointMark());// If 
checkpoint WAL history can't be reserved, stop searching.
if (!reserved)
  break;
{code}
                 3. Checkpoint was marked as inapplicable for historical 
rebalancing
{code:java}
for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet()))
  if (!isCheckpointApplicableForGroup(grpId, chpEntry))
    groupsAndPartitions.remove(grpId);
{code}

  was:
# Mention in the log only partitions for which there are no nodes that suit as 
historical supplier
 For these partitions, print minimal counter (since which we should perform 
historical rebalancing) with corresponding node and maximum reserved counter 
(since which cluster can perform historical rebalancing) with corresponding 
node.
 This will let us know:
 ## Whether history was reserved at all
 ## How much reserved history we lack to perform a historical rebalancing
 ## I see resulting output like this:
{noformat}Historical rebalancing wasn't scheduled for some partitions:
 History wasn't reserved for: [list of partitions and groups]
 History was reserved, but minimum present counter is less than maximum 
reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, 
maxReservedNodeId=ID], ...]\{noformat}
 ## We can also aggregate previous message by (minNodeId) to easily find the 
exact node (or nodes) which were the reason of full rebalance.
 # Log results of {{reserveHistoryForExchange()}}. They can be compactly 
represented as mappings: {{(grpId -> checkpoint (id, timestamp))}}. For every 
group, also log message about why the previous checkpoint wasn't successfully 
reserved.
 There can be three reasons:
 ## Previous checkpoint simply isn't present in the history (the oldest is 
reserved)
 ## WAL reservation failure (call below returned false)

{code:java}
chpEntry = entry(cpTs);
boolean reserved =  cctx.wal().reserve(chpEntry.checkpointMark());// If 
checkpoint WAL history can't be reserved, stop searching.
if (!reserved)
  break;
{code}
                 3. Checkpoint was marked as inapplicable for historical 
rebalancing
{code:java}
for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet()))
  if (!isCheckpointApplicableForGroup(grpId, chpEntry))
    groupsAndPartitions.remove(grpId);
{code}


> Disadvantages in log of historical rebalance
> --------------------------------------------
>
>                 Key: IGNITE-12935
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12935
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vladislav Pyatkov
>            Priority: Major
>
> # Mention in the log only partitions for which there are no nodes that suit 
> as historical supplier
>  For these partitions, print minimal counter (since which we should perform 
> historical rebalancing) with corresponding node and maximum reserved counter 
> (since which cluster can perform historical rebalancing) with corresponding 
> node.
>  This will let us know:
>  ## Whether history was reserved at all
>  ## How much reserved history we lack to perform a historical rebalancing
>  ## I see resulting output like this:
> {noformat}
>  Historical rebalancing wasn't scheduled for some partitions:
>  History wasn't reserved for: [list of partitions and groups]
>  History was reserved, but minimum present counter is less than maximum 
> reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, 
> maxReservedNodeId=ID], ...]{noformat}
>  ## We can also aggregate previous message by (minNodeId) to easily find the 
> exact node (or nodes) which were the reason of full rebalance.
>  # Log results of {{reserveHistoryForExchange()}}. They can be compactly 
> represented as mappings: {{(grpId -> checkpoint (id, timestamp))}}. For every 
> group, also log message about why the previous checkpoint wasn't successfully 
> reserved.
>  There can be three reasons:
>  ## Previous checkpoint simply isn't present in the history (the oldest is 
> reserved)
>  ## WAL reservation failure (call below returned false)
> {code}\{code}
> {code:java}
> chpEntry = entry(cpTs);
> boolean reserved =  cctx.wal().reserve(chpEntry.checkpointMark());// If 
> checkpoint WAL history can't be reserved, stop searching.
> if (!reserved)
>   break;
> {code}
>                  3. Checkpoint was marked as inapplicable for historical 
> rebalancing
> {code:java}
> for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet()))
>   if (!isCheckpointApplicableForGroup(grpId, chpEntry))
>     groupsAndPartitions.remove(grpId);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to