alievmirza commented on code in PR #881:
URL: https://github.com/apache/ignite-3/pull/881#discussion_r904799797
##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -406,8 +406,10 @@ void onCaughtUp(final long version, final PeerId peer,
final boolean success) {
}
Requires.requireTrue(this.stage == Stage.STAGE_CATCHING_UP, "Stage
is not in STAGE_CATCHING_UP");
if (success) {
+ LOG.info("Catch up for peer={} was finished", peer);
this.addingPeers.remove(peer);
if (this.addingPeers.isEmpty()) {
+ LOG.info("Catch up phase to change peers from={} to={} was
successfully finished", oldPeers, newPeers);
Review Comment:
I would log this inside
`org.apache.ignite.raft.jraft.core.NodeImpl.ConfigurationCtx#nextStage`
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -140,6 +140,10 @@ public void onLeaderElected(long term) {
if (!pendingEntry.empty()) {
List<ClusterNode> pendingNodes = (List<ClusterNode>)
ByteUtils.fromBytes(pendingEntry.value());
+ LOG.info("New leader elected for the raft group={} "
Review Comment:
```suggestion
LOG.info("New leader was elected for the raft
group={} "
```
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -140,6 +140,10 @@ public void onLeaderElected(long term) {
if (!pendingEntry.empty()) {
List<ClusterNode> pendingNodes = (List<ClusterNode>)
ByteUtils.fromBytes(pendingEntry.value());
+ LOG.info("New leader elected for the raft group={} "
+ + "of partition={}, table={} and
pending reconfiguration to peers={} discovered",
+ partId, partNum,
tblConfiguration.name().value(), pendingNodes);
+
Review Comment:
Lets add logging to
`org.apache.ignite.internal.table.distributed.TableManager#movePartition` so we
could track that we have started a process of moving a partition.
##########
modules/table/src/main/java/org/apache/ignite/internal/utils/RebalanceUtil.java:
##########
@@ -42,9 +42,33 @@
*/
public class RebalanceUtil {
+ /** Logger. */
+ private static IgniteLogger LOG =
IgniteLogger.forClass(RebalanceUtil.class);
Review Comment:
Maybe final
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -140,6 +140,10 @@ public void onLeaderElected(long term) {
if (!pendingEntry.empty()) {
List<ClusterNode> pendingNodes = (List<ClusterNode>)
ByteUtils.fromBytes(pendingEntry.value());
+ LOG.info("New leader elected for the raft group={} "
Review Comment:
```suggestion
LOG.info("New leader was elected for the raft
group={} "
```
##########
modules/table/src/main/java/org/apache/ignite/internal/utils/RebalanceUtil.java:
##########
@@ -42,9 +42,33 @@
*/
public class RebalanceUtil {
+ /** Logger. */
+ private static IgniteLogger LOG =
IgniteLogger.forClass(RebalanceUtil.class);
+
+ /** Return code of metastore multi-invoke which identify,
Review Comment:
`identifies` here and below
##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/rpc/impl/RaftGroupServiceImpl.java:
##########
@@ -567,6 +570,9 @@ private <R> void sendWithRetry(Peer peer, Object req, long
stopTime, Completable
if (err != null) {
if (recoverable(err)) {
executor.schedule(() -> {
+ LOG.warn("Recoverable error during the request
type={} occurred: ",
Review Comment:
Lets add here that we retrying intent on a randomly selected node
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -269,16 +273,30 @@ private void
doOnNewPeersConfigurationApplied(List<PeerId> peers) {
remove(plannedPartAssignmentsKey(partId)))
.yield(true),
ops().yield(false))).get().getAsBoolean()) {
+ LOG.info("Finished rebalance of partition={}, table={} to
peers={}, "
Review Comment:
As far as we try again, we cannot say that rebalance was finished so let's
remove this ''Finished rebalance of partition={},..." line and add it only when
CAS was successful.
Please update the same logs further in the code
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -140,6 +140,10 @@ public void onLeaderElected(long term) {
if (!pendingEntry.empty()) {
List<ClusterNode> pendingNodes = (List<ClusterNode>)
ByteUtils.fromBytes(pendingEntry.value());
+ LOG.info("New leader elected for the raft group={} "
+ + "of partition={}, table={} and
pending reconfiguration to peers={} discovered",
Review Comment:
```suggestion
+ "of partition={}, table={} and
pending reconfiguration to peers={} was discovered",
```
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -366,6 +366,9 @@ private CompletableFuture<?>
onUpdateReplicas(ConfigurationNotificationEvent<Int
if (replicasCtx.oldValue() != null && replicasCtx.oldValue() > 0) {
TableConfiguration tblCfg =
replicasCtx.config(TableConfiguration.class);
+ LOG.info("Received update for replicas number for table={}
from replicas={} to replicas={}",
Review Comment:
If we run sone rebalance test, we can see in the logs the following:
```
2022-06-23 13:18:27:798 +0300 [INFO][Thread-242][TableManager] Received
update for replicas number for table=PUBLIC.TBL1 from replicas=1 to replicas=2
2022-06-23 13:18:27:799 +0300 [INFO][Thread-244][TableManager] Received
update for replicas number for table=PUBLIC.TBL1 from replicas=1 to replicas=2
2022-06-23 13:18:27:802 +0300 [INFO][Thread-241][TableManager] Received
update for replicas number for table=PUBLIC.TBL1 from replicas=1 to replicas=2
```
Lines looks the same and could be more informative, if we add info about the
node where such update has happened
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/raft/RebalanceRaftGroupEventsListener.java:
##########
@@ -269,16 +273,30 @@ private void
doOnNewPeersConfigurationApplied(List<PeerId> peers) {
remove(plannedPartAssignmentsKey(partId)))
.yield(true),
ops().yield(false))).get().getAsBoolean()) {
+ LOG.info("Finished rebalance of partition={}, table={} to
peers={}, "
+ + "but planned key={} was changed during
results CAS write to metastore, will try again",
Review Comment:
"during results CAS"
Could you please rephrase this, seems that idea is not clear
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -366,6 +366,9 @@ private CompletableFuture<?>
onUpdateReplicas(ConfigurationNotificationEvent<Int
if (replicasCtx.oldValue() != null && replicasCtx.oldValue() > 0) {
TableConfiguration tblCfg =
replicasCtx.config(TableConfiguration.class);
+ LOG.info("Received update for replicas number for table={}
from replicas={} to replicas={}",
Review Comment:
Maybe we should think about a general solution, where all lines of logs will
contain info about the node which logged that line
##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/rpc/impl/RaftGroupServiceImpl.java:
##########
@@ -372,6 +372,9 @@ public static CompletableFuture<RaftGroupService> start(
CompletableFuture<ChangePeersAsyncResponse> fut = new
CompletableFuture<>();
+ LOG.info("Sending request to changePeers of group={} to peers={} with
leader term={}",
Review Comment:
changePeersAsync
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]