[
https://issues.apache.org/jira/browse/IGNITE-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754197#comment-16754197
]
Alexey Goncharuk commented on IGNITE-10995:
-------------------------------------------
[~mstepachev], a few comments:
1) Please change the error message passed in to the failure handler to be the
same as the one being logged ("Failed to continue supplying [" +
supplyRoutineInfo(topicId, nodeId, demandMsg) + "]")
2) No need to override {{getIgniteInstanceName}} method, instead use this
method in the test to determine the node you want to work with
3) Is there any reason why you used {{FSYNC}} WAL mode for this test? This mode
is quite slow and unless specifically needed, we stick with default one
> GridDhtPartitionSupplier::handleDemandMessage suppress errors
> -------------------------------------------------------------
>
> Key: IGNITE-10995
> URL: https://issues.apache.org/jira/browse/IGNITE-10995
> Project: Ignite
> Issue Type: Bug
> Reporter: Dmitry Sherstobitov
> Assignee: Stepachev Maksim
> Priority: Major
> Attachments: Screenshot 2019-01-20 at 23.19.08.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Scenario:
> # Cluster with data
> # Triggered historical rebalance
> In this case if OOM occurs on supplier there is no failHandler triggered and
> cluster is alive with inconsistent data (target node have MOVING partitions,
> supplier do nothing)
> Target rebalance node log:
> {code:java}
> [15:00:31,418][WARNING][sys-#86][GridDhtPartitionDemander] Rebalancing from
> node cancelled [grp=cache_group_4, topVer=AffinityTopologyVersion [topVer=17,
> minorTopVer=0], supplier=4cbc66d3-9d2c-4396-8366-2839a8d0cdb6, topic=5]].
> Supplier has failed with error: java.lang.OutOfMemoryError: Java heap
> space{code}
> Supplier stack trace:
> !Screenshot 2019-01-20 at 23.19.08.png!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)