[
https://issues.apache.org/jira/browse/RATIS-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896854#comment-17896854
]
Ivan Andika commented on RATIS-2186:
------------------------------------
This seems to be the root cause for RATIS-2056.
> Raft log purge preservation might purge log index that does not exist
> ---------------------------------------------------------------------
>
> Key: RATIS-2186
> URL: https://issues.apache.org/jira/browse/RATIS-2186
> Project: Ratis
> Issue Type: Bug
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We encountered a following "Unexpected gap in segments" error when manually
> synchronizing OM DB on OM follower that has been stopped for a few hours.
> {code:java}
> 2024-11-07 21:49:32,940 [om4@group-13A745F1EB59-StateMachineUpdater] ERROR
> org.apache.ratis.server.impl.StateMachineUpdater:
> om4@group-13A745F1EB59-StateMachineUpdater caught a Throwable.
> java.lang.IllegalStateException: Unexpected gap in segments:
> binarySearch(88354999707) returns -1, segments=[log-88363996241_88364000257,
> log-88364000258_88364004199, log-88364004200_88364008231,
> log-88364008232_88364012246, log-88364012247_88364016452,
> log-88364016453_88364020483, log-88364020484_88364024600,
> log-88364024601_88364028704, log-88364028705_88364032801,
> log-88364032802_88364036811, log-88364036812_88364040811,
> log-88364040812_88364044806, log-88364044807_88364048845,
> log-88364048846_88364053013, log-88364053014_88364057206,
> log-88364057207_88364061416, log-88364061417_88364065583,
> log-88364065584_88364069652, log-88364069653_88364073908,
> log-88364073909_88364078037, log-88364078038_88364082338,
> log-88364082339_88364086503, log-88364086504_88364090669,
> log-88364090670_88364094827, log-88364094828_88364099047,
> log-88364099048_88364103228, log-88364103229_88364107373,
> log-88364107374_88364111564, log-88364111565_88364115651,
> log-88364115652_88364119684, log-88364119685_88364123867,
> log-88364123868_88364124644, log-88364124645_88364128703,
> log-88364128704_88364132765, log-88364132766_88364136825,
> log-88364136826_88364140811, log-88364140812_88364144887,
> log-88364144888_88364149042, log-88364149043_88364153379,
> log-88364153380_88364157732, log-88364157733_88364161937,
> log-88364161938_88364166039, log-88364166040_88364170087,
> log-88364170088_88364174135, log-88364174136_88364178144,
> log-88364178145_88364182260, log-88364182261_88364186208,
> log-88364186209_88364190136, log-88364190137_88364194445,
> log-88364194446_88364198500, log-88364198501_88364202507,
> log-88364202508_88364206398, log-88364206399_88364210433,
> log-88364210434_88364214441, log-88364214442_88364218538,
> log-88364218539_88364222548, log-88364222549_88364226618,
> log-88364226619_88364230699, log-88364230700_88364234762,
> log-88364234763_88364238784, log-88364238785_88364242687,
> log-88364242688_88364246625, log-88364246626_88364250581,
> log-88364250582_88364254520, log-88364254521_88364258544,
> log-88364258545_88364262687, log-88364262688_88364266687,
> log-88364266688_88364270677, log-88364270678_88364274675,
> log-88364274676_88364278687, log-88364278688_88364282796,
> log-88364282797_88364287134, log-88364287135_88364291229,
> log-88364291230_88364295199, log-88364295200_88364299138,
> log-88364299139_88364303033, log-88364303034_88364307192,
> log-88364307193_88364311099, log-88364311100_88364315135,
> log-88364315136_88364319072, log-88364319073_88364322884,
> log-88364322885_88364326897, log-88364326898_88364330876,
> log-88364330877_88364334809, log-88364334810_88364338728,
> log-88364338729_88364342864, log-88364342865_88364346842,
> log-88364346843_88364350811, log-88364350812_88364354727,
> log-88364354728_88364358758, log-88364358759_88364359500,
> log-88364359501_88364363662, log-88364363663_88364367743,
> log-88364367744_88364371709, log-88364371710_88364375763,
> log-88364375764_88364379715, log-88364379716_88364383734,
> log-88364383735_88364387563, log-88364387564_88364391573,
> log-88364391574_88364395627, log-88364395628_88364399634,
> log-88364399635_88364403770, log-88364403771_88364408068,
> log-88364408069_88364412129, log-88364412130_88364416145,
> log-88364416146_88364420177, log-88364420178_88364424190,
> log-88364424191_88364428162, log-88364428163_88364432284,
> log-88364432285_88364436218, log-88364436219_88364440288,
> log-88364440289_88364444352, log-88364444353_88364448196,
> log-88364448197_88364452189, log-88364452190_88364456120,
> log-88364456121_88364460132, log-88364460133_88364463990,
> log-88364463991_88364468111, log-88364468112_88364472158,
> log-88364472159_88364476323, log-88364476324_88364480303,
> log-88364480304_88364484414, log-88364484415_88364488460,
> log-88364488461_88364492577, log-88364492578_88364496658,
> log-88364496659_88364500681, log-88364500682_88364504681,
> log-88364504682_88364508692, log-88364508693_88364512735,
> log-88364512736_88364516709, log-88364516710_88364520628,
> log-88364520629_88364524444, log-88364524445_88364528459,
> log-88364528460_88364532564, log-88364532565_88364536546,
> log-88364536547_88364540655, log-88364540656_88364544713,
> log-88364544714_88364548738, log-88364548739_88364552734,
> log-88364552735_88364556745, log-88364556746_88364560570,
> log-88364560571_88364564711, log-88364564712_88364568778,
> log-88364568779_88364572855, log-88364572856_88364577025,
> log-88364577026_88364580991, log-88364580992_88364585005,
> log-88364585006_88364589177, log-88364589178_88364593117,
> log-88364593118_88364596544, log-88364596545_88364600628,
> log-88364600629_88364604666, log-88364604667_88364608788,
> log-88364608789_88364612623, log-88364612624_88364616469,
> log-88364616470_88364620418, log-88364620419_88364624447,
> log-88364624448_88364628364, log-88364628365_88364632583,
> log-88364632584_88364636690, log-88364636691_88364640840,
> log-88364640841_88364645154, log-88364645155_88364649391,
> log-88364649392_88364653616, log-88364653617_88364657719,
> log-88364657720_88364662007, log-88364662008_88364666323,
> log-88364666324_88364670449, log-88364670450_88364674849,
> log-88364674850_88364679290, log-88364679291_88364683748,
> log-88364683749_88364688166, log-88364688167_88364692147,
> log-88364692148_88364696480, log-88364696481_88364700948,
> log-88364700949_88364705067, log-88364705068_88364709420,
> log-88364709421_88364713675, log-88364713676_88364718120,
> log-88364718121_88364722375, log-88364722376_88364726870,
> log-88364726871_88364731208, log-88364731209_88364735403,
> log-88364735404_88364739660, log-88364739661_88364744079,
> log-88364744080_88364748313, log-88364748314_88364752767,
> log-88364752768_88364756923, log-88364756924_88364761130,
> log-88364761131_88364765458, log-88364765459_88364769659,
> log-88364769660_88364773864, log-88364773865_88364778029,
> log-88364778030_88364782373, log-88364782374_88364786843,
> log-88364786844_88364791187, log-88364791188_88364795576,
> log-88364795577_88364799757, log-88364799758_88364804091,
> log-88364804092_88364808438, log-88364808439_88364812735,
> log-88364812736_88364817053, log-88364817054_88364821337,
> log-88364821338_88364825482, log-88364825483_88364829678,
> log-88364829679_88364833850, log-88364833851_88364838114,
> log-88364838115_88364842299, log-88364842300_88364846583,
> log-88364846584_88364849925, log-88364849926_88364854127,
> log-88364854128_88364858268, log-88364858269_88364862345,
> log-88364862346_88364866641, log-88364866642_88364870877,
> log-88364870878_88364875147, log-88364875148_88364879433,
> log-88364879434_88364883886, log-88364883887_88364888223,
> log-88364888224_88364892556, log-88364892557_88364896921,
> log-88364896922_88364901295, log-88364901296_88364905640,
> log-88364905641_88364909861, log-88364909862_88364914097,
> log-88364914098_88364918297, log-88364918298_88364922609,
> log-88364922610_88364926902, log-88364926903_88364931383,
> log-88364931384_88364935609, log-88364935610_88364940046,
> log-88364940047_88364944407, log-88364944408_88364948542,
> log-88364948543_88364952764, log-88364952765_88364956959,
> log-88364956960_88364961303, log-88364961304_88364965492,
> log-88364965493_88364969682, log-88364969683_88364973850,
> log-88364973851_88364978007, log-88364978008_88364982280,
> log-88364982281_88364986516, log-88364986517_88364990776,
> log-88364990777_88364995029, log-88364995030_88364999288] {code}
> When synchronizing the OM follower with the OM leader, we cleaned the OM
> ratis and ratis-snapshot directories and uses rsync to sync the OM DB (that
> contains the last applied index). Afterwards, we restart the slow OM follower
> which will receives the AppendEntries from the leader instead of the
> notifyInstallSnapshot due to the leader's purge preservation configuration.
> However, since the follower does not have some of the previous log segments,
> the first purge will trigger the "Unexpected gap in segments" since the
> purge index is earlier than the first Raft log index in Ratis log directory.
> I suspect that this might also happen in general case for a new Raft server
> with raft.server.snapshot.auto.trigger.threshold and
> raft.server.log.purge.gap that are too small but with very large
> raft.server.log.purge.preservation.log.num, provided
> raft.server.log.purge.upto.snapshot.index is true.
> A possible solution is to not purge when the suggested index is lower than
> the first segmented log index, instead of throwing exception.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)