Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > On 27/01/2008, William Allen Simpson wrote: > So, let's consider this a temporary patch until bounce_unit() -- or its > caller -- is fixed. OTOH. I have spent hundreds of hours during last three years hunting bugs that have turned out to be caused by unit dying inside unit_iterate(). Units can die quite unexpectedly, so I'd rather use _safe() in any high level iteration instead of wasting developer time in unnecessary bughunts. - ML ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > Jason Short wrote: > make_contact kills the unit because of the broken treaty and bouncing > (strange in itself but whatever). Then since maybe_make_contact doesn't > use a proper iterator it breaks the loop. Attached patch should fix it > for 2.1 and most likely 2.2/trunk. > That looks like it! Of course, it *was* a proper iterator right up until somebody added code to wipe an unrelated unit of another player that happens to be in a stack with a player that canceled a treaty In this case, the Indian player's last settler. Causing the Indian player to die! Good thing this is probably a rare event. So, let's consider this a temporary patch until bounce_unit() -- or its caller -- is fixed. What a terrible fragile logic mess! ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > I just ran it under S2_1 as: $ ./ser -f ~/-0200x.sav -p 5557 ... > set timeout 1 > start ... > civserver: player.c:246: player_index: Assertion `pplayer' failed. Aborted (core dumped) Then the same under valgrind: ==5035== Invalid read of size 4 ==5035==at 0x808A7B4: maybe_make_contact (plrhand.c:1206) ==5035==by 0x806D1E9: move_unit (unittools.c:2853) ==5035==by 0x80B11C4: handle_unit_move_request (unithand.c:1201) ==5035==by 0x813D2E8: ai_unit_move (aitools.c:1022) ==5035==by 0x813B69C: ai_unit_execute_path (aitools.c:194) ==5035==by 0x813BDFF: ai_follow_path (aitools.c:385) ==5035==by 0x813C10B: ai_unit_goto_constrained (aitools.c:459) ==5035==by 0x813C9B3: ai_unit_goto (aitools.c:790) ==5035==by 0x813BBA9: ai_gothere (aitools.c:317) ==5035==by 0x8141B85: ai_military_attack (aiunit.c:1755) ==5035==by 0x81423AF: ai_manage_military (aiunit.c:2078) ==5035==by 0x8142B09: ai_manage_unit (aiunit.c:2245) ==5035==by 0x8142F7F: ai_manage_units (aiunit.c:2351) ==5035==by 0x81343BE: ai_do_first_activities (aihand.c:426) ==5035==by 0x8054D25: ai_start_phase (srv_main.c:594) ==5035==by 0x8055124: begin_phase (srv_main.c:726) ==5035==by 0x80571D9: srv_running (srv_main.c:1816) ==5035==by 0x8057B87: srv_main (srv_main.c:2194) ==5035==by 0x804B195: main (civserver.c:258) ==5035== Address 0x4dd2e30 is 8 bytes inside a block of size 12 free'd ==5035==at 0x402365C: free (vg_replace_malloc.c:323) ==5035==by 0x8050488: genlist_unlink (genlist.c:141) ==5035==by 0x80C7B3F: unit_list_unlink (speclist.h:105) ==5035==by 0x80C7AD0: game_remove_unit (game.c:148) ==5035==by 0x8068E13: server_remove_unit (unittools.c:1374) ==5035==by 0x8069072: wipe_unit (unittools.c:1436) ==5035==by 0x8068014: bounce_unit (unittools.c:1044) ==5035==by 0x8068393: resolve_stack_conflicts (unittools.c:1086) ==5035==by 0x8068403: resolve_unit_stacks (unittools.c:1110) ==5035==by 0x8088937: update_players_after_alliance_breakup (plrhand.c:457) ==5035==by 0x8088BB2: handle_diplomacy_cancel_pact (plrhand.c:546) ==5035==by 0x808A5C9: make_contact (plrhand.c:1180) ==5035==by 0x808A7F3: maybe_make_contact (plrhand.c:1207) ==5035==by 0x806D1E9: move_unit (unittools.c:2853) ==5035==by 0x80B11C4: handle_unit_move_request (unithand.c:1201) ==5035==by 0x813D2E8: ai_unit_move (aitools.c:1022) ==5035==by 0x813B69C: ai_unit_execute_path (aitools.c:194) ==5035==by 0x813BDFF: ai_follow_path (aitools.c:385) ==5035==by 0x813C10B: ai_unit_goto_constrained (aitools.c:459) ==5035==by 0x813C9B3: ai_unit_goto (aitools.c:790) ==5035==by 0x813BBA9: ai_gothere (aitools.c:317) ==5035==by 0x8141B85: ai_military_attack (aiunit.c:1755) ==5035==by 0x81423AF: ai_manage_military (aiunit.c:2078) ==5035==by 0x8142B09: ai_manage_unit (aiunit.c:2245) ==5035==by 0x8142F7F: ai_manage_units (aiunit.c:2351) ==5035==by 0x81343BE: ai_do_first_activities (aihand.c:426) ==5035==by 0x8054D25: ai_start_phase (srv_main.c:594) ==5035==by 0x8055124: begin_phase (srv_main.c:726) ==5035==by 0x80571D9: srv_running (srv_main.c:1816) ==5035==by 0x8057B87: srv_main (srv_main.c:2194) ==5035==by 0x804B195: main (civserver.c:258) make_contact kills the unit because of the broken treaty and bouncing (strange in itself but whatever). Then since maybe_make_contact doesn't use a proper iterator it breaks the loop. Attached patch should fix it for 2.1 and most likely 2.2/trunk. -jason Index: server/plrhand.c === --- server/plrhand.c (revision 14329) +++ server/plrhand.c (working copy) @@ -1203,9 +1203,9 @@ if (pcity) { make_contact(pplayer, city_owner(pcity), ptile); } -unit_list_iterate(tile1->units, punit) { +unit_list_iterate_safe(tile1->units, punit) { make_contact(pplayer, unit_owner(punit), ptile); -} unit_list_iterate_end; +} unit_list_iterate_safe_end; } square_iterate_end; } ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > After 12 hours of staring at this problem, and debugging both S2_1 and S2_2, I've guessed that there is some unit somewhere that is killed, but not unlinked from its tile. Another unit is meeting it. In 2.1, all goes well until the make_contact() routine tries to print the nation_plural. The nation is NULL in the player. I'd forgotten that nation_of_player() is also used for nation_plural_for_player(). It's probably the Indian unit(s), as that player died this turn. In 2.2, it doesn't even get that far. The player and name are both NULL in the unit. But that's all I've been able to learn After the 2.1.3 release, I'll check in my debugging, and try again. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > For posterity, several more hours of solid debugging, and I've determined the unit itself has lost its name. A dead unit wasn't taken off the tile? temporary logging: 2: maybe_make_contact() unit And a dead city wasn't taken off the tile owner list before begin_phase(). 1: Failed sanitycheck.c:174 (srv_main.c:725) at Grassland (50,28): map_get_player_site(ptile, tile_owner(ptile)) != NULL Still guessing ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > Thank you for the helpful savegame! Here's where it currently dies in S2_1: 0: nation_of_player() has NULL nation 0: Detected fatal error in nation.c line 259: 0: bad nation This is a crash trying to display a logging message -- nation_of_player() is only used in freelog() messages. === Here's where it currently dies in S2_2 (has a lot more parameter checking): player.c:293: failed assertion `pplayer' 0 civserver 0x0006a1b4 player_number + 0x18 (player.c:295) 1 civserver 0x00032014 make_contact + 0x34 (plrhand.c:1167) 2 civserver 0x00032568 maybe_make_contact + 0x22c (plrhand.c:1242) 3 civserver 0x0004e0dc move_unit + 0x738 (unittools.c:2837) 4 civserver 0x0005c328 unit_move_handling + 0x57c (unithand.c:1331) 5 civserver 0x000d5dcc ai_unit_move + 0x2a4 (aitools.c:1047) 6 civserver 0x000d40b8 ai_unit_execute_path + 0xb0 (aitools.c:221) 7 civserver 0x000d48f4 ai_follow_path + 0x48 (aitools.c:411) 8 civserver 0x000d4b1c ai_unit_goto_constrained + 0x14c (aitools.c:484) 9 civserver 0x000d52c8 ai_unit_goto + 0x60 (aitools.c:814) 10 civserver 0x000d461c ai_gothere + 0x1e8 (aitools.c:342) 11 civserver 0x000dac20 ai_military_attack + 0x1ec (aiunit.c:1755) 12 civserver 0x000dba78 ai_manage_military + 0x2e0 (aiunit.c:2079) 13 civserver 0x000dc2e4 ai_manage_unit + 0x46c (aiunit.c:2261) 14 civserver 0x000dc6e8 ai_manage_units + 0x17c (aiunit.c:2362) Merely adding some more logging in maybe_make_contact() gave it enough time for the server freelog() messages to be flushed without a crash: 1: Failed sanity check at Grassland (50, 28): map_get_player_site(ptile, tile_owner(ptile)) != NULL (sanitycheck.c:173) And the client Message panel now says: The Indians are no more! But the client Nations (F3) tab still shows Indians === My WAG is there are still unsupported Indian units in the field, or an Indian city hasn't been properly removed, and another player made first contact with them. The nation death isn't being properly propagated (as seen on the Nation screen). This is a serious bug, but will take a *lot* of effort to track. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units => plrhand.c maybe_make_contact()
http://bugs.freeciv.org/Ticket/Display.html?id=40020 > Thank you for the savegame. I'll try to reproduce. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev