[Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units = plrhand.c maybe_make_contact()

2008-01-27 Thread Jason Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40020 

I just ran it under S2_1 as:

$ ./ser -f ~/-0200x.sav -p 5557
...
 set timeout 1
 start
...
 civserver: player.c:246: player_index: Assertion `pplayer' failed.
Aborted (core dumped)


Then the same under valgrind:
==5035== Invalid read of size 4
==5035==at 0x808A7B4: maybe_make_contact (plrhand.c:1206)
==5035==by 0x806D1E9: move_unit (unittools.c:2853)
==5035==by 0x80B11C4: handle_unit_move_request (unithand.c:1201)
==5035==by 0x813D2E8: ai_unit_move (aitools.c:1022)
==5035==by 0x813B69C: ai_unit_execute_path (aitools.c:194)
==5035==by 0x813BDFF: ai_follow_path (aitools.c:385)
==5035==by 0x813C10B: ai_unit_goto_constrained (aitools.c:459)
==5035==by 0x813C9B3: ai_unit_goto (aitools.c:790)
==5035==by 0x813BBA9: ai_gothere (aitools.c:317)
==5035==by 0x8141B85: ai_military_attack (aiunit.c:1755)
==5035==by 0x81423AF: ai_manage_military (aiunit.c:2078)
==5035==by 0x8142B09: ai_manage_unit (aiunit.c:2245)
==5035==by 0x8142F7F: ai_manage_units (aiunit.c:2351)
==5035==by 0x81343BE: ai_do_first_activities (aihand.c:426)
==5035==by 0x8054D25: ai_start_phase (srv_main.c:594)
==5035==by 0x8055124: begin_phase (srv_main.c:726)
==5035==by 0x80571D9: srv_running (srv_main.c:1816)
==5035==by 0x8057B87: srv_main (srv_main.c:2194)
==5035==by 0x804B195: main (civserver.c:258)
==5035==  Address 0x4dd2e30 is 8 bytes inside a block of size 12 free'd
==5035==at 0x402365C: free (vg_replace_malloc.c:323)
==5035==by 0x8050488: genlist_unlink (genlist.c:141)
==5035==by 0x80C7B3F: unit_list_unlink (speclist.h:105)
==5035==by 0x80C7AD0: game_remove_unit (game.c:148)
==5035==by 0x8068E13: server_remove_unit (unittools.c:1374)
==5035==by 0x8069072: wipe_unit (unittools.c:1436)
==5035==by 0x8068014: bounce_unit (unittools.c:1044)
==5035==by 0x8068393: resolve_stack_conflicts (unittools.c:1086)
==5035==by 0x8068403: resolve_unit_stacks (unittools.c:1110)
==5035==by 0x8088937: update_players_after_alliance_breakup
(plrhand.c:457)
==5035==by 0x8088BB2: handle_diplomacy_cancel_pact (plrhand.c:546)
==5035==by 0x808A5C9: make_contact (plrhand.c:1180)
==5035==by 0x808A7F3: maybe_make_contact (plrhand.c:1207)
==5035==by 0x806D1E9: move_unit (unittools.c:2853)
==5035==by 0x80B11C4: handle_unit_move_request (unithand.c:1201)
==5035==by 0x813D2E8: ai_unit_move (aitools.c:1022)
==5035==by 0x813B69C: ai_unit_execute_path (aitools.c:194)
==5035==by 0x813BDFF: ai_follow_path (aitools.c:385)
==5035==by 0x813C10B: ai_unit_goto_constrained (aitools.c:459)
==5035==by 0x813C9B3: ai_unit_goto (aitools.c:790)
==5035==by 0x813BBA9: ai_gothere (aitools.c:317)
==5035==by 0x8141B85: ai_military_attack (aiunit.c:1755)
==5035==by 0x81423AF: ai_manage_military (aiunit.c:2078)
==5035==by 0x8142B09: ai_manage_unit (aiunit.c:2245)
==5035==by 0x8142F7F: ai_manage_units (aiunit.c:2351)
==5035==by 0x81343BE: ai_do_first_activities (aihand.c:426)
==5035==by 0x8054D25: ai_start_phase (srv_main.c:594)
==5035==by 0x8055124: begin_phase (srv_main.c:726)
==5035==by 0x80571D9: srv_running (srv_main.c:1816)
==5035==by 0x8057B87: srv_main (srv_main.c:2194)
==5035==by 0x804B195: main (civserver.c:258)

make_contact kills the unit because of the broken treaty and bouncing
(strange in itself but whatever).  Then since maybe_make_contact doesn't
use a proper iterator it breaks the loop.  Attached patch should fix it
for 2.1 and most likely 2.2/trunk.

-jason

Index: server/plrhand.c
===
--- server/plrhand.c	(revision 14329)
+++ server/plrhand.c	(working copy)
@@ -1203,9 +1203,9 @@
 if (pcity) {
   make_contact(pplayer, city_owner(pcity), ptile);
 }
-unit_list_iterate(tile1-units, punit) {
+unit_list_iterate_safe(tile1-units, punit) {
   make_contact(pplayer, unit_owner(punit), ptile);
-} unit_list_iterate_end;
+} unit_list_iterate_safe_end;
   } square_iterate_end;
 }
 
___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units = plrhand.c maybe_make_contact()

2008-01-27 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40020 

Jason Short wrote:
 make_contact kills the unit because of the broken treaty and bouncing
 (strange in itself but whatever).  Then since maybe_make_contact doesn't
 use a proper iterator it breaks the loop.  Attached patch should fix it
 for 2.1 and most likely 2.2/trunk.
 
That looks like it!  Of course, it *was* a proper iterator right up until
somebody added code to wipe an unrelated unit of another player that
happens to be in a stack with a player that canceled a treaty

In this case, the Indian player's last settler.  Causing the Indian player
to die!  Good thing this is probably a rare event.

So, let's consider this a temporary patch until bounce_unit() -- or its
caller -- is fixed.  What a terrible fragile logic mess!



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units = plrhand.c maybe_make_contact()

2008-01-27 Thread Marko Lindqvist

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40020 

On 27/01/2008, William Allen Simpson  wrote:
 So, let's consider this a temporary patch until bounce_unit() -- or its
 caller -- is fixed.

 OTOH. I have spent hundreds of hours during last three years hunting
bugs that have turned out to be caused by unit dying inside
unit_iterate(). Units can die quite unexpectedly, so I'd rather use
_safe() in any high level iteration instead of wasting developer time
in unnecessary bughunts.


 - ML



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units = plrhand.c maybe_make_contact()

2008-01-20 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40020 

Thank you for the helpful savegame!

Here's where it currently dies in S2_1:

0: nation_of_player() has NULL nation
0: Detected fatal error in nation.c line 259:
0: bad nation

This is a crash trying to display a logging message -- nation_of_player()
is only used in freelog() messages.

===

Here's where it currently dies in S2_2 (has a lot more parameter checking):

player.c:293: failed assertion `pplayer'

0   civserver   0x0006a1b4 player_number + 0x18 (player.c:295)
1   civserver   0x00032014 make_contact + 0x34 (plrhand.c:1167)
2   civserver   0x00032568 maybe_make_contact + 0x22c (plrhand.c:1242)
3   civserver   0x0004e0dc move_unit + 0x738 (unittools.c:2837)
4   civserver   0x0005c328 unit_move_handling + 0x57c (unithand.c:1331)
5   civserver   0x000d5dcc ai_unit_move + 0x2a4 (aitools.c:1047)
6   civserver   0x000d40b8 ai_unit_execute_path + 0xb0 (aitools.c:221)
7   civserver   0x000d48f4 ai_follow_path + 0x48 (aitools.c:411)
8   civserver   0x000d4b1c ai_unit_goto_constrained + 0x14c (aitools.c:484)
9   civserver   0x000d52c8 ai_unit_goto + 0x60 (aitools.c:814)
10  civserver   0x000d461c ai_gothere + 0x1e8 (aitools.c:342)
11  civserver   0x000dac20 ai_military_attack + 0x1ec (aiunit.c:1755)
12  civserver   0x000dba78 ai_manage_military + 0x2e0 (aiunit.c:2079)
13  civserver   0x000dc2e4 ai_manage_unit + 0x46c (aiunit.c:2261)
14  civserver   0x000dc6e8 ai_manage_units + 0x17c (aiunit.c:2362)


Merely adding some more logging in maybe_make_contact() gave it enough time
for the server freelog() messages to be flushed without a crash:

1: Failed sanity check at Grassland (50, 28): map_get_player_site(ptile, 
tile_owner(ptile)) != NULL (sanitycheck.c:173)

And the client Message panel now says:
   The Indians are no more!

But the client Nations (F3) tab still shows Indians

===

My WAG is there are still unsupported Indian units in the field, or an
Indian city hasn't been properly removed, and another player made first
contact with them.  The nation death isn't being properly propagated (as
seen on the Nation screen).

This is a serious bug, but will take a *lot* of effort to track.



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40020) Segfault in server aiunit.c ai_manage_units = plrhand.c maybe_make_contact()

2008-01-20 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40020 

For posterity, several more hours of solid debugging, and I've determined the
unit itself has lost its name.  A dead unit wasn't taken off the tile?

temporary logging:
2: maybe_make_contact() unit

And a dead city wasn't taken off the tile owner list before begin_phase().

1: Failed sanitycheck.c:174 (srv_main.c:725) at Grassland (50,28):
  map_get_player_site(ptile, tile_owner(ptile)) != NULL

Still guessing



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev