Hello everybody!
On IRC ndim and I have been investigating a segfault (backtrace here:
http://flightgear.lauft.net/yet-another-segfault-in-simgear.bt)
The culprit seems to be the path cache in simgear/props/props.cxx.
Here is a little overview of what we thought:
Jester: could it be that _linkedNodes contains an invalid entry?
Jester: meaning a node that got destroyed but somehow didn't get
removed from this array?
...
ndim: There are two SGPropertyNode::hash_table::erase() methods.
ndim: One calls remove_linked_node(this);, the other does not.
...
Jester: okay, so the apparent fix to the segfault would be to call
node->remove_linked_node(this) before returning true
...
ndim: Seems to work.
ndim: At least I can't reproduce the segfault any more.
Having stared at it for the last 3 hours or so, I don't think the
above fix is OK. Note that the callback modifies the vector the loop
is iterating over (which is a no-no!), causing every second entry to
be skipped. My guess is that's why it (seemingly) fixed the segfault.
Also, the mentioned erase() overload is only called from
SGPropertyNode::remove_from_path_caches where the _linkedNodes vector
is cleared immediately afterwards. So in this regard it doesn't matter
whether erase() calls back or not (but it really should - maybe later
other code paths will call erase and then bad things will happen).
I now think that the problem lies in the destructor for the hash
table. That's the one that should call the remove_linked_node for all
items still in the cache. Consider the following scenario: let's have
2 nodes X and Y where Y has a cache entry for X. Thus X._linkedNodes
contains Y, and Y._linkedNodes is empty. Now suppose Y gets destroyed
(*). X will however still think that it is alive, since nobody tells
him otherwise. So when somebody removes X then it will try to notify Y
which is already gone.
(*) Interesting things here as well. Seemingly nobody cares to free
property nodes, everybody is happy with removeChild.
Preliminary proof-of-concept patch is attached (the other from
http://flightgear.lauft.net/jester-fixes-another-segfault.patch should
be undone)
Any thoughts?
Greets,
Jester
Index: props.cxx
===================================================================
RCS file: /var/cvs/SimGear-0.3/source/simgear/props/props.cxx,v
retrieving revision 1.27
diff -u -r1.27 props.cxx
--- props.cxx 11 Feb 2007 11:05:23 -0000 1.27
+++ props.cxx 16 Feb 2007 04:11:24 -0000
@@ -2258,6 +2258,15 @@
return false;
}
+void
+SGPropertyNode::hash_table::bucket::clear (SGPropertyNode::hash_table * owner)
+{
+ for (int i = 0; i < _length; i++) {
+ SGPropertyNode * node = _entries[i]->get_value();
+ if (node) node->remove_linked_node(owner);
+ }
+}
+
SGPropertyNode::hash_table::hash_table ()
: _data_length(0),
_data(0)
@@ -2266,8 +2275,10 @@
SGPropertyNode::hash_table::~hash_table ()
{
- for (unsigned int i = 0; i < _data_length; i++)
+ for (unsigned int i = 0; i < _data_length; i++) {
+ if (_data[i]) _data[i]->clear(this);
delete _data[i];
+ }
delete [] _data;
}
Index: props.hxx
===================================================================
RCS file: /var/cvs/SimGear-0.3/source/simgear/props/props.hxx,v
retrieving revision 1.19
diff -u -r1.19 props.hxx
--- props.hxx 11 Feb 2007 11:05:23 -0000 1.19
+++ props.hxx 16 Feb 2007 04:11:31 -0000
@@ -1271,6 +1271,7 @@
entry * get_entry (const char * key, bool create = false);
void erase (const char * key);
bool erase (SGPropertyNode * node);
+ void clear (hash_table * owner);
private:
int _length;
entry ** _entries;
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel