Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Wed, 23 Jan 2008, Torsten Dreyer wrote: Hi Tibor, I am running SuSE linux, currently 10.3. But there is good news, I can reproduce the client misbehaviour now where the aircraft sits at -ft after a client crash and restart. I am currently looking into the issue, but it will take some time. Regards, Torsten Great, thank you for debugging this one. I think the crash is not necessary to reproduce the - problem, the wrong order of starting up the two flightgears should do it. Best regards, Tibor - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
Currently I cannot reproduce any other misbehaviour than the segfault that you describe as gone now. Do you have .deb based system? If so, I could send you the relevant deb packages we used. Or alternatively we could try reproducing the issue on vservers and send you a compressed vserver. Hi Tibor, I am running SuSE linux, currently 10.3. But there is good news, I can reproduce the client misbehaviour now where the aircraft sits at -ft after a client crash and restart. I am currently looking into the issue, but it will take some time. Regards, Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
That sounds like you're using TCP, since if you were using UDP, the master would not know if the slave(s) received the message -- UDP is an unreliable protocol and the master does not know if it is transmittiing into oblivion or reaching an actual slave instance of FlightGear. Provided the native protocol doesn't have a mechanism to provide feedback of received messages to the master, that is. This was running round in my head all day and I did some investigation with a debugger, wireshark and a internet-searchengine at hand... Here is what I learned today: When you send a udp datagram to any machine and the port is not open on that machine, you get a ICMP destination ureachable/port unreachable message from the targeted machine. Looks like this is interpreted by the socket implementation here on my linux box and it is finally handed over as an error from the send() system that produces the warning message. So, nothing to worry about. It's just a warning message on the console and as soon as the client is up again, data flow continues as it is supposed to with udp. Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
We were using the current version of gcc in Debian testing (4.1.2, I believe?), but the same problem occurred with the debian packaged 1.0.0, the ubuntu packages, and older versions as well. As said in another mail, the stability issues improved with the small patch posted on this list. There's no segfault any more. There's still the doublefree/corruption problem (which tends to appear at shutdown), and master/slave startup order still matters. If the master starts before the slave, the slave plane starts stuck in the ground at - feet, and doesn't move. A reset on the slave side restores correct functionality. - What are the configurations of the two machines? - Are they equal, same os same architecture (32/64) bit - What gcc/g++ version was used to compile - What plib version? - Do you share the same binary for the two machines or were they built independantly? The native protocol is *very* native, it just copies the internal data structure to the stream without caring about byte order, byte/word alignment or the kind of data representation in a struct/class. Currently I cannot reproduce any other misbehaviour than the segfault that you describe as gone now. Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Tue, 22 Jan 2008, Torsten Dreyer wrote: snip - What are the configurations of the two machines? Both machines are x86 running Debian testing. Acer notebook: Intel Pentium M (1.60GHz), 1 gb ram, ATI Radeon Mobility X700 (PCIE) dektop: AMD Athlon(tm) XP 3200+, 1 gb ram, some nvidia - Are they equal, same os same architecture (32/64) bit Both are 32 bit with the same byte order - What gcc/g++ version was used to compile gcc: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) libstdc++6: 4.1.1-21 libc6: 2.3.6.ds1-13etch4 - What plib version? Compiled with 1.8.4-6 but both machines have 1.8.4-8 - Do you share the same binary for the two machines or were they built independantly? Same binary; I've compiled everything and built .deb packages and installed those on both machines. The native protocol is *very* native, it just copies the internal data structure to the stream without caring about byte order, byte/word alignment or the kind of data representation in a struct/class. Currently I cannot reproduce any other misbehaviour than the segfault that you describe as gone now. Do you have .deb based system? If so, I could send you the relevant deb packages we used. Or alternatively we could try reproducing the issue on vservers and send you a compressed vserver. TIA Tibor Palinkas - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Fri, 18 Jan 2008, Torsten Dreyer wrote: Hi all, I think there is a bug when using the native protocol to link two instances of FlightGear via network or when recording and playing back flights using fgfs --native=file,out,20,fgfs.out and fgfs --native,file,in,20,fgfs.out --fdm=external The external FlightGear crashes and after some investigation I think the problem is a missing operator =() method in FDM/flight.hxx The problem is: In Network/native.cxx a buffer is read either from network of from a file containing the previously written fdm state. The content of the buffer is than assigned to the current fdm state by doing *cur_fdm_state = buf; both variables are of type FGInterface which currently lacks a operator =() method, so the compiler uses a simple memcpy to copy one object to the other. This is almost ok but for the ground_cache object. This is a complex object containing a std::vectorFGGroundCache::Triangle. This vector seems to store memory pointers to the Triangle-vertices. This is a very bad thing because these pointers are invalid for any other FlightGear session and dereferencing them causes a segmentation fault. A very ugly - if not disgusting - workaround is adding the following to the public methods of FGInterface in FDM/flight.hxx: virtual const FGInterface operator = ( FGInterface src ) { char * start = (char*)inited; char * end = (char*)ground_cache; memcpy( inited, src.inited, end-start ); prepare_ground_cache_m( 0, geodetic_position_v, 100.0 ); } This gets called instead of a memcpy when assinging one FGInterface to another and it does the memcpy for all member variables but the ground_cache. The ground_cache itself is initialized for the recovered position with a fix reference time of 0 and a radius of 100m. At least this change fixes the segfault when replaying with the native protocol, but I don't think this is the kind of code we want to see in FlightGear for two reasons: a) The pointer arithmetic assuming simple datatypes between the inited and ground_cache variable b) A constant used for reference time and the radius. While a) may be circumnavigated by using explicit assignments for all variables, I have no good idea for b). The radius might be saved when doing the output, but I do not understand the idea of the reference time... And there is one thing that is going round in my head: Curt reported, that he does not have this problem at all and no one else (except tpalinkas) reported this crash. Maybe this a a compiler/library problem? Thanks for reading all that - any comment or help is appreciated. Torsten We applied your patch and it fixed the initial segfault in slave. (However, we experience double-free/corruption when the slave quits.) Another strange bug is that if we start up in the wrong order (master first, without the patch, this order caused an immediate segfault), the initial states of the slave are messed up (altitude is - ft; plane permanently stuck in the ground). Doing a reset on the slave fixes the problem even if we've taken off with the master. TIA Tibor Palinkas - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
I am testing the udp version of doing master/slave copies of FlightGear here this morning. I'm doing this with a stock v1.0 version. So far everything seems to be behaving well. I'm not seeing any rapid memory leak, and so far no crash. Are you seeing this only with file I/O? Are you seeing this with network I/O? How long do you need to have the system running before you see memory thrashing or a crash? Thanks, Curt. On Jan 21, 2008 7:03 AM, tpalinkas wrote: On Fri, 18 Jan 2008, Torsten Dreyer wrote: Hi all, I think there is a bug when using the native protocol to link two instances of FlightGear via network or when recording and playing back flights using fgfs --native=file,out,20,fgfs.out and fgfs --native,file,in,20,fgfs.out --fdm=external The external FlightGear crashes and after some investigation I think the problem is a missing operator =() method in FDM/flight.hxx The problem is: In Network/native.cxx a buffer is read either from network of from a file containing the previously written fdm state. The content of the buffer is than assigned to the current fdm state by doing *cur_fdm_state = buf; both variables are of type FGInterface which currently lacks a operator =() method, so the compiler uses a simple memcpy to copy one object to the other. This is almost ok but for the ground_cache object. This is a complex object containing a std::vectorFGGroundCache::Triangle. This vector seems to store memory pointers to the Triangle-vertices. This is a very bad thing because these pointers are invalid for any other FlightGear session and dereferencing them causes a segmentation fault. A very ugly - if not disgusting - workaround is adding the following to the public methods of FGInterface in FDM/flight.hxx: virtual const FGInterface operator = ( FGInterface src ) { char * start = (char*)inited; char * end = (char*)ground_cache; memcpy( inited, src.inited, end-start ); prepare_ground_cache_m( 0, geodetic_position_v, 100.0 ); } This gets called instead of a memcpy when assinging one FGInterface to another and it does the memcpy for all member variables but the ground_cache. The ground_cache itself is initialized for the recovered position with a fix reference time of 0 and a radius of 100m. At least this change fixes the segfault when replaying with the native protocol, but I don't think this is the kind of code we want to see in FlightGear for two reasons: a) The pointer arithmetic assuming simple datatypes between the inited and ground_cache variable b) A constant used for reference time and the radius. While a) may be circumnavigated by using explicit assignments for all variables, I have no good idea for b). The radius might be saved when doing the output, but I do not understand the idea of the reference time... And there is one thing that is going round in my head: Curt reported, that he does not have this problem at all and no one else (except tpalinkas) reported this crash. Maybe this a a compiler/library problem? Thanks for reading all that - any comment or help is appreciated. Torsten We applied your patch and it fixed the initial segfault in slave. (However, we experience double-free/corruption when the slave quits.) Another strange bug is that if we start up in the wrong order (master first, without the patch, this order caused an immediate segfault), the initial states of the slave are messed up (altitude is - ft; plane permanently stuck in the ground). Doing a reset on the slave fixes the problem even if we've taken off with the master. TIA Tibor Palinkas - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel -- Curtis Olson: http://baron.flightgear.org/~curt/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
Am Montag, 21. Januar 2008 18:21 schrieb Curtis Olson: I am testing the udp version of doing master/slave copies of FlightGear here this morning. I'm doing this with a stock v1.0 version. So far everything seems to be behaving well. I'm not seeing any rapid memory leak, and so far no crash. Are you seeing this only with file I/O? Are you seeing this with network I/O? How long do you need to have the system running before you see memory thrashing or a crash? I get the segfault with file-io and a udp link. This is my environment: - SuSE Linux 10.3 running x86_64 on a Intel(R) Core(TM)2 CPU. - Two instances of FlightGear frame-rate-throttled to 25 fps each - FlightGear stock 1.0.0 build from source and current plib cvs built with gcc 4.2.1 commandline for slave (launched first): fgfs --native=socket,in,20,localhost,5556,udp --aircraft=c172p --geometry=640x480 --timeofday=noon --fdm=null commandline for master (launched after slave startup) fgfs --native=socket,out,20,localhost,5556,udp --aircraft=c172p --geometry=640x480 --timeofday=noon segfaults the slave anything between immediately and after a couple of minutes. Sometimes, terminating the master and starting at other locations in the world immediately kills the slave, like --airport=KJFK or --airport=LOWI Doing some more tests with the operator = () method added, I never got a crash. BTW the operator =() could be reduced to virtual const FGInterface operator = ( FGInterface src ) { char * start = (char*)inited; char * end = (char*)ground_cache; memcpy( inited, src.inited, end-start ); } since the prepare_ground_cache will be called later automatically by the FGInterface::get_groundlevel. And while we are at the native protocols: I am sorry to say that the native-ctrls is broken, too. The encoding swaps bytes for little endian machines when encoding to the net, but does not when decoding from the net. This part is commented out in version 1.32: http://cvs.flightgear.org/cgi-bin/viewvc/viewvc.cgi/source/src/Network/native_ctrls.cxx?r1=1.31r2=1.32 (check line 296 and 353) Is this by intention? Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Jan 21, 2008 12:51 PM, Torsten Dreyer wrote: Am Montag, 21. Januar 2008 18:21 schrieb Curtis Olson: I am testing the udp version of doing master/slave copies of FlightGear here this morning. I'm doing this with a stock v1.0 version. So far everything seems to be behaving well. I'm not seeing any rapid memory leak, and so far no crash. Are you seeing this only with file I/O? Are you seeing this with network I/O? How long do you need to have the system running before you see memory thrashing or a crash? I get the segfault with file-io and a udp link. This is my environment: - SuSE Linux 10.3 running x86_64 on a Intel(R) Core(TM)2 CPU. - Two instances of FlightGear frame-rate-throttled to 25 fps each - FlightGear stock 1.0.0 build from source and current plib cvs built with gcc 4.2.1 Here's one possible difference, I'm running with plib-1.8.4 ... any chance you could try that and see if it makes a difference. I realize a complete ground up recompile is not trivial, but flightgear does leverage plib's low level network code, so is it possible that a change in plib since v1.8.4 is causing us grief? The native_ctrls issue you point out is a surprise to me, but as I look in the cvs web browser pages, I see that Erik Hofman's name is attached to these, and the specific commit was adding some new fields to the structure. I think it makes sense to remove the comments that disable network byte order. That appears to have been a mistake that slipped through the cracks. Regards, Curt. -- Curtis Olson: http://baron.flightgear.org/~curt/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Jan 21, 2008 1:45 PM, Torsten Dreyer wrote: Here's one possible difference, I'm running with plib-1.8.4 ... any chance you could try that and see if it makes a difference. I realize a complete ground up recompile is not trivial, but flightgear does leverage plib's low level network code, so is it possible that a change in plib since v1.8.4is causing us grief? I am running 1.8.4, too: (checked plib/ul.h) #define PLIB_MAJOR_VERSION 1 #define PLIB_MINOR_VERSION 8 #define PLIB_TINY_VERSION 4 But I am still convinced, that the ground_cache object causes the crash. The gdb backtrace, the invalid pointers - everything makes sense to me. On the contrary this means, that the native protocol is broken since Nov 22nd 2004 when the ground_cache object made its way into flight.hxx... But native_ctrls tell me, that is sometimes takes years for bugs to show up or be detected ;-) I'm not able to replicate the problem here. :-( I don't use --native-ctrls very often but I use --native-fdm frequently for a variety of projects and it has always been working well for me and seems to continue to work well. I'm running gcc-4.1.2 here. g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) So the problem could also be related to your newer version of gcc (perhaps being more picky or more standards compliant.) Or there could be a libc difference too. The problem could very well be in the ground cache code and something about that code is blowing up on your system? Often, newer versions of the gcc compiler or compilers on other platforms expose weaknesses or problems in code that worked fine before. It would be really great if Mathias could comment since he is the author of the ground cache code. I have not looked at that code myself in any detail. Regards, Curt. -- Curtis Olson: http://baron.flightgear.org/~curt/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) So the problem could also be related to your newer version of gcc (perhaps being more picky or more standards compliant.) Or there could be a libc difference too. Yeah that's what I think, too. I assume a different implementation in the STL to be more precise: the std:vector. I don't think I like the idea to build a gcc 4.1.2 and rebuild FlightGear from source with that compiler but I would not be surprised, if the crash disappears that way. Which compilier is tpalinkas using? Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
The native_ctrls issue you point out is a surprise to me, but as I look in the cvs web browser pages, I see that Erik Hofman's name is attached to these, and the specific commit was adding some new fields to the structure. I think it makes sense to remove the comments that disable network byte order. That appears to have been a mistake that slipped through the cracks. I have just recompiled FlightGear with the comments that disable the byte order code removed, and the native-ctrls protocol works like magic. Can you change the source in CVS? I just removed the two comments: Index: native_ctrls.cxx === RCS file: /var/cvs/FlightGear-0.9/source/src/Network/native_ctrls.cxx,v retrieving revision 1.33 diff -u -p -r1.33 native_ctrls.cxx --- native_ctrls.cxx21 Feb 2006 01:19:47 - 1.33 +++ native_ctrls.cxx21 Jan 2008 20:53:27 - @@ -293,7 +293,6 @@ void FGNetCtrls2Props( FGNetCtrls *net, int i; SGPropertyNode * node; -/*** if ( net_byte_order ) { // convert from network byte order net-version = htonl(net-version); @@ -350,7 +349,6 @@ void FGNetCtrls2Props( FGNetCtrls *net, net-speedup = htonl(net-speedup); net-freeze = htonl(net-freeze); } -*/ if ( net-version != FG_NET_CTRLS_VERSION ) { SG_LOG( SG_IO, SG_ALERT, Version mismatch with raw controls packet format. ); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
Yes, this should already be done in both plib and osg branches. Regards, Curt. On Jan 21, 2008 2:55 PM, Torsten Dreyer wrote: The native_ctrls issue you point out is a surprise to me, but as I look in the cvs web browser pages, I see that Erik Hofman's name is attached to these, and the specific commit was adding some new fields to the structure. I think it makes sense to remove the comments that disable network byte order. That appears to have been a mistake that slipped through the cracks. I have just recompiled FlightGear with the comments that disable the byte order code removed, and the native-ctrls protocol works like magic. Can you change the source in CVS? I just removed the two comments: Index: native_ctrls.cxx === RCS file: /var/cvs/FlightGear-0.9/source/src/Network/native_ctrls.cxx,v retrieving revision 1.33 diff -u -p -r1.33 native_ctrls.cxx --- native_ctrls.cxx21 Feb 2006 01:19:47 - 1.33 +++ native_ctrls.cxx21 Jan 2008 20:53:27 - @@ -293,7 +293,6 @@ void FGNetCtrls2Props( FGNetCtrls *net, int i; SGPropertyNode * node; -/*** if ( net_byte_order ) { // convert from network byte order net-version = htonl(net-version); @@ -350,7 +349,6 @@ void FGNetCtrls2Props( FGNetCtrls *net, net-speedup = htonl(net-speedup); net-freeze = htonl(net-freeze); } -*/ if ( net-version != FG_NET_CTRLS_VERSION ) { SG_LOG( SG_IO, SG_ALERT, Version mismatch with raw controls packet format. ); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel -- Curtis Olson: http://baron.flightgear.org/~curt/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
On Jan 21, 2008 2:38 PM, Torsten Dreyer wrote: g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) So the problem could also be related to your newer version of gcc (perhaps being more picky or more standards compliant.) Or there could be a libc difference too. Yeah that's what I think, too. I assume a different implementation in the STL to be more precise: the std:vector. I don't think I like the idea to build a gcc 4.1.2 and rebuild FlightGear from source with that compiler but I would not be surprised, if the crash disappears that way. Which compilier is tpalinkas using? The other thing that concerns me is that even with your patch, tpalikas is reporting order depencies in the master/slave startup sequence and doing it wrong results in a crash. Also some new double free errors when exiting. The communication between master slaves is UDP so there should be absolutely no order dependencies in startup. Any machine should be able to start (or restart) in any order without affecting the stability of the system. Somehow the master must be sending garbage during startup and crashing the slaves. Regards, Curt. -- Curtis Olson: http://baron.flightgear.org/~curt/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
Am Montag, 21. Januar 2008 22:00 schrieb Curtis Olson: The other thing that concerns me is that even with your patch, tpalikas is reporting order depencies in the master/slave startup sequence and doing it wrong results in a crash. Also some new double free errors when exiting. The communication between master slaves is UDP so there should be absolutely no order dependencies in startup. Any machine should be able to start (or restart) in any order without affecting the stability of the system. Somehow the master must be sending garbage during startup and crashing the slaves. Neither of this is occouring here. When I run the master without a slave, there are many Error writing data. messages on the master's console which stops again when the slave is up. But I can startup/shutdown the master or slave independently without crash or double free error messages. Sorry - no idea for that one. Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
Torsten Dreyer schreef: Neither of this is occouring here. When I run the master without a slave, there are many Error writing data. messages on the master's console which stops again when the slave is up. But I can startup/shutdown the master or slave independently without crash or double free error messages. Sorry - no idea for that one. Torsten That sounds like you're using TCP, since if you were using UDP, the master would not know if the slave(s) received the message -- UDP is an unreliable protocol and the master does not know if it is transmittiing into oblivion or reaching an actual slave instance of FlightGear. Provided the native protocol doesn't have a mechanism to provide feedback of received messages to the master, that is. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel
Re: [Flightgear-devel] Bug in native protocol was: simgear 1.0.0 crash -- and yet another bug
That sounds like you're using TCP, since if you were using UDP, the master would not know if the slave(s) received the message -- UDP is an unreliable protocol and the master does not know if it is transmittiing into oblivion or reaching an actual slave instance of FlightGear. Provided the native protocol doesn't have a mechanism to provide feedback of received messages to the master, that is. It's udp I swear by the life of my dog (mine is to precious): The commandline pasted fresh from the clipboard: fgfs --native=socket,out,20,localhost,5556,udp --aircraft=c172p --geometry=640x480 --timeofday=noon --native-ctrls=socket,out,20,localhost,5557,udp And the output of netstat -un after the above command (sorry - german output. s/VERBUNDEN/CONNECTED/g): Proto Recv-Q Send-Q Local Address Foreign Address State udp0 0 127.0.0.1:32817 127.0.0.1:5556 VERBUNDEN udp0 0 127.0.0.1:32818 127.0.0.1:5557 VERBUNDEN Torsten - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel