Hi Gary, Thanks for the reply.
One end of the connection is eCos the other is Java running on a Linux JVM. The connection is dropped by the Java side; after repeated packets are not acknowledged by the eCos host, followed by the eCos netstack xmiting an out of order segment, the Java protocol stack appears to wind back to the last packed with a sequence number the two hosts agreed on and resends it. After a couple more attempts the Java netstack gives up and drops the connection. Looking at the packet capture from eCos with cyg_io_eth_net_debug set, there is a complete lack on xmit activity; I see the retransmits from the Java host, but the eCos netstack fails to ack them. Almost like the protocol stack has stalled. From what I understand the protocol stack is interrupt driven. Here's the final twist. My main worker thread in eCos is very busy during the time I observe the netstack xmit starvation (writing to flash) for a period of around 16 seconds. I've made sure my main thread priority is lower (higher in integer terms) than the internal netstack delivery threads priority. Is there any way a user thread can cause netstack starvation? BTW I'm not locking out interrupts during this time. Sorry for the length of the post, and again TIA Andrew. Bell, Andrew [Allen & Heath UK] wrote: > Hello All, > > I'm having FreeBSD netstack issues with an eCos port for a Motorola 852T > board based on an A&M Adder. > > Our eCos application keeps dropping socket connections with an EPIPE > (broken pipe) after a period of high tx activity. The ethereal capture > of the stream shows the eCos nestack shortly after the burst of tx > activity stops sending acks to the front end, ignores retransmits from > the front end, then eventually emits an out of order segment which > ethereal calculates a RTT of 1158229289 seconds! > > I've run the bsd tests, enabled stack checking and enabled assertions. > I've turned on MBUF warnings and enabled cyg_io_eth_net_debug and > increased CYGPKG_NET_USAGE to (1008 *1024) + (MAXSOCK * 1024), all of > which show no clues. > > If anyone can point me in the right diections I'd be grateful. AFAIK, EPIPE is only returned if the receiving end of a TCP connection breaks off and the Tx end is still trying to send. Are both "ends" of your connections eCos applications? On the same or different machines? Is this failure something that can be tested/demonstrated separately? In other words, can you send a test case that duplicates the problem? Finally, do you have any idea if it's hardware/platform specific? -- ------------------------------------------------------------ Gary Thomas | Consulting for the MLB Associates | Embedded world ------------------------------------------------------------ -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
