Hello all,


some time ago, we switched to Linux and installed the Linux AX.25 
subsystem, the AX.25 utilities and the FBB BBS.

It seemed to run smoothly after the usual configuration trouble. But 
now, we encountered some problems that seemed to occur more 
and more often.

The first is one already mentioned in our mailinglists. It is the 
appearance of error messages like

socket error: write on socket: broken pipe

when using FBB with the kernel based AX.25. This seems to be 
associated with another mechanism that sometimes causes 
trouble. The BBS writes its data to the kernel AX.25 socket and 
fills its buffers. After all data is written, FBB seems to start the 
timeout countdown for the certain user. But, on bad or slow links 
like 1200bps, it may take some time until all data could be really 
sent out to the user. 
This way, it happens that FBB times out without the machine really 
having sent all the data to the user! 
The effect can be seen as a sudden disconnect (DISC+) without 
any visible reason for the user. It almost never occurs when the 
downlink is of good quality, i.e. all packets are being sent out 
without errors or retries.
But if there are bad conditions, and the BBS machine has to 
resend some frames, it is very likely that the user is being 
disconnected suddenly without having the chance to finish his 
download.

Please note that this theory is only an attempt to explain that weird 
behavour. Using the call programm included in the AX.25 utilities 
we never experienced such problems no matter how bad the link 
was. This leads to the conclusion that the "handshaking" between 
FBB and the AX.25 subsystem may be not optimal.
By the way: The same effect can be experienced when using the 
FBB BBS together with the G8BPQ Packet Switch under DOS. 
Sometimes users get timeouts too if they are downloading a lot of 
data over a bad link there.

Yesterday, things were getting even worse. Sometimes users were 
disconnected very short after logging in. Then, suddenly, at one 
port no more connects were possible. The BBS actually did not 
answer.
Trying another port, the user got the usual message prompting for 
the login. Immediately after displaying this message a disconnect 
followed, however. There was no chance to login.

To test the other services on the machine we tried to connect to 
our node (AWZnode). The node came up as usual but showed no 
reaction to any input.

What we found out were some strange things on the Linux 
machine. The first thing was that the system's clock did not have 
the correct time and date. Obviously, there was something wrong 
with the clock. The machine today had a system date of April 8th. 
Could it be that running Linux over a couple of weeks without 
restarting has influences on the system's clock?

The second thing was more interesting: The mheard data file 
/var/ax25/mheard/mheard.bin had grown up to 56 MBytes!
After deleting this file AWZnode resumed working.
Question here: How can this be avoided? Is there a way to 
configure mheardd to set a maximum size for its data file?
At the moment, we have set up a crontab entry so that the file now 
is deleted once in a month. Is this really the way to go or are there 
more options?

In the hope that someone has got the one or the other hint for 
solving these problems, 

Best regards, 73

Gerd

Reply via email to