> Am 03.06.2016 um 02:01 schrieb David Ranch <[email protected]>:
> 
> 
> Hey Basil,
> 
> Good to hear from you.. hope all is well.
> 
> Yes.. it's been reported and Thomas verified it but I haven't heard of any 
> fixes yet ( I did send out a prod last month but no response)

In another thread (Message-Id: <[email protected]> you answered my 
question, that those machines are running with an smp-kernel.
Have you tried to disable smp (in grub, boot the kernel with the cmdline option 
nosmp)?
Did then the problem still occur?

Those bugs are very hard to trace, because you cannot really provoke them; they 
occur suddenly.

With kernel ax25 on smp machines I have discovered other severe bugs (ax25 data 
corruption), that also needs to be fixed.

Imho, our greatest problem is that there too few kernel ax25 developers around 
in our ham community.

In the meantime, I encourage to disable SMP to minimize the problems with 
kernel ax25.


Also look at my posting Message-Id: 
<[email protected]> from 2016-02-15. There was no 
response on the list, and nothing got into the kernel
(as far as I can see - my approach is to look at 
https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/master/drivers/net/hamradio
 ; perhaps I'm wrong with that ).

And on 2016-02-17 I asked for submitting my patch to mkiss.c that fixes a race 
condition that leads to kernel panic (!!!!): when the kernel ax25 stack sends 
data to the interface right after you plugged off your usb-serial-adapter.
It took me hours to discover, test and submit that, but nothing happens.
(David, it was in my mail to you with Message-Id: 
<[email protected]> )


Thus, those problems are discussed here year after year, again and again, and 
periodically people spend time to develop fixes others have already done (but 
never made it into the mainline kernel).

I'm very frustrated in that, and in a review of my past efforts I simply have 
to say now "sorry, I cannot help".


vy 73,
        - Thomas  dl9sau


> --David
> KI6ZHD
> 
> 
> 
> -------- Forwarded Message --------
> Subject:      Re: AX.25 / ax25d socket close issue on Ubuntu 14.04 but not on 
> 12.04
> Date:         Tue, 29 Mar 2016 09:00:37 +0200
> From:         Thomas Osterried <thomas@de>
> To:   David Ranch <dranch@net>
> CC:   Ralf Bächle DL5RB <ralf@org>, Bernard, f6bvp <f6bvp@fr>
> 
> 
> 
>> Am 28.03.2016 um 22:21 schrieb David Ranch <dranch@net>:
>> 
>> Hey Ralf, Thomas, Bernard,
>> 
>> I've been helping a user here who is running the LinuxRMS gateway on his 
>> Ubuntu 14.04 machine and when the remote station terminates the session, it 
>> leaves an AX.25 session on his computer *forever*.. never times out:
>> 
>> Active AX.25 sockets
>> Dest       Source     Device  State        Vr/Vs    Send-Q  Recv-Q
>> WA7FPV-0   WA7FPV-10  ax0     LISTENING    001/003  0       0
>> 
>> He built up an Ubuntu 12.04 machine with the same LinuxRMS/ax25d service and 
>> this does NOT happen.  He then sent me the below strace.  Any thoughts on 
>> where this issue is coming from?
> 
> Hello David,
> 
> just for a quick answer (I'm on journey): it's coming from a kernel bug in 
> the ax25 part.
> You already have Cc'ed Ralf <dl5rb>.
> If I remember correctly, he spoke some weeks ago also about this issue.
> I also know of those problems, which are very rare.
> 
> My question is: does it happen on SMP (multiprocessor-machine)?
> 
> vy 73,
>       - Thomas  dl9sau
> 
>> 
>> --David
>> 
>> 
>> 
>> -------- Forwarded Message --------
>> Subject:     Re: AX.25 Help...
>> Date:        Mon, 28 Mar 2016 12:52:25 -0700
>> From:        Josh Gibbs <gibbsjj@com>
>> To:  David Ranch <dranch@net>
>> 
>> Confirmed that starting Direwolf on the Ubuntu 14 box with your script made 
>> no difference. Socket still hangs up. I connected to the rmsgw process with 
>> strace, and then sent the bye command:
>> 
>> select(5, [0 4], NULL, NULL, NULL)      = 1 (in [0])
>> read(0, "b\r", 8192)                    = 2
>> write(4, "b\r", 2)                      = 2
>> read(0, 0x8058180, 8192)                = -1 EAGAIN (Resource temporarily 
>> unavailable)
>> select(5, [0 4], NULL, NULL, NULL)      = 1 (in [4])
>> recv(4, "D", 1, MSG_PEEK|MSG_DONTWAIT)  = 1
>> recv(4, "Disconnecting...\r", 8192, 0)  = 17
>> write(1, "Disconnecting...\r", 17)      = 17
>> recv(4, 0x8058180, 8192, 0)             = -1 EAGAIN (Resource temporarily 
>> unavailable)
>> select(5, [0 4], NULL, NULL, NULL)      = 1 (in [4])
>> recv(4, "", 1, MSG_PEEK|MSG_DONTWAIT)   = 0
>> time(NULL)                              = 1459193715
>> send(3, "<134>Mar 28 12:35:15 rmsgw[1417]"..., 85, MSG_NOSIGNAL) = 85
>> write(1, "; INFO: Connection closed by CMS"..., 51) = 51
>> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
>> rt_sigaction(SIGCHLD, NULL, {SIG_IGN, [], 0}, 8) = 0
>> nanosleep({1, 0}, 0xbfad3bac)           = 0
>> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
>> close(4)                                = 0
>> time(NULL)                              = 1459193716
>> write(1, "; Sent: 81 Bytes / Received: 2 B"..., 61) = 61
>> write(1, "; W7AUX de WA7FPV-10 SK\n", 24) = 24
>> time(NULL)                              = 1459193716
>> time(NULL)                              = 1459193716
>> send(3, "<133>Mar 28 12:35:16 rmsgw[1417]"..., 84, MSG_NOSIGNAL) = 84
>> close(4)                                = -1 EBADF (Bad file descriptor)
>> exit_group(0)                           = ?
>> +++ exited with 0 +++
>> 
>> I'm thinking that close(4) near the end is supposed to close the socket, but 
>> is resulting in -1 EBADF (Bad file descriptor).
>> 
>> I'm going to have a look in the code when I have more time to poke at this, 
>> but for now I at least have a working RMS Gateway on the Ubuntu 12 box! 
>> Appreciate all your help with this. I will let you know when I get to the 
>> root of it all, if you are interested!
>> 
>> -Josh
>> 
>> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-hams" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to