Re: RTEMS classic networking: fixed FXP driver works under QEMU on Linux

2016-09-20 Thread Sebastian Huber

Hello Pavel,

the interrupt server with the simple interrupt vector disable/enable via 
the interrupt controller is an easy solution that worked on all boards I 
used so far with the libbsd. However, it certainly doesn't work on all 
systems. It allowed to easily use the FreeBSD device drivers which use 
mutexes in their interrupt service routines.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: RTEMS classic networking: fixed FXP driver works under QEMU on Linux

2016-09-20 Thread Chris Johns

On 21/09/2016 09:47, Pavel Pisa wrote:

Hello all,

the driver works after hours of problem seeking
in incorrect directions.

I have debugged, examined and patched both, RTEMS and QEMU.
The main problem is quite simple. Update of RTEMS interrupts
processing disable level type interrupts when they arrive
and the driver/daemon has to re-enable interrupt source
on the interrupt controller level.

Generally, idea to disable interrupt source at hard interrupt
time and do all processing outside of interrupt is the
the best solution for RT system. But I consider actual
behavior seriously broken. It is for longer description.


I am not so sure it is seriously broken. There looks like issue that 
needs to be handled but that is all.


If the interrupt is level and directly using the PIC then you would see 
the issue you saw. The solution maybe as simple as changing the code in 
BSP_dispatch_isr to be:


 if (irq_trigger[vector] == INTR_TRIGGER_LEVEL && SHARED)
   ...

where SHARED is a fast means of finding the type of interrupt. I think 
this change would have broken the interrupt server at the time I fixed 
this code because the interrupt server was using unique. That has changed.



But if you consider shared interrupts (all PCI ones on PC
for example) then the correct behavior is to use hard IRQ
to disable/gate interrupt on given device level
(not all sharing devices at controller level) and release
worker thread for each device from corresponding
hard IRQ and left the scheduler to select between these
according to the priority.


This is not the model that is used on this arch or others, eg ARM. There 
is a single thread to receive the interrupt and handlers are called from 
that thread's context to check and process the specific interrupt. If 
you do not mask the interrupt at the controller level the interrupt 
server thread can never run. We just sit in a loop handling the interrupt.


I think what we have now with the interrupt server is fine. It means we 
are using threads as soon as we can and with SMP I think that helps.



When more important/critical
device finishes its IRQ processing, it enables IRQ
on given device level and processing of its interrupts
in time is possible.

Actual state pushes device drivers to attempt re-enable
IRQ on controller level.


If you use the shared interrupt server this is managed for you. For a 
unique interrupt attaching directly to the PIC this has changed so yes 
it is broken. I did not consider this case.



But if IRQ is re-enabled by hight
priority device worker thread in case that there exists
lower priority device sharing the IRQ which is still signalling
IRQ, then without gating on device level this has to lead
to livelock. Cotroller fires IRQ dispatch, that releases
high priority device driver, that finds nothing to do and reenables
IRQ on the controller level.


The current code in 4.12 has been tested on a real PC that has the 
spurious interrupt issue with 4 NICs shared across PCI interrupts using 
LibBSD without an issue. I tested this under load. Before the changes 
the platform was broken for real-time, the spurious handler did a printk 
and fired all the time once the load went up.




There is option to make things work even with shared IRQs
if gating on device level is not available. And it is
to have counted disable on the controller level which ensures
that IRQ is re-enabled only after all devices worker
threads finish processing. But priority rules are broken in that
case as well.

I know that there is interrupt server option but base
interrupts should be working and not broken.



I have ticks, uart etc on edge and PCI on shared all working.


Shared edge trigerred interrupts processing is complex
problem either. I think that I have provided analysis
years ago for RTEMS.


Shared edge? I do not think that can happen on a PC given edge relates 
back to the ISA bus.






found it

https://lists.rtems.org/pipermail/users/2008-May/018775.html

If I could find month somewhere, I would try provide
changes which could go under critique and testing .



The changes I did fix the spurious interrupt issue that has been in 
RTEMS on this platform for years. I tracked the FreeBSD kernel's 
handling of the AT pic and what it does. You need to step carefully with 
this code and touching the PIC. What we have now is aligned to FreeBSD 
which partially explains the issue you have had. I considered it a good 
base given it's wide PC support.


Finally this code should be migrated to support APIC and that changes 
things again.


Chris


Anyway, back to solve Saeed Ehteshamifar problem with
lack of network supporting environment for his dynamic
loading task. I have tested it on Linux, Debian.
I have done setup without helper scripts and toolsfrom
QEMU or other system. I decided to use separate
"network segment" for testing. The wire for L2/ethernet
layer is created by

   ip tuntap add tap1 mode tap user pi
   ip link set tap1 up

You need root access.
I 

Re: RTEMS classic networking: fixed FXP driver works under QEMU on Linux

2016-09-20 Thread Pavel Pisa
Hello all,

the driver works after hours of problem seeking
in incorrect directions.

I have debugged, examined and patched both, RTEMS and QEMU.
The main problem is quite simple. Update of RTEMS interrupts
processing disable level type interrupts when they arrive
and the driver/daemon has to re-enable interrupt source
on the interrupt controller level.

Generally, idea to disable interrupt source at hard interrupt
time and do all processing outside of interrupt is the
the best solution for RT system. But I consider actual
behavior seriously broken. It is for longer description.
But if you consider shared interrupts (all PCI ones on PC
for example) then the correct behavior is to use hard IRQ
to disable/gate interrupt on given device level
(not all sharing devices at controller level) and release
worker thread for each device from corresponding
hard IRQ and left the scheduler to select between these
according to the priority. When more important/critical
device finishes its IRQ processing, it enables IRQ
on given device level and processing of its interrupts
in time is possible.

Actual state pushes device drivers to attempt re-enable
IRQ on controller level. But if IRQ is re-enabled by hight
priority device worker thread in case that there exists
lower priority device sharing the IRQ which is still signalling
IRQ, then without gating on device level this has to lead
to livelock. Cotroller fires IRQ dispatch, that releases
high priority device driver, that finds nothing to do and reenables
IRQ on the controller level.

There is option to make things work even with shared IRQs
if gating on device level is not available. And it is
to have counted disable on the controller level which ensures
that IRQ is re-enabled only after all devices worker
threads finish processing. But priority rules are broken in that
case as well.

I know that there is interrupt server option but base
interrupts should be working and not broken.

Shared edge trigerred interrupts processing is complex
problem either. I think that I have provided analysis
years ago for RTEMS.



found it

https://lists.rtems.org/pipermail/users/2008-May/018775.html

If I could find month somewhere, I would try provide
changes which could go under critique and testing .

Anyway, back to solve Saeed Ehteshamifar problem with
lack of network supporting environment for his dynamic
loading task. I have tested it on Linux, Debian.
I have done setup without helper scripts and toolsfrom
QEMU or other system. I decided to use separate
"network segment" for testing. The wire for L2/ethernet
layer is created by
 
  ip tuntap add tap1 mode tap user pi
  ip link set tap1 up

You need root access.
I have connected this stub to Linux TCP/IP networking
subsystem by

  ip addr add 192.168.3.1/24 dev tap1
  ip link set tap1 up

You need to select address from private address range.
Check that whole range, in above case 192.168.3.0
to 192.168.3.255 does not overlap with networks
visible from your computer. It should be really
isolated island. The kernel does default setup
of routing to the range

  ip route show

  192.168.3.0/24 dev tap1  proto kernel  scope link  src 192.168.3.1

To keep range separate you should not have enabled forwarding

  cat /proc/sys/net/ipv4/ip_forward

should show 0 or you need to setup iptables or check and set
routing rules to keep island network.

I start RTEMS application by command

qemu-system-x86_64 -enable-kvm -kernel $APP_BINARY \
  -vga cirrus \
  -append "--console=/dev/com1" \
  -serial stdio \
  -net nic,vlan=1,macaddr=be:be:be:10:00:01,model=i82557b \
  -net tap,ifname=tap1,vlan=1,script=no,downscript=no  \

You can add "-s -S" for debugging by GDB "target remote localhost:1234"
Be carefull, if you want to set breakpoints then setting them when
in real mode at address 0x0 is not good idea. You need
to be in virtual space of RTEMS application after its load
to set software breakpoints. But there are two or three HW
breakpoints emulated. Decide for some function and use

  hbreak rtems_fxp_attach
  c

When the function is reached you can insert regular SW
breakpoints.

When application runs, you can access it from Linux system.
The configured address has to be withing island network range.
I.e.

  ping 192.168.3.66

The netwok card should be found by ARP broadcast coming
to QEMU where RTEMS responds that it is its address.
You can use 

  arping -i tap1 192.168.3.66

or 

 arping -i tap1 be:be:be:10:00:01

to check lowest level connection.

The OMK template includes application "appnet" which uses
RTEMS integrated telnet to access RTEMS shell remotely.

  telnet 192.168.3.66

See

  
http://rtime.felk.cvut.cz/gitweb/rtems-devel.git/tree/HEAD:/rtems-omk-template/appnet

The config

static struct rtems_bsdnet_ifconfig netdriver_config = {
.name = "fxp1" /*RTEMS_BSP_NETWORK_DRIVER_NAME*/,
.attach = rtems_fxp_attach /*RTEMS_BSP_NETWORK_DRIVER_ATTACH*/,
.next = NULL,