Hi Greg and all,

good and bad news for me: the issue below has been solved, many thanks for the 
support.
It was related to a "bad/open soldering" on a small pull-up resistor network, 
and spurious interrupts was generated.

Now the board is much more stable (was crashing in 1 hour), so i left the shell 
(serial port) open, but after 1 day of uptime i get this strange lock situation:

~ # ls
bin    etc    lib    mnt    root   usr
dev    home   media  proc   tmp    var
~ # ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

- no keys are accepted from the console
- not possible to connect with telnet
- web server was still someway responding, but slower than normal, i could get 
2 times the homepage, but then seem be locked completely.
- ftp server (inetd) don't respond also, as telnet

Debug symbols are already enabled inside the kernel, but there isn't any useful 
information.

Any help is appreciated,

regards,
angelo

On 12/08/2011 10:55, angelo wrote:

> Hi Greg, many thanks for your reply,
> 
> did you use and found reliable mcf5307 boards ? I am asking since the
> errata for this chip only.
> 
> 
> the issue described in the errata was happening in u-boot, i was getting
> the trap below just after "rte" execution inside the interrupt handler.
> Also there the shown PC was different from 0xffffffff., but setting the
> "C/I" bit as they say solved the problem.
> 
> 
> *** Unexpected exception ***
> Vector Number: 3  Format: 04  Fault Status: 4
> 
> PC: 00fe910a    SR: 00002000    SP: 00ed8af0
> D0: 00002c1b    D1: 0000001b    D2: 00400000    D3: 00ee8b76
> D4: ffc12d08    D5: ffffffff    D6: 00ffad57    D7: 00ee8b76
> A0: 00ee8b76    A1: 00fe9604    A2: 00ee8bc6    A3: 00ffd400
> A4: 00ff8167    A5: 00ffbb00    A6: 00ed8b48
> 
> *** Please Reset Board! ***
> 
> Looking better now, the uclinux trap is different, it is a Vector
> 4(illegal instruction) and not 3, but it always happen inside the
> interupt and this is probably not a case.
> 
> 
> 
> About power, i recently changed the power supply circuit inside my
> custom board, using a more reliable, switching, 3.3V regulator, 2A max
> drain (total board consume is near 400ma). I will check for noise and
> see if there are some stability issues.
> 
> About SDRAM, do you know a method to be sure it's working correctly ? Is
> there some specific test inside linux ?
> I am thinking to run a continued loop test from u-boot ram test section,
> to exclude out uclinux.
> 
> This are some informations about the board:
> 
> /proc # cat cpuinfo
> CPU:            COLDFIRE(m5307)
> MMU:            none
> FPU:            none
> Clocking:       88.4MHz
> BogoMips:       58.98
> Calibration:    29491200 loops
> /proc #
> 
> ~ # cat /proc/version
> uClinux version 2.6.36.2 (angelo@angel7) (gcc version 4.2.4) #134 Wed
> Aug 10 16:01:21 CEST 2011
> 
> ~ # cat /proc/meminfo
> MemTotal:          13864 kB
> MemFree:            7164 kB
> Buffers:              16 kB
> Cached:              124 kB
> SwapCached:            0 kB
> Active:               68 kB
> Inactive:             72 kB
> Active(anon):          0 kB
> Inactive(anon):        0 kB
> Active(file):         68 kB
> Inactive(file):       72 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> MmapCopy:            552 kB
> SwapTotal:             0 kB
> SwapFree:              0 kB
> Dirty:                 0 kB
> Writeback:             0 kB
> AnonPages:             0 kB
> Mapped:                0 kB
> Shmem:                 0 kB
> Slab:               1208 kB
> SReclaimable:         60 kB
> SUnreclaim:         1148 kB
> KernelStack:         100 kB
> PageTables:            0 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:        6932 kB
> Committed_AS:          0 kB
> VmallocTotal:          0 kB
> VmallocUsed:           0 kB
> VmallocChunk:          0 kB
> ~ #
> 
> 
> regards,
> angelo
> 
> 
> On 12/08/2011 05:28, Greg Ungerer wrote:
>> Hi Angelo,
>>
>> On 11/08/11 19:52, angelo wrote:
>>> working on a port of u-boot for mcf5307 i have found a major issue
>>> due to an "errata" of this chip:
>>>
>>> from MCF5307ER pdf:
>>>
>>> 35 Corrupted Return PC in Exception Stack Frame
>>>
>>> 35.1 Description
>>> When processing an autovectored interrupt an error can occur that
>>> causes 0xFFFFFFFF to be written as
>>> the return PC value in the exception stack frame. The problem is
>>> caused by a conflict between an internal
>>> autovector access and a chip select mapped to the IACK address space
>>> (0xFFFFXXXX).
>>>
>>> 35.2 Workaround
>>> • Set the C/I bit in the chip select mask register (CSMR) for the
>>> chip select that is mapped to
>>> 0xFFFFXXXX. This will prevent the chip select from asserting for IACK
>>> accesses.
>>> • Remap the chip select to a different address range.
>>> • Use external logic to provide external vectors for all interrupts
>>> instead of autovectoring.
>>> MASKS: 0H55J, 1H55J, 1J20C, 2J20C01/22/04
>>>
>>>
>>>
>>> from time to time, in my mcf5307 board (uClinux + main line kernel),
>>> i get the following trap exception, and since the calltrace pass
>>> always from an interrupt, and i am getting the same trap i was
>>> getting in u-boot, i am suspecting that the issue is the same.
>>>
>>>
>>> ~ # *** ILLEGAL INSTRUCTION *** FORMAT=4
>>> Current process id is 0
>>> BAD KERNEL TRAP: 00000000
>>> PC: [<0002e500>]
>>
>> But your trap PC is not 0xFFFFFFFF as per the errata above?
>> I would not think you are seeing this problem.
>>
>> Over the years there is probably 2 main culprits I have seen for
>> sporadic, hard to explain, traps on ColdFire boards. Since you say
>> you have problems with both uboot and uClinux it may be time to
>> check these:
>>
>> 1. bad power. Boards being run from wall wart type power supplies
>>    that just don't deliver good clean power
>> 2. bad DRAM timing. Can be very subtle, and hard to diagnose.
>>
>>
>>> SR: 2714 SP: 001bdf40 a2: 0016148c
>>> d0: 00000000 d1: 00e80000 d2: 0000001e d3: 00000000
>>> d4: 00000000 d5: 00ffd440 a0: 001bd000 a1: 001ab0e8
>>> Process swapper (pid: 0, stackpage=001ac0e8)
>>> Stack from 001bdf74:
>>> 000207a0 00ffdb94 00000000 00022ec6 0000001e 001bdf8c 00e80000 00ffdb94
>>> 00000000 00000000 00ffd440 001bd008 001ab0e8 0016148c 00000000 ffffffff
>>> 00000000 40782000 000208d8 00020a08 00020a0e 001608e4 001ab0e8 001ca5f8
>>> 001ca708 001be944 00000d6d 00000d6d 001cc000 00ffdb08 00ed8abc 00ffbf00
>>> 001ca86c 00ed8708 000200d8
>>> Call Trace with CONFIG_FRAME_POINTER disabled:
>>> [000207a0] do_IRQ+0x1e/0x5a
>>> [00022ec6] inthandler+0x6a/0x74
>>> [0016148c] schedule+0x0/0x30e
>>> [000208d8] default_idle+0x22/0x40
>>> [00020a08] cpu_idle+0x1a/0x20
>>> [00020a0e] kernel_thread+0x0/0x3c
>>> [001608e4] rest_init+0x6c/0x72
>>> [001ca5f8] _einittext+0x0/0x0
>>> [001be944] start_kernel+0x274/0x280
>>> [000200d8] _exit+0x0/0x8
>>>
>>> Disabling lock debugging due to kernel taint
>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>> Stack from 001bdddc:
>>> 001bdf34 00029ae0 001961e6 001cc297 001cc297 00000400 001963cd 001bde24
>>> 00000001 001ab0e8 0000000b 00000000 001bd000 001bdf40 000d1d10 001bde24
>>> 0002ca90 001963cd 00000004 00000000 00000000 00ffd440 00000000 00ee8b76
>>> 0002a0a2 001bdf40 000d1d10 0002a0a2 001bdf34 00196170 00000004 00022656
>>> 0000000b 00000007 00000000 001bdf74 0019546c 001ab2ac 00000000 001ac0e8
>>> 001bdf40 0002a0a2 000226fe 001954f3 001bdf40 00000000 001954d6 00000000
>>> Call Trace with CONFIG_FRAME_POINTER disabled:
>>> [00029ae0] panic+0x60/0x1be
>>> [001961e6] __func__.34039+0x201d2/0x34c70
>>> [001963cd] __func__.34039+0x203b9/0x34c70
>>> [000d1d10] strlen+0x0/0x1a
>>> [0002ca90] do_exit+0x648/0x6cc
>>> [001963cd] __func__.34039+0x203b9/0x3
>>> [0002a0a2] printk+0x0/0x1c
>>> [000d1d10] strlen+0x0/0x1a
>>> [0002a0a2] printk+0x0/0x1c
>>> [00196170] __func__.34039+0x2015c/0x34c70
>>> [00022656] die_if_kernel+0xd4/0xda
>>> [0019546c] __func__.34039+0x1f458/0x34c70
>>> [0002a0a2] printk+0x0/0x1c
>>> [000226fe] bad_super_trap+0xa2/0xb0
>>> [001954f3] __func__.34039+0x1f4df/0x34c70
>>> [001954d6] __func__.34039+0x1f4c2/0x34c70
>>> [0016148c] schedule+0x0/0x30e
>>> [0002278a] trap_c+0x30/0x3da
>>> [00026c88] wake_up_process+0x0/0x16
>>> [0002a0a2] printk+0x0/0x1c
>>> [0002a0a2] printk+0x0/0x1c
>>> [00026c88] wake_up_process+0x0/0x16
>>> [00032d4a] update_process_times+0x40/0x4a
>>> [0003277c] run_timer_softirq+0x14/0x234
>>> [0004b00a] rcu_bh_qs+0x0/0x18
>>> [00020598] trap+0x5c/0x64
>>> [0016148c] schedule+0x0/0x30e
>>> [0002e500] irq_enter+0x2e/0x3a
>>> [000207a0] do_IRQ+0x1e/0x5a
>>> [00022ec6] inthandler+0x6a/0x74
>>> [0016148c] schedule+0x0/0x30e
>>> [000208d8] default_idle+0x22/0x40
>>> [00020a08] cpu_idle+0x1a/0x20
>>> [00020a0e] kernel_thread+0x0/0x3c
>>> [001608e4] rest_init+0x6c/0x72
>>> [001ca5f8] _einittext+0x0/0x0
>>> [001be944] start_kernel+0x274/0x280
>>> [000200d8] _exit+0x0/0x8
>>>
>>> Anyone have experienced something like this ?
>>> Any help is appreciated, anyway i am going ahead in investigations
>>> inside the kernel and let you know.
>>
>> I don't use 5307 parts too much these days, but a few years back
>> used them alot. And in general uClinux is very reliable on them.
>> What version kernel are you running?
>>
>> Regards
>> Greg
>>
>>
>> ------------------------------------------------------------------------
>> Greg Ungerer  --  Principal Engineer        EMAIL:     g...@snapgear.com
>> SnapGear Group, McAfee                      PHONE:       +61 7 3435 2888
>> 8 Gardner Close                             FAX:         +61 7 3217 5323
>> Milton, QLD, 4064, Australia                WEB: http://www.SnapGear.com
> 
> 
> -- 
> 
>  .:.:.SYSAM.:.:.
> 
>   di Angelo Dureghello
>     via San Nazario 149
>       34151, Trieste, Italy
>     ++39 340 7631990
>   www.sysam.it <http://www.sysam.it>
> 
> 

_______________________________________________
uClinux-dev mailing list
uClinux-dev@uclinux.org
http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
This message was resent by uclinux-dev@uclinux.org
To unsubscribe see:
http://mailman.uclinux.org/mailman/options/uclinux-dev

Reply via email to