Hi Glenn,

We are running many bof files with that change and are doing plenty of
register and bram reads and writes and have not experienced any issues with
these bus accesses. What version of TCPBorphServer are you running?

Wes


On Fri, Mar 15, 2013 at 3:28 PM, G Jones <glenn.calt...@gmail.com> wrote:

> Hi,
> It should have occurred to me sooner, but I checked through the commit
> logs for mlib_devel and remembered I had updated from ska-sa a couple of
> weeks ago to get the bugfix for the rcs block. In doing so, I had also
> pulled down this commit:
>
>
> https://github.com/ska-sa/mlib_devel/commit/bad95b18fe79146d288607e5fe3c0360c071c2ad
> "Simplified the EPB to OPB 32bit bus cycle and now supports legacy byte
> enable support for ROACH 1 modules on ROACH 2."
>
> which sounds suspicious since the problem seemed to be related to reading
> writing brams/software registers.
>
> Indeed, when I switched over to the commit right before that one and
> compiled the same test design, I ended up with a boffile that has not yet
> crashed (the bad bof would have certainly crashed by now).
>
> The design is simply two ADC5Gs connected to a snapshot blocks. The ADCs
> are clocked at 2880 MHz, so the FPGA is running at 180 MHz.  I'm not sure
> if the problem is some interaction between the ADC5Gs and this commit, or
> the clock rate or what.
>
> Henno, can you double check the code in this commit and see if you can
> ascertain where the bug might be?
>
> Glenn
>
> On Thu, Mar 14, 2013 at 12:00 PM, G Jones <glenn.calt...@gmail.com> wrote:
>
>> Hi,
>> For some unknown reason, boffiles I generate with my toolflow cause
>> ROACH 2's to freeze up after a few minutes (I think related to I/O to
>> software registers and shared BRAMs rather than any specific amount of
>> time). I don't know of any changes I made to my toolflow since the
>> last time I compiled working boffiles. Previously working boffiles
>> still work, but recompiled designs do not work. The symptom is that
>> the python katcp client stops responding. SSHing to the ROACH and
>> running ps shows that tcpborphserver3 is no longer running. It finally
>> occurred to me to check dmesg, and on all crashed ROACHs, I see this
>> in the demsg:
>>
>> ...
>> About to toggle cpu_rdy pin<7>r2case_event(): Got type 11, code 8, value 1
>> attempting led toggle
>> About to toggle cpu_rdy pin<7>r2case_event(): Got type 11, code 8, value 0
>> attempting led toggle
>> About to toggle cpu_rdy pinMachine check in kernel mode.
>> Data Read PLB Error
>> Oops: Machine check, sig: 7 [#1]
>> PowerPC 44x Platform
>> Modules linked in:
>> NIP: 0fea4048 LR: 0fea3f88 CTR: 00000004
>> REGS: ef00bf10 TRAP: 0214   Not tainted  (3.7.0-rc2+)
>> MSR: 0002d000 <CE,EE,PR,ME>  CR: 20000224  XER: 00000000
>> TASK = efb54060[516] 'tcpborphserver3' THREAD: ef00a000
>> GPR00: 00000000 bfcb7290 48031e20 10628bf9 4802c010 00000004 00000018
>> 7f7f7f7f
>> GPR08: 00000000 10628bf0 10628ba0 0fea3f80 20000222 1006ba18 00000000
>> 00000000
>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> 00000000
>> GPR24: 00000000 00000000 00000000 00000004 10628bf9 10628bf9 0ff91ff4
>> 4802c011
>> NIP [0fea4048] 0xfea4048
>> LR [0fea3f88] 0xfea3f88
>> Call Trace:
>> ---[ end trace 59d28c137ef7dde2 ]---
>>
>> roach VMA close
>> roach release mem called
>>
>> -----
>>
>> If I then try to reboot the ROACH with shutdown -r now, it hardfreezes
>> and requires a power cycle to get it running again.
>>
>> Any ideas where to look for this problem?
>>
>> Thanks,
>> Glenn
>>
>
>

Reply via email to