Re: [casper] Roach-2 crashing fix

John Ford Tue, 28 Jul 2015 05:27:40 -0700

> So I confess to relying on third parties for this information, but isn't
> the board populated with 1Gb RAM after all ? Would the crash be trigged by
> a kernel memory layout of 3Gb+1Gb rather than  2Gb+2Gb ?


Certainly if the kernel thinks the layout of memory isn't what it really
is it could crash.  We'll look into this a bit more.

>
> Have you tried the kernel from 9 months ago at github
> ska-sa/roach2_nfs_uboot ?

No, we haven't, as far as I know.

John

>
> regards
>
> marc
>
>
> On Wed, Jun 24, 2015 at 11:49 PM, John Ford <jf...@nrao.edu> wrote:
>
>> Hi all.
>>
>> We were having problems with multiple sequentail progdev calls failing
>> on
>> our ROACH-2 systems.  We were testing multiple bof files in a loop, and
>> the roach would fall over and crash completely, and after the kernel
>> panic, it would reboot itself.
>>
>> After a great deal of concentrated debugging effort this afternoon by
>> Jack, David, Justin, Ryan, Arindam, Randy, and me, the cause of the
>> crashing upon multiple progdev calls was found.  It turned out to have
>> nothing to do with programming the chip, rather it was a problem with
>> memory allocation by the operating system.  Jack found that problem
>> could
>> also be caused by allocating a huge array in Python, using lots of
>> memory.
>>
>> The problem was caused by the kernel thinking that the ROACH has 768 MB
>> of
>> memory on board, when in fact it has only 512 MB.  The fix is to pass
>> the
>> real amount of memory to the kernel in the bootargs.  the systems have
>> been mostly working for a long time (Years!), so you may want to check
>> that your systems know in fact how much memory they have.  If you start
>> up
>> top you can see what it thinks, or look in /proc/meminfo.
>>
>> John
>>
>>
>>
>>
>>
>>
>

Re: [casper] Roach-2 crashing fix

Reply via email to