Seeing a similar problem:

Assertion "tcp_input: pcb->next != pcb (before cache)" failed at line 182
in <...>/core/tcp_in.c

I have two machines, one ARM and another i386 running the same code. I can
reproduce it consistently on the ARM. Don't see it i386.

The LWIP task is running with NO_SYS=1 (as one task in a multitasking
environment).

Will investigate over the next few days. Any hints welcome.



On Wed, Oct 14, 2015 at 11:03 PM, Sylvain Rochet <[email protected]>
wrote:

> Hi Stephen,
>
> On Wed, Oct 14, 2015 at 09:13:59AM -0500, Stephen Cowell wrote:
> > Hey Enrico,
> > I'm using GNU toolchain/compiler, supplied with Atmel Studio 6.1.
> > Since I've added the code I've had no other problems; I really don't
> > have much time to research this, what with other pressures at work.
> >
> > It seems the issue is not unknown... sometimes the pdb ends up pointing
> > to itself.  These times appear to be correlated to high-stress I/O.
> >
> > Obviously the last pdb should point to null... and it should never point
> > to itself.  It is easy enough to catch it pointing to itself and make
> that
> > null.  I verified that this was the first pdb, that we weren't going to
> > have a memory leak when we just terminated the list.  I did not have
> > the resources to chase down when the pointer to self happened...
> > I only know that it does, and that the pdb that this happens to is
> > at the first allocated pdb address.  The obvious thing to do was to
> > correct the pointer to break the endless loop... seems to work.
> >
> > As Sylvain wrote, the Atmel port has some serious differences from
> > what he's used to seeing... I'm assuming this has something to do
> > with it.  As I get more time (the product ships soon) I'll be able to
> > spend some more time on this issue.  I'm just glad to get it out there
> > and let others know it's happening.
>
> A linked list corruption is a very serious problem, you really must not
> ship your product with such a known bug. Your workaround only mitigate a
> single common corruption pattern on linked list, but that's only one of
> them. It will break soon or later with an other pattern.
>
> If a linked list is corrupted it's because there is a reentrancy problem
> in functions modifying the linked list. Which really limit the scope
> where reentrancy can occur. We have critical sections for !NO_SYS
> systems, you could use the critical sections hooks to check if
> reentrancy constraints are respected,
> SYS_ARCH_DECL_PROTECT/SYS_ARCH_PROTECT/SYS_ARCH_UNPROTECT.
>
> At least, if you want to ship your product very quickly, just define
> those hooks to something appropriate (those are recursive locks so
> you'll have take care of that) and you should be safe, for now.
>
> Sylvain
>
> _______________________________________________
> lwip-users mailing list
> [email protected]
> https://lists.nongnu.org/mailman/listinfo/lwip-users
>
_______________________________________________
lwip-users mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/lwip-users

Reply via email to