Re: Upcoming 19.07.7 release

2021-02-07 Thread Jaap Buurman
On Sun, Feb 7, 2021 at 11:23 AM Baptiste Jonglez
 wrote:
>
> On 05-02-21, Jaap Buurman wrote:
> > > Hi,
> > >
> > > We are planning a new 19.07 release in about a week (probably next 
> > > week-end).
> > >
> > > If you are aware of changes that need to be integrated, now is the time to
> > > do it or mention it here!
> > >
> > > I plan to test & integrate a workaround for this ramips stability issue:
> > > https://bugs.openwrt.org/index.php?do=details_id=2628
> > >
> > > Baptiste
> > >
> > > PS: please don't ask about 21.XX
> >
> > Dear Baptiste,
> >
> > Out of interest, what workaround for the RAMIPS stability issue are
> > you planning on using? Is it the disabling of TSO as discussed in the
> > bug report you have linked to?
>
> Yes, that's the idea, if it turns out to be not too intrusive.
>
> It's clearly a workaround from some other issue: as stated in the bug
> report, we use another driver in 5.4 and this one has no issue with TSO.
>
> Baptiste

Are we sure disabling TSO is the actual fix though? There are a few
reasons I am doubting that assessment:

1. Here is a user that is reporting he has always been running with
TSO disabled, yet he does experience the bug:
https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/5/389?u=mushoz
2. TSO seems fine with the master branch according to user reports.
3. The user "mrakotiq" suggested a patch to disable TSO in the bug
report you linked to, but this bug report also disables
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX. The reason that was
given was that he was seeing packets getting tagged that shouldn't
have (at least that's what I am understanding from his post on the bug
report). So there's obviously also something wrong with this
functionality, and it might not surprise me if this change is the
thing that seems to fix this issue.

Having said that, this bug is age-old and is affecting a lot of users,
me included. So I'd really like to get fixed. If there are no
regression with this approach, the best way forward might be to simply
adapt the patch he suggested as a workaround until we're on 21.xx with
the DSA driver. Especially since this user is reporting no more issues
with 75 (!) mt7621 routers in his production network, which is a
rather large sample size. Thoughts?

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


RE: Upcoming 19.07.7 release

2021-02-05 Thread Jaap Buurman
> Hi,
>
> We are planning a new 19.07 release in about a week (probably next week-end).
>
> If you are aware of changes that need to be integrated, now is the time to
> do it or mention it here!
>
> I plan to test & integrate a workaround for this ramips stability issue:
> https://bugs.openwrt.org/index.php?do=details_id=2628
>
> Baptiste
>
> PS: please don't ask about 21.XX

Dear Baptiste,

Out of interest, what workaround for the RAMIPS stability issue are
you planning on using? Is it the disabling of TSO as discussed in the
bug report you have linked to?

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


mt7621 / mt7530 programming: Disabling Flow Control on All Ports

2020-10-06 Thread Jaap Buurman
Dear all,

I am trying to disable flow control on all ports on my mt7621 device
with the mt7530 switch, however, I am having difficulties with getting
this to work. From Mediatek's documentation, setting a MAC's
register's 15th bit (FORCE_MODE_PU) to 1 allows you to force certain
settings. Then setting the 4th and 5th bit to 0 (FORCE_TX_FC_PU and
FORCE_RX_FC_PU respectively) should force flow control off for both
the TX and RX stream. However, when writing these values with the
existing mt7530_mdio_w32 function to the correct registers, flow
control still seems to be enabled and advertised on the router's side.
Running ethtool on my desktop gives me the following:

Link partner advertised pause frame use: Symmetric

Is there anyone that knows how to force flow control off on ALL ports?
In the following topic you can read about what I've already tried:
https://forum.openwrt.org/t/mt7621-mt7530-programming-disabling-flow-control-on-all-ports/76006/4

Thank you very much in advance!

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: ramips: ethernet: fix to interrupt handling

2020-09-05 Thread Jaap Buurman
Dear Rosen,

On Fri, Sep 4, 2020 at 10:50 PM Rosen Penev  wrote:
>
> On Fri, Sep 4, 2020 at 7:01 AM Jaap Buurman  wrote:
> >
> > Dear all,
> >
> > Is there a reason the "ramips: ethernet: fix to interrupt handling"
> > patch never was included in the master branch or 19.07 branch?:
> >
> > https://patchwork.ozlabs.org/project/openwrt/patch/20191029172328.85861-2-ros...@gmail.com/
> I doubt anyone cares. That patch was never merged. The issue was
> "avoided" by switching drivers.

That's very unfortunate to hear, since the 19.07 branch (which doesn't
use the dsa drivers) is still the latest stable branch and used a lot,
especially since the patch is already available and only needs to be
merged.

> >
> > There are many interrupt errors on my mt7621 device, and there have
> > been reports that this patch brings those errors massively down or
> > even eliminates them all together.
> https://github.com/openwrt/openwrt/commit/b5d425af237dc03327078d6b9be178a38b5f8723
> is very interesting in that regard. Some of those patches can be
> backported I think.

I have seen these as well. I haven't had time yet to test these, but I
am hoping these fix the transmit queue has timed out that has plagued
mt7621 for a very long time now. Fingers crossed!

> >
> > Yours sincerely,
> >
> > Jaap
> >
> > ___
> > openwrt-devel mailing list
> > openwrt-devel@lists.openwrt.org
> > https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Kind regards,

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


ramips: ethernet: fix to interrupt handling

2020-09-04 Thread Jaap Buurman
Dear all,

Is there a reason the "ramips: ethernet: fix to interrupt handling"
patch never was included in the master branch or 19.07 branch?:

https://patchwork.ozlabs.org/project/openwrt/patch/20191029172328.85861-2-ros...@gmail.com/

There are many interrupt errors on my mt7621 device, and there have
been reports that this patch brings those errors massively down or
even eliminates them all together.

Yours sincerely,

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: MT7621 Flow Control

2020-08-17 Thread Jaap Buurman
Dear Kristian,

Your watchdog script gave me the idea to try something similar. I have
now made a script with the following line:
logread -f | awk '/transmit timed out/ {system("/etc/init.d/network restart")}'

Which is continuously running in the background. Is this similar to
how your script operates? The network restart command does take a bit
to complete, so I was wondering if there is a command that also
restores connectivity but would result in a shorter break than this
current command does. Thank you!

Best regards,

Jaap

On Fri, Aug 7, 2020 at 10:09 AM Kristian Evensen
 wrote:
>
> Hello,
>
> On Thu, Aug 6, 2020 at 1:44 PM Jaap Buurman  wrote:
> > However, on this mailing list a user by the name of Kristian claims
> > that disabling flow control helps fix this problem, as can be read
> > here: 
> > https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
>
> My patch unfortunately does not solve the problem, as I can still see
> the timeout error. However, by disabling flow control, the frequency
> of the error is decreased to the point where it almost never happens
> (even on devices where I would frequently see the error). In order to
> deal with the remaining timeout cases, I wrote a small watchdog script
> that checks syslog for the timeout message and restarts networking if
> the error occurs. A restart has always been able to recover networking
> (at the cost of a small interruption).
>
> Kristian

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: MT7621 Flow Control

2020-08-07 Thread Jaap Buurman
On Fri, Aug 7, 2020 at 10:09 AM Kristian Evensen
 wrote:
>
> Hello,
>
> On Thu, Aug 6, 2020 at 1:44 PM Jaap Buurman  wrote:
> > However, on this mailing list a user by the name of Kristian claims
> > that disabling flow control helps fix this problem, as can be read
> > here: 
> > https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
>
> My patch unfortunately does not solve the problem, as I can still see
> the timeout error. However, by disabling flow control, the frequency
> of the error is decreased to the point where it almost never happens
> (even on devices where I would frequently see the error). In order to
> deal with the remaining timeout cases, I wrote a small watchdog script
> that checks syslog for the timeout message and restarts networking if
> the error occurs. A restart has always been able to recover networking
> (at the cost of a small interruption).
>
> Kristian

Dear Kristian,

Thank you very much for your input. It's unfortunate to hear you were
unable to fix the issue completely, but good to hear the frequency of
it happening has gone way down. I have two questions for you if you
wouldn't mind:

1) Would you perhaps be willing to share the watchdog script? It would
be very useful to me, and probably with me many others with the same
issue.
2) Is my assertion that the above mentioned patch that is supposed to
disable flow control never changed anything, correct? I am not sure I
saw a reduction in the issue cropping up after that patch landed,
which could be explained if the patch never disabled flow control as
it was intended to do.

Thank you!

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: MT7621 Flow Control

2020-08-07 Thread Jaap Buurman
On Thu, Aug 6, 2020 at 2:35 PM John Crispin  wrote:
>
>
> On 06.08.20 14:31, Andre Valentin wrote:
> > Hi Jaap,
> >
> >
> > Am 06.08.20 um 13:43 schrieb Jaap Buurman:
> >> Dear all,
> >>
> >> I have noticed the flow control work for mt7621 in the following
> >> Openwrt patch: 
> >> https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=c8f8e59816eca49d776562d2d302bf990a87faf0
> >>
> >> However, the problem that the patch is supposed to fix is still
> >> occurring, even in combination with other experimental patches
> >> submitted. These experiences can be read about here:
> >> https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/5/
> >>
> >> However, on this mailing list a user by the name of Kristian claims
> >> that disabling flow control helps fix this problem, as can be read
> >> here: 
> >> https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
> >>
> >>  From what I understood, he was running many mt7621 devices
> >> commercially, with many of them experiencing the issue, which were all
> >> fixed with his own flow control patch. My question is why the decision
> >> was made to only disable flow control on port 5 in the above mentioned
> >> Openwrt patch? AFAIK, Kristian's own patch disables flow control on
> >> all of the ports and he claims the issue is fixed for him. Perhaps the
> >> current patch should be extended to disable flow control on all ports?
> >> What are people's thoughts on this?
> > I'm facing the same issue now after upgrading to 5.4 kernel more often than 
> > before.
> > Every second reboot reboot with 5.4 fails with this timeout error.
> >
> >> Yours sincerely,
> >>
> >> Jaap
> > André
> >
> >
> from previous discussions with MTK and looking at the SDK code, the flow
> control should always be disabled.
>
>  John
>

Dear all,

I had a look at the actual patch to see if it would be possible for me
to extend it to disable flow control on all ports. However, I am
unsure if the patch in its current form even does anything, even for
port 5. Here is the diff of the commit in question:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blobdiff;f=target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c;h=232bcd8cf4ea5edbd34d815032ce72b1f1f85661;hp=89be23900738095a8180532d5dd7e585f01bb7c4;hb=c8f8e59816eca49d776562d2d302bf990a87faf0;hpb=3e11ddaf2ede4f105bc9ac91229623526371a7a2

In the old code, the IF statement was used to check silicon revision.
As this code only runs on mt7621 devices, this apparently always
evaluates to TRUE, which means the ELSE statement is never run.
Therefore, the IF condition is removed. However, let's see what was in
the original IF block:

/* (GE1, Force 1000M/FD, FC ON, MAX_RX_LENGTH 1536) */
mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);

As we can see, some values are written to the switch's registers. The
included comment states "FC ON" implying that flow control is enabled,
as expected (since this was the old code).

Now if we look at the new code, we see this:

/* (GE1, Force 1000M/FD, FC OFF, MAX_RX_LENGTH 1536) */
mtk_switch_w32(gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
mt7530_mdio_w32(gsw, 0x3600, 0x5e30b);

Here the included comment states "FC OFF" implying that flow control
is disabled. HOWEVER, if we look at the values written and the
location where they are written to, they are identical in the old and
new code. Hence this commit doesn't seem to actually change anything.
Am I reading this correct, or am I missing something?

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: MT7621 Flow Control

2020-08-06 Thread Jaap Buurman
On Thu, Aug 6, 2020 at 2:35 PM John Crispin  wrote:
>
>
> On 06.08.20 14:31, Andre Valentin wrote:
> > Hi Jaap,
> >
> >
> > Am 06.08.20 um 13:43 schrieb Jaap Buurman:
> >> Dear all,
> >>
> >> I have noticed the flow control work for mt7621 in the following
> >> Openwrt patch: 
> >> https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=c8f8e59816eca49d776562d2d302bf990a87faf0
> >>
> >> However, the problem that the patch is supposed to fix is still
> >> occurring, even in combination with other experimental patches
> >> submitted. These experiences can be read about here:
> >> https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/5/
> >>
> >> However, on this mailing list a user by the name of Kristian claims
> >> that disabling flow control helps fix this problem, as can be read
> >> here: 
> >> https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html
> >>
> >>  From what I understood, he was running many mt7621 devices
> >> commercially, with many of them experiencing the issue, which were all
> >> fixed with his own flow control patch. My question is why the decision
> >> was made to only disable flow control on port 5 in the above mentioned
> >> Openwrt patch? AFAIK, Kristian's own patch disables flow control on
> >> all of the ports and he claims the issue is fixed for him. Perhaps the
> >> current patch should be extended to disable flow control on all ports?
> >> What are people's thoughts on this?
> > I'm facing the same issue now after upgrading to 5.4 kernel more often than 
> > before.
> > Every second reboot reboot with 5.4 fails with this timeout error.
> >
> >> Yours sincerely,
> >>
> >> Jaap
> > André
> >
> >
> from previous discussions with MTK and looking at the SDK code, the flow
> control should always be disabled.
>
>  John
>

Dear John,

Thank you for your information! Does that mean the current patch might
be insufficient in the sense that it only disabled flow control on
port 5, rather than on all ports? Because as evident, people are still
facing the issue, that is supposedly fixed for Kristian, which used
his own patch to disable Flow Control on all ports. What is your
opinion on this?

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


MT7621 Flow Control

2020-08-06 Thread Jaap Buurman
Dear all,

I have noticed the flow control work for mt7621 in the following
Openwrt patch: 
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=c8f8e59816eca49d776562d2d302bf990a87faf0

However, the problem that the patch is supposed to fix is still
occurring, even in combination with other experimental patches
submitted. These experiences can be read about here:
https://forum.openwrt.org/t/mtk-soc-eth-watchdog-timeout-after-r11573/5/

However, on this mailing list a user by the name of Kristian claims
that disabling flow control helps fix this problem, as can be read
here: 
https://lists.openwrt.org/pipermail/openwrt-devel/2017-November/009882.html

>From what I understood, he was running many mt7621 devices
commercially, with many of them experiencing the issue, which were all
fixed with his own flow control patch. My question is why the decision
was made to only disable flow control on port 5 in the above mentioned
Openwrt patch? AFAIK, Kristian's own patch disables flow control on
all of the ports and he claims the issue is fixed for him. Perhaps the
current patch should be extended to disable flow control on all ports?
What are people's thoughts on this?

Yours sincerely,

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] ramips: gsw_mt7621: disable PORT 5 MAC RX/TX flow control by default

2020-05-26 Thread Jaap Buurman
Dear all,

The above patch has been committed for a long while in the master
branch 
(https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=c8f8e59816eca49d776562d2d302bf990a87faf0).
Is there any chance this could be backported to the 19.07 branch as
well, since it's a bug-fix and not a new feature? Thanks!

Yours sincerely,

Jaap

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06: Cherry pick mtu fix

2018-06-21 Thread Jaap Buurman
Dear Kevin,

Some very good points you are giving. Is there anyone able and willing
to cherry-pick these 3 commits to the 18.06 branch before the RC1
release is tagged tomorrow?

Yours sincerely,

Jaap Buurman
On Thu, Jun 21, 2018 at 10:27 AM Kevin Darbyshire-Bryant
 wrote:
>
>
>
> > On 21 Jun 2018, at 08:13, Jaap Buurman  wrote:
> >
> > Dear all,
> >
> > The move to kernel 4.14 broke mtu settings larger than 1500 by
> > default, unless the correct mtu was explicitly specified. The
> > following commit fixes this for the mt7621 target:
> > https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=5da2c68d001ee44b15a58639ed03a0ebb6f68020
> >
> > 1) Would it be possible for someone to cherry-pick this to the 18.06
> > branch, so that it can receive widespread testing in the upcoming
> > 18.06 RC1 release?
>
> Personally I’d also pick the preceding mtk driver commit "ec502cd3fe ramips: 
> rename ethernet driver folder to the same one that upstream uses” for clarity 
> & easier cherry-picks going forward for 18.06.  And I’d be very strongly 
> tempted to have 9a4253b81f ramips: improve ethernet driver performance with 
> GRO/TSO  as well…. which gets 18.06 into the same state as master for mtk eth 
> drivers.
>
> > 2) From the commit's message I get the impression this isn't an issue
> > with just mt7621, but with all targets that are able to handle a mtu >
> > 1500. Is my impression correct in this regard, and is this something
> > that should be fixed before a 18.06 release? Changing mtu settings >
> > 1500 does sound like basic functionality for routers nowadays.
>
> It is dependent upon the driver setting a suitable max_mtu in the netdev 
> structure *IF* it supports values greater than 1500.
>
> +   if (IS_ENABLED(CONFIG_SOC_MT7621))
> +   netdev->max_mtu = 2048;
>
> The version of driver included with kernel 4.14 at present is too dumb to 
> understand >1500 mtu, which is why openwrt replaces it with a version that 
> (now) does.
>
> That’s my opinion anyway :-)
>
> Kevin
>

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/listinfo/openwrt-devel


[OpenWrt-Devel] 18.06: Cherry pick mtu fix

2018-06-21 Thread Jaap Buurman
Dear all,

The move to kernel 4.14 broke mtu settings larger than 1500 by
default, unless the correct mtu was explicitly specified. The
following commit fixes this for the mt7621 target:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=5da2c68d001ee44b15a58639ed03a0ebb6f68020

1) Would it be possible for someone to cherry-pick this to the 18.06
branch, so that it can receive widespread testing in the upcoming
18.06 RC1 release?
2) From the commit's message I get the impression this isn't an issue
with just mt7621, but with all targets that are able to handle a mtu >
1500. Is my impression correct in this regard, and is this something
that should be fixed before a 18.06 release? Changing mtu settings >
1500 does sound like basic functionality for routers nowadays.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/listinfo/openwrt-devel


[OpenWrt-Devel] 18.06 bug: Flow Offload Active Connections

2018-05-29 Thread Jaap Buurman
Dear all,

Whenever flow offload is enabled (either software or hardware) I can
see many many active connections on the Luci overview page. It can go
up to thousands of active connections. Looking in the "connections"
part of the "realtime graphs" in Luci shows me that even connections
with devices that turned off hours ago are still active. So for some
reasons old connections are not leaving the conntrack table. Turning
off flow offload fixes these issues right away. I am currently running
the latest 18.06 snapshot on a dir-860l. Hopefully this is useful
information for debugging.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/listinfo/openwrt-devel


Re: [OpenWrt-Devel] Wireguard & hw flow offload incompatibility

2018-05-29 Thread Jaap Buurman
Dear Jason,

The initial technical explanation unfortunately went over my head,
since I am not that technical myself. But I will do my best to provide
the required information. First of all, sorry for the confusion that I
may have caused, but this happens with both the hardware version of
the flow offload implementation and the software version, so it
doesn't seem to be caused by any vendor specific (hardware) logic. So
it is probably easier to focus on the software version for now. Also,
in case that wasn't fully clear, both the flow offloading feature and
the wireguard interface are both running on the router itself.

What exactly do you mean with the kernel source of these boxes? As far
as I understand, Lede/OpenWRT uses the upstream 4.14 kernel in this
build (4.14.43 in the one I am running atm) with Lede/OpenWRT specific
patches. The patches that enable flow offloading support for the 4.14
kernel can be found in these 2 folders:
https://github.com/openwrt/openwrt/tree/openwrt-18.06/target/linux/generic/backport-4.14
https://github.com/openwrt/openwrt/tree/openwrt-18.06/target/linux/generic/pending-4.14

One important thing is that the upstream flow offloading code uses
nftables, while Lede/OpenWRT uses iptables. Hence flow offload support
has been backported to iptables, which also might be a contribution to
this bug.

I'm not even sure what dst entries are exactly, but I found one patch
that is supposed to fix dst entries. Perhaps it is incomplete or
contains a bug?:
https://github.com/openwrt/openwrt/commit/c89e338fe68fd5af61b80ef37c55a657721c6542

I will try to cross-compile wireguard with your suggested patch
tomorrow or the day after tomorrow depending on my time schedule. I
will report back whether it solves this issue. Thank you very much.

Yours sincerely,

Jaap Buurman

On Tue, May 29, 2018 at 2:38 PM, Jason A. Donenfeld  wrote:
> Hey Felix,
>
> Per the below thread, I've been digging around trying to see what's
> going on. Apparently packets are hitting a virtual network interface's
> ndo_start_xmit with no dst when hardware offloading enabled. I assume
> that the path is something along the lines of a packet coming in on
> one of these hardware accelerated NICs and then being forwarded to the
> wireguard interface, which expects the dst. I found your
> ndo_flow_offload patchset, and I suspect that might have something to
> do with this. Any insights on dsts disappearing in skbs?
>
> Thanks,
> Jason
>
> On Tue, May 29, 2018 at 2:14 PM, Jason A. Donenfeld  wrote:
>> Hi Jaap,
>>
>> Thanks for the clarification. I downloaded the binary for that
>> hardware and triaged where the bug occurs [1]. This patch [2] should
>> probably fix it, but I'm rather surprised to see situations in which a
>> skb is missing a dst entry in ndo_start_xmit; this might point to
>> deeper kernel bugs in this hardware offloading feature, or some
>> alternative mechanism for routing being used when hardware offloading
>> is on. So I'm hesitant to merge this just yet, because perhaps this is
>> better handled in the compat layer, if it is in fact vendor silliness.
>> Do you have a link to the kernel source of these boxes? I'd like to
>> see what exactly the vendor is doing. And if you could try [2] and see
>> if that still crashes, this would be most appreciated.
>>
>> Thanks,
>> Jason
>>
>> [1] https://data.zx2c4.com/openwrt-mips-offloading-bug.png
>> [2] https://א.cc/Am4tZ0n8
>>
>> On Tue, May 29, 2018 at 1:59 PM, Jaap Buurman  wrote:
>>> Dear Jason,
>>>
>>> This isn't a regression. This is simply the first time this has been
>>> observed. (hw) flow offload is a new feature, and hence this
>>> interaction with wireguard is also new.
>>>
>>> Yours sincerely,
>>>
>>> Jaap
>>>
>>> On Tue, May 29, 2018 at 1:54 PM, Jason A. Donenfeld  wrote:
>>>> Hi Jaap,
>>>>
>>>> Thanks for the report. Is this a _new_ bug in _new_ version of
>>>> WireGuard that wasn't there before. Or is this the first time you've
>>>> observed this?
>>>>
>>>> Thanks,
>>>> Jason
>>
>>  Original Mail ==
>>
>>> Dear all,
>>>
>>> When running a wireguard interface on the latest Lede master branch,
>>> the router will crash as soon as traffic hits the wireguard interface
>>> while (hw) flow offloading is enabled. I am not sure whether this is a
>>> bug with wireguard, hw flow offload, both or neither, so I am
>>> reporting the bug to both mailinglists. A more detailed description
>>> plus a properly formatted stack trace can be found on Lede's bug
>>> tracker: https://bugs.openwrt.org/index.php?do=details_id=1539
>>>
>>> If you require any additional information, please do not hesitate to
>>> contact me. Thank you very much in advance.
>>>
>>> Yours sincerely,
>>>
>>> Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/listinfo/openwrt-devel


Re: [OpenWrt-Devel] FS#1567 reported: making openrwrt unusable (BT Home Hub 5) since between r6080 and r7050

2018-05-28 Thread Jaap Buurman
On Mon, May 28, 2018 at 3:27 PM, Mauro Mozzarelli <ma...@ezplanet.org> wrote:
> I will be happy to help. How do I do it?
>
> I mean, I have a backup of the master folder for r6080, since I keep a full
> copy of every successful build that I then run on my routers. How do I pull
> just one release at a time, like r6081, r6082, etc.? That is the best I
> could do, build the releases one by one until I find the one that fails.
>
>
> Mauro
>
>
>
> On 28/05/18 14:17, Jaap Buurman wrote:
>>
>> On Mon, May 28, 2018 at 3:12 PM, Mauro Mozzarelli <ma...@ezplanet.org>
>> wrote:
>>>
>>> This does not make sense, is the alternative to write off openwrt?
>>> Because
>>> if ADSL + PPPoA do not work, then it is useless.
>>>
>>> As a minimum I would expect a developer to look at the commits between
>>> r6080
>>> and r7050 to see what has changed and roll-back.
>>>
>>> Also if you need any further information, just ask and provide guidance
>>> on
>>> how to get it.
>>> If you would like the hardware I can arrange for it to be shipped to you.
>>>
>>> Mauro
>>>
>>>
>>>
>>> On 28/05/18 14:02, Jo-Philipp Wich wrote:
>>>>
>>>> Hi,
>>>>
>>>>> Is anyone looking into it?
>>>>
>>>> I doubt it, unfortunately the info in the ticket is too vague to work
>>>> with. Personally I don't have any hardware to debug this.
>>>>
>>>> ~ Jo
>>>>
>>>> ___
>>>> openwrt-devel mailing list
>>>> openwrt-devel@lists.openwrt.org
>>>> http://lists.infradead.org/mailman/listinfo/openwrt-devel
>>>
>>>
>>>
>>> ___
>>> openwrt-devel mailing list
>>> openwrt-devel@lists.openwrt.org
>>> http://lists.infradead.org/mailman/listinfo/openwrt-devel
>>
>> Dear Mauro,
>>
>> Going through nearly 1000 commits without knowing what to look for is
>> very difficult and not very time efficient. The easiest way for you to
>> help track down this issue would be to do a Git bisect from r6080 to
>> r7050 until you find the specific commit that breaks PPPoA. It would
>> be much easier to debug/fix if a developer knows which particular
>> commit breaks stuff :)
>>
>> Yours sincerely,
>>
>> Jaap Buurman
>
>

I don't know the commands by heart, so you will have to look up the
specific commands. But basically:

1) use git to clone the repository
2) mark the last known good commit
3) mark the first known bad commit
4) git will then pick one commit in the middle, so in this case
r6500ish. You compile that image:
https://openwrt.org/docs/guide-developer/quickstart-build-images
5) you test the image and mark it as good/bad. Git will then pick a
new commit r6250ish or 6750ish depending on whether the previous one
was bad/good.

Since you halve the search space each attempt, it shouldn't take that
many attempts to track down the specific commit that broke it within
1000 commits.

Good luck :)

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] FS#1567 reported: making openrwrt unusable (BT Home Hub 5) since between r6080 and r7050

2018-05-28 Thread Jaap Buurman
On Mon, May 28, 2018 at 3:12 PM, Mauro Mozzarelli <ma...@ezplanet.org> wrote:
> This does not make sense, is the alternative to write off openwrt? Because
> if ADSL + PPPoA do not work, then it is useless.
>
> As a minimum I would expect a developer to look at the commits between r6080
> and r7050 to see what has changed and roll-back.
>
> Also if you need any further information, just ask and provide guidance on
> how to get it.
> If you would like the hardware I can arrange for it to be shipped to you.
>
> Mauro
>
>
>
> On 28/05/18 14:02, Jo-Philipp Wich wrote:
>>
>> Hi,
>>
>>> Is anyone looking into it?
>>
>> I doubt it, unfortunately the info in the ticket is too vague to work
>> with. Personally I don't have any hardware to debug this.
>>
>> ~ Jo
>>
>> ___
>> openwrt-devel mailing list
>> openwrt-devel@lists.openwrt.org
>> http://lists.infradead.org/mailman/listinfo/openwrt-devel
>
>
>
> ___
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> http://lists.infradead.org/mailman/listinfo/openwrt-devel

Dear Mauro,

Going through nearly 1000 commits without knowing what to look for is
very difficult and not very time efficient. The easiest way for you to
help track down this issue would be to do a Git bisect from r6080 to
r7050 until you find the specific commit that breaks PPPoA. It would
be much easier to debug/fix if a developer knows which particular
commit breaks stuff :)

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-28 Thread Jaap Buurman
Dear Mathias,

I can confirm your patch is working fine. I am able to set a mtu of
1508 on the switch, giving me a mtu of 1500 on the pppoe-wan
connection. I am now able to ping 1472 bytes with the DF flag set. The
patch in question:
https://git.openwrt.org/?p=openwrt/staging/mkresin.git;a=commitdiff;h=cc5f1fe7aa02943f3b39ffbd9dc3b8fcad569c8f

Thank you very much for your work :)

@the rest

I was able to flash simply by disabling all my WiFi interfaces. It's a
dirty workaround and should be fixed before a 18.06 release IMO, but
at least we managed to track down what's causing the issue :)

Yours sincerely,

Jaap Buurman

On Sat, May 26, 2018 at 9:16 AM, Kristian Evensen
<kristian.even...@gmail.com> wrote:
> Hi,
>
> (Accidentally hit send)
>
> On Fri, May 25, 2018 at 7:06 PM, Kristian Evensen
> <kristian.even...@gmail.com> wrote:
>>> I know how to fix the issue by recovery, however, from the responses
>>> in the topic on the Lede forum it seems more people are running into
>>> this issue. This definitely needs to be fixed before a 18.06 release.
>>> Is there someone with a mt7621 device that can reproduce the problem,
>>> and that has serial access? We might be able to figure out what is
>>> going wrong.
>
> I kept looking into this and instrumented /lib/upgrade/stage2. I added
> some output showing which processes were left for each iteration of
> the loop, as well as when "Failed to kill ..." hits. It seems that
> hostapd, for some reason, takes unexpectedly long to die:
>
> Sending TERM to remaining processes ... loop limit 10
> logd
> rpcd
> netifd
> odhcpd
> crond
> ntpd
> nginx
> nginx
> ubusd
> dnsmasq
> sh
> sh
> sh
> sshd
> sleep
> sh
> hostapd
> hostapd
> rsync
> ssh
> sleep
>
> [  115.583843] device wlan0 left promiscuous mode
> [  115.588436] br-lan: port 3(wlan0) entered disabled state
> [  115.594261] device wlan1 left promiscuous mode
> [  115.598798] br-lan: port 2(wlan1) entered disabled state
> Sending KILL to remaining processes ... loop limit 10
> hostapd
> loop limit 9
> hostapd
> loop limit 8
> hostapd
> loop limit 7
> hostapd
> loop limit 6
> hostapd
> loop limit 5
> hostapd
> loop limit 4
> hostapd
> loop limit 3
> hostapd
> loop limit 2
> hostapd
> loop limit 1
>
> Failed to kill all processes.
>   PID USER   VSZ STAT COMMAND
> 1 root   992 S/sbin/upgraded /tmp/firmware.bin . /lib/functions.sh
> 2 root 0 SW   [kthreadd]
> 3 root 0 IW   [kworker/0:0]
> 4 root 0 IW<  [kworker/0:0H]
> 5 root 0 IW   [kworker/u8:0]
> 6 root 0 IW<  [mm_percpu_wq]
> 7 root 0 SW   [ksoftirqd/0]
> 8 root 0 IW   [rcu_sched]
> 9 root 0 IW   [rcu_bh]
>10 root 0 SW   [migration/0]
>11 root 0 SW   [cpuhp/0]
>12 root 0 SW   [cpuhp/1]
>13 root 0 SW   [migration/1]
>14 root 0 SW   [ksoftirqd/1]
>15 root 0 IW   [kworker/1:0]
>16 root 0 IW<  [kworker/1:0H]
>17 root 0 SW   [cpuhp/2]
>18 root 0 SW   [migration/2]
>19 root 0 SW   [ksoftirqd/2]
>20 root 0 IW   [kworker/2:0]
>21 root 0 IW<  [kworker/2:0H]
>22 root 0 SW   [cpuhp/3]
>23 root 0 SW   [migration/3]
>24 root 0 SW   [ksoftirqd/3]
>25 root 0 IW   [kworker/3:0]
>26 root 0 IW<  [kworker/3:0H]
>27 root 0 IW   [kworker/u8:1]
>34 root 0 IW   [kworker/u8:2]
>65 root 0 IW   [kworker/0:1]
>66 root 0 IW   [kworker/3:1]
>67 root 0 IW   [kworker/2:1]
>   136 root 0 IW   [kworker/1:1]
>   137 root 0 SW   [oom_reaper]
>   138 root 0 IW<  [writeback]
>   140 root 0 IW<  [crypto]
>   142 root 0 IW<  [kblockd]
>   157 root 0 IW   [kworker/u8:3]
>   177 root 0 IW<  [watchdogd]
>   201 root 0 SW   [kswapd0]
>   233 root 0 IW<  [pencrypt]
>   262 root 0 IW<  [pdecrypt]
>   295 root 0 SW   [spi0]
>   353 root 0 IW<  [ipv6_addrconf]
>   362 root 0 IW<  [kworker/1:1H]
>   363 root 0 IW<  [kworker/0:1H]
>   365 root 0 IW<  [kworker/3:1H]
>   366 root 0 IW<  [kworker/2:1H]
>   416 root 0 IW   [kworker/1:2]
>   417 root 0 IW   [kworker/0:2]
>   457 root 0 SWN  [jffs2_gcd_mtd6]
>   575 root 0 IW   [kworker/2:2]
>   869 root 0 IW<  [cfg80211]
>  1842 root 0 IW   [kworker/3:2]
>  7535

Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-25 Thread Jaap Buurman
On Fri, May 25, 2018 at 1:35 PM, Levente <leventel...@gmail.com> wrote:
> Try upgrading the sysupgrade image using your bootloader.
>
> Lev
>
> On Fri, May 25, 2018 at 1:25 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>> On Fri, May 25, 2018 at 1:14 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>>> On Fri, May 25, 2018 at 12:43 PM, Mathias Kresin <d...@kresin.me> wrote:
>>>> 2018-05-25 12:48 GMT+03:00 Jaap Buurman <jaapbuur...@gmail.com>:
>>>>> Dear Martin, Mathias and the rest,
>>>>>
>>>>> Please scratch my previous message. It seems like the flash was not
>>>>> successful, and hence I was still running the old firmware. However, I
>>>>> have tried flashing 3 different times now, without any luck. The
>>>>> router ends up rebooting and boots right into the old firmware. This
>>>>> seems to be a major bug. Is there anything I can do to help debug this
>>>>> particular issue?
>>>>
>>>> First of all, Martin is right. The commit in my staging tree should
>>>> fix the MTU issue but I don't have the hardware to test it on my own.
>>>>
>>>> So far you never mentioned which board you have. Hence it's quite
>>>> difficult to have a look at the code about what could be wrong. It
>>>> would be helpful if you can name the last working revision to limit
>>>> the number of commits to look at.
>>>>
>>>>> Seems like a dealbreaker for 18.06 (which I am
>>>>> running now) to me. I could simply use recovery and flash a firmware
>>>>> like that, but I would prefer to get to the bottom of this issue so
>>>>> that end users won't end up stuck on a particular firmware. Any ideas
>>>>> what I could do to debug this?
>>>>
>>>> Your best bet is to attach the/a serial console and check the console
>>>> for errors.
>>>>
>>>> Mathias
>>>
>>> My apologies for leaving out important details. I am using a Dir-860L
>>> B1. I used to be running Lede 17.01.4, until last Tuesday. At that day
>>> I upgraded to OpenWrt 18.06-SNAPSHOT r6917-8948a78 via Luci. Flashing
>>> any other firmware seems to be broken now: I have tried flashing a
>>> build compiled from your staging tree, I've tried reverting back to
>>> 17.01.4 and I've tried reflashing 18.06. All end up in the exact same
>>> spot: Still on the OpenWrt 18.06-SNAPSHOT r6917-8948a78 with all
>>> manually installed packages still present. I've tried flashing via
>>> Luci and via the sysupgrade command (with the -v switch for more
>>> verbosity), but no useful information there. The last line that is
>>> output is simply:
>>>
>>> Commencing upgrade. All shell sessions will be closed now.
>>>
>>> One particular weird thing that I do remember on this build, is the
>>> fact that I tried to update all upgradable packages via OPKG (I know
>>> this is discouraged). One of those packages was "base-files". The
>>> upgrade failed with a weird error (can't remember what exactly), but
>>> nothing seemed wrong at that time, so I didn't really think much about
>>> it. Is there anyone more knowledgeable than me that knows whether this
>>> could influence the sysupgrade functionality?
>>>
>>> Lastly, I do not have a serial cable unfortunately, so I think
>>> debugging will be difficult for me. I could use recovery to reflash a
>>> fresh 18.06 build, and see if upgrade functionality is still broken in
>>> that case. I will report back with my findings.
>>>
>>> Yours sincerely,
>>>
>>> Jaap Buurman
>>
>> Dear all,
>>
>> This just popped up on the Lede forum:
>> https://forum.lede-project.org/t/xiaomi-wifi-router-3g/5377/879
>>
>> So this might simply be a (mt7621 specific?) bug that prevents
>> sysupgrade from working properly. I am still awaiting his answers to
>> verify that he is indeed also running into the same issue where to
>> firmware won't upgrade.
>>
>> Yours sincerely,
>>
>> Jaap Buurman
>>
>> ___
>> openwrt-devel mailing list
>> openwrt-devel@lists.openwrt.org
>> http://lists.infradead.org/mailman/listinfo/openwrt-devel

I know how to fix the issue by recovery, however, from the responses
in the topic on the Lede forum it seems more people are running into
this issue. This definitely needs to be fixed before a 18.06 release.
Is there someone with a mt7621 device that can reproduce the problem,
and that has serial access? We might be able to figure out what is
going wrong.

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-25 Thread Jaap Buurman
On Fri, May 25, 2018 at 1:14 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
> On Fri, May 25, 2018 at 12:43 PM, Mathias Kresin <d...@kresin.me> wrote:
>> 2018-05-25 12:48 GMT+03:00 Jaap Buurman <jaapbuur...@gmail.com>:
>>> Dear Martin, Mathias and the rest,
>>>
>>> Please scratch my previous message. It seems like the flash was not
>>> successful, and hence I was still running the old firmware. However, I
>>> have tried flashing 3 different times now, without any luck. The
>>> router ends up rebooting and boots right into the old firmware. This
>>> seems to be a major bug. Is there anything I can do to help debug this
>>> particular issue?
>>
>> First of all, Martin is right. The commit in my staging tree should
>> fix the MTU issue but I don't have the hardware to test it on my own.
>>
>> So far you never mentioned which board you have. Hence it's quite
>> difficult to have a look at the code about what could be wrong. It
>> would be helpful if you can name the last working revision to limit
>> the number of commits to look at.
>>
>>> Seems like a dealbreaker for 18.06 (which I am
>>> running now) to me. I could simply use recovery and flash a firmware
>>> like that, but I would prefer to get to the bottom of this issue so
>>> that end users won't end up stuck on a particular firmware. Any ideas
>>> what I could do to debug this?
>>
>> Your best bet is to attach the/a serial console and check the console
>> for errors.
>>
>> Mathias
>
> My apologies for leaving out important details. I am using a Dir-860L
> B1. I used to be running Lede 17.01.4, until last Tuesday. At that day
> I upgraded to OpenWrt 18.06-SNAPSHOT r6917-8948a78 via Luci. Flashing
> any other firmware seems to be broken now: I have tried flashing a
> build compiled from your staging tree, I've tried reverting back to
> 17.01.4 and I've tried reflashing 18.06. All end up in the exact same
> spot: Still on the OpenWrt 18.06-SNAPSHOT r6917-8948a78 with all
> manually installed packages still present. I've tried flashing via
> Luci and via the sysupgrade command (with the -v switch for more
> verbosity), but no useful information there. The last line that is
> output is simply:
>
> Commencing upgrade. All shell sessions will be closed now.
>
> One particular weird thing that I do remember on this build, is the
> fact that I tried to update all upgradable packages via OPKG (I know
> this is discouraged). One of those packages was "base-files". The
> upgrade failed with a weird error (can't remember what exactly), but
> nothing seemed wrong at that time, so I didn't really think much about
> it. Is there anyone more knowledgeable than me that knows whether this
> could influence the sysupgrade functionality?
>
> Lastly, I do not have a serial cable unfortunately, so I think
> debugging will be difficult for me. I could use recovery to reflash a
> fresh 18.06 build, and see if upgrade functionality is still broken in
> that case. I will report back with my findings.
>
> Yours sincerely,
>
> Jaap Buurman

Dear all,

This just popped up on the Lede forum:
https://forum.lede-project.org/t/xiaomi-wifi-router-3g/5377/879

So this might simply be a (mt7621 specific?) bug that prevents
sysupgrade from working properly. I am still awaiting his answers to
verify that he is indeed also running into the same issue where to
firmware won't upgrade.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-25 Thread Jaap Buurman
On Fri, May 25, 2018 at 12:43 PM, Mathias Kresin <d...@kresin.me> wrote:
> 2018-05-25 12:48 GMT+03:00 Jaap Buurman <jaapbuur...@gmail.com>:
>> Dear Martin, Mathias and the rest,
>>
>> Please scratch my previous message. It seems like the flash was not
>> successful, and hence I was still running the old firmware. However, I
>> have tried flashing 3 different times now, without any luck. The
>> router ends up rebooting and boots right into the old firmware. This
>> seems to be a major bug. Is there anything I can do to help debug this
>> particular issue?
>
> First of all, Martin is right. The commit in my staging tree should
> fix the MTU issue but I don't have the hardware to test it on my own.
>
> So far you never mentioned which board you have. Hence it's quite
> difficult to have a look at the code about what could be wrong. It
> would be helpful if you can name the last working revision to limit
> the number of commits to look at.
>
>> Seems like a dealbreaker for 18.06 (which I am
>> running now) to me. I could simply use recovery and flash a firmware
>> like that, but I would prefer to get to the bottom of this issue so
>> that end users won't end up stuck on a particular firmware. Any ideas
>> what I could do to debug this?
>
> Your best bet is to attach the/a serial console and check the console
> for errors.
>
> Mathias

My apologies for leaving out important details. I am using a Dir-860L
B1. I used to be running Lede 17.01.4, until last Tuesday. At that day
I upgraded to OpenWrt 18.06-SNAPSHOT r6917-8948a78 via Luci. Flashing
any other firmware seems to be broken now: I have tried flashing a
build compiled from your staging tree, I've tried reverting back to
17.01.4 and I've tried reflashing 18.06. All end up in the exact same
spot: Still on the OpenWrt 18.06-SNAPSHOT r6917-8948a78 with all
manually installed packages still present. I've tried flashing via
Luci and via the sysupgrade command (with the -v switch for more
verbosity), but no useful information there. The last line that is
output is simply:

Commencing upgrade. All shell sessions will be closed now.

One particular weird thing that I do remember on this build, is the
fact that I tried to update all upgradable packages via OPKG (I know
this is discouraged). One of those packages was "base-files". The
upgrade failed with a weird error (can't remember what exactly), but
nothing seemed wrong at that time, so I didn't really think much about
it. Is there anyone more knowledgeable than me that knows whether this
could influence the sysupgrade functionality?

Lastly, I do not have a serial cable unfortunately, so I think
debugging will be difficult for me. I could use recovery to reflash a
fresh 18.06 build, and see if upgrade functionality is still broken in
that case. I will report back with my findings.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-25 Thread Jaap Buurman
On Fri, May 25, 2018 at 11:30 AM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
> On Thu, May 24, 2018 at 8:00 PM, Martin Blumenstingl
> <martin.blumensti...@googlemail.com> wrote:
>> Hello Jaap,
>>
>> On Thu, May 24, 2018 at 12:00 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>>> Dear all,
>>>
>>> I found some additional information in the system log: Thu May 24
>>> 11:38:39 2018 kern.err kernel: [83864.729458] eth0: Invalid MTU 1508
>>> requested, hw max 1500
>>> Digging deeper, this seems like a message that is spawned by a
>>> function in /net/core.dev.c of the linux kernel:
>>>
>>> if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) {
>>> net_err_ratelimited("%s: Invalid MTU %d requested, hw max %d\n",
>>> dev->name, new_mtu, dev->max_mtu);
>>> return -EINVAL;
>>> }
>>>
>>> Is there anybody that happens to know where exactly this max_mtu value
>>> is set to 1500? For mt7621 devices this should be 2048 (Baby jambo
>>> frames).
>>>
>>> Thank you very much in advance.
>>>
>>> Yours sincerely,
>>>
>>> Jaap Buurman
>>>
>>> On Tue, May 22, 2018 at 3:05 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>>>> Dear all,
>>>>
>>>> The switch to the 4.14 kernel apparently broke the baby jumbo frames
>>>> support of 2048 bytes that the switch is capable off. I found out that
>>>> changing the mtu above 1500 via Luci no longer applies properly.
>>>> Trying to manually change the mtu via ssh also fails:
>>>>
>>>> root@LEDE:~# ifconfig eth0 mtu 1508 up
>>>> ifconfig: SIOCSIFMTU: Invalid argument
>>>>
>>>> If there is any additional information that I can supply, please let
>>>> me know. I am also more than willing to help test potential fixes :)
>> I *believe* Mathias has a fix for this in his tree (but I'm not sure
>> if he has the hardware to test it): [0]
>> maybe you can give it a go and report back?
>>
>>
>> Regards
>> Martin
>>
>>
>> [0] 
>> https://git.openwrt.org/?p=openwrt/staging/mkresin.git;a=commitdiff;h=cc5f1fe7aa02943f3b39ffbd9dc3b8fcad569c8f
>
> Dear Martin and Mathias,
>
> I have compiled and flashed an image from Mathias' tree checked out at
> the commit linked in Martin's previous message. Unfortunately, I am
> still seeing the following message in dmesg:
> [  243.845159] eth0: Invalid MTU 1508 requested, hw max 1500
>
> If there are additional tests you would like me to run, please do not
> hesitate to contact me :)
>
> Yours sincerely,
>
> Jaap Buurman

Dear Martin, Mathias and the rest,

Please scratch my previous message. It seems like the flash was not
successful, and hence I was still running the old firmware. However, I
have tried flashing 3 different times now, without any luck. The
router ends up rebooting and boots right into the old firmware. This
seems to be a major bug. Is there anything I can do to help debug this
particular issue? Seems like a dealbreaker for 18.06 (which I am
running now) to me. I could simply use recovery and flash a firmware
like that, but I would prefer to get to the bottom of this issue so
that end users won't end up stuck on a particular firmware. Any ideas
what I could do to debug this?

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-25 Thread Jaap Buurman
On Thu, May 24, 2018 at 8:00 PM, Martin Blumenstingl
<martin.blumensti...@googlemail.com> wrote:
> Hello Jaap,
>
> On Thu, May 24, 2018 at 12:00 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>> Dear all,
>>
>> I found some additional information in the system log: Thu May 24
>> 11:38:39 2018 kern.err kernel: [83864.729458] eth0: Invalid MTU 1508
>> requested, hw max 1500
>> Digging deeper, this seems like a message that is spawned by a
>> function in /net/core.dev.c of the linux kernel:
>>
>> if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) {
>> net_err_ratelimited("%s: Invalid MTU %d requested, hw max %d\n",
>> dev->name, new_mtu, dev->max_mtu);
>> return -EINVAL;
>> }
>>
>> Is there anybody that happens to know where exactly this max_mtu value
>> is set to 1500? For mt7621 devices this should be 2048 (Baby jambo
>> frames).
>>
>> Thank you very much in advance.
>>
>> Yours sincerely,
>>
>> Jaap Buurman
>>
>> On Tue, May 22, 2018 at 3:05 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
>>> Dear all,
>>>
>>> The switch to the 4.14 kernel apparently broke the baby jumbo frames
>>> support of 2048 bytes that the switch is capable off. I found out that
>>> changing the mtu above 1500 via Luci no longer applies properly.
>>> Trying to manually change the mtu via ssh also fails:
>>>
>>> root@LEDE:~# ifconfig eth0 mtu 1508 up
>>> ifconfig: SIOCSIFMTU: Invalid argument
>>>
>>> If there is any additional information that I can supply, please let
>>> me know. I am also more than willing to help test potential fixes :)
> I *believe* Mathias has a fix for this in his tree (but I'm not sure
> if he has the hardware to test it): [0]
> maybe you can give it a go and report back?
>
>
> Regards
> Martin
>
>
> [0] 
> https://git.openwrt.org/?p=openwrt/staging/mkresin.git;a=commitdiff;h=cc5f1fe7aa02943f3b39ffbd9dc3b8fcad569c8f

Dear Martin and Mathias,

I have compiled and flashed an image from Mathias' tree checked out at
the commit linked in Martin's previous message. Unfortunately, I am
still seeing the following message in dmesg:
[  243.845159] eth0: Invalid MTU 1508 requested, hw max 1500

If there are additional tests you would like me to run, please do not
hesitate to contact me :)

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-24 Thread Jaap Buurman
Dear all,

I found some additional information in the system log: Thu May 24
11:38:39 2018 kern.err kernel: [83864.729458] eth0: Invalid MTU 1508
requested, hw max 1500
Digging deeper, this seems like a message that is spawned by a
function in /net/core.dev.c of the linux kernel:

if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) {
net_err_ratelimited("%s: Invalid MTU %d requested, hw max %d\n",
dev->name, new_mtu, dev->max_mtu);
return -EINVAL;
}

Is there anybody that happens to know where exactly this max_mtu value
is set to 1500? For mt7621 devices this should be 2048 (Baby jambo
frames).

Thank you very much in advance.

Yours sincerely,

Jaap Buurman

On Tue, May 22, 2018 at 3:05 PM, Jaap Buurman <jaapbuur...@gmail.com> wrote:
> Dear all,
>
> The switch to the 4.14 kernel apparently broke the baby jumbo frames
> support of 2048 bytes that the switch is capable off. I found out that
> changing the mtu above 1500 via Luci no longer applies properly.
> Trying to manually change the mtu via ssh also fails:
>
> root@LEDE:~# ifconfig eth0 mtu 1508 up
> ifconfig: SIOCSIFMTU: Invalid argument
>
> If there is any additional information that I can supply, please let
> me know. I am also more than willing to help test potential fixes :)
>
> Yours sincerely,
>
> Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] 18.06 Bug: HW Flow Offload + Wireguard Bug

2018-05-22 Thread Jaap Buurman
Dear Felix & others,

I am currently running a 18.06 snapshot image to start testing the
stability of the firmware and new features, including the lovely
hardware flow offload. While it is working extremely well (I am
finally able to max out my connection, but with hardly any CPU load!),
pushing data through Wireguard instantly crashes my router whenever hw
flow offload is enabled. There is another report of this issue on the
forum, including a call trace:
https://forum.lede-project.org/t/netfilter-flow-offload-hw-nat/10237/44?u=mushoz

If you need any additional information or require my help testing,
please do not hesitate to contact me.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] Fwd: 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-22 Thread Jaap Buurman
Dear all,

The switch to the 4.14 kernel apparently broke the baby jumbo frames
support of 2048 bytes that the switch is capable off. I found out that
changing the mtu above 1500 via Luci no longer applies properly.
Trying to manually change the mtu via ssh also fails:

root@LEDE:~# ifconfig eth0 mtu 1508 up
ifconfig: SIOCSIFMTU: Invalid argument

If there is any additional information that I can supply, please let
me know. I am also more than willing to help test potential fixes :)

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] 18.06 Bug: Baby Jumbo Frames on mt7621

2018-05-22 Thread Jaap Buurman
Dear all,

The switch to the 4.14 kernel apparently broke the baby jumbo frames
support of 2048 bytes that the switch is capable off. I found out that
changing the mtu above 1500 via Luci no longer applies properly.
Trying to manually change the mtu via ssh also fails:

root@LEDE:~# ifconfig eth0 mtu 1508 up
ifconfig: SIOCSIFMTU: Invalid argument

If there is any additional information that I can supply, please let
me know. I am also more than willing to help test potential fixes :)

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] Bug: HW Flow Offload + Wireguard Bug

2018-05-22 Thread Jaap Buurman
Dear Felix & others,

I am currently running a 18.06 snapshot image to start testing the
stability of the firmware and new features, including the lovely
hardware flow offload. While it is working extremely well (I am
finally able to max out my connection, but with hardly any CPU load!),
pushing data through Wireguard instantly crashes my router whenever hw
flow offload is enabled. There is another report of this issue on the
forum, including a call trace:
https://forum.lede-project.org/t/netfilter-flow-offload-hw-nat/10237/44?u=mushoz

If you need any additional information or require my help testing,
please do not hesitate to contact me.

Yours sincerely,

Jaap Buurman

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
http://lists.infradead.org/mailman/listinfo/openwrt-devel