/filesystems/epoll/epoll_wakeup_test.c
(it just adds a second write) shows the different behavior.
The testlet passes with pipe() but fails with socketpair() with 5.10.
They both fail with 4.19.
Is it fair to assume that 5.10 pipe's behavior is the correct one?
Thanks,
Francesco Ruggeri
/*
* t0
On Wed, Oct 14, 2020 at 1:23 AM Florian Westphal wrote:
>
> Pablo Neira Ayuso wrote:
> > Legacy would still be flawed though.
>
> Its fine too, new rule blob gets handled (and match/target checkentry
> called) before old one is dismantled.
>
> We only have a 0 refcount + hook unregister when
On Fri, Oct 9, 2020 at 12:49 PM Jozsef Kadlecsik wrote:
> What is the rationale behind "remove the conntrack hooks when there are no
> rule left referring to conntrack"? Performance optimization?
That seems to be the case. See commit 4d3a57f23dec ("netfilter: conntrack:
do not enable connection
On Wed, Oct 7, 2020 at 12:32 PM Francesco Ruggeri wrote:
>
> If the first packet conntrack sees after a re-register is an outgoing
> keepalive packet with no data (SEG.SEQ = SND.NXT-1), td_end is set to
> SND.NXT-1.
> When the peer correctly acknowledges SND.NXT, tcp_in_window fa
r than when sending out
the keepalive packet.
Fixes: f94e63801ab2 ("netfilter: conntrack: reset tcp maxwin on re-register")
Signed-off-by: Francesco Ruggeri
---
net/netfilter/nf_conntrack_proto_tcp.c | 19 +--
1 file changed, 13 insertions(+), 6 deletions(-)
, but I am not sure it is the
correct
Thanks,
Francesco Ruggeri
Fixes: f94e63801ab2 ("netfilter: conntrack: reset tcp maxwin on re-register")
Signed-off-by: Francesco Ruggeri
---
net/netfilter/nf_conntrack_proto_tcp.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/net/netfilter/nf_c
> description in 'WAIT_REFS_MIN_MSECS'
>
> Fixes: 0e4be9e57e8c ("net: use exponential backoff in netdev_wait_allrefs")
> Signed-off-by: Mauro Carvalho Chehab
Reviewed-by: Francesco Ruggeri
Thanks for fixing this!
On Tue, Sep 22, 2020 at 4:22 AM Mauro Carvalho Chehab
wrote:
>
> kernel-doc expects the function prototype to be just after
> the kernel-doc markup, as otherwise it will get it all wrong:
>
> ./net/core/dev.c:10036: warning: Excess function parameter 'dev'
> description in
refs.
v3: preserve reverse christmas tree ordering of local variables
v4: try an extra rcu_barrier before the backoff, plus some
cosmetic changes.
Signed-off-by: Francesco Ruggeri
---
net/core/dev.c | 12 ++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/core/de
On Thu, Sep 17, 2020 at 5:02 PM Stephen Hemminger
wrote:
> Is there anyway to make RCU trigger faster?
This is a case of the networking code requiring multiple cascading grace periods
(functions executing at the end of a period scheduling more functions
for the end
of the next period), so it's a
rve reverse christmas tree ordering of local variables
Signed-off-by: Francesco Ruggeri
---
net/core/dev.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 4086d335978c..e5fa60cb8832 100644
--- a/net/core/dev.c
+++ b/net/core
xed msleep(250)
to get out of the loop faster.
Time with this patch on a 5.4 kernel:
real0m8.199s
user0m0.402s
sys 0m1.213s
Time without this patch:
real0m31.522s
user0m0.438s
sys 0m1.156s
v2: use exponential backoff instead of trying to wake up
netdev_wait_allrefs.
Signed
On Wed, Sep 16, 2020 at 11:51 PM Eric Dumazet wrote:
>
> Honestly I would not touch dev_put() at all.
>
> Simply change the msleep(250) to something better, with maybe
> exponential backoff.
OK, I will try that.
Francesco
> static inline void dev_put(struct net_device *dev)
> {
> + struct task_struct *destroy_task = dev->destroy_task;
> +
> this_cpu_dec(*dev->pcpu_refcnt);
> + if (destroy_task)
> + wake_up_process(destroy_task);
> }
I just realized that this introduces a race,
llowing dev_put to wake up netdev_wait_allrefs.
Time with this patch on a 5.4 kernel:
real0m7.494s
user0m0.403s
sys 0m1.197s
Time without this patch:
real0m31.522s
user0m0.438s
sys 0m1.156s
Signed-off-by: Francesco Ruggeri
---
include/linux/netdevice.h | 6 ++
net
Thanks for the replies and workaround suggestions!
Francesco
ing together a user buffer worth of events?
Thanks,
Francesco Ruggeri
[5752801.578813] watchdog: BUG: soft lockup - CPU#15 stuck for 22s!
[fstrace:23105]
[5752801.586804] Modules linked in: ...
[5752801.586871] CPU: 15 PID: 23105 Comm: fstrace Tainted: GW O L
4.19.112-16802951.AroraKernel
driver"),
commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
reset subtask").
v2: add fix for second race condition above.
Signed-off-by: Francesco Ruggeri
---
drivers/net/ethernet/
>
> So will you be sending a v2 of your patch to include the second fix?
Yes, I am working on it. Just to confirm, v2 should include both fixes, right?
Thanks,
Francesco
> Do not worry about the other Intel drivers, I have our developers looking at
> each of our drivers for the locking issue.
>
> @David Miller - I am picking up this patch
There seems to be a second race, independent from the
original one, that results in a divide error:
kworker reboot
> Would you mind adding a fixes tag here? Probably:
>
> Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver")
That seems to be the commit that introduced the driver in 2.6.25.
I am not familiar with the history of the driver to tell if this was a day 1
problem or if it became an
_state
This commit applies to igb the same changes that were applied to ixgbe
in commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
reset subtask").
Signed-off-by: Francesco Ruggeri
---
driv
I see the following kernel panic in 4.19.47 as soon as I hit a kprobe.
In this case it happened right after
# cd /sys/kernel/debug/tracing/
# echo >trace
# echo 'p rollback_registered_many' >kprobe_events
# echo 1 >events/kprobes/enable
# ip netns add dummy
# ip netns del dummy
but I have also
$MNT; done
Signed-off-by: Francesco Ruggeri
---
fs/buffer.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 48318fb74938..447e8db2ff5f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -190,7 +190,8 @@ EXPORT_SYMBOL(end_buffer_wr
By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.
Signed-off-by: Maxim Martynov
Signed-off-by: Francesco
On Thu, Feb 28, 2019 at 3:31 PM David Ahern wrote:
>
> On 2/28/19 2:02 PM, David Miller wrote:
> > From: frugg...@arista.com (Francesco Ruggeri)
> > Date: Thu, 28 Feb 2019 11:01:46 -0800
> >
> >> ip6_call_ra_chain is called when IPv6 packet with Rout
it is in.
Suggested-by: Maxim Martynov
Signed-off-by: Maxim Martynov
Signed-off-by: Francesco Ruggeri
---
net/ipv6/ip6_output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5f9fa0302b5a..3ed25e17dff3 100644
--- a/net/ipv6
On Thu, Feb 21, 2019 at 11:27 AM Francesco Ruggeri wrote:
>
> __find_get_block_slow() and grow_buffers() use different methods to compute
> a page index for a given block: __find_get_block_slow() computes it from
> bd_inode->i_blkbits, while grow_buffers() computes it from
s=1MiB
losetup $LOOP $FILE
mkfs -t ext4 $LOOP
while true; do losetup -D $LOOP; losetup $LOOP $FILE;done 2>&1 >/dev/null &
for ((i=0; i<100; i++)); do echo == $i; \
mount $LOOP $MNT; umount $MNT; done
Signed-off-by: Francesco Ruggeri
---
fs/
Does anybody have any opinions about this?
Thanks,
Francesco Ruggeri
On Wed, Jan 31, 2018 at 9:16 AM, Francesco Ruggeri <frugg...@arista.com> wrote:
> I had a few cases of mount getting stuck in an infinite loop.
> This happens when bdev->bd_inode->i_blkbits gets modif
Does anybody have any opinions about this?
Thanks,
Francesco Ruggeri
On Wed, Jan 31, 2018 at 9:16 AM, Francesco Ruggeri wrote:
> I had a few cases of mount getting stuck in an infinite loop.
> This happens when bdev->bd_inode->i_blkbits gets modified (for
> example by bd_se
((i=0; i<100; i++)); do echo == $i; \
mount $LOOP $MNT; umount $MNT; done
The issue is that __find_get_block_slow() and grow_buffers() compute
the page index in different ways.
I am not sure what the correct solution should be here.
Thanks,
Francesco Ruggeri
((i=0; i<100; i++)); do echo == $i; \
mount $LOOP $MNT; umount $MNT; done
The issue is that __find_get_block_slow() and grow_buffers() compute
the page index in different ways.
I am not sure what the correct solution should be here.
Thanks,
Francesco Ruggeri
On Fri, Aug 11, 2017 at 11:29 AM, Francesco Ruggeri <frugg...@arista.com> wrote:
> I have run into this panic while some devices were being hotunplugged,
> and I am able to easily reproduce it with the attached module, which
> creates a dummy uio device (in my case /sys/
On Fri, Aug 11, 2017 at 11:29 AM, Francesco Ruggeri wrote:
> I have run into this panic while some devices were being hotunplugged,
> and I am able to easily reproduce it with the attached module, which
> creates a dummy uio device (in my case /sys/class/uio/uio74).
> The panic i
ULL, and name_show() which dereferences it.
Seen in 4.9, 3.18 and 3.4.
Thanks,
Francesco Ruggeri
-bash-4.3# insmod dummydev.ko
-bash-4.3# cat /sys/class/uio/uio74/name
uio_dummydev
-bash-4.3# rmmod dummydev
-bash-4.3#
Then on different bash shells on different cpus I run
while true ;do ins
ULL, and name_show() which dereferences it.
Seen in 4.9, 3.18 and 3.4.
Thanks,
Francesco Ruggeri
-bash-4.3# insmod dummydev.ko
-bash-4.3# cat /sys/class/uio/uio74/name
uio_dummydev
-bash-4.3# rmmod dummydev
-bash-4.3#
Then on different bash shells on different cpus I run
while true ;do ins
s /dev/tap*
for ((ns=1; ns<3; ns++))
do
ip netns del ns${ns}
done
Signed-off-by: Francesco Ruggeri <frugg...@arista.com>
---
drivers/net/macvlan.c | 11 ---
drivers/net/macvtap.c | 2 ++
include/linux/if_macvlan.h | 3 +++
3 files changed, 13 insertions(+), 3 deletion
s /dev/tap*
for ((ns=1; ns<3; ns++))
do
ip netns del ns${ns}
done
Signed-off-by: Francesco Ruggeri
---
drivers/net/macvlan.c | 11 ---
drivers/net/macvtap.c | 2 ++
include/linux/if_macvlan.h | 3 +++
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/n
>From ce9a4f202723f6ba1b18bc7c4a258c130c1f4148 Mon Sep 17 00:00:00 2001
From: Francesco Ruggeri
Date: Mon, 9 Mar 2015 11:51:04 -0700
Subject: [PATCH 1/1] net: delete stale packet_mclist entries
When an interface is deleted from a net namespace the ifindex in the
corresponding entr
From ce9a4f202723f6ba1b18bc7c4a258c130c1f4148 Mon Sep 17 00:00:00 2001
From: Francesco Ruggeri frugg...@arista.com
Date: Mon, 9 Mar 2015 11:51:04 -0700
Subject: [PATCH 1/1] net: delete stale packet_mclist entries
When an interface is deleted from a net namespace the ifindex in the
corresponding
VMWare's e1000 implementation does not seem to support unicast filtering.
This can be observed by configuring a macvlan interface on eth0 in a VM in
VMWare Fusion 5.0.5, and trying to use that interface instead of eth0.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri
---
drivers/net/ethernet
VMWare's e1000 implementation does not seem to support unicast filtering.
This can be observed by configuring a macvlan interface on eth0 in a VM in
VMWare Fusion 5.0.5, and trying to use that interface instead of eth0.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri frugg...@arista.com
The string from the slave is missed several times.
This patch takes the same approach as the fix for read and special cases
this condition for poll.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri
---
drivers/tty/n_tty.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/dr
several times.
This patch takes the same approach as the fix for read and special cases
this condition for poll.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri frugg...@arista.com
---
drivers/tty/n_tty.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty
Hi Peter,
thanks for your reply.
>> poll_wait(file, >read_wait, wait);
>> poll_wait(file, >write_wait, wait);
>> - if (input_available_p(tty, 1))
>> - mask |= POLLIN | POLLRDNORM;
>> if (tty->packet && tty->link->ctrl_status)
>> mask |= POLLPRI |
Hi Peter,
thanks for your reply.
poll_wait(file, tty-read_wait, wait);
poll_wait(file, tty-write_wait, wait);
- if (input_available_p(tty, 1))
- mask |= POLLIN | POLLRDNORM;
if (tty-packet tty-link-ctrl_status)
mask |= POLLPRI | POLLIN |
The string from the slave is missed several times.
This patch takes the same approach as the fix for read and special cases
this condition for poll.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri
---
drivers/tty/n_tty.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/dr
several times.
This patch takes the same approach as the fix for read and special cases
this condition for poll.
Tested on 3.16.
Signed-off-by: Francesco Ruggeri frugg...@arista.com
---
drivers/tty/n_tty.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty
>From b99b1ec156ab351668eb77a00a57a4ae095eec28 Mon Sep 17 00:00:00 2001
From: Francesco Ruggeri
Date: Wed, 24 Sep 2014 10:12:41 -0700
Subject: [PATCH 1/1] pci: move PCI_VENDOR_ID_VMWARE to pci_ids.h
Moving PCI_VENDOR_ID_VMWARE from device specific files to pci_ids.h.
It is useful to always h
From b99b1ec156ab351668eb77a00a57a4ae095eec28 Mon Sep 17 00:00:00 2001
From: Francesco Ruggeri frugg...@arista.com
Date: Wed, 24 Sep 2014 10:12:41 -0700
Subject: [PATCH 1/1] pci: move PCI_VENDOR_ID_VMWARE to pci_ids.h
Moving PCI_VENDOR_ID_VMWARE from device specific files to pci_ids.h
>>
> Look for callers of bus_find_device. Unless I am missing something, only pci
> and scsi code call it with non-NULL 'start' argument, and the scsi use is
> limited to a walk through scsi devices for a proc file.
>
> Makes me wonder if the start argument should go away, and if pci and scsi
>
Look for callers of bus_find_device. Unless I am missing something, only pci
and scsi code call it with non-NULL 'start' argument, and the scsi use is
limited to a walk through scsi devices for a proc file.
Makes me wonder if the start argument should go away, and if pci and scsi
should use
In-Reply-To: <20140523023141.gc13...@kroah.com>
Hi Guenter,
I got back to looking into this crash.
Just as an example, the attached diffs also fix my bus_find_device problem for
traversals that start from the head of the list and traverse it completely.
They are very specific to the case of
In-Reply-To: 20140523023141.gc13...@kroah.com
Hi Guenter,
I got back to looking into this crash.
Just as an example, the attached diffs also fix my bus_find_device problem for
traversals that start from the head of the list and traverse it completely.
They are very specific to the case of
at 12:22 AM, Guenter Roeck wrote:
> On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
>>
>> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
>>>
>>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
>>>>
>>>&g
at 12:22 AM, Guenter Roeck li...@roeck-us.net wrote:
On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
I have been using an x86 platform.
When I started working
Hi Guenter,
thank you for your reply. I will check out the changes that you pointed to.
The problem we are seeing is a race condition between for_each_pci_dev
(or similar) and device_unregisters. I am not sure if use of the new
lock should be extended to all code using for_each_pci_dev as well.
.
Has anybody run into this before?
Thanks,
Francesco Ruggeri
[ cut here ]
WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41
klist_iter_init_node+0x30/0x38()
Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
macvlan
.
Has anybody run into this before?
Thanks,
Francesco Ruggeri
[ cut here ]
WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41
klist_iter_init_node+0x30/0x38()
Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
macvlan
Hi Guenter,
thank you for your reply. I will check out the changes that you pointed to.
The problem we are seeing is a race condition between for_each_pci_dev
(or similar) and device_unregisters. I am not sure if use of the new
lock should be extended to all code using for_each_pci_dev as well.
resuming a scan the caller should
be holding a
reference to the klist_node, but instead it relies on holding a
reference to the device.
I played with a couple of narrow fixes, but a clean solution would
affect quite a bit of code.
Has anybody run into this before?
Thanks,
Francesco Ruggeri
resuming a scan the caller should
be holding a
reference to the klist_node, but instead it relies on holding a
reference to the device.
I played with a couple of narrow fixes, but a clean solution would
affect quite a bit of code.
Has anybody run into this before?
Thanks,
Francesco Ruggeri
to sysctl_follow_link() fails.
This patch fixes this leak by making sure the reference is always dropped
on return.
See also 076c3eed2c31773200b082568957fd8852ae93d7 which reorganized this
code in 3.4.
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri
Index: linux-3.4.x86_64/fs/proc/proc_sysctl.c
to sysctl_follow_link() fails.
This patch fixes this leak by making sure the reference is always dropped
on return.
See also 076c3eed2c31773200b082568957fd8852ae93d7 which reorganized this
code in 3.4.
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri frugg...@aristanetworks.com
Index: linux-3.4.x86_64/fs
to sysctl_follow_link() fails.
This patch fixes this leak by making sure the reference is always dropped
on return.
See also 076c3eed2c31773200b082568957fd8852ae93d7 which reorganized this
code in 3.4.
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri
Index: linux-3.4.x86_64/fs/proc/proc_sysctl.c
to sysctl_follow_link() fails.
This patch fixes this leak by making sure the reference is always dropped
on return.
See also 076c3eed2c31773200b082568957fd8852ae93d7 which reorganized this
code in 3.4.
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri frugg...@aristanetworks.com
Index: linux-3.4
in that
d_set_d_op(dentry, _sys_dentry_operations);
d_add(dentry, inode);
will now execute before the ctl_table_header is freed.
This should not be a problem, and it makes the code cleaner than just
adding an extra call to sysctl_head_finish(h);
Tested in Linux 3.4.4.
Signed-off-by: Francesco
the ctl_table_header is freed.
This should not be a problem, and it makes the code cleaner than just
adding an extra call to sysctl_head_finish(h);
Tested in Linux 3.4.4.
Signed-off-by: Francesco Ruggeri frugg...@aristanetworks.com
Index: linux-3.4.x86_64/fs/proc/proc_sysctl.c
69 matches
Mail list logo