Re: [PATCH] sunrpc: fix crash when cache_head become valid before update

2019-10-11 Thread J . Bruce Fields
On Wed, Oct 09, 2019 at 09:51:23AM +1100, NeilBrown wrote:
> On Tue, Oct 08 2019,  J . Bruce Fields  wrote:
> 
> > On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote:
> >> Add Neil to CC, sorry, had lost it somehow...
> >
> > Always happy when we can fix a bug by deleting code, and your
> > explanation makes sense to me, but I'll give Neil a chance to look it
> > over if he wants.
> 
> Yes, it makes sense to me.  But I'm not sure that is worth much.  The
> original fix got a Reviewed-by from me but was wrong.
>  Acked-by: NeilBrown 
> 
> 'Acked' is weaker than 'reviewed' - isn't it? :-)

Hah--"Self-deprecatingly-reviewed-by:"?

Anyway, applied thanks.

--b.


Re: [PATCH] sunrpc: fix crash when cache_head become valid before update

2019-10-08 Thread NeilBrown
On Tue, Oct 08 2019,  J . Bruce Fields  wrote:

> On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote:
>> Add Neil to CC, sorry, had lost it somehow...
>
> Always happy when we can fix a bug by deleting code, and your
> explanation makes sense to me, but I'll give Neil a chance to look it
> over if he wants.

Yes, it makes sense to me.  But I'm not sure that is worth much.  The
original fix got a Reviewed-by from me but was wrong.
 Acked-by: NeilBrown 

'Acked' is weaker than 'reviewed' - isn't it? :-)

NeilBrown


signature.asc
Description: PGP signature


Re: [PATCH] sunrpc: fix crash when cache_head become valid before update

2019-10-08 Thread J . Bruce Fields
On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote:
> Add Neil to CC, sorry, had lost it somehow...

Always happy when we can fix a bug by deleting code, and your
explanation makes sense to me, but I'll give Neil a chance to look it
over if he wants.

--b.

> 
> On 10/1/19 11:03 AM, Pavel Tikhomirov wrote:
> > I was investigating a crash in our Virtuozzo7 kernel which happened in
> > in svcauth_unix_set_client. I found out that we access m_client field
> > in ip_map structure, which was received from sunrpc_cache_lookup (we
> > have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
> > these field looks uninitialized (m_client == 0x74 don't look like a
> > pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.
> > 
> > It looks like the problem appeared from our previous fix to sunrpc (1):
> > commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
> > request")
> > 
> > And we've also found a patch already fixing our patch (2):
> > commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
> > 
> > Though the crash is eliminated, I think the core of the problem is not
> > completely fixed:
> > 
> > Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
> > cache_fresh_locked which was added in (1) to fix crash. These way
> > cache_is_valid won't say the cache is valid anymore and in
> > svcauth_unix_set_client the function cache_check will return error
> > instead of 0, and we don't count entry as initialized.
> > 
> > But it looks like we need to remove cache_fresh_locked completely in
> > sunrpc_cache_lookup:
> > 
> > In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
> > that cache_requests with no readers also release corresponding
> > cache_head, to fix their leak.  We with Vasily were not sure if
> > cache_fresh_locked and cache_fresh_unlocked should be used in pair or
> > not, so we've guessed to use them in pair.
> > 
> > Now we see that we don't want the CACHE_VALID bit set here by
> > cache_fresh_locked, as "valid" means "initialized" and there is no
> > initialization in sunrpc_cache_add_entry. Both expiry_time and
> > last_refresh are not used in cache_fresh_unlocked code-path and also not
> > required for the initial fix.
> > 
> > So to conclude cache_fresh_locked was called by mistake, and we can just
> > safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
> > ideologically better for me. Hope I don't miss something here.
> > 
> > Here is our crash backtrace:
> > [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 
> > 0074
> > [13108726.326365] IP: [] 
> > svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
> > [13108726.326448] PGD 0
> > [13108726.326468] Oops: 0002 [#1] SMP
> > [13108726.326497] Modules linked in: nbd isofs xfs loop 
> > kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill 
> > ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink 
> > ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr 
> > ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw 
> > xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG 
> > xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic 
> > xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag 
> > udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag 
> > rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 
> > ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 
> > nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
> > [13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter 
> > arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE 
> > nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment 
> > nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set 
> > xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter 
> > ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac 
> > intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse 
> > pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si 
> > shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 
> > nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb 
> > sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay 
> > ip6_vzprivnet ip6_vznetstat ip_vznetstat
> > [13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat 
> > vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw 
> > pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif 
> > crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 
> > 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc 
> > bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm 
> > 

Re: [PATCH] sunrpc: fix crash when cache_head become valid before update

2019-10-08 Thread Pavel Tikhomirov
Add Neil to CC, sorry, had lost it somehow...

On 10/1/19 11:03 AM, Pavel Tikhomirov wrote:
> I was investigating a crash in our Virtuozzo7 kernel which happened in
> in svcauth_unix_set_client. I found out that we access m_client field
> in ip_map structure, which was received from sunrpc_cache_lookup (we
> have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
> these field looks uninitialized (m_client == 0x74 don't look like a
> pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.
> 
> It looks like the problem appeared from our previous fix to sunrpc (1):
> commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
> request")
> 
> And we've also found a patch already fixing our patch (2):
> commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
> 
> Though the crash is eliminated, I think the core of the problem is not
> completely fixed:
> 
> Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
> cache_fresh_locked which was added in (1) to fix crash. These way
> cache_is_valid won't say the cache is valid anymore and in
> svcauth_unix_set_client the function cache_check will return error
> instead of 0, and we don't count entry as initialized.
> 
> But it looks like we need to remove cache_fresh_locked completely in
> sunrpc_cache_lookup:
> 
> In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
> that cache_requests with no readers also release corresponding
> cache_head, to fix their leak.  We with Vasily were not sure if
> cache_fresh_locked and cache_fresh_unlocked should be used in pair or
> not, so we've guessed to use them in pair.
> 
> Now we see that we don't want the CACHE_VALID bit set here by
> cache_fresh_locked, as "valid" means "initialized" and there is no
> initialization in sunrpc_cache_add_entry. Both expiry_time and
> last_refresh are not used in cache_fresh_unlocked code-path and also not
> required for the initial fix.
> 
> So to conclude cache_fresh_locked was called by mistake, and we can just
> safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
> ideologically better for me. Hope I don't miss something here.
> 
> Here is our crash backtrace:
> [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 
> 0074
> [13108726.326365] IP: [] 
> svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
> [13108726.326448] PGD 0
> [13108726.326468] Oops: 0002 [#1] SMP
> [13108726.326497] Modules linked in: nbd isofs xfs loop 
> kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill 
> ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink 
> ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr 
> ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw 
> xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit 
> xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG 
> nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag 
> tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 
> xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT 
> nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 
> ip6table_mangle ip6table_raw nfsv4
> [13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter 
> arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE 
> nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment 
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set 
> xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among 
> ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp 
> coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses 
> enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp 
> ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd 
> auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq 
> sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet 
> ip6_vznetstat ip_vznetstat
> [13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat 
> vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 
> ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 
> i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea 
> sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul 
> crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath 
> ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper 
> cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage 
> ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last 
> unloaded: kpatch_cumulative_82_0_r1]
> [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded 
> Tainted: GW  O    

[PATCH] sunrpc: fix crash when cache_head become valid before update

2019-10-01 Thread Pavel Tikhomirov
I was investigating a crash in our Virtuozzo7 kernel which happened in
in svcauth_unix_set_client. I found out that we access m_client field
in ip_map structure, which was received from sunrpc_cache_lookup (we
have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
these field looks uninitialized (m_client == 0x74 don't look like a
pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.

It looks like the problem appeared from our previous fix to sunrpc (1):
commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
request")

And we've also found a patch already fixing our patch (2):
commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")

Though the crash is eliminated, I think the core of the problem is not
completely fixed:

Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
cache_fresh_locked which was added in (1) to fix crash. These way
cache_is_valid won't say the cache is valid anymore and in
svcauth_unix_set_client the function cache_check will return error
instead of 0, and we don't count entry as initialized.

But it looks like we need to remove cache_fresh_locked completely in
sunrpc_cache_lookup:

In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
that cache_requests with no readers also release corresponding
cache_head, to fix their leak.  We with Vasily were not sure if
cache_fresh_locked and cache_fresh_unlocked should be used in pair or
not, so we've guessed to use them in pair.

Now we see that we don't want the CACHE_VALID bit set here by
cache_fresh_locked, as "valid" means "initialized" and there is no
initialization in sunrpc_cache_add_entry. Both expiry_time and
last_refresh are not used in cache_fresh_unlocked code-path and also not
required for the initial fix.

So to conclude cache_fresh_locked was called by mistake, and we can just
safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
ideologically better for me. Hope I don't miss something here.

Here is our crash backtrace:
[13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 
0074
[13108726.326365] IP: [] svcauth_unix_set_client+0x2ab/0x520 
[sunrpc]
[13108726.326448] PGD 0
[13108726.326468] Oops: 0002 [#1] SMP
[13108726.326497] Modules linked in: nbd isofs xfs loop 
kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill 
ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink 
ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr 
ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent 
nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS 
xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log 
dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag 
netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter 
ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute 
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
[13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables 
ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set 
xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among 
ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp 
coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure 
joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf 
ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss 
nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs 
em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat 
ip_vznetstat
[13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat 
vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 
ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 
i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect 
garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul 
crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel 
uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi 
scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc 
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
kpatch_cumulative_82_0_r1]
[13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded 
Tainted: GW  O      3.10.0-862.20.2.vz7.73.29 #1 73.29
[13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, 
BIOS U32 10/02/2018
[13108726.328554] task: a0a6a41b1160 ti: a0c2a74bc000 task.ti: 
a0c2a74bc000
[13108726.328610] RIP: 0010:[]  []