The following commit has been merged in the master branch:
commit 8f7aa3d3c7323f4ca2768a9e74ebbe359c4f8f88
Merge: 015e7b0b0e8e51f7321ec2aafc1d7fc0a8a5536f
4de44542991ed4cb8c9fb2ccd766d6e6015101b0
Author: Linus Torvalds <[email protected]>
Date: Wed Dec 3 17:24:33 2025 -0800
Merge tag 'net-next-6.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Replace busylock at the Tx queuing layer with a lockless list.
Resulting in a 300% (4x) improvement on heavy TX workloads, sending
twice the number of packets per second, for half the cpu cycles.
- Allow constantly busy flows to migrate to a more suitable CPU/NIC
queue.
Normally we perform queue re-selection when flow comes out of idle,
but under extreme circumstances the flows may be constantly busy.
Add sysctl to allow periodic rehashing even if it'd risk packet
reordering.
- Optimize the NAPI skb cache, make it larger, use it in more paths.
- Attempt returning Tx skbs to the originating CPU (like we already
did for Rx skbs).
- Various data structure layout and prefetch optimizations from Eric.
- Remove ktime_get() from the recvmsg() fast path, ktime_get() is
sadly quite expensive on recent AMD machines.
- Extend threaded NAPI polling to allow the kthread busy poll for
packets.
- Make MPTCP use Rx backlog processing. This lowers the lock
pressure, improving the Rx performance.
- Support memcg accounting of MPTCP socket memory.
- Allow admin to opt sockets out of global protocol memory accounting
(using a sysctl or BPF-based policy). The global limits are a poor
fit for modern container workloads, where limits are imposed using
cgroups.
- Improve heuristics for when to kick off AF_UNIX garbage collection.
- Allow users to control TCP SACK compression, and default to 33% of
RTT.
- Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid
unnecessarily aggressive rcvbuf growth and overshot when the
connection RTT is low.
- Preserve skb metadata space across skb_push / skb_pull operations.
- Support for IPIP encapsulation in the nftables flowtable offload.
- Support appending IP interface information to ICMP messages (RFC
5837).
- Support setting max record size in TLS (RFC 8449).
- Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL.
- Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock.
- Let users configure the number of write buffers in SMC.
- Add new struct sockaddr_unsized for sockaddr of unknown length,
from Kees.
- Some conversions away from the crypto_ahash API, from Eric Biggers.
- Some preparations for slimming down struct page.
- YAML Netlink protocol spec for WireGuard.
- Add a tool on top of YAML Netlink specs/lib for reporting commonly
computed derived statistics and summarized system state.
Driver API:
- Add CAN XL support to the CAN Netlink interface.
- Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as
defined by the OPEN Alliance's "Advanced diagnostic features for
100BASE-T1 automotive Ethernet PHYs" specification.
- Add DPLL phase-adjust-gran pin attribute (and implement it in
zl3073x).
- Refactor xfrm_input lock to reduce contention when NIC offloads
IPsec and performs RSS.
- Add info to devlink params whether the current setting is the
default or a user override. Allow resetting back to default.
- Add standard device stats for PSP crypto offload.
- Leverage DSA frame broadcast to implement simple HSR frame
duplication for a lot of switches without dedicated HSR offload.
- Add uAPI defines for 1.6Tbps link modes.
Device drivers:
- Add Motorcomm YT921x gigabit Ethernet switch support.
- Add MUCSE driver for N500/N210 1GbE NIC series.
- Convert drivers to support dedicated ops for timestamping control,
and away from the direct IOCTL handling. While at it support GET
operations for PHY timestamping.
- Add (and convert most drivers to) a dedicated ethtool callback for
reading the Rx ring count.
- Significant refactoring efforts in the STMMAC driver, which
supports Synopsys turn-key MAC IP integrated into a ton of SoCs.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support PPS in/out on all pins
- Intel (100G, ice, idpf):
- ice: implement standard ethtool and timestamping stats
- i40e: support setting the max number of MAC addresses per VF
- iavf: support RSS of GTP tunnels for 5G and LTE deployments
- nVidia/Mellanox (mlx5):
- reduce downtime on interface reconfiguration
- disable being an XDP redirect target by default (same as
other drivers) to avoid wasting resources if feature is
unused
- Meta (fbnic):
- add support for Linux-managed PCS on 25G, 50G, and 100G links
- Wangxun:
- support Rx descriptor merge, and Tx head writeback
- support Rx coalescing offload
- support 25G SPF and 40G QSFP modules
- Ethernet virtual:
- Google (gve):
- allow ethtool to configure rx_buf_len
- implement XDP HW RX Timestamping support for DQ descriptor
format
- Microsoft vNIC (mana):
- support HW link state events
- handle hardware recovery events when probing the device
- Ethernet NICs consumer, and embedded:
- usbnet: add support for Byte Queue Limits (BQL)
- AMD (amd-xgbe):
- add device selftests
- NXP (enetc):
- add i.MX94 support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- bcmasp: add support for PHY-based Wake-on-LAN
- Broadcom switches (b53):
- support port isolation
- support BCM5389/97/98 and BCM63XX ARL formats
- Lantiq/MaxLinear switches:
- support bridge FDB entries on the CPU port
- use regmap for register access
- allow user to enable/disable learning
- support Energy Efficient Ethernet
- support configuring RMII clock delays
- add tagging driver for MaxLinear GSW1xx switches
- Synopsys (stmmac):
- support using the HW clock in free running mode
- add Eswin EIC7700 support
- add Rockchip RK3506 support
- add Altera Agilex5 support
- Cadence (macb):
- cleanup and consolidate descriptor and DMA address handling
- add EyeQ5 support
- TI:
- icssg-prueth: support AF_XDP
- Airoha access points:
- add missing Ethernet stats and link state callback
- add AN7583 support
- support out-of-order Tx completion processing
- Power over Ethernet:
- pd692x0: preserve PSE configuration across reboots
- add support for TPS23881B devices
- Ethernet PHYs:
- Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support
- Support 50G SerDes and 100G interfaces in Linux-managed PHYs
- micrel:
- support for non PTP SKUs of lan8814
- enable in-band auto-negotiation on lan8814
- realtek:
- cable testing support on RTL8224
- interrupt support on RTL8221B
- motorcomm: support for PHY LEDs on YT853
- microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag
- mscc: support for PHY LED control
- CAN drivers:
- m_can: add support for optional reset and system wake up
- remove can_change_mtu() obsoleted by core handling
- mcp251xfd: support GPIO controller functionality
- Bluetooth:
- add initial support for PASTa
- WiFi:
- split ieee80211.h file, it's way too big
- improvements in VHT radiotap reporting, S1G, Channel Switch
Announcement handling, rate tracking in mesh networks
- improve multi-radio monitor mode support, and add a cfg80211
debugfs interface for it
- HT action frame handling on 6 GHz
- initial chanctx work towards NAN
- MU-MIMO sniffer improvements
- WiFi drivers:
- RealTek (rtw89):
- support USB devices RTL8852AU and RTL8852CU
- initial work for RTL8922DE
- improved injection support
- Intel:
- iwlwifi: new sniffer API support
- MediaTek (mt76):
- WED support for >32-bit DMA
- airoha NPU support
- regdomain improvements
- continued WiFi7/MLO work
- Qualcomm/Atheros:
- ath10k: factory test support
- ath11k: TX power insertion support
- ath12k: BSS color change support
- ath12k: statistics improvements
- brcmfmac: Acer A1 840 tablet quirk
- rtl8xxxu: 40 MHz connection fixes/support"
* tag 'net-next-6.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits)
net: page_pool: sanitise allocation order
net: page pool: xa init with destroy on pp init
net/mlx5e: Support XDP target xmit with dummy program
net/mlx5e: Update XDP features in switch channels
selftests/tc-testing: Test CAKE scheduler when enqueue drops packets
net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
wireguard: netlink: generate netlink code
wireguard: uapi: generate header with ynl-gen
wireguard: uapi: move flag enums
wireguard: uapi: move enum wg_cmd
wireguard: netlink: add YNL specification
selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py
selftests: drv-net: Fix and clarify TC bandwidth split in
devlink_rate_tc_bw.py
selftests: drv-net: Set shell=True for sysfs writes in
devlink_rate_tc_bw.py
selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py
selftests: drv-net: introduce Iperf3Runner for measurement use cases
selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS
net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive()
Documentation: net: dsa: mention simple HSR offload helpers
Documentation: net: dsa: mention availability of RedBox
...
diff --combined Documentation/devicetree/bindings/vendor-prefixes.yaml
index 647746e6f75f6,424aa7b911a77..beab81e462d23
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@@ -20,7 -20,7 +20,7 @@@ patternProperties
"^(keypad|m25p|max8952|max8997|max8998|mpmc),.*": true
"^(pciclass|pinctrl-single|#pinctrl-single|PowerPC),.*": true
"^(pl022|pxa-mmc|rcar_sound|rotary-encoder|s5m8767|sdhci),.*": true
- "^(simple-audio-card|st-plgpio|st-spics|ts),.*": true
+ "^(simple-audio-card|st-plgpio|st-spics|ts|vsc8531),.*": true
"^pool[0-3],.*": true
# Keep list in alphabetical order.
@@@ -1705,8 -1705,6 +1705,8 @@@
description: Universal Scientific Industrial Co., Ltd.
"^usr,.*":
description: U.S. Robotics Corporation
+ "^ultrarisc,.*":
+ description: UltraRISC Technology Co., Ltd.
"^ultratronik,.*":
description: Ultratronik GmbH
"^utoo,.*":
diff --combined MAINTAINERS
index 3b1d3af83f135,9c0d30217516a..e36689cd7cc7b
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@@ -3301,6 -3301,7 +3301,7 @@@ F: drivers/*/*/*rockchip
F: drivers/*/*rockchip*
F: drivers/clk/rockchip/
F: drivers/i2c/busses/i2c-rk3x.c
+ F: drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
F: sound/soc/rockchip/
N: rockchip
@@@ -4654,7 -4655,6 +4655,7 @@@ F: Documentation/userspace-api/ebpf
F: arch/*/net/*
F: include/linux/bpf*
F: include/linux/btf*
+F: include/linux/buildid.h
F: include/linux/filter.h
F: include/trace/events/xdp.h
F: include/uapi/linux/bpf*
@@@ -5132,7 -5132,6 +5133,6 @@@ F: Documentation/devicetree/bindings/ne
F: drivers/net/ethernet/broadcom/genet/
F: drivers/net/ethernet/broadcom/unimac.h
F: drivers/net/mdio/mdio-bcm-unimac.c
- F: include/linux/platform_data/bcmgenet.h
F: include/linux/platform_data/mdio-bcm-unimac.h
BROADCOM IPROC ARM ARCHITECTURE
@@@ -6614,6 -6613,7 +6614,6 @@@ CRYPTOGRAPHIC RANDOM NUMBER GENERATO
M: Neil Horman <[email protected]>
L: [email protected]
S: Maintained
-F: crypto/ansi_cprng.c
F: crypto/rng.c
CS3308 MEDIA DRIVER
@@@ -7419,10 -7419,16 +7419,10 @@@ S: Maintaine
P: Documentation/doc-guide/maintainer-profile.rst
T: git git://git.lwn.net/linux.git docs-next
F: Documentation/
-F: scripts/check-variable-fonts.sh
-F: scripts/checktransupdate.py
-F: scripts/documentation-file-ref-check
-F: scripts/get_abi.py
F: scripts/kernel-doc*
-F: scripts/lib/abi/*
-F: scripts/lib/kdoc/*
-F: tools/docs/*
+F: tools/lib/python/*
+F: tools/docs/
F: tools/net/ynl/pyynl/lib/doc_generator.py
-F: scripts/sphinx-pre-install
X: Documentation/ABI/
X: Documentation/admin-guide/media/
X: Documentation/devicetree/
@@@ -7455,10 -7461,9 +7455,10 @@@ DOCUMENTATION SCRIPT
M: Mauro Carvalho Chehab <[email protected]>
L: [email protected]
S: Maintained
-F: Documentation/sphinx/parse-headers.pl
-F: scripts/documentation-file-ref-check
-F: scripts/sphinx-pre-install
+F: Documentation/sphinx/
+F: scripts/kernel-doc*
+F: tools/lib/python/*
+F: tools/docs/
DOCUMENTATION/ITALIAN
M: Federico Vaga <[email protected]>
@@@ -9183,9 -9188,6 +9183,9 @@@ S: Maintaine
F: kernel/power/energy_model.c
F: include/linux/energy_model.h
F: Documentation/power/energy-model.rst
+F: Documentation/netlink/specs/em.yaml
+F: include/uapi/linux/energy_model.h
+F: kernel/power/em_netlink*.*
EPAPR HYPERVISOR BYTE CHANNEL DEVICE DRIVER
M: Laurentiu Tudor <[email protected]>
@@@ -12568,7 -12570,6 +12568,7 @@@ F: drivers/dma/ioat
INTEL IAA CRYPTO DRIVER
M: Kristen Accardi <[email protected]>
M: Vinicius Costa Gomes <[email protected]>
+M: Kanchana P Sridhar <[email protected]>
L: [email protected]
S: Supported
F: Documentation/driver-api/crypto/iaa/iaa-crypto.rst
@@@ -13499,7 -13500,7 +13499,7 @@@ F: fs/autofs
KERNEL BUILD + files below scripts/ (unless maintained elsewhere)
M: Nathan Chancellor <[email protected]>
-M: Nicolas Schier <[email protected]>
+M: Nicolas Schier <[email protected]>
L: [email protected]
S: Odd Fixes
Q: https://patchwork.kernel.org/project/linux-kbuild/list/
@@@ -14054,7 -14055,7 +14054,7 @@@ F: tools/testing/selftests/landlock
K: landlock
K: LANDLOCK
- LANTIQ / INTEL Ethernet drivers
+ LANTIQ / MAXLINEAR / INTEL Ethernet DSA drivers
M: Hauke Mehrtens <[email protected]>
L: [email protected]
S: Maintained
@@@ -14062,6 -14063,7 +14062,7 @@@ F: Documentation/devicetree/bindings/ne
F: drivers/net/dsa/lantiq/*
F: drivers/net/ethernet/lantiq_xrx200.c
F: net/dsa/tag_gswip.c
+ F: net/dsa/tag_mxl-gsw1xx.c
LANTIQ MIPS ARCHITECTURE
M: John Crispin <[email protected]>
@@@ -14458,11 -14460,10 +14459,11 @@@ T: git
git://git.kernel.org/pub/scm/lin
F: Documentation/ABI/testing/sysfs-kernel-livepatch
F: Documentation/livepatch/
F: arch/powerpc/include/asm/livepatch.h
-F: include/linux/livepatch.h
+F: include/linux/livepatch*.h
F: kernel/livepatch/
F: kernel/module/livepatch.c
F: samples/livepatch/
+F: scripts/livepatch/
F: tools/testing/selftests/livepatch/
LLC (802.2)
@@@ -14536,7 -14537,6 +14537,7 @@@ S: Maintaine
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
locking/core
F: Documentation/locking/
F: arch/*/include/asm/spinlock*.h
+F: include/linux/local_lock*.h
F: include/linux/lockdep*.h
F: include/linux/mutex*.h
F: include/linux/rwlock*.h
@@@ -15413,14 -15413,12 +15414,12 @@@ S: Supporte
F: drivers/net/phy/mxl-86110.c
F: drivers/net/phy/mxl-gpy.c
- MCAN MMIO DEVICE DRIVER
- M: Chandrasekar Ramakrishnan <[email protected]>
+ MCAN DEVICE DRIVER
+ M: Markus Schneider-Pargmann <[email protected]>
L: [email protected]
S: Maintained
F: Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
- F: drivers/net/can/m_can/m_can.c
- F: drivers/net/can/m_can/m_can.h
- F: drivers/net/can/m_can/m_can_platform.c
+ F: drivers/net/can/m_can/
MCBA MICROCHIP CAN BUS ANALYZER TOOL DRIVER
R: Yasushi SHOJI <[email protected]>
@@@ -17456,6 -17454,14 +17455,14 @@@ S: Maintaine
F: Documentation/devicetree/bindings/net/motorcomm,yt8xxx.yaml
F: drivers/net/phy/motorcomm.c
+ MOTORCOMM YT921X ETHERNET SWITCH DRIVER
+ M: David Yang <[email protected]>
+ L: [email protected]
+ S: Maintained
+ F: Documentation/devicetree/bindings/net/dsa/motorcomm,yt921x.yaml
+ F: drivers/net/dsa/yt921x.*
+ F: net/dsa/tag_yt921x.c
+
MOXA SMARTIO/INDUSTIO/INTELLIO SERIAL CARD
M: Jiri Slaby <[email protected]>
S: Maintained
@@@ -17469,16 -17475,6 +17476,16 @@@ S: Maintaine
F: Documentation/devicetree/bindings/leds/backlight/mps,mp3309c.yaml
F: drivers/video/backlight/mp3309c.c
+MPAM DRIVER
+M: James Morse <[email protected]>
+M: Ben Horgan <[email protected]>
+R: Reinette Chatre <[email protected]>
+R: Fenghua Yu <[email protected]>
+S: Maintained
+F: drivers/resctrl/mpam_*
+F: drivers/resctrl/test_mpam_*
+F: include/linux/arm_mpam.h
+
MPS MP2869 DRIVER
M: Wensheng Wang <[email protected]>
L: [email protected]
@@@ -17621,6 -17617,14 +17628,14 @@@ T: git git://linuxtv.org/media.gi
F: Documentation/devicetree/bindings/media/i2c/aptina,mt9v111.yaml
F: drivers/media/i2c/mt9v111.c
+ MUCSE ETHERNET DRIVER
+ M: Yibo Dong <[email protected]>
+ L: [email protected]
+ S: Maintained
+ W: https://www.mucse.com/en/
+ F: Documentation/networking/device_drivers/ethernet/mucse/
+ F: drivers/net/ethernet/mucse/
+
MULTIFUNCTION DEVICES (MFD)
M: Lee Jones <[email protected]>
S: Maintained
@@@ -18062,6 -18066,7 +18077,7 @@@ L: [email protected]
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git
T: git
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git
+ F: Documentation/networking/xfrm/
F: include/net/xfrm.h
F: include/uapi/linux/xfrm.h
F: net/ipv4/ah4.c
@@@ -20609,7 -20614,6 +20625,7 @@@ R: John Ogness <john.ogness@linutronix.
R: Sergey Senozhatsky <[email protected]>
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git
+F: Documentation/core-api/printk-basics.rst
F: include/linux/printk.h
F: kernel/printk/
@@@ -21067,6 -21071,7 +21083,7 @@@ F: Documentation/devicetree/bindings/ne
F: drivers/net/wwan/qcom_bam_dmux.c
QUALCOMM BLUETOOTH DRIVER
+ M: Bartosz Golaszewski <[email protected]>
L: [email protected]
S: Maintained
F: drivers/bluetooth/btqca.[ch]
@@@ -21691,11 -21696,6 +21708,11 @@@ S: Maintaine
F: Documentation/devicetree/bindings/spi/realtek,rtl9301-snand.yaml
F: drivers/spi/spi-realtek-rtl-snand.c
+REALTEK SYSTIMER DRIVER
+M: Hao-Wen Ting <[email protected]>
+S: Maintained
+F: drivers/clocksource/timer-realtek.c
+
REALTEK WIRELESS DRIVER (rtlwifi family)
M: Ping-Ke Shih <[email protected]>
L: [email protected]
@@@ -22525,6 -22525,7 +22542,6 @@@ F: tools/verification
RUST
M: Miguel Ojeda <[email protected]>
-M: Alex Gaynor <[email protected]>
R: Boqun Feng <[email protected]>
R: Gary Guo <[email protected]>
R: Björn Roy Baron <[email protected]>
@@@ -22561,14 -22562,6 +22578,14 @@@ T: git https://github.com/Rust-for-Linu
F: rust/kernel/alloc.rs
F: rust/kernel/alloc/
+RUST [NUM]
+M: Alexandre Courbot <[email protected]>
+R: Yury Norov <[email protected]>
+L: [email protected]
+S: Maintained
+F: rust/kernel/num.rs
+F: rust/kernel/num/
+
RUST [PIN-INIT]
M: Benno Lossin <[email protected]>
L: [email protected]
@@@ -26080,8 -26073,6 +26097,8 @@@ S: Supporte
W: https://www.tq-group.com/en/products/tq-embedded/
F: arch/arm/boot/dts/nxp/imx/*mba*.dts*
F: arch/arm/boot/dts/nxp/imx/*tqma*.dts*
+F: arch/arm/boot/dts/ti/omap/*mba*.dts*
+F: arch/arm/boot/dts/ti/omap/*tqma*.dts*
F: arch/arm64/boot/dts/freescale/fsl-*tqml*.dts*
F: arch/arm64/boot/dts/freescale/imx*mba*.dts*
F: arch/arm64/boot/dts/freescale/imx*tqma*.dts*
@@@ -27190,7 -27181,6 +27207,7 @@@ F: arch/s390/include/uapi/asm/virtio-cc
F: drivers/s390/virtio/
VIRTIO FILE SYSTEM
+M: German Maglione <[email protected]>
M: Vivek Goyal <[email protected]>
M: Stefan Hajnoczi <[email protected]>
M: Miklos Szeredi <[email protected]>
@@@ -27684,6 -27674,7 +27701,7 @@@ M: Jason A. Donenfeld <[email protected]
L: [email protected]
L: [email protected]
S: Maintained
+ F: Documentation/netlink/specs/wireguard.yaml
F: drivers/net/wireguard/
F: tools/testing/selftests/wireguard/
diff --combined crypto/af_alg.c
index 6c271e55f44d9,5e760ab626183..e468714f539df
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@@ -145,7 -145,7 +145,7 @@@ void af_alg_release_parent(struct sock
}
EXPORT_SYMBOL_GPL(af_alg_release_parent);
- static int alg_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
+ static int alg_bind(struct socket *sock, struct sockaddr_unsized *uaddr, int
addr_len)
{
const u32 allowed = CRYPTO_ALG_KERN_DRIVER_ONLY;
struct sock *sk = sock->sk;
@@@ -1212,14 -1212,15 +1212,14 @@@ struct af_alg_async_req *af_alg_alloc_a
if (unlikely(!areq))
return ERR_PTR(-ENOMEM);
+ memset(areq, 0, areqlen);
+
ctx->inflight = true;
areq->areqlen = areqlen;
areq->sk = sk;
areq->first_rsgl.sgl.sgt.sgl = areq->first_rsgl.sgl.sgl;
- areq->last_rsgl = NULL;
INIT_LIST_HEAD(&areq->rsgl_list);
- areq->tsgl = NULL;
- areq->tsgl_entries = 0;
return areq;
}
diff --combined drivers/ptp/ptp_ocp.c
index 258169df0ff87,bf8187c963aab..65fe05cac8c42
--- a/drivers/ptp/ptp_ocp.c
+++ b/drivers/ptp/ptp_ocp.c
@@@ -25,8 -25,7 +25,7 @@@
#include <linux/crc16.h>
#include <linux/dpll.h>
- #define PCI_VENDOR_ID_FACEBOOK 0x1d9b
- #define PCI_DEVICE_ID_FACEBOOK_TIMECARD 0x0400
+ #define PCI_DEVICE_ID_META_TIMECARD 0x0400
#define PCI_VENDOR_ID_CELESTICA 0x18d4
#define PCI_DEVICE_ID_CELESTICA_TIMECARD 0x1008
@@@ -1030,7 -1029,7 +1029,7 @@@ static struct ocp_resource ocp_adva_res
};
static const struct pci_device_id ptp_ocp_pcidev_id[] = {
- { PCI_DEVICE_DATA(FACEBOOK, TIMECARD, &ocp_fb_resource) },
+ { PCI_DEVICE_DATA(META, TIMECARD, &ocp_fb_resource) },
{ PCI_DEVICE_DATA(CELESTICA, TIMECARD, &ocp_fb_resource) },
{ PCI_DEVICE_DATA(OROLIA, ARTCARD, &ocp_art_resource) },
{ PCI_DEVICE_DATA(ADVA, TIMECARD, &ocp_adva_resource) },
@@@ -2225,6 -2224,9 +2224,9 @@@ ptp_ocp_ts_enable(void *priv, u32 req,
static void
ptp_ocp_unregister_ext(struct ptp_ocp_ext_src *ext)
{
+ if (!ext)
+ return;
+
ext->info->enable(ext, ~0, false);
pci_free_irq(ext->bp->pdev, ext->irq_vec, ext);
kfree(ext);
@@@ -3250,20 -3252,16 +3252,16 @@@ signal_show(struct device *dev, struct
struct dev_ext_attribute *ea = to_ext_attr(attr);
struct ptp_ocp *bp = dev_get_drvdata(dev);
struct ptp_ocp_signal *signal;
+ int gen = (uintptr_t)ea->var;
struct timespec64 ts;
- ssize_t count;
- int i;
-
- i = (uintptr_t)ea->var;
- signal = &bp->signal[i];
- count = sysfs_emit(buf, "%llu %d %llu %d", signal->period,
- signal->duty, signal->phase, signal->polarity);
+ signal = &bp->signal[gen];
ts = ktime_to_timespec64(signal->start);
- count += sysfs_emit_at(buf, count, " %ptT TAI\n", &ts);
- return count;
+ return sysfs_emit(buf, "%llu %d %llu %d %ptT TAI\n",
+ signal->period, signal->duty, signal->phase,
signal->polarity,
+ &ts.tv_sec);
}
static EXT_ATTR_RW(signal, signal, 0);
static EXT_ATTR_RW(signal, signal, 1);
@@@ -3430,6 -3428,12 +3428,12 @@@ ptp_ocp_tty_show(struct device *dev, st
struct dev_ext_attribute *ea = to_ext_attr(attr);
struct ptp_ocp *bp = dev_get_drvdata(dev);
+ /*
+ * NOTE: This output does not include a trailing newline for backward
+ * compatibility. Existing userspace software uses this value directly
+ * as a device path (e.g., "/dev/ttyS4"), and adding a newline would
+ * break those applications. Do not add a newline to this output.
+ */
return sysfs_emit(buf, "ttyS%d", bp->port[(uintptr_t)ea->var].line);
}
@@@ -4287,9 -4291,11 +4291,9 @@@ ptp_ocp_summary_show(struct seq_file *s
ns += (s64)bp->utc_tai_offset * NSEC_PER_SEC;
sys_ts = ns_to_timespec64(ns);
- seq_printf(s, "%7s: %lld.%ld == %ptT TAI\n", "PHC",
- ts.tv_sec, ts.tv_nsec, &ts);
- seq_printf(s, "%7s: %lld.%ld == %ptT UTC offset %d\n", "SYS",
- sys_ts.tv_sec, sys_ts.tv_nsec, &sys_ts,
- bp->utc_tai_offset);
+ seq_printf(s, "%7s: %ptSp == %ptS TAI\n", "PHC", &ts, &ts);
+ seq_printf(s, "%7s: %ptSp == %ptS UTC offset %d\n", "SYS",
+ &sys_ts, &sys_ts, bp->utc_tai_offset);
seq_printf(s, "%7s: PHC:SYS offset: %lld window: %lld\n", "",
timespec64_to_ns(&ts) - ns,
post_ns - pre_ns);
@@@ -4497,8 -4503,9 +4501,8 @@@ ptp_ocp_phc_info(struct ptp_ocp *bp
ptp_clock_index(bp->ptp));
if (!ptp_ocp_gettimex(&bp->ptp_info, &ts, NULL))
- dev_info(&bp->pdev->dev, "Time: %lld.%ld, %s\n",
- ts.tv_sec, ts.tv_nsec,
- bp->sync ? "in-sync" : "UNSYNCED");
+ dev_info(&bp->pdev->dev, "Time: %ptSp, %s\n",
+ &ts, bp->sync ? "in-sync" : "UNSYNCED");
}
static void
@@@ -4553,21 -4560,14 +4557,14 @@@ ptp_ocp_detach(struct ptp_ocp *bp
ptp_ocp_detach_sysfs(bp);
ptp_ocp_attr_group_del(bp);
timer_delete_sync(&bp->watchdog);
- if (bp->ts0)
- ptp_ocp_unregister_ext(bp->ts0);
- if (bp->ts1)
- ptp_ocp_unregister_ext(bp->ts1);
- if (bp->ts2)
- ptp_ocp_unregister_ext(bp->ts2);
- if (bp->ts3)
- ptp_ocp_unregister_ext(bp->ts3);
- if (bp->ts4)
- ptp_ocp_unregister_ext(bp->ts4);
- if (bp->pps)
- ptp_ocp_unregister_ext(bp->pps);
+ ptp_ocp_unregister_ext(bp->ts0);
+ ptp_ocp_unregister_ext(bp->ts1);
+ ptp_ocp_unregister_ext(bp->ts2);
+ ptp_ocp_unregister_ext(bp->ts3);
+ ptp_ocp_unregister_ext(bp->ts4);
+ ptp_ocp_unregister_ext(bp->pps);
for (i = 0; i < 4; i++)
- if (bp->signal_out[i])
- ptp_ocp_unregister_ext(bp->signal_out[i]);
+ ptp_ocp_unregister_ext(bp->signal_out[i]);
for (i = 0; i < __PORT_COUNT; i++)
if (bp->port[i].line != -1)
serial8250_unregister_port(bp->port[i].line);
@@@ -4820,8 -4820,7 +4817,7 @@@ ptp_ocp_probe(struct pci_dev *pdev, con
return 0;
out_dpll:
- while (i) {
- --i;
+ while (i--) {
dpll_pin_unregister(bp->dpll, bp->sma[i].dpll_pin,
&dpll_pins_ops, &bp->sma[i]);
dpll_pin_put(bp->sma[i].dpll_pin);
}
diff --combined drivers/s390/net/ctcm_fsms.c
index e221687a9858a,1a48258b63b28..bf917f4264532
--- a/drivers/s390/net/ctcm_fsms.c
+++ b/drivers/s390/net/ctcm_fsms.c
@@@ -12,7 -12,8 +12,7 @@@
#undef DEBUGDATA
#undef DEBUGCCW
-#define KMSG_COMPONENT "ctcm"
-#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+#define pr_fmt(fmt) "ctcm: " fmt
#include <linux/module.h>
#include <linux/init.h>
@@@ -881,6 -882,13 +881,13 @@@ static void ctcm_chx_rxiniterr(fsm_inst
fsm_newstate(fi, CTC_STATE_RXERR);
fsm_event(priv->fsm, DEV_EVENT_RXDOWN, dev);
}
+ } else if (event == CTC_EVENT_UC_RCRESET) {
+ CTCM_DBF_TEXT_(TRACE, CTC_DBF_NOTICE,
+ "%s(%s): %s in %s", CTCM_FUNTAIL, ch->id,
+ ctc_ch_event_names[event], fsm_getstate_str(fi));
+
+ dev_info(&dev->dev,
+ "Init handshake not received, peer not ready yet\n");
} else {
CTCM_DBF_TEXT_(ERROR, CTC_DBF_ERROR,
"%s(%s): %s in %s", CTCM_FUNTAIL, ch->id,
@@@ -966,6 -974,13 +973,13 @@@ static void ctcm_chx_txiniterr(fsm_inst
fsm_newstate(fi, CTC_STATE_TXERR);
fsm_event(priv->fsm, DEV_EVENT_TXDOWN, dev);
}
+ } else if (event == CTC_EVENT_UC_RCRESET) {
+ CTCM_DBF_TEXT_(TRACE, CTC_DBF_NOTICE,
+ "%s(%s): %s in %s", CTCM_FUNTAIL, ch->id,
+ ctc_ch_event_names[event], fsm_getstate_str(fi));
+
+ dev_info(&dev->dev,
+ "Init handshake not sent, peer not ready yet\n");
} else {
CTCM_DBF_TEXT_(ERROR, CTC_DBF_ERROR,
"%s(%s): %s in %s", CTCM_FUNTAIL, ch->id,
diff --combined drivers/s390/net/qeth_core_main.c
index 64d45285651db,10b53bba373c9..1c80e8ca67b5f
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@@ -7,8 -7,10 +7,8 @@@
* Frank Blaschka <[email protected]>
*/
-#define KMSG_COMPONENT "qeth"
-#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+#define pr_fmt(fmt) "qeth: " fmt
-#include <linux/compat.h>
#include <linux/export.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
@@@ -759,7 -761,7 +759,7 @@@ static void qeth_issue_ipa_msg(struct q
if (rc)
QETH_DBF_MESSAGE(2, "IPA: %s(%#x) for device %x returned %#x
\"%s\"\n",
ipa_name, com, CARD_DEVID(card), rc,
- qeth_get_ipa_msg(rc));
+ qeth_get_ipa_msg(com, rc));
else
QETH_DBF_MESSAGE(5, "IPA: %s(%#x) for device %x succeeded\n",
ipa_name, com, CARD_DEVID(card));
@@@ -4803,7 -4805,8 +4803,7 @@@ static int qeth_query_oat_command(struc
rc = qeth_send_ipa_cmd(card, iob, qeth_setadpparms_query_oat_cb, &priv);
if (!rc) {
- tmp = is_compat_task() ? compat_ptr(oat_data.ptr) :
- u64_to_user_ptr(oat_data.ptr);
+ tmp = u64_to_user_ptr(oat_data.ptr);
oat_data.response_len = priv.response_len;
if (copy_to_user(tmp, priv.buffer, priv.response_len) ||
diff --combined drivers/s390/net/smsgiucv_app.c
index 7041c1dca1e89,768108c90b325..1bd0370460cd4
--- a/drivers/s390/net/smsgiucv_app.c
+++ b/drivers/s390/net/smsgiucv_app.c
@@@ -10,7 -10,8 +10,7 @@@
* Author(s): Hendrik Brueckner <[email protected]>
*
*/
-#define KMSG_COMPONENT "smsgiucv_app"
-#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+#define pr_fmt(fmt) "smsgiucv_app: " fmt
#include <linux/ctype.h>
#include <linux/err.h>
@@@ -87,9 -88,10 +87,10 @@@ static struct smsg_app_event *smsg_app_
ev->envp[3] = NULL;
/* setting up environment: sender, prefix name, and message text */
- snprintf(ev->envp[0], ENV_SENDER_LEN, ENV_SENDER_STR "%s", from);
- snprintf(ev->envp[1], ENV_PREFIX_LEN, ENV_PREFIX_STR "%s", SMSG_PREFIX);
- snprintf(ev->envp[2], ENV_TEXT_LEN(msg), ENV_TEXT_STR "%s", msg);
+ scnprintf(ev->envp[0], ENV_SENDER_LEN, ENV_SENDER_STR "%s", from);
+ scnprintf(ev->envp[1], ENV_PREFIX_LEN, ENV_PREFIX_STR "%s",
+ SMSG_PREFIX);
+ scnprintf(ev->envp[2], ENV_TEXT_LEN(msg), ENV_TEXT_STR "%s", msg);
return ev;
}
@@@ -160,7 -162,7 +161,7 @@@ static int __init smsgiucv_app_init(voi
if (!smsgiucv_drv)
return -ENODEV;
- smsg_app_dev = iucv_alloc_device(NULL, smsgiucv_drv, NULL,
KMSG_COMPONENT);
+ smsg_app_dev = iucv_alloc_device(NULL, smsgiucv_drv, NULL,
"smsgiucv_app");
if (!smsg_app_dev)
return -ENOMEM;
diff --combined drivers/slimbus/qcom-ngd-ctrl.c
index cd40ab839c541,fdb94dc4a7307..ba3d80d12605c
--- a/drivers/slimbus/qcom-ngd-ctrl.c
+++ b/drivers/slimbus/qcom-ngd-ctrl.c
@@@ -463,7 -463,7 +463,7 @@@ static int qcom_slim_qmi_init(struct qc
}
rc = kernel_connect(handle->sock,
- (struct sockaddr *)&ctrl->qmi.svc_info,
+ (struct sockaddr_unsized *)&ctrl->qmi.svc_info,
sizeof(ctrl->qmi.svc_info), 0);
if (rc < 0) {
dev_err(ctrl->dev, "Remote Service connect failed: %d\n", rc);
@@@ -1241,7 -1241,6 +1241,7 @@@ static void qcom_slim_ngd_notify_slaves
if (slim_get_logical_addr(sbdev))
dev_err(ctrl->dev, "Failed to get logical address\n");
+ put_device(&sbdev->dev);
}
}
diff --combined fs/coredump.c
index fe4099e0530bf,14837d9e2abbe..8feb9c1cf83db
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@@ -708,7 -708,7 +708,7 @@@ static bool coredump_sock_connect(struc
*/
pidfs_coredump(cprm);
- retval = kernel_connect(socket, (struct sockaddr *)(&addr), addr_len,
+ retval = kernel_connect(socket, (struct sockaddr_unsized *)(&addr),
addr_len,
O_NONBLOCK | SOCK_COREDUMP);
if (retval) {
@@@ -1036,7 -1036,7 +1036,7 @@@ static bool coredump_pipe(struct core_n
static bool coredump_write(struct core_name *cn,
struct coredump_params *cprm,
- struct linux_binfmt *binfmt)
+ const struct linux_binfmt *binfmt)
{
if (dump_interrupted())
@@@ -1086,119 -1086,119 +1086,119 @@@ static inline bool coredump_skip(const
return false;
}
-void vfs_coredump(const kernel_siginfo_t *siginfo)
+static void do_coredump(struct core_name *cn, struct coredump_params *cprm,
+ size_t **argv, int *argc, const struct linux_binfmt
*binfmt)
{
- struct cred *cred __free(put_cred) = NULL;
- size_t *argv __free(kfree) = NULL;
- struct core_state core_state;
- struct core_name cn;
- struct mm_struct *mm = current->mm;
- struct linux_binfmt *binfmt = mm->binfmt;
- const struct cred *old_cred;
- int argc = 0;
- struct coredump_params cprm = {
- .siginfo = siginfo,
- .limit = rlimit(RLIMIT_CORE),
- /*
- * We must use the same mm->flags while dumping core to avoid
- * inconsistency of bit flags, since this flag is not protected
- * by any locks.
- *
- * Note that we only care about MMF_DUMP* flags.
- */
- .mm_flags = __mm_flags_get_dumpable(mm),
- .vma_meta = NULL,
- .cpu = raw_smp_processor_id(),
- };
-
- audit_core_dumps(siginfo->si_signo);
-
- if (coredump_skip(&cprm, binfmt))
- return;
-
- cred = prepare_creds();
- if (!cred)
- return;
- /*
- * We cannot trust fsuid as being the "true" uid of the process
- * nor do we know its entire history. We only know it was tainted
- * so we dump it as root in mode 2, and only into a controlled
- * environment (pipe handler or fully qualified path).
- */
- if (coredump_force_suid_safe(&cprm))
- cred->fsuid = GLOBAL_ROOT_UID;
-
- if (coredump_wait(siginfo->si_signo, &core_state) < 0)
- return;
-
- old_cred = override_creds(cred);
-
- if (!coredump_parse(&cn, &cprm, &argv, &argc)) {
+ if (!coredump_parse(cn, cprm, argv, argc)) {
coredump_report_failure("format_corename failed, aborting
core");
- goto close_fail;
+ return;
}
- switch (cn.core_type) {
+ switch (cn->core_type) {
case COREDUMP_FILE:
- if (!coredump_file(&cn, &cprm, binfmt))
- goto close_fail;
+ if (!coredump_file(cn, cprm, binfmt))
+ return;
break;
case COREDUMP_PIPE:
- if (!coredump_pipe(&cn, &cprm, argv, argc))
- goto close_fail;
+ if (!coredump_pipe(cn, cprm, *argv, *argc))
+ return;
break;
case COREDUMP_SOCK_REQ:
fallthrough;
case COREDUMP_SOCK:
- if (!coredump_socket(&cn, &cprm))
- goto close_fail;
+ if (!coredump_socket(cn, cprm))
+ return;
break;
default:
WARN_ON_ONCE(true);
- goto close_fail;
+ return;
}
/* Don't even generate the coredump. */
- if (cn.mask & COREDUMP_REJECT)
- goto close_fail;
+ if (cn->mask & COREDUMP_REJECT)
+ return;
/* get us an unshared descriptor table; almost always a no-op */
/* The cell spufs coredump code reads the file descriptor tables */
if (unshare_files())
- goto close_fail;
+ return;
- if ((cn.mask & COREDUMP_KERNEL) && !coredump_write(&cn, &cprm, binfmt))
- goto close_fail;
+ if ((cn->mask & COREDUMP_KERNEL) && !coredump_write(cn, cprm, binfmt))
+ return;
- coredump_sock_shutdown(cprm.file);
+ coredump_sock_shutdown(cprm->file);
/* Let the parent know that a coredump was generated. */
- if (cn.mask & COREDUMP_USERSPACE)
- cn.core_dumped = true;
+ if (cn->mask & COREDUMP_USERSPACE)
+ cn->core_dumped = true;
/*
* When core_pipe_limit is set we wait for the coredump server
* or usermodehelper to finish before exiting so it can e.g.,
* inspect /proc/<pid>.
*/
- if (cn.mask & COREDUMP_WAIT) {
- switch (cn.core_type) {
+ if (cn->mask & COREDUMP_WAIT) {
+ switch (cn->core_type) {
case COREDUMP_PIPE:
- wait_for_dump_helpers(cprm.file);
+ wait_for_dump_helpers(cprm->file);
break;
case COREDUMP_SOCK_REQ:
fallthrough;
case COREDUMP_SOCK:
- coredump_sock_wait(cprm.file);
+ coredump_sock_wait(cprm->file);
break;
default:
break;
}
}
+}
+
+void vfs_coredump(const kernel_siginfo_t *siginfo)
+{
+ size_t *argv __free(kfree) = NULL;
+ struct core_state core_state;
+ struct core_name cn;
+ const struct mm_struct *mm = current->mm;
+ const struct linux_binfmt *binfmt = mm->binfmt;
+ int argc = 0;
+ struct coredump_params cprm = {
+ .siginfo = siginfo,
+ .limit = rlimit(RLIMIT_CORE),
+ /*
+ * We must use the same mm->flags while dumping core to avoid
+ * inconsistency of bit flags, since this flag is not protected
+ * by any locks.
+ *
+ * Note that we only care about MMF_DUMP* flags.
+ */
+ .mm_flags = __mm_flags_get_dumpable(mm),
+ .vma_meta = NULL,
+ .cpu = raw_smp_processor_id(),
+ };
+
+ audit_core_dumps(siginfo->si_signo);
+
+ if (coredump_skip(&cprm, binfmt))
+ return;
+
+ CLASS(prepare_creds, cred)();
+ if (!cred)
+ return;
+ /*
+ * We cannot trust fsuid as being the "true" uid of the process
+ * nor do we know its entire history. We only know it was tainted
+ * so we dump it as root in mode 2, and only into a controlled
+ * environment (pipe handler or fully qualified path).
+ */
+ if (coredump_force_suid_safe(&cprm))
+ cred->fsuid = GLOBAL_ROOT_UID;
+
+ if (coredump_wait(siginfo->si_signo, &core_state) < 0)
+ return;
-close_fail:
+ scoped_with_creds(cred)
+ do_coredump(&cn, &cprm, &argv, &argc, binfmt);
coredump_cleanup(&cn, &cprm);
- revert_creds(old_cred);
return;
}
diff --combined include/linux/filter.h
index 569de3b14279a,03e7516c61872..fd54fed8f95f3
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@@ -712,13 -712,11 +712,13 @@@ static __always_inline u32 __bpf_prog_r
ret = dfunc(ctx, prog->insnsi, prog->bpf_func);
duration = sched_clock() - start;
- stats = this_cpu_ptr(prog->stats);
- flags = u64_stats_update_begin_irqsave(&stats->syncp);
- u64_stats_inc(&stats->cnt);
- u64_stats_add(&stats->nsecs, duration);
- u64_stats_update_end_irqrestore(&stats->syncp, flags);
+ if (likely(prog->stats)) {
+ stats = this_cpu_ptr(prog->stats);
+ flags = u64_stats_update_begin_irqsave(&stats->syncp);
+ u64_stats_inc(&stats->cnt);
+ u64_stats_add(&stats->nsecs, duration);
+ u64_stats_update_end_irqrestore(&stats->syncp, flags);
+ }
} else {
ret = dfunc(ctx, prog->insnsi, prog->bpf_func);
}
@@@ -1537,7 -1535,7 +1537,7 @@@ static inline int bpf_tell_extensions(v
struct bpf_sock_addr_kern {
struct sock *sk;
- struct sockaddr *uaddr;
+ struct sockaddr_unsized *uaddr;
/* Temporary "register" to make indirect stores to nested structures
* defined above. We need three registers to make such a store, but
* only two (src and dst) are available at convert_ctx_access time
@@@ -1803,6 -1801,8 +1803,8 @@@ int __bpf_xdp_store_bytes(struct xdp_bu
void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len);
void bpf_xdp_copy_buf(struct xdp_buff *xdp, unsigned long off,
void *buf, unsigned long len, bool flush);
+ int __bpf_skb_meta_store_bytes(struct sk_buff *skb, u32 offset,
+ const void *from, u32 len, u64 flags);
void *bpf_skb_meta_pointer(struct sk_buff *skb, u32 offset);
#else /* CONFIG_NET */
static inline int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset,
@@@ -1839,6 -1839,13 +1841,13 @@@ static inline void bpf_xdp_copy_buf(str
{
}
+ static inline int __bpf_skb_meta_store_bytes(struct sk_buff *skb, u32 offset,
+ const void *from, u32 len,
+ u64 flags)
+ {
+ return -EOPNOTSUPP;
+ }
+
static inline void *bpf_skb_meta_pointer(struct sk_buff *skb, u32 offset)
{
return ERR_PTR(-EOPNOTSUPP);
diff --combined include/linux/rculist_nulls.h
index d5a656cc4c6af,c26cb83ca0711..a97c3bcb1656e
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@@ -52,6 -52,13 +52,13 @@@ static inline void hlist_nulls_del_init
#define hlist_nulls_next_rcu(node) \
(*((struct hlist_nulls_node __rcu __force **)&(node)->next))
+ /**
+ * hlist_nulls_pprev_rcu - returns the dereferenced pprev of @node.
+ * @node: element of the list.
+ */
+ #define hlist_nulls_pprev_rcu(node) \
+ (*((struct hlist_nulls_node __rcu __force **)(node)->pprev))
+
/**
* hlist_nulls_del_rcu - deletes entry from hash list without
re-initialization
* @n: the element to delete from the hash list.
@@@ -138,7 -145,7 +145,7 @@@ static inline void hlist_nulls_add_tail
if (last) {
WRITE_ONCE(n->next, last->next);
- n->pprev = &last->next;
+ WRITE_ONCE(n->pprev, &last->next);
rcu_assign_pointer(hlist_nulls_next_rcu(last), n);
} else {
hlist_nulls_add_head_rcu(n, h);
@@@ -148,10 -155,62 +155,62 @@@
/* after that hlist_nulls_del will work */
static inline void hlist_nulls_add_fake(struct hlist_nulls_node *n)
{
- n->pprev = &n->next;
- n->next = (struct hlist_nulls_node *)NULLS_MARKER(NULL);
+ WRITE_ONCE(n->pprev, &n->next);
+ WRITE_ONCE(n->next, (struct hlist_nulls_node *)NULLS_MARKER(NULL));
}
+ /**
+ * hlist_nulls_replace_rcu - replace an old entry by a new one
+ * @old: the element to be replaced
+ * @new: the new element to insert
+ *
+ * Description:
+ * Replace the old entry with the new one in a RCU-protected hlist_nulls,
while
+ * permitting racing traversals.
+ *
+ * The caller must take whatever precautions are necessary (such as holding
+ * appropriate locks) to avoid racing with another list-mutation primitive,
such
+ * as hlist_nulls_add_head_rcu() or hlist_nulls_del_rcu(), running on this
same
+ * list. However, it is perfectly legal to run concurrently with the _rcu
+ * list-traversal primitives, such as hlist_nulls_for_each_entry_rcu().
+ */
+ static inline void hlist_nulls_replace_rcu(struct hlist_nulls_node *old,
+ struct hlist_nulls_node *new)
+ {
+ struct hlist_nulls_node *next = old->next;
+
+ WRITE_ONCE(new->next, next);
+ WRITE_ONCE(new->pprev, old->pprev);
+ rcu_assign_pointer(hlist_nulls_pprev_rcu(new), new);
+ if (!is_a_nulls(next))
+ WRITE_ONCE(next->pprev, &new->next);
+ }
+
+ /**
+ * hlist_nulls_replace_init_rcu - replace an old entry by a new one and
+ * initialize the old
+ * @old: the element to be replaced
+ * @new: the new element to insert
+ *
+ * Description:
+ * Replace the old entry with the new one in a RCU-protected hlist_nulls,
while
+ * permitting racing traversals, and reinitialize the old entry.
+ *
+ * Note: @old must be hashed.
+ *
+ * The caller must take whatever precautions are necessary (such as holding
+ * appropriate locks) to avoid racing with another list-mutation primitive,
such
+ * as hlist_nulls_add_head_rcu() or hlist_nulls_del_rcu(), running on this
same
+ * list. However, it is perfectly legal to run concurrently with the _rcu
+ * list-traversal primitives, such as hlist_nulls_for_each_entry_rcu().
+ */
+ static inline void hlist_nulls_replace_init_rcu(struct hlist_nulls_node *old,
+ struct hlist_nulls_node *new)
+ {
+ hlist_nulls_replace_rcu(old, new);
+ WRITE_ONCE(old->pprev, NULL);
+ }
+
/**
* hlist_nulls_for_each_entry_rcu - iterate over rcu list of given type
* @tpos: the type * to use as a loop cursor.
diff --combined include/uapi/linux/bpf.h
index f5713f59ac10a,6eb75ad900b13..f8d8513eda27b
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@@ -1026,7 -1026,6 +1026,7 @@@ enum bpf_map_type
BPF_MAP_TYPE_USER_RINGBUF,
BPF_MAP_TYPE_CGRP_STORAGE,
BPF_MAP_TYPE_ARENA,
+ BPF_MAP_TYPE_INSN_ARRAY,
__MAX_BPF_MAP_TYPE
};
@@@ -1431,9 -1430,6 +1431,9 @@@ enum
/* Do not translate kernel bpf_arena pointers to user pointers */
BPF_F_NO_USER_CONV = (1U << 18),
+
+/* Enable BPF ringbuf overwrite mode */
+ BPF_F_RB_OVERWRITE = (1U << 19),
};
/* Flags for BPF_PROG_QUERY. */
@@@ -5622,7 -5618,7 +5622,7 @@@ union bpf_attr
* Return
* *sk* if casting is valid, or **NULL** otherwise.
*
- * long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct
bpf_dynptr *ptr)
+ * long bpf_dynptr_from_mem(void *data, u64 size, u64 flags, struct
bpf_dynptr *ptr)
* Description
* Get a dynptr to local memory *data*.
*
@@@ -5665,7 -5661,7 +5665,7 @@@
* Return
* Nothing. Always succeeds.
*
- * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32
offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u64 len, const struct bpf_dynptr *src, u64
offset, u64 flags)
* Description
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
@@@ -5675,7 -5671,7 +5675,7 @@@
* of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
* *flags* is not 0.
*
- * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src,
u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u64 offset, void *src,
u64 len, u64 flags)
* Description
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
@@@ -5696,7 -5692,7 +5696,7 @@@
* is a read-only dynptr or if *flags* is not correct. For
skb-type dynptrs,
* other errors correspond to errors returned by
**bpf_skb_store_bytes**\ ().
*
- * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u64 offset, u64 len)
* Description
* Get a pointer to the underlying dynptr data.
*
@@@ -6235,7 -6231,6 +6235,7 @@@ enum
BPF_RB_RING_SIZE = 1,
BPF_RB_CONS_POS = 2,
BPF_RB_PROD_POS = 3,
+ BPF_RB_OVERWRITE_POS = 4,
};
/* BPF ring buffer constants */
@@@ -7205,6 -7200,8 +7205,8 @@@ enum
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header
*/
TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
SK_BPF_CB_FLAGS = 1009, /* Get or set sock ops flags in socket
*/
+ SK_BPF_BYPASS_PROT_MEM = 1010, /* Get or Set sk->sk_bypass_prot_mem */
+
};
enum {
@@@ -7650,24 -7647,4 +7652,24 @@@ enum bpf_kfunc_flags
BPF_F_PAD_ZEROS = (1ULL << 0),
};
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ *
+ * Before the map is used the orig_off field should point to an
+ * instruction inside the program being loaded. The other fields
+ * must be set to 0.
+ *
+ * After the program is loaded, the xlated_off will be adjusted
+ * by the verifier to point to the index of the original instruction
+ * in the xlated program. If the instruction is deleted, it will
+ * be set to (u32)-1. The jitted_off will be set to the corresponding
+ * offset in the jitted image of the program.
+ */
+struct bpf_insn_array_value {
+ __u32 orig_off;
+ __u32 xlated_off;
+ __u32 jitted_off;
+ __u32 :32;
+};
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --combined kernel/bpf/helpers.c
index 07bfa72f96493,0785eb739eb8e..db72b96f9c8c8
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@@ -28,7 -28,6 +28,7 @@@
#include <linux/verification.h>
#include <linux/task_work.h>
#include <linux/irq_work.h>
+#include <linux/buildid.h>
#include "../../lib/kstrtox.h"
@@@ -43,7 -42,8 +43,7 @@@
*/
BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key)
{
- WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
- !rcu_read_lock_bh_held());
+ WARN_ON_ONCE(!bpf_rcu_lock_held());
return (unsigned long) map->ops->map_lookup_elem(map, key);
}
@@@ -59,7 -59,8 +59,7 @@@ const struct bpf_func_proto bpf_map_loo
BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key,
void *, value, u64, flags)
{
- WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
- !rcu_read_lock_bh_held());
+ WARN_ON_ONCE(!bpf_rcu_lock_held());
return map->ops->map_update_elem(map, key, value, flags);
}
@@@ -76,7 -77,8 +76,7 @@@ const struct bpf_func_proto bpf_map_upd
BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, key)
{
- WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
- !rcu_read_lock_bh_held());
+ WARN_ON_ONCE(!bpf_rcu_lock_held());
return map->ops->map_delete_elem(map, key);
}
@@@ -132,7 -134,8 +132,7 @@@ const struct bpf_func_proto bpf_map_pee
BPF_CALL_3(bpf_map_lookup_percpu_elem, struct bpf_map *, map, void *, key,
u32, cpu)
{
- WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
- !rcu_read_lock_bh_held());
+ WARN_ON_ONCE(!bpf_rcu_lock_held());
return (unsigned long) map->ops->map_lookup_percpu_elem(map, key, cpu);
}
@@@ -774,11 -777,9 +774,11 @@@ int bpf_try_get_buffers(struct bpf_bpri
{
int nest_level;
+ preempt_disable();
nest_level = this_cpu_inc_return(bpf_bprintf_nest_level);
if (WARN_ON_ONCE(nest_level > MAX_BPRINTF_NEST_LEVEL)) {
this_cpu_dec(bpf_bprintf_nest_level);
+ preempt_enable();
return -EBUSY;
}
*bufs = this_cpu_ptr(&bpf_bprintf_bufs[nest_level - 1]);
@@@ -791,7 -792,6 +791,7 @@@ void bpf_put_buffers(void
if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
return;
this_cpu_dec(bpf_bprintf_nest_level);
+ preempt_enable();
}
void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
@@@ -1660,13 -1660,6 +1660,13 @@@ static const struct bpf_func_proto bpf_
.arg2_btf_id = BPF_PTR_POISON,
};
+struct bpf_dynptr_file_impl {
+ struct freader freader;
+ /* 64 bit offset and size overriding 32 bit ones in bpf_dynptr_kern */
+ u64 offset;
+ u64 size;
+};
+
/* Since the upper 8 bits of dynptr->size is reserved, the
* maximum supported size is 2^24 - 1.
*/
@@@ -1695,65 -1688,23 +1695,65 @@@ static enum bpf_dynptr_type bpf_dynptr_
return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
}
-u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr)
+u64 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr)
{
+ if (bpf_dynptr_get_type(ptr) == BPF_DYNPTR_TYPE_FILE) {
+ struct bpf_dynptr_file_impl *df = ptr->data;
+
+ return df->size;
+ }
+
return ptr->size & DYNPTR_SIZE_MASK;
}
-static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
+static void bpf_dynptr_advance_offset(struct bpf_dynptr_kern *ptr, u64 off)
+{
+ if (bpf_dynptr_get_type(ptr) == BPF_DYNPTR_TYPE_FILE) {
+ struct bpf_dynptr_file_impl *df = ptr->data;
+
+ df->offset += off;
+ return;
+ }
+ ptr->offset += off;
+}
+
+static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u64 new_size)
{
u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
- ptr->size = new_size | metadata;
+ if (bpf_dynptr_get_type(ptr) == BPF_DYNPTR_TYPE_FILE) {
+ struct bpf_dynptr_file_impl *df = ptr->data;
+
+ df->size = new_size;
+ return;
+ }
+ ptr->size = (u32)new_size | metadata;
}
-int bpf_dynptr_check_size(u32 size)
+int bpf_dynptr_check_size(u64 size)
{
return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
}
+static int bpf_file_fetch_bytes(struct bpf_dynptr_file_impl *df, u64 offset,
void *buf, u64 len)
+{
+ const void *ptr;
+
+ if (!buf)
+ return -EINVAL;
+
+ df->freader.buf = buf;
+ df->freader.buf_sz = len;
+ ptr = freader_fetch(&df->freader, offset + df->offset, len);
+ if (!ptr)
+ return df->freader.err;
+
+ if (ptr != buf) /* Force copying into the buffer */
+ memcpy(buf, ptr, len);
+
+ return 0;
+}
+
void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
enum bpf_dynptr_type type, u32 offset, u32 size)
{
@@@ -1768,7 -1719,7 +1768,7 @@@ void bpf_dynptr_set_null(struct bpf_dyn
memset(ptr, 0, sizeof(*ptr));
}
-BPF_CALL_4(bpf_dynptr_from_mem, void *, data, u32, size, u64, flags, struct
bpf_dynptr_kern *, ptr)
+BPF_CALL_4(bpf_dynptr_from_mem, void *, data, u64, size, u64, flags, struct
bpf_dynptr_kern *, ptr)
{
int err;
@@@ -1803,8 -1754,8 +1803,8 @@@ static const struct bpf_func_proto bpf_
.arg4_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT |
MEM_WRITE,
};
-static int __bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr_kern
*src,
- u32 offset, u64 flags)
+static int __bpf_dynptr_read(void *dst, u64 len, const struct bpf_dynptr_kern
*src,
+ u64 offset, u64 flags)
{
enum bpf_dynptr_type type;
int err;
@@@ -1834,16 -1785,14 +1834,16 @@@
case BPF_DYNPTR_TYPE_SKB_META:
memmove(dst, bpf_skb_meta_pointer(src->data, src->offset +
offset), len);
return 0;
+ case BPF_DYNPTR_TYPE_FILE:
+ return bpf_file_fetch_bytes(src->data, offset, dst, len);
default:
WARN_ONCE(true, "bpf_dynptr_read: unknown dynptr type %d\n",
type);
return -EFAULT;
}
}
-BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct
bpf_dynptr_kern *, src,
- u32, offset, u64, flags)
+BPF_CALL_5(bpf_dynptr_read, void *, dst, u64, len, const struct
bpf_dynptr_kern *, src,
+ u64, offset, u64, flags)
{
return __bpf_dynptr_read(dst, len, src, offset, flags);
}
@@@ -1859,8 -1808,8 +1859,8 @@@ static const struct bpf_func_proto bpf_
.arg5_type = ARG_ANYTHING,
};
-int __bpf_dynptr_write(const struct bpf_dynptr_kern *dst, u32 offset, void
*src,
- u32 len, u64 flags)
+int __bpf_dynptr_write(const struct bpf_dynptr_kern *dst, u64 offset, void
*src,
+ u64 len, u64 flags)
{
enum bpf_dynptr_type type;
int err;
@@@ -1893,18 -1842,16 +1893,16 @@@
return -EINVAL;
return __bpf_xdp_store_bytes(dst->data, dst->offset + offset,
src, len);
case BPF_DYNPTR_TYPE_SKB_META:
- if (flags)
- return -EINVAL;
- memmove(bpf_skb_meta_pointer(dst->data, dst->offset + offset),
src, len);
- return 0;
+ return __bpf_skb_meta_store_bytes(dst->data, dst->offset +
offset, src,
+ len, flags);
default:
WARN_ONCE(true, "bpf_dynptr_write: unknown dynptr type %d\n",
type);
return -EFAULT;
}
}
-BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32,
offset, void *, src,
- u32, len, u64, flags)
+BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u64,
offset, void *, src,
+ u64, len, u64, flags)
{
return __bpf_dynptr_write(dst, offset, src, len, flags);
}
@@@ -1920,7 -1867,7 +1918,7 @@@ static const struct bpf_func_proto bpf_
.arg5_type = ARG_ANYTHING,
};
-BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset,
u32, len)
+BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u64, offset,
u64, len)
{
enum bpf_dynptr_type type;
int err;
@@@ -2735,12 -2682,12 +2733,12 @@@ __bpf_kfunc struct task_struct *bpf_tas
* provided buffer, with its contents containing the data, if unable to obtain
* direct pointer)
*/
-__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
- void *buffer__opt, u32 buffer__szk)
+__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr *p, u64 offset,
+ void *buffer__opt, u64 buffer__szk)
{
const struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
enum bpf_dynptr_type type;
- u32 len = buffer__szk;
+ u64 len = buffer__szk;
int err;
if (!ptr->data)
@@@ -2774,9 -2721,6 +2772,9 @@@
}
case BPF_DYNPTR_TYPE_SKB_META:
return bpf_skb_meta_pointer(ptr->data, ptr->offset + offset);
+ case BPF_DYNPTR_TYPE_FILE:
+ err = bpf_file_fetch_bytes(ptr->data, offset, buffer__opt,
buffer__szk);
+ return err ? NULL : buffer__opt;
default:
WARN_ONCE(true, "unknown dynptr type %d\n", type);
return NULL;
@@@ -2825,8 -2769,8 +2823,8 @@@
* provided buffer, with its contents containing the data, if unable to obtain
* direct pointer)
*/
-__bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u32
offset,
- void *buffer__opt, u32 buffer__szk)
+__bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *p, u64
offset,
+ void *buffer__opt, u64 buffer__szk)
{
const struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
@@@ -2858,10 -2802,10 +2856,10 @@@
return bpf_dynptr_slice(p, offset, buffer__opt, buffer__szk);
}
-__bpf_kfunc int bpf_dynptr_adjust(const struct bpf_dynptr *p, u32 start, u32
end)
+__bpf_kfunc int bpf_dynptr_adjust(const struct bpf_dynptr *p, u64 start, u64
end)
{
struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
- u32 size;
+ u64 size;
if (!ptr->data || start > end)
return -EINVAL;
@@@ -2871,7 -2815,7 +2869,7 @@@
if (start > size || end > size)
return -ERANGE;
- ptr->offset += start;
+ bpf_dynptr_advance_offset(ptr, start);
bpf_dynptr_set_size(ptr, end - start);
return 0;
@@@ -2894,7 -2838,7 +2892,7 @@@ __bpf_kfunc bool bpf_dynptr_is_rdonly(c
return __bpf_dynptr_is_rdonly(ptr);
}
-__bpf_kfunc __u32 bpf_dynptr_size(const struct bpf_dynptr *p)
+__bpf_kfunc u64 bpf_dynptr_size(const struct bpf_dynptr *p)
{
struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
@@@ -2931,14 -2875,14 +2929,14 @@@ __bpf_kfunc int bpf_dynptr_clone(const
* Copies data from source dynptr to destination dynptr.
* Returns 0 on success; negative error, otherwise.
*/
-__bpf_kfunc int bpf_dynptr_copy(struct bpf_dynptr *dst_ptr, u32 dst_off,
- struct bpf_dynptr *src_ptr, u32 src_off, u32
size)
+__bpf_kfunc int bpf_dynptr_copy(struct bpf_dynptr *dst_ptr, u64 dst_off,
+ struct bpf_dynptr *src_ptr, u64 src_off, u64
size)
{
struct bpf_dynptr_kern *dst = (struct bpf_dynptr_kern *)dst_ptr;
struct bpf_dynptr_kern *src = (struct bpf_dynptr_kern *)src_ptr;
void *src_slice, *dst_slice;
char buf[256];
- u32 off;
+ u64 off;
src_slice = bpf_dynptr_slice(src_ptr, src_off, NULL, size);
dst_slice = bpf_dynptr_slice_rdwr(dst_ptr, dst_off, NULL, size);
@@@ -2960,7 -2904,7 +2958,7 @@@
off = 0;
while (off < size) {
- u32 chunk_sz = min_t(u32, sizeof(buf), size - off);
+ u64 chunk_sz = min_t(u64, sizeof(buf), size - off);
int err;
err = __bpf_dynptr_read(buf, chunk_sz, src, src_off + off, 0);
@@@ -2986,10 -2930,10 +2984,10 @@@
* at @offset with the constant byte @val.
* Returns 0 on success; negative error, otherwise.
*/
- __bpf_kfunc int bpf_dynptr_memset(struct bpf_dynptr *p, u32 offset, u32
size, u8 val)
- {
+__bpf_kfunc int bpf_dynptr_memset(struct bpf_dynptr *p, u64 offset, u64 size,
u8 val)
+{
struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
- u32 chunk_sz, write_off;
+ u64 chunk_sz, write_off;
char buf[256];
void* slice;
int err;
@@@ -3008,11 -2952,11 +3006,11 @@@
return err;
/* Non-linear data under the dynptr, write from a local buffer */
- chunk_sz = min_t(u32, sizeof(buf), size);
+ chunk_sz = min_t(u64, sizeof(buf), size);
memset(buf, val, chunk_sz);
for (write_off = 0; write_off < size; write_off += chunk_sz) {
- chunk_sz = min_t(u32, sizeof(buf), size - write_off);
+ chunk_sz = min_t(u64, sizeof(buf), size - write_off);
err = __bpf_dynptr_write(ptr, offset + write_off, buf,
chunk_sz, 0);
if (err)
return err;
@@@ -3732,21 -3676,34 +3730,21 @@@ err_out
return -EFAULT;
}
-/**
- * bpf_strnstr - Find the first substring in a length-limited string
- * @s1__ign: The string to be searched
- * @s2__ign: The string to search for
- * @len: the maximum number of characters to search
- *
- * Return:
- * * >=0 - Index of the first character of the first occurrence of
@s2__ign
- * within the first @len characters of @s1__ign
- * * %-ENOENT - @s2__ign not found in the first @len characters of @s1__ign
- * * %-EFAULT - Cannot read one of the strings
- * * %-E2BIG - One of the strings is too large
- * * %-ERANGE - One of the strings is outside of kernel address space
- */
-__bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign, size_t
len)
+static int __bpf_strnstr(const char *s1, const char *s2, size_t len,
+ bool ignore_case)
{
char c1, c2;
int i, j;
- if (!copy_from_kernel_nofault_allowed(s1__ign, 1) ||
- !copy_from_kernel_nofault_allowed(s2__ign, 1)) {
+ if (!copy_from_kernel_nofault_allowed(s1, 1) ||
+ !copy_from_kernel_nofault_allowed(s2, 1)) {
return -ERANGE;
}
guard(pagefault)();
for (i = 0; i < XATTR_SIZE_MAX; i++) {
for (j = 0; i + j <= len && j < XATTR_SIZE_MAX; j++) {
- __get_kernel_nofault(&c2, s2__ign + j, char, err_out);
+ __get_kernel_nofault(&c2, s2 + j, char, err_out);
if (c2 == '\0')
return i;
/*
@@@ -3756,13 -3713,7 +3754,13 @@@
*/
if (i + j == len)
break;
- __get_kernel_nofault(&c1, s1__ign + j, char, err_out);
+ __get_kernel_nofault(&c1, s1 + j, char, err_out);
+
+ if (ignore_case) {
+ c1 = tolower(c1);
+ c2 = tolower(c2);
+ }
+
if (c1 == '\0')
return -ENOENT;
if (c1 != c2)
@@@ -3772,7 -3723,7 +3770,7 @@@
return -E2BIG;
if (i + j == len)
return -ENOENT;
- s1__ign++;
+ s1++;
}
return -E2BIG;
err_out:
@@@ -3794,69 -3745,8 +3792,69 @@@
*/
__bpf_kfunc int bpf_strstr(const char *s1__ign, const char *s2__ign)
{
- return bpf_strnstr(s1__ign, s2__ign, XATTR_SIZE_MAX);
+ return __bpf_strnstr(s1__ign, s2__ign, XATTR_SIZE_MAX, false);
+}
+
+/**
+ * bpf_strcasestr - Find the first substring in a string, ignoring the case of
+ * the characters
+ * @s1__ign: The string to be searched
+ * @s2__ign: The string to search for
+ *
+ * Return:
+ * * >=0 - Index of the first character of the first occurrence of
@s2__ign
+ * within @s1__ign
+ * * %-ENOENT - @s2__ign is not a substring of @s1__ign
+ * * %-EFAULT - Cannot read one of the strings
+ * * %-E2BIG - One of the strings is too large
+ * * %-ERANGE - One of the strings is outside of kernel address space
+ */
+__bpf_kfunc int bpf_strcasestr(const char *s1__ign, const char *s2__ign)
+{
+ return __bpf_strnstr(s1__ign, s2__ign, XATTR_SIZE_MAX, true);
+}
+
+/**
+ * bpf_strnstr - Find the first substring in a length-limited string
+ * @s1__ign: The string to be searched
+ * @s2__ign: The string to search for
+ * @len: the maximum number of characters to search
+ *
+ * Return:
+ * * >=0 - Index of the first character of the first occurrence of
@s2__ign
+ * within the first @len characters of @s1__ign
+ * * %-ENOENT - @s2__ign not found in the first @len characters of @s1__ign
+ * * %-EFAULT - Cannot read one of the strings
+ * * %-E2BIG - One of the strings is too large
+ * * %-ERANGE - One of the strings is outside of kernel address space
+ */
+__bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign,
+ size_t len)
+{
+ return __bpf_strnstr(s1__ign, s2__ign, len, false);
+}
+
+/**
+ * bpf_strncasestr - Find the first substring in a length-limited string,
+ * ignoring the case of the characters
+ * @s1__ign: The string to be searched
+ * @s2__ign: The string to search for
+ * @len: the maximum number of characters to search
+ *
+ * Return:
+ * * >=0 - Index of the first character of the first occurrence of
@s2__ign
+ * within the first @len characters of @s1__ign
+ * * %-ENOENT - @s2__ign not found in the first @len characters of @s1__ign
+ * * %-EFAULT - Cannot read one of the strings
+ * * %-E2BIG - One of the strings is too large
+ * * %-ERANGE - One of the strings is outside of kernel address space
+ */
+__bpf_kfunc int bpf_strncasestr(const char *s1__ign, const char *s2__ign,
+ size_t len)
+{
+ return __bpf_strnstr(s1__ign, s2__ign, len, true);
}
+
#ifdef CONFIG_KEYS
/**
* bpf_lookup_user_key - lookup a key by its serial
@@@ -4314,54 -4204,6 +4312,54 @@@ __bpf_kfunc int bpf_task_work_schedule_
return bpf_task_work_schedule(task, tw, map__map, callback, aux__prog,
TWA_RESUME);
}
+static int make_file_dynptr(struct file *file, u32 flags, bool may_sleep,
+ struct bpf_dynptr_kern *ptr)
+{
+ struct bpf_dynptr_file_impl *state;
+
+ /* flags is currently unsupported */
+ if (flags) {
+ bpf_dynptr_set_null(ptr);
+ return -EINVAL;
+ }
+
+ state = bpf_mem_alloc(&bpf_global_ma, sizeof(struct
bpf_dynptr_file_impl));
+ if (!state) {
+ bpf_dynptr_set_null(ptr);
+ return -ENOMEM;
+ }
+ state->offset = 0;
+ state->size = U64_MAX; /* Don't restrict size, as file may change
anyways */
+ freader_init_from_file(&state->freader, NULL, 0, file, may_sleep);
+ bpf_dynptr_init(ptr, state, BPF_DYNPTR_TYPE_FILE, 0, 0);
+ bpf_dynptr_set_rdonly(ptr);
+ return 0;
+}
+
+__bpf_kfunc int bpf_dynptr_from_file(struct file *file, u32 flags, struct
bpf_dynptr *ptr__uninit)
+{
+ return make_file_dynptr(file, flags, false, (struct bpf_dynptr_kern
*)ptr__uninit);
+}
+
+int bpf_dynptr_from_file_sleepable(struct file *file, u32 flags, struct
bpf_dynptr *ptr__uninit)
+{
+ return make_file_dynptr(file, flags, true, (struct bpf_dynptr_kern
*)ptr__uninit);
+}
+
+__bpf_kfunc int bpf_dynptr_file_discard(struct bpf_dynptr *dynptr)
+{
+ struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)dynptr;
+ struct bpf_dynptr_file_impl *df = ptr->data;
+
+ if (!df)
+ return 0;
+
+ freader_cleanup(&df->freader);
+ bpf_mem_free(&bpf_global_ma, df);
+ bpf_dynptr_set_null(ptr);
+ return 0;
+}
+
__bpf_kfunc_end_defs();
static void bpf_task_work_cancel_scheduled(struct irq_work *irq_work)
@@@ -4532,17 -4374,13 +4530,17 @@@ BTF_ID_FLAGS(func, bpf_strnlen)
BTF_ID_FLAGS(func, bpf_strspn);
BTF_ID_FLAGS(func, bpf_strcspn);
BTF_ID_FLAGS(func, bpf_strstr);
+BTF_ID_FLAGS(func, bpf_strcasestr);
BTF_ID_FLAGS(func, bpf_strnstr);
+BTF_ID_FLAGS(func, bpf_strncasestr);
#if defined(CONFIG_BPF_LSM) && defined(CONFIG_CGROUPS)
BTF_ID_FLAGS(func, bpf_cgroup_read_xattr, KF_RCU)
#endif
BTF_ID_FLAGS(func, bpf_stream_vprintk_impl, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_task_work_schedule_signal_impl, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_task_work_schedule_resume_impl, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_dynptr_from_file, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_dynptr_file_discard)
BTF_KFUNCS_END(common_btf_ids)
static const struct btf_kfunc_id_set common_kfunc_set = {
@@@ -4583,7 -4421,7 +4581,7 @@@ late_initcall(kfunc_init)
/* Get a pointer to dynptr data up to len bytes for read only access. If
* the dynptr doesn't have continuous data up to len bytes, return NULL.
*/
-const void *__bpf_dynptr_data(const struct bpf_dynptr_kern *ptr, u32 len)
+const void *__bpf_dynptr_data(const struct bpf_dynptr_kern *ptr, u64 len)
{
const struct bpf_dynptr *p = (struct bpf_dynptr *)ptr;
@@@ -4594,19 -4432,9 +4592,19 @@@
* the dynptr doesn't have continuous data up to len bytes, or the dynptr
* is read only, return NULL.
*/
-void *__bpf_dynptr_data_rw(const struct bpf_dynptr_kern *ptr, u32 len)
+void *__bpf_dynptr_data_rw(const struct bpf_dynptr_kern *ptr, u64 len)
{
if (__bpf_dynptr_is_rdonly(ptr))
return NULL;
return (void *)__bpf_dynptr_data(ptr, len);
}
+
+void bpf_map_free_internal_structs(struct bpf_map *map, void *val)
+{
+ if (btf_record_has_field(map->record, BPF_TIMER))
+ bpf_obj_free_timer(map->record, val);
+ if (btf_record_has_field(map->record, BPF_WORKQUEUE))
+ bpf_obj_free_workqueue(map->record, val);
+ if (btf_record_has_field(map->record, BPF_TASK_WORK))
+ bpf_obj_free_task_work(map->record, val);
+}
diff --combined kernel/bpf/syscall.c
index 7561f66e3331e,80b86e9d3c39b..6589acc89ef8d
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@@ -158,7 -158,7 +158,7 @@@ static void maybe_wait_bpf_programs(str
*/
if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS ||
map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS)
- synchronize_rcu();
+ synchronize_rcu_expedited();
}
static void unpin_uptr_kaddr(void *kaddr)
@@@ -1234,6 -1234,7 +1234,7 @@@ int bpf_obj_name_cpy(char *dst, const c
return src - orig_src;
}
+ EXPORT_SYMBOL_GPL(bpf_obj_name_cpy);
int map_check_no_btf(const struct bpf_map *map,
const struct btf *btf,
@@@ -1493,7 -1494,6 +1494,7 @@@ static int map_create(union bpf_attr *a
case BPF_MAP_TYPE_STRUCT_OPS:
case BPF_MAP_TYPE_CPUMAP:
case BPF_MAP_TYPE_ARENA:
+ case BPF_MAP_TYPE_INSN_ARRAY:
if (!bpf_token_capable(token, CAP_BPF))
goto put_token;
break;
@@@ -1586,8 -1586,7 +1587,8 @@@
goto free_map;
}
} else if (attr->excl_prog_hash_size) {
- return -EINVAL;
+ err = -EINVAL;
+ goto free_map;
}
err = security_bpf_map_create(map, attr, token, uattr.is_kernel);
@@@ -1726,6 -1725,9 +1727,6 @@@ static int map_lookup_elem(union bpf_at
if (CHECK_ATTR(BPF_MAP_LOOKUP_ELEM))
return -EINVAL;
- if (attr->flags & ~BPF_F_LOCK)
- return -EINVAL;
-
CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
@@@ -1733,9 -1735,9 +1734,9 @@@
if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
return -EPERM;
- if ((attr->flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK))
- return -EINVAL;
+ err = bpf_map_check_op_flags(map, attr->flags, BPF_F_LOCK);
+ if (err)
+ return err;
key = __bpf_copy_key(ukey, map->key_size);
if (IS_ERR(key))
@@@ -1798,9 -1800,11 +1799,9 @@@ static int map_update_elem(union bpf_at
goto err_put;
}
- if ((attr->flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
- err = -EINVAL;
+ err = bpf_map_check_op_flags(map, attr->flags, ~0);
+ if (err)
goto err_put;
- }
key = ___bpf_copy_key(ukey, map->key_size);
if (IS_ERR(key)) {
@@@ -2004,9 -2008,13 +2005,9 @@@ int generic_map_update_batch(struct bpf
void *key, *value;
int err = 0;
- if (attr->batch.elem_flags & ~BPF_F_LOCK)
- return -EINVAL;
-
- if ((attr->batch.elem_flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
- return -EINVAL;
- }
+ err = bpf_map_check_op_flags(map, attr->batch.elem_flags, BPF_F_LOCK);
+ if (err)
+ return err;
value_size = bpf_map_value_size(map);
@@@ -2063,9 -2071,12 +2064,9 @@@ int generic_map_lookup_batch(struct bpf
u32 value_size, cp, max_count;
int err;
- if (attr->batch.elem_flags & ~BPF_F_LOCK)
- return -EINVAL;
-
- if ((attr->batch.elem_flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK))
- return -EINVAL;
+ err = bpf_map_check_op_flags(map, attr->batch.elem_flags, BPF_F_LOCK);
+ if (err)
+ return err;
value_size = bpf_map_value_size(map);
@@@ -2320,7 -2331,7 +2321,7 @@@ static void bpf_audit_prog(const struc
return;
if (audit_enabled == AUDIT_OFF)
return;
- if (!in_irq() && !irqs_disabled())
+ if (!in_hardirq() && !irqs_disabled())
ctx = audit_context();
ab = audit_log_start(ctx, GFP_ATOMIC, AUDIT_BPF);
if (unlikely(!ab))
@@@ -2418,7 -2429,7 +2419,7 @@@ static void __bpf_prog_put(struct bpf_p
struct bpf_prog_aux *aux = prog->aux;
if (atomic64_dec_and_test(&aux->refcnt)) {
- if (in_irq() || irqs_disabled()) {
+ if (in_hardirq() || irqs_disabled()) {
INIT_WORK(&aux->work, bpf_prog_put_deferred);
schedule_work(&aux->work);
} else {
@@@ -2452,9 -2463,6 +2453,9 @@@ void notrace bpf_prog_inc_misses_counte
struct bpf_prog_stats *stats;
unsigned int flags;
+ if (unlikely(!prog->stats))
+ return;
+
stats = this_cpu_ptr(prog->stats);
flags = u64_stats_update_begin_irqsave(&stats->syncp);
u64_stats_inc(&stats->misses);
@@@ -2846,23 -2854,6 +2847,23 @@@ static int bpf_prog_verify_signature(st
return err;
}
+static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
+{
+ int err;
+ int i;
+
+ for (i = 0; i < prog->aux->used_map_cnt; i++) {
+ if (prog->aux->used_maps[i]->map_type !=
BPF_MAP_TYPE_INSN_ARRAY)
+ continue;
+
+ err = bpf_insn_array_ready(prog->aux->used_maps[i]);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
/* last field in 'union bpf_attr' used by this command */
#define BPF_PROG_LOAD_LAST_FIELD keyring_id
@@@ -3092,10 -3083,6 +3093,10 @@@ static int bpf_prog_load(union bpf_att
if (err < 0)
goto free_used_maps;
+ err = bpf_prog_mark_insn_arrays_ready(prog);
+ if (err < 0)
+ goto free_used_maps;
+
err = bpf_prog_alloc_id(prog);
if (err)
goto free_used_maps;
@@@ -5048,19 -5035,19 +5049,19 @@@ static int bpf_prog_get_info_by_fd(stru
struct bpf_insn *insns_sanitized;
bool fault;
- if (prog->blinded && !bpf_dump_raw_ok(file->f_cred)) {
+ if (!prog->blinded || bpf_dump_raw_ok(file->f_cred)) {
+ insns_sanitized = bpf_insn_prepare_dump(prog,
file->f_cred);
+ if (!insns_sanitized)
+ return -ENOMEM;
+ uinsns = u64_to_user_ptr(info.xlated_prog_insns);
+ ulen = min_t(u32, info.xlated_prog_len, ulen);
+ fault = copy_to_user(uinsns, insns_sanitized, ulen);
+ kfree(insns_sanitized);
+ if (fault)
+ return -EFAULT;
+ } else {
info.xlated_prog_insns = 0;
- goto done;
}
- insns_sanitized = bpf_insn_prepare_dump(prog, file->f_cred);
- if (!insns_sanitized)
- return -ENOMEM;
- uinsns = u64_to_user_ptr(info.xlated_prog_insns);
- ulen = min_t(u32, info.xlated_prog_len, ulen);
- fault = copy_to_user(uinsns, insns_sanitized, ulen);
- kfree(insns_sanitized);
- if (fault)
- return -EFAULT;
}
if (bpf_prog_is_offloaded(prog->aux)) {
diff --combined net/core/filter.c
index df6ce85e48dcd,4124becf86047..616e0520a0bb8
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@@ -2458,13 -2458,6 +2458,13 @@@ BPF_CALL_3(bpf_clone_redirect, struct s
if (unlikely(flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL)))
return -EINVAL;
+ /* BPF test infra's convert___skb_to_skb() can create type-less
+ * GSO packets. gso_features_check() will detect this as a bad
+ * offload. However, lets not leak them out in the first place.
+ */
+ if (unlikely(skb_is_gso(skb) && !skb_shinfo(skb)->gso_type))
+ return -EBADMSG;
+
dev = dev_get_by_index_rcu(dev_net(skb->dev), ifindex);
if (unlikely(!dev))
return -EINVAL;
@@@ -3260,11 -3253,11 +3260,11 @@@ static void bpf_skb_change_protocol(str
static int bpf_skb_generic_push(struct sk_buff *skb, u32 off, u32 len)
{
- /* Caller already did skb_cow() with len as headroom,
+ /* Caller already did skb_cow() with meta_len+len as headroom,
* so no need to do it here.
*/
skb_push(skb, len);
- memmove(skb->data, skb->data + len, off);
+ skb_postpush_data_move(skb, len, off);
memset(skb->data + off, 0, len);
/* No skb_postpush_rcsum(skb, skb->data + off, len)
@@@ -3288,7 -3281,7 +3288,7 @@@ static int bpf_skb_generic_pop(struct s
old_data = skb->data;
__skb_pull(skb, len);
skb_postpull_rcsum(skb, old_data + off, len);
- memmove(skb->data, old_data, off);
+ skb_postpull_data_move(skb, len, off);
return 0;
}
@@@ -3333,10 -3326,11 +3333,11 @@@ static int bpf_skb_net_hdr_pop(struct s
static int bpf_skb_proto_4_to_6(struct sk_buff *skb)
{
const u32 len_diff = sizeof(struct ipv6hdr) - sizeof(struct iphdr);
+ const u8 meta_len = skb_metadata_len(skb);
u32 off = skb_mac_header_len(skb);
int ret;
- ret = skb_cow(skb, len_diff);
+ ret = skb_cow(skb, meta_len + len_diff);
if (unlikely(ret < 0))
return ret;
@@@ -3496,6 -3490,7 +3497,7 @@@ static int bpf_skb_net_grow(struct sk_b
u8 inner_mac_len = flags >> BPF_ADJ_ROOM_ENCAP_L2_SHIFT;
bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK;
u16 mac_len = 0, inner_net = 0, inner_trans = 0;
+ const u8 meta_len = skb_metadata_len(skb);
unsigned int gso_type = SKB_GSO_DODGY;
int ret;
@@@ -3506,7 -3501,7 +3508,7 @@@
return -ENOTSUPP;
}
- ret = skb_cow_head(skb, len_diff);
+ ret = skb_cow_head(skb, meta_len + len_diff);
if (unlikely(ret < 0))
return ret;
@@@ -3880,6 -3875,7 +3882,7 @@@ static const struct bpf_func_proto sk_s
static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room,
u64 flags)
{
+ const u8 meta_len = skb_metadata_len(skb);
u32 max_len = BPF_SKB_MAX_LEN;
u32 new_len = skb->len + head_room;
int ret;
@@@ -3889,7 -3885,7 +3892,7 @@@
new_len < skb->len))
return -EINVAL;
- ret = skb_cow(skb, head_room);
+ ret = skb_cow(skb, meta_len + head_room);
if (likely(!ret)) {
/* Idea for this helper is that we currently only
* allow to expand on mac header. This means that
@@@ -3901,6 -3897,7 +3904,7 @@@
* for redirection into L2 device.
*/
__skb_push(skb, head_room);
+ skb_postpush_data_move(skb, head_room, 0);
memset(skb->data, 0, head_room);
skb_reset_mac_header(skb);
skb_reset_mac_len(skb);
@@@ -5741,6 -5738,77 +5745,77 @@@ static const struct bpf_func_proto bpf_
.arg5_type = ARG_CONST_SIZE,
};
+ static int sk_bpf_set_get_bypass_prot_mem(struct sock *sk,
+ char *optval, int optlen,
+ bool getopt)
+ {
+ int val;
+
+ if (optlen != sizeof(int))
+ return -EINVAL;
+
+ if (!sk_has_account(sk))
+ return -EOPNOTSUPP;
+
+ if (getopt) {
+ *(int *)optval = sk->sk_bypass_prot_mem;
+ return 0;
+ }
+
+ val = *(int *)optval;
+ if (val < 0 || val > 1)
+ return -EINVAL;
+
+ sk->sk_bypass_prot_mem = val;
+ return 0;
+ }
+
+ BPF_CALL_5(bpf_sock_create_setsockopt, struct sock *, sk, int, level,
+ int, optname, char *, optval, int, optlen)
+ {
+ if (level == SOL_SOCKET && optname == SK_BPF_BYPASS_PROT_MEM)
+ return sk_bpf_set_get_bypass_prot_mem(sk, optval, optlen,
false);
+
+ return __bpf_setsockopt(sk, level, optname, optval, optlen);
+ }
+
+ static const struct bpf_func_proto bpf_sock_create_setsockopt_proto = {
+ .func = bpf_sock_create_setsockopt,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY,
+ .arg5_type = ARG_CONST_SIZE,
+ };
+
+ BPF_CALL_5(bpf_sock_create_getsockopt, struct sock *, sk, int, level,
+ int, optname, char *, optval, int, optlen)
+ {
+ if (level == SOL_SOCKET && optname == SK_BPF_BYPASS_PROT_MEM) {
+ int err = sk_bpf_set_get_bypass_prot_mem(sk, optval, optlen,
true);
+
+ if (err)
+ memset(optval, 0, optlen);
+
+ return err;
+ }
+
+ return __bpf_getsockopt(sk, level, optname, optval, optlen);
+ }
+
+ static const struct bpf_func_proto bpf_sock_create_getsockopt_proto = {
+ .func = bpf_sock_create_getsockopt,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_PTR_TO_UNINIT_MEM,
+ .arg5_type = ARG_CONST_SIZE,
+ };
+
BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
int, level, int, optname, char *, optval, int, optlen)
{
@@@ -5915,7 -5983,7 +5990,7 @@@ BPF_CALL_3(bpf_bind, struct bpf_sock_ad
return err;
if (((struct sockaddr_in *)addr)->sin_port == htons(0))
flags |= BIND_FORCE_ADDRESS_NO_PORT;
- return __inet_bind(sk, addr, addr_len, flags);
+ return __inet_bind(sk, (struct sockaddr_unsized *)addr,
addr_len, flags);
#if IS_ENABLED(CONFIG_IPV6)
} else if (addr->sa_family == AF_INET6) {
if (addr_len < SIN6_LEN_RFC2133)
@@@ -5925,7 -5993,8 +6000,8 @@@
/* ipv6_bpf_stub cannot be NULL, since it's called from
* bpf_cgroup_inet6_connect hook and ipv6 is already loaded
*/
- return ipv6_bpf_stub->inet6_bind(sk, addr, addr_len, flags);
+ return ipv6_bpf_stub->inet6_bind(sk, (struct sockaddr_unsized
*)addr,
+ addr_len, flags);
#endif /* CONFIG_IPV6 */
}
#endif /* CONFIG_INET */
@@@ -6429,12 -6498,9 +6505,12 @@@ BPF_CALL_5(bpf_skb_check_mtu, struct sk
*/
if (skb_is_gso(skb)) {
ret = BPF_MTU_CHK_RET_SUCCESS;
- if (flags & BPF_MTU_CHK_SEGS &&
- !skb_gso_validate_network_len(skb, mtu))
- ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
+ if (flags & BPF_MTU_CHK_SEGS) {
+ if (!skb_transport_header_was_set(skb))
+ return -EINVAL;
+ if (!skb_gso_validate_network_len(skb, mtu))
+ ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
+ }
}
out:
*mtu_len = mtu;
@@@ -8073,6 -8139,20 +8149,20 @@@ sock_filter_func_proto(enum bpf_func_i
return &bpf_sk_storage_get_cg_sock_proto;
case BPF_FUNC_ktime_get_coarse_ns:
return &bpf_ktime_get_coarse_ns_proto;
+ case BPF_FUNC_setsockopt:
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_INET_SOCK_CREATE:
+ return &bpf_sock_create_setsockopt_proto;
+ default:
+ return NULL;
+ }
+ case BPF_FUNC_getsockopt:
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_INET_SOCK_CREATE:
+ return &bpf_sock_create_getsockopt_proto;
+ default:
+ return NULL;
+ }
default:
return bpf_base_func_proto(func_id, prog);
}
@@@ -12026,6 -12106,18 +12116,18 @@@ void *bpf_skb_meta_pointer(struct sk_bu
return skb_metadata_end(skb) - skb_metadata_len(skb) + offset;
}
+ int __bpf_skb_meta_store_bytes(struct sk_buff *skb, u32 offset,
+ const void *from, u32 len, u64 flags)
+ {
+ if (unlikely(flags))
+ return -EINVAL;
+ if (unlikely(bpf_try_make_writable(skb, 0)))
+ return -EFAULT;
+
+ memmove(bpf_skb_meta_pointer(skb, offset), from, len);
+ return 0;
+ }
+
__bpf_kfunc_start_defs();
__bpf_kfunc int bpf_dynptr_from_skb(struct __sk_buff *s, u64 flags,
struct bpf_dynptr *ptr__uninit)
@@@ -12053,9 -12145,6 +12155,6 @@@
* XDP context with bpf_xdp_adjust_meta(). Serves as an alternative to
* &__sk_buff->data_meta.
*
- * If passed @skb_ is a clone which shares the data with the original, the
- * dynptr will be read-only. This limitation may be lifted in the future.
- *
* Return:
* * %0 - dynptr ready to use
* * %-EINVAL - invalid flags, dynptr set to null
@@@ -12073,9 -12162,6 +12172,6 @@@ __bpf_kfunc int bpf_dynptr_from_skb_met
bpf_dynptr_init(ptr, skb, BPF_DYNPTR_TYPE_SKB_META, 0,
skb_metadata_len(skb));
- if (skb_cloned(skb))
- bpf_dynptr_set_rdonly(ptr);
-
return 0;
}
diff --combined net/core/net_namespace.c
index 83cbec4afcb3a,dfad7c03b8094..a6e6a964a2876
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@@ -395,6 -395,7 +395,7 @@@ static __net_init void preinit_net_sysc
net->core.sysctl_optmem_max = 128 * 1024;
net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
net->core.sysctl_tstamp_allow_data = 1;
+ net->core.sysctl_txq_reselection = msecs_to_jiffies(1000);
}
/* init code that must occur even if setup_net() is not called. */
@@@ -439,7 -440,7 +440,7 @@@ static __net_init int setup_net(struct
LIST_HEAD(net_exit_list);
int error = 0;
- net->net_cookie = ns_tree_gen_id(&net->ns);
+ net->net_cookie = ns_tree_gen_id(net);
list_for_each_entry(ops, &pernet_list, list) {
error = ops_init(ops, net);
@@@ -1222,14 -1223,12 +1223,12 @@@ static void __init netns_ipv4_struct_ch
sysctl_tcp_wmem);
CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_tx,
sysctl_ip_fwd_use_pmtu);
- CACHELINE_ASSERT_GROUP_SIZE(struct netns_ipv4, netns_ipv4_read_tx, 33);
-
- /* TXRX readonly hotpath cache lines */
- CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_txrx,
- sysctl_tcp_moderate_rcvbuf);
- CACHELINE_ASSERT_GROUP_SIZE(struct netns_ipv4, netns_ipv4_read_txrx, 1);
/* RX readonly hotpath cache line */
+ CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_rx,
+ sysctl_tcp_moderate_rcvbuf);
+ CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_rx,
+ sysctl_tcp_rcvbuf_low_rtt);
CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_rx,
sysctl_ip_early_demux);
CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_rx,
@@@ -1240,7 -1239,6 +1239,6 @@@
sysctl_tcp_reordering);
CACHELINE_ASSERT_GROUP_MEMBER(struct netns_ipv4, netns_ipv4_read_rx,
sysctl_tcp_rmem);
- CACHELINE_ASSERT_GROUP_SIZE(struct netns_ipv4, netns_ipv4_read_rx, 22);
}
#endif
diff --combined net/socket.c
index e1bf93508f051,101a7ed574e75..809ef372727b3
--- a/net/socket.c
+++ b/net/socket.c
@@@ -503,12 -503,21 +503,12 @@@ EXPORT_SYMBOL(sock_alloc_file)
static int sock_map_fd(struct socket *sock, int flags)
{
- struct file *newfile;
- int fd = get_unused_fd_flags(flags);
- if (unlikely(fd < 0)) {
- sock_release(sock);
- return fd;
- }
-
- newfile = sock_alloc_file(sock, flags, NULL);
- if (!IS_ERR(newfile)) {
- fd_install(fd, newfile);
- return fd;
- }
+ int fd;
- put_unused_fd(fd);
- return PTR_ERR(newfile);
+ fd = FD_ADD(flags, sock_alloc_file(sock, flags, NULL));
+ if (fd < 0)
+ sock_release(sock);
+ return fd;
}
/**
@@@ -1863,7 -1872,7 +1863,7 @@@ int __sys_bind_socket(struct socket *so
addrlen);
if (!err)
err = READ_ONCE(sock->ops)->bind(sock,
- (struct sockaddr *)address,
+ (struct sockaddr_unsized
*)address,
addrlen);
return err;
}
@@@ -2003,6 -2012,8 +2003,6 @@@ static int __sys_accept4_file(struct fi
int __user *upeer_addrlen, int flags)
{
struct proto_accept_arg arg = { };
- struct file *newfile;
- int newfd;
if (flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
return -EINVAL;
@@@ -2010,7 -2021,18 +2010,7 @@@
if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
- newfd = get_unused_fd_flags(flags);
- if (unlikely(newfd < 0))
- return newfd;
-
- newfile = do_accept(file, &arg, upeer_sockaddr, upeer_addrlen,
- flags);
- if (IS_ERR(newfile)) {
- put_unused_fd(newfd);
- return PTR_ERR(newfile);
- }
- fd_install(newfd, newfile);
- return newfd;
+ return FD_ADD(flags, do_accept(file, &arg, upeer_sockaddr,
upeer_addrlen, flags));
}
/*
@@@ -2077,8 -2099,8 +2077,8 @@@ int __sys_connect_file(struct file *fil
if (err)
goto out;
- err = READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)address,
- addrlen, sock->file->f_flags | file_flags);
+ err = READ_ONCE(sock->ops)->connect(sock, (struct sockaddr_unsized
*)address,
+ addrlen, sock->file->f_flags |
file_flags);
out:
return err;
}
@@@ -3561,13 -3583,13 +3561,13 @@@ static long compat_sock_ioctl(struct fi
* Returns 0 or an error.
*/
- int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
+ int kernel_bind(struct socket *sock, struct sockaddr_unsized *addr, int
addrlen)
{
struct sockaddr_storage address;
memcpy(&address, addr, addrlen);
- return READ_ONCE(sock->ops)->bind(sock, (struct sockaddr *)&address,
+ return READ_ONCE(sock->ops)->bind(sock, (struct sockaddr_unsized
*)&address,
addrlen);
}
EXPORT_SYMBOL(kernel_bind);
@@@ -3640,14 -3662,14 +3640,14 @@@ EXPORT_SYMBOL(kernel_accept)
* Returns 0 or an error code.
*/
- int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
+ int kernel_connect(struct socket *sock, struct sockaddr_unsized *addr, int
addrlen,
int flags)
{
struct sockaddr_storage address;
memcpy(&address, addr, addrlen);
- return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)&address,
+ return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr_unsized
*)&address,
addrlen, flags);
}
EXPORT_SYMBOL(kernel_connect);
diff --combined net/unix/af_unix.c
index 45a606c013fcf,c627efc3698d3..55cdebfa0da02
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@@ -733,19 -733,7 +733,7 @@@ static void unix_release_sock(struct so
/* ---- Socket is dead now and most probably destroyed ---- */
- /*
- * Fixme: BSD difference: In BSD all sockets connected to us get
- * ECONNRESET and we die on the spot. In Linux we behave
- * like files and pipes do and wait for the last
- * dereference.
- *
- * Can't we simply set sock->err?
- *
- * What the above comment does talk about? --ANK(980817)
- */
-
- if (READ_ONCE(unix_tot_inflight))
- unix_gc(); /* Garbage collect fds */
+ unix_schedule_gc(NULL);
}
struct unix_peercred {
@@@ -854,8 -842,8 +842,8 @@@ out
}
static int unix_release(struct socket *);
- static int unix_bind(struct socket *, struct sockaddr *, int);
- static int unix_stream_connect(struct socket *, struct sockaddr *,
+ static int unix_bind(struct socket *, struct sockaddr_unsized *, int);
+ static int unix_stream_connect(struct socket *, struct sockaddr_unsized *,
int addr_len, int flags);
static int unix_socketpair(struct socket *, struct socket *);
static int unix_accept(struct socket *, struct socket *, struct
proto_accept_arg *arg);
@@@ -877,7 -865,7 +865,7 @@@ static int unix_dgram_sendmsg(struct so
static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
- static int unix_dgram_connect(struct socket *, struct sockaddr *,
+ static int unix_dgram_connect(struct socket *, struct sockaddr_unsized *,
int, int);
static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t,
@@@ -1210,16 -1198,25 +1198,16 @@@ static struct sock *unix_find_bsd(struc
unix_mkname_bsd(sunaddr, addr_len);
if (flags & SOCK_COREDUMP) {
- const struct cred *cred;
- struct cred *kcred;
struct path root;
- kcred = prepare_kernel_cred(&init_task);
- if (!kcred) {
- err = -ENOMEM;
- goto fail;
- }
-
task_lock(&init_task);
get_fs_root(init_task.fs, &root);
task_unlock(&init_task);
- cred = override_creds(kcred);
- err = vfs_path_lookup(root.dentry, root.mnt, sunaddr->sun_path,
- LOOKUP_BENEATH | LOOKUP_NO_SYMLINKS |
- LOOKUP_NO_MAGICLINKS, &path);
- put_cred(revert_creds(cred));
+ scoped_with_kernel_creds()
+ err = vfs_path_lookup(root.dentry, root.mnt,
sunaddr->sun_path,
+ LOOKUP_BENEATH |
LOOKUP_NO_SYMLINKS |
+ LOOKUP_NO_MAGICLINKS, &path);
path_put(&root);
if (err)
goto fail;
@@@ -1390,7 -1387,7 +1378,7 @@@ static int unix_bind_bsd(struct sock *s
idmap = mnt_idmap(parent.mnt);
err = security_path_mknod(&parent, dentry, mode, 0);
if (!err)
- err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0);
+ err = vfs_mknod(idmap, d_inode(parent.dentry), dentry, mode, 0,
NULL);
if (err)
goto out_path;
err = mutex_lock_interruptible(&u->bindlock);
@@@ -1468,7 -1465,7 +1456,7 @@@ out
return err;
}
- static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int
addr_len)
+ static int unix_bind(struct socket *sock, struct sockaddr_unsized *uaddr, int
addr_len)
{
struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr;
struct sock *sk = sock->sk;
@@@ -1514,7 -1511,7 +1502,7 @@@ static void unix_state_double_unlock(st
unix_state_unlock(sk2);
}
- static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
+ static int unix_dgram_connect(struct socket *sock, struct sockaddr_unsized
*addr,
int alen, int flags)
{
struct sockaddr_un *sunaddr = (struct sockaddr_un *)addr;
@@@ -1633,7 -1630,7 +1621,7 @@@ static long unix_wait_for_peer(struct s
return timeo;
}
- static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
+ static int unix_stream_connect(struct socket *sock, struct sockaddr_unsized
*uaddr,
int addr_len, int flags)
{
struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr;
@@@ -2101,8 -2098,6 +2089,6 @@@ static int unix_dgram_sendmsg(struct so
if (err < 0)
return err;
- wait_for_unix_gc(scm.fp);
-
if (msg->msg_flags & MSG_OOB) {
err = -EOPNOTSUPP;
goto out;
@@@ -2396,8 -2391,6 +2382,6 @@@ static int unix_stream_sendmsg(struct s
if (err < 0)
return err;
- wait_for_unix_gc(scm.fp);
-
if (msg->msg_flags & MSG_OOB) {
err = -EOPNOTSUPP;
#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
@@@ -3276,6 -3269,9 +3260,6 @@@ EXPORT_SYMBOL_GPL(unix_outq_len)
static int unix_open_file(struct sock *sk)
{
- struct file *f;
- int fd;
-
if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
return -EPERM;
@@@ -3285,7 -3281,18 +3269,7 @@@
if (!unix_sk(sk)->path.dentry)
return -ENOENT;
- fd = get_unused_fd_flags(O_CLOEXEC);
- if (fd < 0)
- return fd;
-
- f = dentry_open(&unix_sk(sk)->path, O_PATH, current_cred());
- if (IS_ERR(f)) {
- put_unused_fd(fd);
- return PTR_ERR(f);
- }
-
- fd_install(fd, f);
- return fd;
+ return FD_ADD(O_CLOEXEC, dentry_open(&unix_sk(sk)->path, O_PATH,
current_cred()));
}
static int unix_ioctl(struct socket *sock, unsigned int cmd, unsigned long
arg)
diff --combined tools/include/uapi/linux/bpf.h
index f5713f59ac10a,9b17d937edf73..be7d8e060e104
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@@ -1026,7 -1026,6 +1026,7 @@@ enum bpf_map_type
BPF_MAP_TYPE_USER_RINGBUF,
BPF_MAP_TYPE_CGRP_STORAGE,
BPF_MAP_TYPE_ARENA,
+ BPF_MAP_TYPE_INSN_ARRAY,
__MAX_BPF_MAP_TYPE
};
@@@ -1431,9 -1430,6 +1431,9 @@@ enum
/* Do not translate kernel bpf_arena pointers to user pointers */
BPF_F_NO_USER_CONV = (1U << 18),
+
+/* Enable BPF ringbuf overwrite mode */
+ BPF_F_RB_OVERWRITE = (1U << 19),
};
/* Flags for BPF_PROG_QUERY. */
@@@ -5622,7 -5618,7 +5622,7 @@@ union bpf_attr
* Return
* *sk* if casting is valid, or **NULL** otherwise.
*
- * long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct
bpf_dynptr *ptr)
+ * long bpf_dynptr_from_mem(void *data, u64 size, u64 flags, struct
bpf_dynptr *ptr)
* Description
* Get a dynptr to local memory *data*.
*
@@@ -5665,7 -5661,7 +5665,7 @@@
* Return
* Nothing. Always succeeds.
*
- * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32
offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u64 len, const struct bpf_dynptr *src, u64
offset, u64 flags)
* Description
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
@@@ -5675,7 -5671,7 +5675,7 @@@
* of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
* *flags* is not 0.
*
- * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src,
u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u64 offset, void *src,
u64 len, u64 flags)
* Description
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
@@@ -5696,7 -5692,7 +5696,7 @@@
* is a read-only dynptr or if *flags* is not correct. For
skb-type dynptrs,
* other errors correspond to errors returned by
**bpf_skb_store_bytes**\ ().
*
- * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u64 offset, u64 len)
* Description
* Get a pointer to the underlying dynptr data.
*
@@@ -6235,7 -6231,6 +6235,7 @@@ enum
BPF_RB_RING_SIZE = 1,
BPF_RB_CONS_POS = 2,
BPF_RB_PROD_POS = 3,
+ BPF_RB_OVERWRITE_POS = 4,
};
/* BPF ring buffer constants */
@@@ -7205,6 -7200,7 +7205,7 @@@ enum
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header
*/
TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
SK_BPF_CB_FLAGS = 1009, /* Get or set sock ops flags in socket
*/
+ SK_BPF_BYPASS_PROT_MEM = 1010, /* Get or Set sk->sk_bypass_prot_mem */
};
enum {
@@@ -7650,24 -7646,4 +7651,24 @@@ enum bpf_kfunc_flags
BPF_F_PAD_ZEROS = (1ULL << 0),
};
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ *
+ * Before the map is used the orig_off field should point to an
+ * instruction inside the program being loaded. The other fields
+ * must be set to 0.
+ *
+ * After the program is loaded, the xlated_off will be adjusted
+ * by the verifier to point to the index of the original instruction
+ * in the xlated program. If the instruction is deleted, it will
+ * be set to (u32)-1. The jitted_off will be set to the corresponding
+ * offset in the jitted image of the program.
+ */
+struct bpf_insn_array_value {
+ __u32 orig_off;
+ __u32 xlated_off;
+ __u32 jitted_off;
+ __u32 :32;
+};
+
#endif /* _UAPI__LINUX_BPF_H__ */
--
LinuxNextTracking