Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On Fri, Aug 08, 2025 at 03:08:41PM +0200, Paul Menzel wrote:
> Dear Michal,
>
>
> Thank you for your reply.
>
> Am 08.08.25 um 15:04 schrieb Michal Kubiak:
> > On Fri, Aug 08, 2025 at 02:03:43PM +0200, Paul Menzel wrote:
>
> > > Am 08.08.25 um 13:40 schrieb Michal Kubiak:
> > > > On Mon, Jul 07, 2025 at 03:58:37PM -0700, Jacob Keller wrote:
> > > > >
> > > > >
> > > > > On 7/4/2025 9:18 AM, Michal Kubiak wrote:
> > > > > > @@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct
> > > > > > ice_rx_ring *rx_ring)
> > > > > >static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int
> > > > > > budget)
> > > >
> > > > [...]
> > > >
> > > > > > @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct
> > > > > > ice_rx_ring *rx_ring, int budget)
> > > > > > if (ice_is_non_eop(rx_ring, rx_desc))
> > > > > > continue;
> > > > > > - ice_get_pgcnts(rx_ring);
> > > > > > xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog,
> > > > > > xdp_ring, rx_desc);
> > > > > > if (xdp_verdict == ICE_XDP_PASS)
> > > > > > goto construct_skb;
> > > > > > - total_rx_bytes += xdp_get_buff_len(xdp);
> > > > > > - total_rx_pkts++;
> > > > > > - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc,
> > > > > > xdp_verdict);
> > > > > > + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
> > > > > > + xdp_xmit |= xdp_verdict;
> > > > > > + total_rx_bytes += xdp_get_buff_len(&xdp->base);
> > > > > > + total_rx_pkts++;
> > > > > > + xdp->data = NULL;
> > > > > > + rx_ring->first_desc = ntc;
> > > > > > + rx_ring->nr_frags = 0;
> > > > > > continue;
> > > > > >construct_skb:
> > > > > > - skb = ice_build_skb(rx_ring, xdp);
> > > > > > + skb = xdp_build_skb_from_buff(&xdp->base);
> > > > > > +
> > > > > > /* exit if we failed to retrieve a buffer */
> > > > > > if (!skb) {
> > > > > >
> > > > > > rx_ring->ring_stats->rx_stats.alloc_page_failed++;
> > > > >
> > > > > This is not your fault, but we've been incorrectly incrementing
> > > > > alloc_page_failed here instead of alloc_buf_failed.
> > > > >
> > > >
> > > > Sure. It's a good idea to fix it while we're rewriting the Rx path.
> > > > Will be addressed in v2.
> > >
> > > Should this be a separate patch, that can be easily backported?
> > >
> > > […]
>
> > Do you mean that the patch should be included as part of the series, or
> > would
> > you prefer it to be submitted as a standalone patch targeting the 'net'
> > tree?
> Good question. I do not know the rules. My gut would say a separate patch
> for 'net', but the others know best.
>
>
> Kind regards,
>
> Paul
Thank you for your thoughts, Paul!
Finally, I decided to send a separate patch for that fix to the "net" tree.
It seems to be the clearest solution.
The link:
https://lore.kernel.org/intel-wired-lan/[email protected]/
Thanks,
Michal
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
Dear Michal,
Thank you for your reply.
Am 08.08.25 um 15:04 schrieb Michal Kubiak:
On Fri, Aug 08, 2025 at 02:03:43PM +0200, Paul Menzel wrote:
Am 08.08.25 um 13:40 schrieb Michal Kubiak:
On Mon, Jul 07, 2025 at 03:58:37PM -0700, Jacob Keller wrote:
On 7/4/2025 9:18 AM, Michal Kubiak wrote:
@@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct ice_rx_ring *rx_ring)
static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
[...]
@@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring,
int budget)
if (ice_is_non_eop(rx_ring, rx_desc))
continue;
- ice_get_pgcnts(rx_ring);
xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring,
rx_desc);
if (xdp_verdict == ICE_XDP_PASS)
goto construct_skb;
- total_rx_bytes += xdp_get_buff_len(xdp);
- total_rx_pkts++;
- ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
+ if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
+ xdp_xmit |= xdp_verdict;
+ total_rx_bytes += xdp_get_buff_len(&xdp->base);
+ total_rx_pkts++;
+ xdp->data = NULL;
+ rx_ring->first_desc = ntc;
+ rx_ring->nr_frags = 0;
continue;
construct_skb:
- skb = ice_build_skb(rx_ring, xdp);
+ skb = xdp_build_skb_from_buff(&xdp->base);
+
/* exit if we failed to retrieve a buffer */
if (!skb) {
rx_ring->ring_stats->rx_stats.alloc_page_failed++;
This is not your fault, but we've been incorrectly incrementing
alloc_page_failed here instead of alloc_buf_failed.
Sure. It's a good idea to fix it while we're rewriting the Rx path.
Will be addressed in v2.
Should this be a separate patch, that can be easily backported?
[…]
Do you mean that the patch should be included as part of the series, or would
you prefer it to be submitted as a standalone patch targeting the 'net' tree?
Good question. I do not know the rules. My gut would say a separate
patch for 'net', but the others know best.
Kind regards,
Paul
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On Fri, Aug 08, 2025 at 02:03:43PM +0200, Paul Menzel wrote:
> Dear Michal, dear Jacob,
>
>
> One minor comment:
>
> Am 08.08.25 um 13:40 schrieb Michal Kubiak:
> > On Mon, Jul 07, 2025 at 03:58:37PM -0700, Jacob Keller wrote:
> > >
> > >
> > > On 7/4/2025 9:18 AM, Michal Kubiak wrote:
> > > > @@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct ice_rx_ring
> > > > *rx_ring)
> > > > static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >
> > [...]
> >
> > > > @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring
> > > > *rx_ring, int budget)
> > > > if (ice_is_non_eop(rx_ring, rx_desc))
> > > > continue;
> > > > - ice_get_pgcnts(rx_ring);
> > > > xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog,
> > > > xdp_ring, rx_desc);
> > > > if (xdp_verdict == ICE_XDP_PASS)
> > > > goto construct_skb;
> > > > - total_rx_bytes += xdp_get_buff_len(xdp);
> > > > - total_rx_pkts++;
> > > > - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc,
> > > > xdp_verdict);
> > > > + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
> > > > + xdp_xmit |= xdp_verdict;
> > > > + total_rx_bytes += xdp_get_buff_len(&xdp->base);
> > > > + total_rx_pkts++;
> > > > + xdp->data = NULL;
> > > > + rx_ring->first_desc = ntc;
> > > > + rx_ring->nr_frags = 0;
> > > > continue;
> > > > construct_skb:
> > > > - skb = ice_build_skb(rx_ring, xdp);
> > > > + skb = xdp_build_skb_from_buff(&xdp->base);
> > > > +
> > > > /* exit if we failed to retrieve a buffer */
> > > > if (!skb) {
> > > >
> > > > rx_ring->ring_stats->rx_stats.alloc_page_failed++;
> > >
> > > This is not your fault, but we've been incorrectly incrementing
> > > alloc_page_failed here instead of alloc_buf_failed.
> > >
> >
> > Sure. It's a good idea to fix it while we're rewriting the Rx path.
> > Will be addressed in v2.
>
> Should this be a separate patch, that can be easily backported?
>
> […]
>
>
> Kind regards,
>
> Paul
Hello Paul,
Do you mean that the patch should be included as part of the series, or would
you prefer it to be submitted as a standalone patch targeting the 'net' tree?
Thanks,
Michal
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
Dear Michal, dear Jacob,
One minor comment:
Am 08.08.25 um 13:40 schrieb Michal Kubiak:
On Mon, Jul 07, 2025 at 03:58:37PM -0700, Jacob Keller wrote:
On 7/4/2025 9:18 AM, Michal Kubiak wrote:
@@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct ice_rx_ring *rx_ring)
static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
[...]
@@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring,
int budget)
if (ice_is_non_eop(rx_ring, rx_desc))
continue;
- ice_get_pgcnts(rx_ring);
xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring,
rx_desc);
if (xdp_verdict == ICE_XDP_PASS)
goto construct_skb;
- total_rx_bytes += xdp_get_buff_len(xdp);
- total_rx_pkts++;
- ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
+ if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
+ xdp_xmit |= xdp_verdict;
+ total_rx_bytes += xdp_get_buff_len(&xdp->base);
+ total_rx_pkts++;
+ xdp->data = NULL;
+ rx_ring->first_desc = ntc;
+ rx_ring->nr_frags = 0;
continue;
construct_skb:
- skb = ice_build_skb(rx_ring, xdp);
+ skb = xdp_build_skb_from_buff(&xdp->base);
+
/* exit if we failed to retrieve a buffer */
if (!skb) {
rx_ring->ring_stats->rx_stats.alloc_page_failed++;
This is not your fault, but we've been incorrectly incrementing
alloc_page_failed here instead of alloc_buf_failed.
Sure. It's a good idea to fix it while we're rewriting the Rx path.
Will be addressed in v2.
Should this be a separate patch, that can be easily backported?
[…]
Kind regards,
Paul
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On Mon, Jul 07, 2025 at 04:02:30PM -0700, Jacob Keller wrote: > > > On 7/4/2025 9:18 AM, Michal Kubiak wrote: > > @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring > > *rx_ring, int budget) > > if (ice_is_non_eop(rx_ring, rx_desc)) > > continue; > > > > - ice_get_pgcnts(rx_ring); > > xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, > > rx_desc); > > if (xdp_verdict == ICE_XDP_PASS) > > goto construct_skb; > > - total_rx_bytes += xdp_get_buff_len(xdp); > > - total_rx_pkts++; > > > > - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); > > + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) > > + xdp_xmit |= xdp_verdict; > > + total_rx_bytes += xdp_get_buff_len(&xdp->base); > > + total_rx_pkts++; > > > > + xdp->data = NULL; > > + rx_ring->first_desc = ntc; > > I can't seem to find a user for first_desc outside of the > ice_clean_rx_irq_zc > > > + rx_ring->nr_frags = 0; > > Similarly, I can't find a user of nr_frags now that we have dropped the > ice_put_rx_mbuf. We assign it, but don't seem to actually use it. > Good catch! Sure - I will remove both structure members in v2. Thanks, Michal
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On Mon, Jul 07, 2025 at 03:58:37PM -0700, Jacob Keller wrote:
>
>
> On 7/4/2025 9:18 AM, Michal Kubiak wrote:
> > @@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct ice_rx_ring
> > *rx_ring)
> > static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
[...]
> > @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring
> > *rx_ring, int budget)
> > if (ice_is_non_eop(rx_ring, rx_desc))
> > continue;
> >
> > - ice_get_pgcnts(rx_ring);
> > xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring,
> > rx_desc);
> > if (xdp_verdict == ICE_XDP_PASS)
> > goto construct_skb;
> > - total_rx_bytes += xdp_get_buff_len(xdp);
> > - total_rx_pkts++;
> >
> > - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
> > + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
> > + xdp_xmit |= xdp_verdict;
> > + total_rx_bytes += xdp_get_buff_len(&xdp->base);
> > + total_rx_pkts++;
> >
> > + xdp->data = NULL;
> > + rx_ring->first_desc = ntc;
> > + rx_ring->nr_frags = 0;
> > continue;
> > construct_skb:
> > - skb = ice_build_skb(rx_ring, xdp);
> > + skb = xdp_build_skb_from_buff(&xdp->base);
> > +
> > /* exit if we failed to retrieve a buffer */
> > if (!skb) {
> > rx_ring->ring_stats->rx_stats.alloc_page_failed++;
>
> This is not your fault, but we've been incorrectly incrementing
> alloc_page_failed here instead of alloc_buf_failed.
>
Sure. It's a good idea to fix it while we're rewriting the Rx path.
Will be addressed in v2.
> > xdp_verdict = ICE_XDP_CONSUMED;
>
> xdp_verdict is no longer used, so I don't think we need to modify it
> further here. It was previously being used as part of the call to
> ice_put_rx_mbuf.
>
You're right. I'll remove it in v2.
> > + xdp->data = NULL;
> > + rx_ring->first_desc = ntc;
> > + rx_ring->nr_frags = 0;
> > + break;
> > }
> > - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
> >
> > - if (!skb)
> > - break;
> > + xdp->data = NULL;
> > + rx_ring->first_desc = ntc;
> > + rx_ring->nr_frags = 0;
> >
>
> The failure case for !skb does the same as this, so would it make sense
> to move this block up to before !skb and just check the skb pointer
> afterwards?
>
Yeah. Together with Olek we re-organized the logic here, so in v2 it
should be simplified.
> > stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> > if (unlikely(ice_test_staterr(rx_desc->wb.status_error0
Thanks,
Michal
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On 7/4/2025 9:18 AM, Michal Kubiak wrote:
> @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring
> *rx_ring, int budget)
> if (ice_is_non_eop(rx_ring, rx_desc))
> continue;
>
> - ice_get_pgcnts(rx_ring);
> xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring,
> rx_desc);
> if (xdp_verdict == ICE_XDP_PASS)
> goto construct_skb;
> - total_rx_bytes += xdp_get_buff_len(xdp);
> - total_rx_pkts++;
>
> - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
> + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
> + xdp_xmit |= xdp_verdict;
> + total_rx_bytes += xdp_get_buff_len(&xdp->base);
> + total_rx_pkts++;
>
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
I can't seem to find a user for first_desc outside of the
ice_clean_rx_irq_zc
> + rx_ring->nr_frags = 0;
Similarly, I can't find a user of nr_frags now that we have dropped the
ice_put_rx_mbuf. We assign it, but don't seem to actually use it.
> continue;
> construct_skb:
> - skb = ice_build_skb(rx_ring, xdp);
> + skb = xdp_build_skb_from_buff(&xdp->base);
> +
> /* exit if we failed to retrieve a buffer */
> if (!skb) {
> rx_ring->ring_stats->rx_stats.alloc_page_failed++;
> xdp_verdict = ICE_XDP_CONSUMED;
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
> + rx_ring->nr_frags = 0;
> + break;
> }
> - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
>
> - if (!skb)
> - break;
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
> + rx_ring->nr_frags = 0;
>
> stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> if (unlikely(ice_test_staterr(rx_desc->wb.status_error0,
OpenPGP_signature.asc
Description: OpenPGP digital signature
Re: [Intel-wired-lan] [PATCH iwl-next 3/3] ice: switch to Page Pool
On 7/4/2025 9:18 AM, Michal Kubiak wrote:
> @@ -1075,16 +780,17 @@ void ice_clean_ctrl_rx_irq(struct ice_rx_ring *rx_ring)
> static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> {
> unsigned int total_rx_bytes = 0, total_rx_pkts = 0;
> - unsigned int offset = rx_ring->rx_offset;
> - struct xdp_buff *xdp = &rx_ring->xdp;
> struct ice_tx_ring *xdp_ring = NULL;
> struct bpf_prog *xdp_prog = NULL;
> u32 ntc = rx_ring->next_to_clean;
> + LIBETH_XDP_ONSTACK_BUFF(xdp);
> u32 cached_ntu, xdp_verdict;
> u32 cnt = rx_ring->count;
> u32 xdp_xmit = 0;
> bool failure;
>
> + libeth_xdp_init_buff(xdp, &rx_ring->xdp, &rx_ring->xdp_rxq);
> +
> xdp_prog = READ_ONCE(rx_ring->xdp_prog);
> if (xdp_prog) {
> xdp_ring = rx_ring->xdp_ring;
> @@ -1094,7 +800,7 @@ static int ice_clean_rx_irq(struct ice_rx_ring *rx_ring,
> int budget)
> /* start the loop to process Rx packets bounded by 'budget' */
> while (likely(total_rx_pkts < (unsigned int)budget)) {
> union ice_32b_rx_flex_desc *rx_desc;
> - struct ice_rx_buf *rx_buf;
> + struct libeth_fqe *rx_buf;
> struct sk_buff *skb;
> unsigned int size;
> u16 stat_err_bits;
> @@ -1124,19 +830,10 @@ static int ice_clean_rx_irq(struct ice_rx_ring
> *rx_ring, int budget)
> ICE_RX_FLX_DESC_PKT_LEN_M;
>
> /* retrieve a buffer from the ring */
> - rx_buf = ice_get_rx_buf(rx_ring, size, ntc);
> -
> - if (!xdp->data) {
> - void *hard_start;
> -
> - hard_start = page_address(rx_buf->page) +
> rx_buf->page_offset -
> - offset;
> - xdp_prepare_buff(xdp, hard_start, offset, size,
> !!offset);
> - xdp_buff_clear_frags_flag(xdp);
> - } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) {
> - ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc,
> ICE_XDP_CONSUMED);
> + rx_buf = &rx_ring->rx_fqes[ntc];
> + if (!libeth_xdp_process_buff(xdp, rx_buf, size))
> break;
> - }
> +
> if (++ntc == cnt)
> ntc = 0;
>
> @@ -1144,27 +841,35 @@ static int ice_clean_rx_irq(struct ice_rx_ring
> *rx_ring, int budget)
> if (ice_is_non_eop(rx_ring, rx_desc))
> continue;
>
> - ice_get_pgcnts(rx_ring);
> xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring,
> rx_desc);
> if (xdp_verdict == ICE_XDP_PASS)
> goto construct_skb;
> - total_rx_bytes += xdp_get_buff_len(xdp);
> - total_rx_pkts++;
>
> - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
> + if (xdp_verdict & (ICE_XDP_TX | ICE_XDP_REDIR))
> + xdp_xmit |= xdp_verdict;
> + total_rx_bytes += xdp_get_buff_len(&xdp->base);
> + total_rx_pkts++;
>
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
> + rx_ring->nr_frags = 0;
> continue;
> construct_skb:
> - skb = ice_build_skb(rx_ring, xdp);
> + skb = xdp_build_skb_from_buff(&xdp->base);
> +
> /* exit if we failed to retrieve a buffer */
> if (!skb) {
> rx_ring->ring_stats->rx_stats.alloc_page_failed++;
This is not your fault, but we've been incorrectly incrementing
alloc_page_failed here instead of alloc_buf_failed.
> xdp_verdict = ICE_XDP_CONSUMED;
xdp_verdict is no longer used, so I don't think we need to modify it
further here. It was previously being used as part of the call to
ice_put_rx_mbuf.
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
> + rx_ring->nr_frags = 0;
> + break;
> }
> - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
>
> - if (!skb)
> - break;
> + xdp->data = NULL;
> + rx_ring->first_desc = ntc;
> + rx_ring->nr_frags = 0;
>
The failure case for !skb does the same as this, so would it make sense
to move this block up to before !skb and just check the skb pointer
afterwards?
> stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
> if (unlikely(ice_test_staterr(rx_desc->wb.status_error0
OpenPGP_signature.asc
Description: OpenPGP digital signature
