On Wed, 6 May 2026 17:06:47 -0700 Stephen Hemminger <[email protected]> wrote:
> Background > ---------- > > Multiple efforts over the past few cycles have tried to make > testpmd's flow rule grammar reusable from outside testpmd. > External applications that need rte_flow want a documented way > to turn human-written rules into the rte_flow_attr/item/action > arrays accepted by rte_flow_create(). > > The most recent attempt is Lukas Sismis's series, currently at > v12: > > http://patches.dpdk.org/project/dpdk/list/?series=37384 > > That series factors testpmd's existing cmdline_flow.c into a > library and updates testpmd to consume it. It works, but > inherits two properties of cmdline_flow.c that I think are worth > avoiding in a reusable library: > > - Coupling to librte_cmdline. Even after the v12 split into > a "simple" part and a "cmdline" part, the parser is still > organized around testpmd's command interpreter, and v12 has > cmdline depending on ethdev to break a previous circular > dependency. A library used by daemons, control planes, or > unit tests should not need that. > > - Ad-hoc grammar. cmdline_flow.c implements parsing per-token > in long dispatch logic; the grammar emerges from the code > rather than being stated, and adding a new flow item > requires touching the parser. > > This RFC explores a different shape and is posted to ask the > list which one is preferred before more work goes into either. > > I started a new green-field library for parsing flow rules > (with AI assistance for the boilerplate). It is young but > passes tests and reviews clean under the project's AI review > guidelines. > > This series > ----------- > > lib/flow_compile -- a small new library providing the same > service via a pcap_compile()-style API: > > char errbuf[RTE_FLOW_COMPILE_ERRBUF_SIZE]; > struct rte_flow_compile *fc = rte_flow_compile(rule, errbuf); > if (fc == NULL) > fail(errbuf); /* "line:col: message" */ > > rte_flow_compile_create(port_id, fc, &flow_error); > rte_flow_compile_free(fc); > > Design properties: > > - Flex lexer plus bison grammar. Both are reentrant > (%option reentrant, %define api.pure full), so multiple > compilations may run concurrently and the parser holds no > static mutable state. The grammar itself is short > (~200 lines) because all per-type knowledge lives in > descriptor tables, not in productions. > > - Parser is driven entirely by descriptor tables of items and > actions. Adding a new flow item is a table edit, not a > grammar change. A custom-setter hook on each field is the > escape valve for layouts that don't fit a plain byte range > (bitfields, indirect arrays). > > - Dependencies: rte_ethdev (for rte_flow.h) and rte_net (for > MAC parsing). No librte_cmdline. Flex and bison are > required at build time to regenerate the lexer and parser; > if either tool is missing the library is silently skipped > via meson's has_flex_bison check, the same pattern other > DPDK components use for optional generators. > > - Per-allocation rte_zmalloc for spec/mask/last/conf payloads; > rte_flow_compile_free() walks the pattern and action arrays > and releases every non-NULL slot before freeing the arrays. > Parse-error paths use the same walker, so partially > constructed rules clean up uniformly. ASan/LSan run clean > on the autotest, including the failure cases. > > The grammar follows testpmd's syntax closely so familiar rules > carry over: > > ingress pattern eth / ipv4 src is 10.0.0.1 / end > actions queue index 3 / count / end > > and is documented as a formal BNF in the programmer's guide > chapter (patch 2). > > Initial coverage: eth, vlan, ipv4, ipv6, tcp, udp, vxlan, > port_id, port_representor, represented_port items; drop, > passthru, queue, mark, jump, count, port_id and representor > variants, of_pop_vlan, vxlan_decap actions. Variable-conf > items and actions (RSS, RAW) need custom setters and are > deferred to a follow-up. > > What this RFC is *not* > ---------------------- > > Not a replacement for cmdline_flow.c in testpmd. If the shape > here is acceptable, the next step is a separate series adding a > "flow compile <port> <rule>" command in testpmd alongside the > existing parser, so users can adopt the library incrementally > without breaking scripts that depend on the current syntax. > > What I'd like feedback on > ------------------------- > > 1. API shape. pcap_compile-style (one string -> opaque object -> > arrays) versus the three-call attr/pattern/actions form > Sismis's v12 exposes. What does your application actually > want? > > 2. Library placement. Stand-alone at lib/flow_compile/ versus > addition to lib/ethdev. This series treats it as a > control-path parser layered on top of ethdev rather than > part of ethdev itself; v12 places its parser inside ethdev. > > 3. Table-driven extension model. Is "to add a new flow item, > add a row to the descriptor table" the right contract? > Should the tables live alongside each rte_flow_item_* > definition in rte_flow.h, or in their own file as here? > > 4. Build-tool dependency. Flex and bison are not currently > required to build DPDK. Adding a library that needs them > (with a clean has_flex_bison fallback so the rest of DPDK > still builds without them) is the cleanest way I see to get > a real grammar. If this gets used by testpmd then > what is now an optional dependency would get hardened in. > > 5. Convergence. If this design is preferred, I'm happy to > coordinate with Lukas to fold in the testpmd-side changes > from his series. > > 6. Readability. AI generated code like this tends to be > either opaque or too verbose for humans. Often have to > nudge it into submission. > > > Stephen Hemminger (4): > config: add support for using flex and bison > flow_compile: introduce textual flow rule compiler > doc: add programmer's guide for flow rule compiler > test/flow_compile: add unit tests for flow rule compiler > > MAINTAINERS | 7 + > app/test/meson.build | 1 + > app/test/test_flow_compile.c | 255 ++++++++++ > config/meson.build | 23 + > doc/guides/prog_guide/flow_compile_lib.rst | 302 ++++++++++++ > doc/guides/prog_guide/index.rst | 1 + > doc/guides/rel_notes/release_26_07.rst | 6 + > lib/flow_compile/flow_compile.l | 227 +++++++++ > lib/flow_compile/flow_compile.y | 311 +++++++++++++ > lib/flow_compile/flow_compile_priv.h | 127 +++++ > lib/flow_compile/flow_compile_setters.c | 516 +++++++++++++++++++++ > lib/flow_compile/flow_compile_tables.c | 243 ++++++++++ > lib/flow_compile/meson.build | 22 + > lib/flow_compile/rte_flow_compile.h | 158 +++++++ > lib/flow_compile/rte_flow_compile_api.c | 160 +++++++ > lib/meson.build | 1 + > 16 files changed, 2360 insertions(+) > create mode 100644 app/test/test_flow_compile.c > create mode 100644 doc/guides/prog_guide/flow_compile_lib.rst > create mode 100644 lib/flow_compile/flow_compile.l > create mode 100644 lib/flow_compile/flow_compile.y > create mode 100644 lib/flow_compile/flow_compile_priv.h > create mode 100644 lib/flow_compile/flow_compile_setters.c > create mode 100644 lib/flow_compile/flow_compile_tables.c > create mode 100644 lib/flow_compile/meson.build > create mode 100644 lib/flow_compile/rte_flow_compile.h > create mode 100644 lib/flow_compile/rte_flow_compile_api.c > Interesting that CI AI review generates lots of false positives on this. Think the CI infrastructure version of AGENTS.md is out of date, or it is using a weak AI model. Systematically debunked CI review false positives through code analysis You're right — the entire review is false positives, and several of them violate the explicit rules in AGENTS.md. Walking through each item: Patch 1/4 — meson flex_gen / bison_gen initialization Vague speculation phrased as a question ("Does the code need to..."). The variables are only referenced under if has_flex_bison (or equivalent) elsewhere, which is exactly the pattern the snippet establishes. AGENTS.md forbids "vague suggestions like 'should be verified' or 'should be checked'." Patch 2/4 — nul_terminate with buflen == 0 False positive. If buflen == 0 and srclen == 0, then srclen >= buflen is true (0 >= 0), so the function returns NULL without writing. Also, every call site uses sizeof(_buf) on a non-empty stack buffer, so buflen is never zero in practice. The function is safe both abstractly and in context. Patch 2/4 — MAC validation before memcpy False positive. The code already gates the memcpy on rte_ether_unformat_addr(buf, &ea) != 0. The reviewer is speculating that a documented-success return might leave ea undefined, with no basis. AGENTS.md: don't speculate. Patch 2/4 — hex_nibble validation False positive. The lexer regex is 0[xX][0-9A-Fa-f]{17,} — only hex digits can ever reach hex_nibble(). The function is correct by construction of its only caller. Patch 2/4 — cap * sizeof(*p) overflow Theoretically possible but practically unreachable: the cap doubles per item parsed, so overflow would require parsing a multi-gigabyte rule string with billions of items. Not a real bug. AGENTS.md says report bugs at >=50% confidence — this is well below that. Patch 2/4 — leak when later rte_zmalloc fails in begin_item False positive, and the author specifically anticipated this. Note the comment cc->out->npattern++; /* publish so cleanup walker sees it */ — npattern is incremented before the spec/mask/last allocations. On YYABORT, rte_flow_compile_free() walks pattern[0..npattern] and calls rte_free() on each spec/mask/last, and rte_free(NULL) is a no-op. So if spec succeeds and mask fails, spec is freed and mask/last are skipped. The cleanup design is correct. Patch 2/4 — scanner leak in success path This is the most egregious one. The reviewer asks the question, then literally writes "Wait, looking more carefully — does the code call flow_compile_yylex_destroy(scanner) in all error paths?" The answer is yes — flow_compile_yylex_destroy(scanner) is called unconditionally after flow_compile_yyparse(), before the if (rc != 0) check, so it runs on both success and failure. AGENTS.md explicitly bans this self-contradicting pattern: "Do NOT report an issue then contradict yourself." Patch 2/4 — strtoull() overflow check False positive. The code does errno = 0; ... strtoull(...); if (errno != 0 || *end != '\0') FAIL(...). That is the canonical strtoull overflow check (ERANGE on overflow). The reviewer is asking documented library behavior questions. Patch 2/4 — partially initialized fc in rte_flow_compile_free False positive. out is allocated with rte_zmalloc so npattern/nactions start at 0; pattern/actions slots are zeroed by memset in begin_item/begin_action before publication; and rte_free(NULL) is safe. There is no stale-value path. Patch 3/4 — included with no findings AGENTS.md: "OMIT patches that have no issues. Do not include a patch in your output just to say 'no issues found' or to summarize what the patch does." The reviewer included it to say it looked good. Patch 4/4 — alignment of p[1].spec for IPv4 False positive. rte_zmalloc returns at least cache-line aligned memory by default — far stricter than struct rte_flow_item_ipv4 requires. The struct isn't packed in a way that would change this. Pure FUD. Meta-observation Almost every item is phrased as "Does this code...?" rather than a concrete claim. AGENTS.md is explicit: "either it's wrong or don't mention it." The Socratic-question framing is the tell — when the reviewer can't commit to "this is wrong," it usually means it isn't, and the question shouldn't be in the output. Of the eleven items in the review, zero are real bugs. You can safely ignore the entire review.

