Control: tag -1 patch
On Tue, May 06, 2025 at 03:48:52PM +0200, Vincent Lefevre wrote:
> Package: libhtml-gumbo-perl
> Version: 0.18-4+b1
> Severity: serious
> Tags: security upstream
> Justification: security
> Forwarded: https://github.com/ruz/HTML-Gumbo/issues/6
> X-Debbugs-Cc: Debian Security Team <[email protected]>
>
> I get erratic behavior on the template HTML element, e.g. on
> the HTML file "<template>". For instance:
> ==64955== Command: perl -C -MHTML::Gumbo -e print\
> HTML::Gumbo-\>new-\>parse('\<template\>',\ format\ =\>\ 'string');
> ==64955==
> ==64955== Conditional jump or move depends on uninitialised value(s)
> ==64955== at 0x484DC89: strlen (vg_replace_strmem.c:505)
> ==64955== by 0x2AD7DF: ??? (in /usr/bin/perl)
> ==64955== by 0x486D6CE: tree_to_string (Gumbo.xs:189)
> ==64955== by 0x486E2C4: walk_tree.isra.0 (Gumbo.xs:55)
> ==64955== by 0x486E2C4: walk_tree.isra.0 (Gumbo.xs:55)
> ==64955== by 0x486E2C4: walk_tree.isra.0 (Gumbo.xs:55)
> ==64955== by 0x486E41B: parse_to_string_cb (Gumbo.xs:505)
The attached change does not make HTML::Gumbo support <template>
properly but seems to plug this specific hole, and hence the
known security aspects.
I've checked that this doesn't break the (not very extensive) test
suite, and that the only reverse dependency in trixie, request-tracker5,
still builds with this.
Tentatively tagging 'patch', but eyeballs would be good.
I think full support for <template> should be a separate wishlist bug.
--
Niko Tyni [email protected]
>From 549609cd80784012c274c11731e6a31787d3555e Mon Sep 17 00:00:00 2001
From: Niko Tyni <[email protected]>
Date: Sat, 17 May 2025 09:32:06 +0100
Subject: [PATCH] Fix wrong code path with GUMBO_NODE_TEMPLATE
GUMBO_NODE_TEMPLATE was introduced in Gumbo 0.10.0 but HTML-Gumbo has
not been updated to support that.
This makes walk_tree() take the text node branch for templates
and access uninitialized memory.
The gumbo C library seems to treat GUMBO_NODE_TEMPLATE very
similarly to GUMBO_NODE_ELEMENT. From
https://sources.debian.org/src/gumbo-parser/0.13.0%2Bdfsg-2/src/gumbo.h/#L304
/** Template node. This is separate from GUMBO_NODE_ELEMENT because many
* client libraries will want to ignore the contents of template nodes, as
* the spec suggests. Recursing on GUMBO_NODE_ELEMENT will do the right thing
* here, while clients that want to include template contents should also
* check for GUMBO_NODE_TEMPLATE. v will be a GumboElement. */
So we add it to the list "special" container types in walk_tree()
that attach a GumboElement value rather than a GumboText.
Bug-Debian: https://bugs.debian.org/1104789
Bug: https://github.com/ruz/HTML-Gumbo/issues/6
---
lib/HTML/Gumbo.xs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/HTML/Gumbo.xs b/lib/HTML/Gumbo.xs
index 97dfc43..32427d7 100644
--- a/lib/HTML/Gumbo.xs
+++ b/lib/HTML/Gumbo.xs
@@ -38,7 +38,7 @@ typedef enum {
STATIC
void
walk_tree(pTHX_ GumboNode* node, int flags, void (*cb)(pTHX_ PerlHtmlGumboType, GumboNode*, void*), void* ctx ) {
- if ( node->type == GUMBO_NODE_DOCUMENT || node->type == GUMBO_NODE_ELEMENT ) {
+ if ( node->type == GUMBO_NODE_DOCUMENT || node->type == GUMBO_NODE_ELEMENT || node->type == GUMBO_NODE_TEMPLATE) {
GumboVector* children;
int skip = flags&PHG_FLAG_SKIP_ROOT_ELEMENT && node->type == GUMBO_NODE_ELEMENT && node->parent && node->parent->type == GUMBO_NODE_DOCUMENT;
if ( !skip ) {
--
2.49.0