Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package rubygem-loofah for openSUSE:Factory checked in at 2026-03-29 20:01:07 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/rubygem-loofah (Old) and /work/SRC/openSUSE:Factory/.rubygem-loofah.new.8177 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "rubygem-loofah" Sun Mar 29 20:01:07 2026 rev:28 rq:1343433 version:2.25.0 Changes: -------- --- /work/SRC/openSUSE:Factory/rubygem-loofah/rubygem-loofah.changes 2024-11-07 16:27:55.522991133 +0100 +++ /work/SRC/openSUSE:Factory/.rubygem-loofah.new.8177/rubygem-loofah.changes 2026-03-29 20:01:36.007676176 +0200 @@ -1,0 +2,28 @@ +Fri Mar 13 11:58:25 UTC 2026 - Marcus Rueckert <[email protected]> + +- update to 2.25.0 + * Extract `Loofah::HTML5::Scrub.allowed_uri?` which operates on + a string. Previously this logic was coupled to the parsed tree + in `.scrub_uri_attribute`. #300 @flavorjones + * Tightened up how entities and control characters are handled + when detecting allowed URIs. #301 @flavorjones + +------------------------------------------------------------------- +Fri Jul 18 00:58:02 UTC 2025 - Marcus Rueckert <[email protected]> + +- update to 2.24.1 + * Import only what's needed from `cgi` for support for Ruby 3.5 + #296 @Earlopain + +------------------------------------------------------------------- +Wed Jan 22 02:06:03 UTC 2025 - Marcus Rueckert <[email protected]> + +- update to 2.24.0 + * Built-in scrubber `:double_breakpoint` which sees `<br><br>` + and wraps the surrounding content in `<p>` tags. #279, #284 + @josecolella @torihuang + * Built-in scrubber `:targetblank` now skips `a` tags whose + `href` attribute is an anchor link. Previously, all `a` tags + were modified to have `target='_blank'`. #291 @fnando + +------------------------------------------------------------------- Old: ---- loofah-2.23.1.gem New: ---- loofah-2.25.0.gem ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ rubygem-loofah.spec ++++++ --- /var/tmp/diff_new_pack.2PHY3d/_old 2026-03-29 20:01:36.503696612 +0200 +++ /var/tmp/diff_new_pack.2PHY3d/_new 2026-03-29 20:01:36.503696612 +0200 @@ -1,7 +1,7 @@ # # spec file for package rubygem-loofah # -# Copyright (c) 2024 SUSE LLC +# Copyright (c) 2026 SUSE LLC and contributors # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -24,7 +24,7 @@ # Name: rubygem-loofah -Version: 2.23.1 +Version: 2.25.0 Release: 0 %define mod_name loofah %define mod_full_name %{mod_name}-%{version} @@ -50,6 +50,7 @@ %install %gem_install \ + --no-rdoc --no-ri \ --doc-files="CHANGELOG.md MIT-LICENSE.txt README.md" \ -f ++++++ loofah-2.23.1.gem -> loofah-2.25.0.gem ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/CHANGELOG.md new/CHANGELOG.md --- old/CHANGELOG.md 2024-10-25 14:43:25.000000000 +0200 +++ new/CHANGELOG.md 1980-01-02 01:00:00.000000000 +0100 @@ -1,5 +1,29 @@ # Changelog +## 2.25.0 / 2025-12-15 + +* Extract `Loofah::HTML5::Scrub.allowed_uri?` which operates on a string. Previously this logic was coupled to the parsed tree in `.scrub_uri_attribute`. #300 @flavorjones +* Tightened up how entities and control characters are handled when detecting allowed URIs. #301 @flavorjones + + +## 2.24.1 / 2025-05-12 + +### Ruby support + +* Import only what's needed from `cgi` for support for Ruby 3.5 #296 @Earlopain + + +## 2.24.0 / 2024-12-24 + +### Added + +* Built-in scrubber `:double_breakpoint` which sees `<br><br>` and wraps the surrounding content in `<p>` tags. #279, #284 @josecolella @torihuang + +### Improved + +* Built-in scrubber `:targetblank` now skips `a` tags whose `href` attribute is an anchor link. Previously, all `a` tags were modified to have `target='_blank'`. #291 @fnando + + ## 2.23.1 / 2024-10-25 ### Added diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/README.md new/README.md --- old/README.md 2024-10-25 14:43:25.000000000 +0200 +++ new/README.md 1980-01-02 01:00:00.000000000 +0100 @@ -31,6 +31,8 @@ * Add the _nofollow_ attribute to all hyperlinks. * Add the _target=\_blank_ attribute to all hyperlinks. * Remove _unprintable_ characters from text nodes. +* Some specialized HTML transformations are also built-in: + * Where `<br><br>` exists inside a `p` tag, close the `p` and open a new one. * Format markup as plain text, with (or without) sensible whitespace handling around block elements. * Replace Rails's `strip_tags` and `sanitize` view helper methods. @@ -227,14 +229,15 @@ # and strips all node attributes ``` -Loofah also comes with some common transformation tasks: +Loofah also comes with built-in scrubers for some common transformation tasks: ``` ruby -doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links -doc.scrub!(:noopener) # adds rel="noopener" attribute to links -doc.scrub!(:noreferrer) # adds rel="noreferrer" attribute to links -doc.scrub!(:unprintable) # removes unprintable characters from text nodes -doc.scrub!(:targetblank) # adds target="_blank" attribute to links +doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links +doc.scrub!(:noopener) # adds rel="noopener" attribute to links +doc.scrub!(:noreferrer) # adds rel="noreferrer" attribute to links +doc.scrub!(:unprintable) # removes unprintable characters from text nodes +doc.scrub!(:targetblank) # adds target="_blank" attribute to links +doc.scrub!(:double_breakpoint) # where `<br><br>` appears in a `p` tag, close the `p` and open a new one ``` See `Loofah::Scrubbers` for more details and example usage. Binary files old/checksums.yaml.gz and new/checksums.yaml.gz differ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/loofah/html5/scrub.rb new/lib/loofah/html5/scrub.rb --- old/lib/loofah/html5/scrub.rb 2024-10-25 14:43:25.000000000 +0200 +++ new/lib/loofah/html5/scrub.rb 1980-01-02 01:00:00.000000000 +0100 @@ -1,6 +1,7 @@ # frozen_string_literal: true -require "cgi" +require "cgi/escape" +require "cgi/util" if RUBY_VERSION < "3.5" require "crass" module Loofah @@ -13,6 +14,7 @@ CSS_WHITESPACE = " " CSS_PROPERTY_STRING_WITHOUT_EMBEDDED_QUOTES = /\A(["'])?[^"']+\1\z/ DATA_ATTRIBUTE_NAME = /\Adata-[\w-]+\z/ + URI_PROTOCOL_REGEX = /\A[a-z][a-z0-9+\-.]*:/ # RFC 3986 class << self def allowed_element?(element_name) @@ -139,23 +141,33 @@ attr_node.value = values.join(" ") end + # Returns true if the given URI string is safe, false otherwise. + # This method can be used to validate URI attribute values without + # requiring a Nokogiri DOM node. + def allowed_uri?(uri_string) + # this logic lifted nearly verbatim from HTML5 sanitization + val_unescaped = CGI.unescapeHTML(uri_string.gsub(CONTROL_CHARACTERS, "")).gsub(":", ":").downcase + if URI_PROTOCOL_REGEX.match?(val_unescaped) + protocol = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0] + return false unless SafeList::ALLOWED_PROTOCOLS.include?(protocol) + + if protocol == "data" + # permit only allowed data mediatypes + mediatype = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[1] + mediatype, _ = mediatype.split(/[;,]/)[0..1] if mediatype + return false if mediatype && !SafeList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype) + end + end + true + end + def scrub_uri_attribute(attr_node) - # this block lifted nearly verbatim from HTML5 sanitization - val_unescaped = CGI.unescapeHTML(attr_node.value).gsub(CONTROL_CHARACTERS, "").downcase - if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ && - !SafeList::ALLOWED_PROTOCOLS.include?(val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0]) + if allowed_uri?(attr_node.value) + false + else attr_node.remove - return true - elsif val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0] == "data" - # permit only allowed data mediatypes - mediatype = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[1] - mediatype, _ = mediatype.split(";")[0..1] if mediatype - if mediatype && !SafeList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype) - attr_node.remove - return true - end + true end - false end # diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/loofah/scrubbers.rb new/lib/loofah/scrubbers.rb --- old/lib/loofah/scrubbers.rb 2024-10-25 14:43:25.000000000 +0200 +++ new/lib/loofah/scrubbers.rb 1980-01-02 01:00:00.000000000 +0100 @@ -251,7 +251,9 @@ def scrub(node) return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "a") - node.set_attribute("target", "_blank") + href = node["href"] + + node.set_attribute("target", "_blank") if href && href[0] != "#" STOP end @@ -349,6 +351,57 @@ end # + # === scrub!(:double_breakpoint) + # + # +:double_breakpoint+ replaces double-break tags with closing/opening paragraph tags. + # + # markup = "<p>Some text here in a logical paragraph.<br><br>Some more text, apparently a second paragraph.</p>" + # Loofah.html5_fragment(markup).scrub!(:double_breakpoint) + # => "<p>Some text here in a logical paragraph.</p><p>Some more text, apparently a second paragraph.</p>" + # + class DoubleBreakpoint < Scrubber + def initialize # rubocop:disable Lint/MissingSuper + @direction = :top_down + end + + def scrub(node) + return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "p") + + paragraph_with_break_point_nodes = node.xpath("//p[br[following-sibling::br]]") + + paragraph_with_break_point_nodes.each do |paragraph_node| + new_paragraph = paragraph_node.add_previous_sibling("<p>").first + + paragraph_node.children.each do |child| + remove_blank_text_nodes(child) + end + + paragraph_node.children.each do |child| + # already unlinked + next if child.parent.nil? + + if child.name == "br" && child.next_sibling.name == "br" + new_paragraph = paragraph_node.add_previous_sibling("<p>").first + child.next_sibling.unlink + child.unlink + else + child.parent = new_paragraph + end + end + + paragraph_node.unlink + end + + CONTINUE + end + + private + + def remove_blank_text_nodes(node) + node.unlink if node.text? && node.blank? + end + end + # # A hash that maps a symbol (like +:prune+) to the appropriate Scrubber (Loofah::Scrubbers::Prune). # MAP = { @@ -362,6 +415,7 @@ targetblank: TargetBlank, newline_block_elements: NewlineBlockElements, unprintable: Unprintable, + double_breakpoint: DoubleBreakpoint, } class << self diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/loofah/version.rb new/lib/loofah/version.rb --- old/lib/loofah/version.rb 2024-10-25 14:43:25.000000000 +0200 +++ new/lib/loofah/version.rb 1980-01-02 01:00:00.000000000 +0100 @@ -2,5 +2,5 @@ module Loofah # The version of Loofah you are using - VERSION = "2.23.1" + VERSION = "2.25.0" end diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/metadata new/metadata --- old/metadata 2024-10-25 14:43:25.000000000 +0200 +++ new/metadata 1980-01-02 01:00:00.000000000 +0100 @@ -1,15 +1,14 @@ --- !ruby/object:Gem::Specification name: loofah version: !ruby/object:Gem::Version - version: 2.23.1 + version: 2.25.0 platform: ruby authors: - Mike Dalessio - Bryan Helmkamp -autorequire: bindir: bin cert_chain: [] -date: 2024-10-25 00:00:00.000000000 Z +date: 1980-01-02 00:00:00.000000000 Z dependencies: - !ruby/object:Gem::Dependency name: crass @@ -82,7 +81,7 @@ bug_tracker_uri: https://github.com/flavorjones/loofah/issues changelog_uri: https://github.com/flavorjones/loofah/blob/main/CHANGELOG.md documentation_uri: https://www.rubydoc.info/gems/loofah/ -post_install_message: + funding_uri: https://github.com/sponsors/flavorjones rdoc_options: [] require_paths: - lib @@ -97,8 +96,7 @@ - !ruby/object:Gem::Version version: '0' requirements: [] -rubygems_version: 3.5.22 -signing_key: +rubygems_version: 3.6.9 specification_version: 4 summary: Loofah is a general library for manipulating and transforming HTML/XML documents and fragments, built on top of Nokogiri.
