Bug#1019652: ruby-regexp-parser: diff for NMU version 2.1.1-2.1

Adrian Bunk Sat, 19 Nov 2022 06:12:16 -0800

Control: tags 1019652 + patch
Control: tags 1019652 + pending

Dear maintainer,


I've prepared an NMU for ruby-regexp-parser (versioned as 2.1.1-2.1) and 
uploaded it to DELAYED/15. Please feel free to tell me if I should cancel it.

cu
Adrian

diff -Nru ruby-regexp-parser-2.1.1/debian/changelog ruby-regexp-parser-2.1.1/debian/changelog
--- ruby-regexp-parser-2.1.1/debian/changelog	2021-12-08 18:30:46.000000000 +0200
+++ ruby-regexp-parser-2.1.1/debian/changelog	2022-11-19 15:54:11.000000000 +0200
@@ -1,3 +1,10 @@
+ruby-regexp-parser (2.1.1-2.1) unstable; urgency=low
+
+  * Non-maintainer upload.
+  * Add upstream fix for FTBFS with Ruby 3.1. (Closes: #1019652)
+
+ -- Adrian Bunk <[email protected]>  Sat, 19 Nov 2022 15:54:11 +0200
+
 ruby-regexp-parser (2.1.1-2) unstable; urgency=medium
 
   * Reupload to unstable
diff -Nru ruby-regexp-parser-2.1.1/debian/patches/0001-Fix-build-on-Ruby-3.1.patch ruby-regexp-parser-2.1.1/debian/patches/0001-Fix-build-on-Ruby-3.1.patch
--- ruby-regexp-parser-2.1.1/debian/patches/0001-Fix-build-on-Ruby-3.1.patch	1970-01-01 02:00:00.000000000 +0200
+++ ruby-regexp-parser-2.1.1/debian/patches/0001-Fix-build-on-Ruby-3.1.patch	2022-11-19 15:53:15.000000000 +0200
@@ -0,0 +1,226 @@
+From 09580ad00f1922401d4e793dd541bbeaf8d9b058 Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?Janosch=20Mu=CC=88ller?= <[email protected]>
+Date: Sat, 4 Dec 2021 15:34:28 +0100
+Subject: Fix build on Ruby 3.1 ...
+
+As of Ruby 3.1, meta and control sequences are [pre-processed to hex escapes when used in Regexp literals]( https://github.com/ruby/ruby/commit/11ae581a4a7f5d5f5ec6378872eab8f25381b1b9 ), so they will only reach the scanner and will only be emitted if a String or a Regexp that has been built with the `::new` constructor is scanned.
+---
+ README.md                    |  9 +++--
+ spec/parser/escapes_spec.rb  | 74 +++++++++++++++++++++---------------
+ spec/scanner/escapes_spec.rb | 47 ++++++++++++++---------
+ 3 files changed, 77 insertions(+), 53 deletions(-)
+
+diff --git a/README.md b/README.md
+index 5410d36..7b40476 100644
+--- a/README.md
++++ b/README.md
+@@ -360,9 +360,9 @@ _Note that not all of these are available in all versions of Ruby_
+ | &emsp;&nbsp;_**Reluctant** (Lazy)_    | `??`, `*?`, `+?`, `{m,M}?`                              | &#x2713; |
+ | &emsp;&nbsp;_**Possessive**_          | `?+`, `*+`, `++`, `{m,M}+`                              | &#x2713; |
+ | **String Escapes**                    |                                                         | &#x22f1; |
+-| &emsp;&nbsp;_**Control**_             | `\C-C`, `\cD`                                           | &#x2713; |
++| &emsp;&nbsp;_**Control** \[1\]_       | `\C-C`, `\cD`                                           | &#x2713; |
+ | &emsp;&nbsp;_**Hex**_                 | `\x20`, `\x{701230}`                                    | &#x2713; |
+-| &emsp;&nbsp;_**Meta**_                | `\M-c`, `\M-\C-C`, `\M-\cC`, `\C-\M-C`, `\c\M-C`        | &#x2713; |
++| &emsp;&nbsp;_**Meta** \[1\]_          | `\M-c`, `\M-\C-C`, `\M-\cC`, `\C-\M-C`, `\c\M-C`        | &#x2713; |
+ | &emsp;&nbsp;_**Octal**_               | `\0`, `\01`, `\012`                                     | &#x2713; |
+ | &emsp;&nbsp;_**Unicode**_             | `\uHHHH`, `\u{H+ H+}`                                   | &#x2713; |
+ | **Unicode Properties**                | _<sub>([Unicode 11.0.0](http://www.unicode.org/versions/Unicode11.0.0/))</sub>_ | &#x22f1; |
+@@ -374,6 +374,10 @@ _Note that not all of these are available in all versions of Ruby_
+ | &emsp;&nbsp;_**Scripts**_             | `\p{Arabic}`, `\P{Hiragana}`, `\p{^Greek}`              | &#x2713; |
+ | &emsp;&nbsp;_**Simple**_              | `\p{Dash}`, `\p{Extender}`, `\p{^Hyphen}`               | &#x2713; |
+ 
++**\[1\]**: As of Ruby 3.1, meta and control sequences are [pre-processed to hex escapes when used in Regexp literals](
++ https://github.com/ruby/ruby/commit/11ae581a4a7f5d5f5ec6378872eab8f25381b1b9 ), so they will only reach the
++scanner and will only be emitted if a String or a Regexp that has been built with the `::new` constructor is scanned.
++
+ ##### Inapplicable Features
+ 
+ Some modifiers, like `o` and `s`, apply to the **Regexp** object itself and do not
+@@ -387,7 +391,6 @@ expressions library (Onigmo). They are not supported by the scanner.
+   - **Quotes**: `\Q...\E` _[[See]](https://github.com/k-takata/Onigmo/blob/7911409/doc/RE#L499)_
+   - **Capture History**: `(?@...)`, `(?@<name>...)` _[[See]](https://github.com/k-takata/Onigmo/blob/7911409/doc/RE#L550)_
+ 
+-
+ See something missing? Please submit an [issue](https://github.com/ammar/regexp_parser/issues)
+ 
+ _**Note**: Attempting to process expressions with unsupported syntax features can raise an error,
+diff --git a/spec/parser/escapes_spec.rb b/spec/parser/escapes_spec.rb
+index 25cc5eb..53fe6d9 100644
+--- a/spec/parser/escapes_spec.rb
++++ b/spec/parser/escapes_spec.rb
+@@ -56,8 +56,20 @@ RSpec.describe('EscapeSequence parsing') do
+     expect { root[5].codepoint }.to raise_error(/#codepoints/)
+   end
+ 
++  # Meta/control espaces
++  #
++  # After the following fix in Ruby 3.1, a Regexp#source containing meta/control
++  # escapes can only be set with the Regexp::new constructor.
++  # In Regexp literals, these escapes are now pre-processed to hex escapes.
++  #
++  # https://github.com/ruby/ruby/commit/11ae581a4a7f5d5f5ec6378872eab8f25381b1b9
++  def parse_meta_control(regexp_body)
++    regexp = Regexp.new(regexp_body.force_encoding('ascii-8bit'), 'n')
++    RP.parse(regexp)
++  end
++
+   specify('parse escape control sequence lower') do
+-    root = RP.parse(/a\\\c2b/)
++    root = parse_meta_control('a\\\\\c2b')
+ 
+     expect(root[2]).to be_instance_of(EscapeSequence::Control)
+     expect(root[2].text).to eq '\\c2'
+@@ -66,56 +78,56 @@ RSpec.describe('EscapeSequence parsing') do
+   end
+ 
+   specify('parse escape control sequence upper') do
+-    root = RP.parse(/\d\\\C-C\w/)
++    root = parse_meta_control('\d\C-C\w')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::Control)
+-    expect(root[2].text).to eq '\\C-C'
+-    expect(root[2].char).to eq "\x03"
+-    expect(root[2].codepoint).to eq 3
++    expect(root[1]).to be_instance_of(EscapeSequence::Control)
++    expect(root[1].text).to eq '\\C-C'
++    expect(root[1].char).to eq "\x03"
++    expect(root[1].codepoint).to eq 3
+   end
+ 
+   specify('parse escape meta sequence') do
+-    root = RP.parse(/\Z\\\M-Z/n)
++    root = parse_meta_control('\Z\M-Z')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::Meta)
+-    expect(root[2].text).to eq '\\M-Z'
+-    expect(root[2].char).to eq "\u00DA"
+-    expect(root[2].codepoint).to eq 218
++    expect(root[1]).to be_instance_of(EscapeSequence::Meta)
++    expect(root[1].text).to eq '\\M-Z'
++    expect(root[1].char).to eq "\u00DA"
++    expect(root[1].codepoint).to eq 218
+   end
+ 
+   specify('parse escape meta control sequence') do
+-    root = RP.parse(/\A\\\M-\C-X/n)
++    root = parse_meta_control('\A\M-\C-X')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::MetaControl)
+-    expect(root[2].text).to eq '\\M-\\C-X'
+-    expect(root[2].char).to eq "\u0098"
+-    expect(root[2].codepoint).to eq 152
++    expect(root[1]).to be_instance_of(EscapeSequence::MetaControl)
++    expect(root[1].text).to eq '\\M-\\C-X'
++    expect(root[1].char).to eq "\u0098"
++    expect(root[1].codepoint).to eq 152
+   end
+ 
+   specify('parse lower c meta control sequence') do
+-    root = RP.parse(/\A\\\M-\cX/n)
++    root = parse_meta_control('\A\M-\cX')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::MetaControl)
+-    expect(root[2].text).to eq '\\M-\\cX'
+-    expect(root[2].char).to eq "\u0098"
+-    expect(root[2].codepoint).to eq 152
++    expect(root[1]).to be_instance_of(EscapeSequence::MetaControl)
++    expect(root[1].text).to eq '\\M-\\cX'
++    expect(root[1].char).to eq "\u0098"
++    expect(root[1].codepoint).to eq 152
+   end
+ 
+   specify('parse escape reverse meta control sequence') do
+-    root = RP.parse(/\A\\\C-\M-X/n)
++    root = parse_meta_control('\A\C-\M-X')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::MetaControl)
+-    expect(root[2].text).to eq '\\C-\\M-X'
+-    expect(root[2].char).to eq "\u0098"
+-    expect(root[2].codepoint).to eq 152
++    expect(root[1]).to be_instance_of(EscapeSequence::MetaControl)
++    expect(root[1].text).to eq '\\C-\\M-X'
++    expect(root[1].char).to eq "\u0098"
++    expect(root[1].codepoint).to eq 152
+   end
+ 
+   specify('parse escape reverse lower c meta control sequence') do
+-    root = RP.parse(/\A\\\c\M-X/n)
++    root = parse_meta_control('\A\c\M-X')
+ 
+-    expect(root[2]).to be_instance_of(EscapeSequence::MetaControl)
+-    expect(root[2].text).to eq '\\c\\M-X'
+-    expect(root[2].char).to eq "\u0098"
+-    expect(root[2].codepoint).to eq 152
++    expect(root[1]).to be_instance_of(EscapeSequence::MetaControl)
++    expect(root[1].text).to eq '\\c\\M-X'
++    expect(root[1].char).to eq "\u0098"
++    expect(root[1].codepoint).to eq 152
+   end
+ end
+diff --git a/spec/scanner/escapes_spec.rb b/spec/scanner/escapes_spec.rb
+index 411d050..c3a9117 100644
+--- a/spec/scanner/escapes_spec.rb
++++ b/spec/scanner/escapes_spec.rb
+@@ -35,25 +35,6 @@ RSpec.describe('Escape scanning') do
+   include_examples 'scan', 'a\u{640 0641}c',  1 => [:escape,  :codepoint_list,   '\u{640 0641}',   1,  13]
+   include_examples 'scan', 'a\u{10FFFF}c',    1 => [:escape,  :codepoint_list,   '\u{10FFFF}',     1,  11]
+ 
+-  include_examples 'scan', /a\cBc/,           1 => [:escape,  :control,          '\cB',            1,  4]
+-  include_examples 'scan', /a\c^c/,           1 => [:escape,  :control,          '\c^',            1,  4]
+-  include_examples 'scan', /a\c\n/,           1 => [:escape,  :control,          '\c\n',           1,  5]
+-  include_examples 'scan', /a\c\\b/,          1 => [:escape,  :control,          '\c\\\\',         1,  5]
+-  include_examples 'scan', /a\C-bc/,          1 => [:escape,  :control,          '\C-b',           1,  5]
+-  include_examples 'scan', /a\C-^b/,          1 => [:escape,  :control,          '\C-^',           1,  5]
+-  include_examples 'scan', /a\C-\nb/,         1 => [:escape,  :control,          '\C-\n',          1,  6]
+-  include_examples 'scan', /a\C-\\b/,         1 => [:escape,  :control,          '\C-\\\\',        1,  6]
+-  include_examples 'scan', /a\c\M-Bc/n,       1 => [:escape,  :control,          '\c\M-B',         1,  7]
+-  include_examples 'scan', /a\C-\M-Bc/n,      1 => [:escape,  :control,          '\C-\M-B',        1,  8]
+-
+-  include_examples 'scan', /a\M-Bc/n,         1 => [:escape,  :meta_sequence,    '\M-B',           1,  5]
+-  include_examples 'scan', /a\M-\cBc/n,       1 => [:escape,  :meta_sequence,    '\M-\cB',         1,  7]
+-  include_examples 'scan', /a\M-\c^/n,        1 => [:escape,  :meta_sequence,    '\M-\c^',         1,  7]
+-  include_examples 'scan', /a\M-\c\n/n,       1 => [:escape,  :meta_sequence,    '\M-\c\n',        1,  8]
+-  include_examples 'scan', /a\M-\c\\/n,       1 => [:escape,  :meta_sequence,    '\M-\c\\\\',      1,  8]
+-  include_examples 'scan', /a\M-\C-Bc/n,      1 => [:escape,  :meta_sequence,    '\M-\C-B',        1,  8]
+-  include_examples 'scan', /a\M-\C-\\/n,      1 => [:escape,  :meta_sequence,    '\M-\C-\\\\',     1,  9]
+-
+   include_examples 'scan', 'ab\\\xcd',        1 => [:escape,  :backslash,        '\\\\',           2,  4]
+   include_examples 'scan', 'ab\\\0cd',        1 => [:escape,  :backslash,        '\\\\',           2,  4]
+   include_examples 'scan', 'ab\\\Kcd',        1 => [:escape,  :backslash,        '\\\\',           2,  4]
+@@ -61,4 +42,32 @@ RSpec.describe('Escape scanning') do
+   include_examples 'scan', 'ab\^cd',          1 => [:escape,  :bol,              '\^',             2,  4]
+   include_examples 'scan', 'ab\$cd',          1 => [:escape,  :eol,              '\$',             2,  4]
+   include_examples 'scan', 'ab\[cd',          1 => [:escape,  :set_open,         '\[',             2,  4]
++
++  # Meta/control espaces
++  #
++  # After the following fix in Ruby 3.1, a Regexp#source containing meta/control
++  # escapes can only be set with the Regexp::new constructor.
++  # In Regexp literals, these escapes are now pre-processed to hex escapes.
++  #
++  # https://github.com/ruby/ruby/commit/11ae581a4a7f5d5f5ec6378872eab8f25381b1b9
++  n = ->(regexp_body){ Regexp.new(regexp_body.force_encoding('ascii-8bit'), 'n') }
++
++  include_examples 'scan', 'a\cBc',           1 => [:escape,  :control,          '\cB',            1,  4]
++  include_examples 'scan', 'a\c^c',           1 => [:escape,  :control,          '\c^',            1,  4]
++  include_examples 'scan', 'a\c\n',           1 => [:escape,  :control,          '\c\n',           1,  5]
++  include_examples 'scan', 'a\c\\\\b',        1 => [:escape,  :control,          '\c\\\\',         1,  5]
++  include_examples 'scan', 'a\C-bc',          1 => [:escape,  :control,          '\C-b',           1,  5]
++  include_examples 'scan', 'a\C-^b',          1 => [:escape,  :control,          '\C-^',           1,  5]
++  include_examples 'scan', 'a\C-\nb',         1 => [:escape,  :control,          '\C-\n',          1,  6]
++  include_examples 'scan', 'a\C-\\\\b',       1 => [:escape,  :control,          '\C-\\\\',        1,  6]
++  include_examples 'scan', n.('a\c\M-Bc'),    1 => [:escape,  :control,          '\c\M-B',         1,  7]
++  include_examples 'scan', n.('a\C-\M-Bc'),   1 => [:escape,  :control,          '\C-\M-B',        1,  8]
++
++  include_examples 'scan', n.('a\M-Bc'),      1 => [:escape,  :meta_sequence,    '\M-B',           1,  5]
++  include_examples 'scan', n.('a\M-\cBc'),    1 => [:escape,  :meta_sequence,    '\M-\cB',         1,  7]
++  include_examples 'scan', n.('a\M-\c^'),     1 => [:escape,  :meta_sequence,    '\M-\c^',         1,  7]
++  include_examples 'scan', n.('a\M-\c\n'),    1 => [:escape,  :meta_sequence,    '\M-\c\n',        1,  8]
++  include_examples 'scan', n.('a\M-\c\\\\'),  1 => [:escape,  :meta_sequence,    '\M-\c\\\\',      1,  8]
++  include_examples 'scan', n.('a\M-\C-Bc'),   1 => [:escape,  :meta_sequence,    '\M-\C-B',        1,  8]
++  include_examples 'scan', n.('a\M-\C-\\\\'), 1 => [:escape,  :meta_sequence,    '\M-\C-\\\\',     1,  9]
+ end
+-- 
+2.30.2
+
diff -Nru ruby-regexp-parser-2.1.1/debian/patches/series ruby-regexp-parser-2.1.1/debian/patches/series
--- ruby-regexp-parser-2.1.1/debian/patches/series	1970-01-01 02:00:00.000000000 +0200
+++ ruby-regexp-parser-2.1.1/debian/patches/series	2022-11-19 15:54:10.000000000 +0200
@@ -0,0 +1 @@
+0001-Fix-build-on-Ruby-3.1.patch

Bug#1019652: ruby-regexp-parser: diff for NMU version 2.1.1-2.1

Reply via email to