Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package rubygem-regexp_parser for openSUSE:Factory checked in at 2022-10-12 18:25:21 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/rubygem-regexp_parser (Old) and /work/SRC/openSUSE:Factory/.rubygem-regexp_parser.new.2275 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "rubygem-regexp_parser" Wed Oct 12 18:25:21 2022 rev:11 rq:1010081 version:2.6.0 Changes: -------- --- /work/SRC/openSUSE:Factory/rubygem-regexp_parser/rubygem-regexp_parser.changes 2022-06-15 00:32:53.202578269 +0200 +++ /work/SRC/openSUSE:Factory/.rubygem-regexp_parser.new.2275/rubygem-regexp_parser.changes 2022-10-12 18:27:10.186017187 +0200 @@ -1,0 +2,14 @@ +Mon Oct 10 13:18:09 UTC 2022 - Stephan Kulow <co...@suse.com> + +updated to version 2.6.0 + see installed CHANGELOG.md + + # Changelog + + All notable changes to this project will be documented in this file. + + The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), + and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + + +------------------------------------------------------------------- Old: ---- regexp_parser-2.5.0.gem New: ---- regexp_parser-2.6.0.gem ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ rubygem-regexp_parser.spec ++++++ --- /var/tmp/diff_new_pack.mW4YYe/_old 2022-10-12 18:27:10.634018173 +0200 +++ /var/tmp/diff_new_pack.mW4YYe/_new 2022-10-12 18:27:10.642018191 +0200 @@ -24,7 +24,7 @@ # Name: rubygem-regexp_parser -Version: 2.5.0 +Version: 2.6.0 Release: 0 %define mod_name regexp_parser %define mod_full_name %{mod_name}-%{version} ++++++ regexp_parser-2.5.0.gem -> regexp_parser-2.6.0.gem ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/CHANGELOG.md new/CHANGELOG.md --- old/CHANGELOG.md 2022-05-27 23:32:50.000000000 +0200 +++ new/CHANGELOG.md 2022-09-26 22:15:50.000000000 +0200 @@ -1,37 +1,68 @@ +# Changelog + +All notable changes to this project will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + ## [Unreleased] +## [2.6.0] - 2022-09-26 - [Janosch M??ller](mailto:janosc...@gmail.com) + +### Fixed + +- fixed `#referenced_expression` for `\g<0>` (was `nil`, is now the `Root` exp) +- fixed `#reference`, `#referenced_expression` for recursion level backrefs + * e.g. `(a)(b)\k<-1+1>` + * `#referenced_expression` was `nil`, now it is the correct `Group` exp +- detect and raise for two more syntax errors when parsing String input + * quantification of option switches (e.g. `(?i)+`) + * invalid references (e.g. `/\k<1>/`) + * these are a `SyntaxError` in Ruby, so could only be passed as a String + +### Added + +- `Regexp::Expression::Base#human_name` + * returns a nice, human-readable description of the expression +- `Regexp::Expression::Base#optional?` + * returns `true` if the expression is quantified accordingly (e.g. with `*`, `{,n}`) +- added a deprecation warning when calling `#to_re` on set members + +## [2.5.0] - 2022-05-27 - [Janosch M??ller](mailto:janosc...@gmail.com) + ### Added - `Regexp::Expression::Base.construct` and `.token_class` methods + * see the [wiki](https://github.com/ammar/regexp_parser/wiki) for details ## [2.4.0] - 2022-05-09 - [Janosch M??ller](mailto:janosc...@gmail.com) ### Fixed - fixed interpretation of `+` and `?` after interval quantifiers (`{n,n}`) - - they used to be treated as reluctant or possessive mode indicators - - however, Ruby does not support these modes for interval quantifiers - - they are now treated as chained quantifiers instead, as Ruby does it - - c.f. [#3](https://github.com/ammar/regexp_parser/issues/3) + * they used to be treated as reluctant or possessive mode indicators + * however, Ruby does not support these modes for interval quantifiers + * they are now treated as chained quantifiers instead, as Ruby does it + * c.f. [#3](https://github.com/ammar/regexp_parser/issues/3) - fixed `Expression::Base#nesting_level` for some tree rewrite cases - - e.g. the alternatives in `/a|[b]/` had an inconsistent nesting_level + * e.g. the alternatives in `/a|[b]/` had an inconsistent nesting_level - fixed `Scanner` accepting invalid posix classes, e.g. `[[:foo:]]` - - they raise a `SyntaxError` when used in a Regexp, so could only be passed as String - - they now raise a `Regexp::Scanner::ValidationError` in the `Scanner` + * they raise a `SyntaxError` when used in a Regexp, so could only be passed as String + * they now raise a `Regexp::Scanner::ValidationError` in the `Scanner` ### Added - added `Expression::Base#==` for (deep) comparison of expressions - added `Expression::Base#parts` - - returns the text elements and subexpressions of an expression - - e.g. `parse(/(a)/)[0].parts # => ["(", #<Literal @text="a"...>, ")"]` + * returns the text elements and subexpressions of an expression + * e.g. `parse(/(a)/)[0].parts # => ["(", #<Literal @text="a"...>, ")"]` - added `Expression::Base#te` (a.k.a. token end index) - - `Expression::Subexpression` always had `#te`, only terminal nodes lacked it so far + * `Expression::Subexpression` always had `#te`, only terminal nodes lacked it so far - made some `Expression::Base` methods available on `Quantifier` instances, too - - `#type`, `#type?`, `#is?`, `#one_of?`, `#options`, `#terminal?` - - `#base_length`, `#full_length`, `#starts_at`, `#te`, `#ts`, `#offset` - - `#conditional_level`, `#level`, `#nesting_level` , `#set_level` - - this allows a more unified handling with `Expression::Base` instances + * `#type`, `#type?`, `#is?`, `#one_of?`, `#options`, `#terminal?` + * `#base_length`, `#full_length`, `#starts_at`, `#te`, `#ts`, `#offset` + * `#conditional_level`, `#level`, `#nesting_level` , `#set_level` + * this allows a more unified handling with `Expression::Base` instances - allowed `Quantifier#initialize` to take a token and options Hash like other nodes - added a deprecation warning for initializing Quantifiers with 4+ arguments: @@ -54,18 +85,18 @@ ### Fixed - removed five inexistent unicode properties from `Syntax#features` - - these were never supported by Ruby or the `Regexp::Scanner` - - thanks to [Markus Schirp](https://github.com/mbj) for the report + * these were never supported by Ruby or the `Regexp::Scanner` + * thanks to [Markus Schirp](https://github.com/mbj) for the report ## [2.3.0] - 2022-04-08 - [Janosch M??ller](mailto:janosc...@gmail.com) ### Added - improved parsing performance through `Syntax` refactoring - - instead of fresh `Syntax` instances, pre-loaded constants are now re-used - - this approximately doubles the parsing speed for simple regexps + * instead of fresh `Syntax` instances, pre-loaded constants are now re-used + * this approximately doubles the parsing speed for simple regexps - added methods to `Syntax` classes to show relative feature sets - - e.g. `Regexp::Syntax::V3_2_0.added_features` + * e.g. `Regexp::Syntax::V3_2_0.added_features` - support for new unicode properties of Ruby 3.2 / Unicode 14.0 ## [2.2.1] - 2022-02-11 - [Janosch M??ller](mailto:janosc...@gmail.com) @@ -73,14 +104,14 @@ ### Fixed - fixed Syntax version of absence groups (`(?~...)`) - - the lexer accepted them for any Ruby version - - now they are only recognized for Ruby >= 2.4.1 in which they were introduced + * the lexer accepted them for any Ruby version + * now they are only recognized for Ruby >= 2.4.1 in which they were introduced - reduced gem size by excluding specs from package - removed deprecated `test_files` gemspec setting - no longer depend on `yaml`/`psych` (except for Ruby <= 2.4) - no longer depend on `set` - - `set` was removed from the stdlib and made a standalone gem as of Ruby 3 - - this made it a hidden/undeclared dependency of `regexp_parser` + * `set` was removed from the stdlib and made a standalone gem as of Ruby 3 + * this made it a hidden/undeclared dependency of `regexp_parser` ## [2.2.0] - 2021-12-04 - [Janosch M??ller](mailto:janosc...@gmail.com) @@ -318,8 +349,8 @@ - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re` - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree - - it used to be the only expression stored in a custom ivar, complicating traversal - - its setter and getter (`#condition=`, `#condition`) still work as before + * it used to be the only expression stored in a custom ivar, complicating traversal + * its setter and getter (`#condition=`, `#condition`) still work as before ## [1.1.0] - 2018-09-17 - [Janosch M??ller](mailto:janosc...@gmail.com) @@ -327,8 +358,8 @@ - Added `Quantifier` methods `#greedy?`, `#possessive?`, `#reluctant?`/`#lazy?` - Added `Group::Options#option_changes` - - shows the options enabled or disabled by the given options group - - as with all other expressions, `#options` shows the overall active options + * shows the options enabled or disabled by the given options group + * as with all other expressions, `#options` shows the overall active options - Added `Conditional#reference` and `Condition#reference`, indicating the determinative group - Added `Subexpression#dig`, acts like [`Array#dig`](http://ruby-doc.org/core-2.5.0/Array.html#method-i-dig) @@ -512,7 +543,6 @@ * Fixed scanning of zero length comments (PR #12) * Fixed missing escape:codepoint_list syntax token (PR #14) * Fixed to_s for modified interval quantifiers (PR #17) -- Added a note about MRI implementation quirks to Scanner section ## [0.3.2] - 2016-01-01 - [Ammar Ali](mailto:ammarabu...@gmail.com) @@ -538,7 +568,6 @@ - Renamed Lexer's method to lex, added an alias to the old name (scan) - Use #map instead of #each to run the block in Lexer.lex. - Replaced VERSION.yml file with a constant. -- Updated README - Update tokens and scanner with new additions in Unicode 7.0. ## [0.1.6] - 2014-10-06 - [Ammar Ali](mailto:ammarabu...@gmail.com) @@ -548,20 +577,11 @@ - Added syntax files for missing ruby 2.x versions. These do not add extra syntax support, they just make the gem work with the newer ruby versions. -- Added .travis.yml to project root. -- README: - - Removed note purporting runtime support for ruby 1.8.6. - - Added a section identifying the main unsupported syntax features. - - Added sections for Testing and Building - - Added badges for gem version, Travis CI, and code climate. -- Updated README, fixing broken examples, and converting it from a rdoc file to Github's flavor of Markdown. - Fixed a parser bug where an alternation sequence that contained nested expressions was incorrectly being appended to the parent expression when the nesting was exited. e.g. in /a|(b)c/, c was appended to the root. - - Fixed a bug where character types were not being correctly scanned within character sets. e.g. in [\d], two tokens were scanned; one for the backslash '\' and one for the 'd' ## [0.1.5] - 2014-01-14 - [Ammar Ali](mailto:ammarabu...@gmail.com) -- Correct ChangeLog. - Added syntax stubs for ruby versions 2.0 and 2.1 - Added clone methods for deep copying expressions. - Added optional format argument for to_s on expressions to return the text of the expression with (:full, the default) or without (:base) its quantifier. @@ -570,7 +590,6 @@ - Improved EOF handling in general and especially from sequences like hex and control escapes. - Fixed a bug where named groups with an empty name would return a blank token []. - Fixed a bug where member of a parent set where being added to its last subset. -- Various code cleanups in scanner.rl - Fixed a few mutable string bugs by calling dup on the originals. - Made ruby 1.8.6 the base for all 1.8 syntax, and the 1.8 name a pointer to the latest (1.8.7 at this time) - Removed look-behind assertions (positive and negative) from 1.8 syntax diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/README.md new/README.md --- old/README.md 2022-05-27 23:32:50.000000000 +0200 +++ new/README.md 2022-09-26 22:15:50.000000000 +0200 @@ -9,8 +9,8 @@ * Multilayered * A scanner/tokenizer based on [Ragel](http://www.colm.net/open-source/ragel/) - * A lexer that produces a "stream" of token objects. - * A parser that produces a "tree" of Expression objects (OO API) + * A lexer that produces a "stream" of [Token objects](https://github.com/ammar/regexp_parser/wiki/Token-Objects) + * A parser that produces a "tree" of [Expression objects (OO API)](https://github.com/ammar/regexp_parser/wiki/Expression-Objects) * Runs on Ruby 2.x, 3.x and JRuby runtimes * Recognizes Ruby 1.8, 1.9, 2.x and 3.x regular expressions [See Supported Syntax](#supported-syntax) @@ -36,14 +36,15 @@ ```gem 'regexp_parser', '~> X.Y.Z'``` -See rubygems for the the [latest version number](https://rubygems.org/gems/regexp_parser) +See the badge at the top of this README or [rubygems](https://rubygems.org/gems/regexp_parser) +for the the latest version number. --- ## Usage The three main modules are **Scanner**, **Lexer**, and **Parser**. Each of them -provides a single method that takes a regular expression (as a RegExp object or +provides a single method that takes a regular expression (as a Regexp object or a string) and returns its results. The **Lexer** and the **Parser** accept an optional second argument that specifies the syntax version, like 'ruby/2.0', which defaults to the host Ruby version (using RUBY_VERSION). @@ -79,7 +80,7 @@ require 'regexp_parser' Regexp::Parser.parse( - "a+ #??Recognises a and A...", + "a+ #??Recognizes a and A...", options: ::Regexp::EXTENDED | ::Regexp::IGNORECASE ) ``` @@ -101,7 +102,7 @@ ```ruby require 'regexp_parser' -Regexp::Scanner.scan /(ab?(cd)*[e-h]+)/ do |type, token, text, ts, te| +Regexp::Scanner.scan(/(ab?(cd)*[e-h]+)/) do |type, token, text, ts, te| puts "type: #{type}, token: #{token}, text: '#{text}' [#{ts}..#{te}]" end @@ -124,7 +125,7 @@ parts of the pattern: ```ruby -Regexp::Scanner.scan( /(cat?([bhm]at)){3,5}/ ).map {|token| token[2]} +Regexp::Scanner.scan(/(cat?([bhm]at)){3,5}/).map { |token| token[2] } #=> ["(", "cat", "?", "(", "[", "b", "h", "m", "]", "at", ")", ")", "{3,5}"] ``` @@ -220,7 +221,7 @@ ```ruby require 'regexp_parser' -Regexp::Lexer.lex /a?(b(c))*[d]+/, 'ruby/1.9' do |token| +Regexp::Lexer.lex(/a?(b(c))*[d]+/, 'ruby/1.9') do |token| puts "#{' ' * token.level}#{token.text}" end @@ -246,7 +247,7 @@ by a quantifier that only applies to it. ```ruby -Regexp::Lexer.scan( /(cat?([b]at)){3,5}/ ).map {|token| token.text} +Regexp::Lexer.scan(/(cat?([b]at)){3,5}/).map { |token| token.text } #=> ["(", "ca", "t", "?", "(", "[", "b", "]", "at", ")", ")", "{3,5}"] ``` @@ -274,7 +275,7 @@ regex = /a?(b+(c)d)*(?<name>[0-9]+)/ -tree = Regexp::Parser.parse( regex, 'ruby/2.1' ) +tree = Regexp::Parser.parse(regex, 'ruby/2.1') tree.traverse do |event, exp| puts "#{event}: #{exp.type} `#{exp.to_s}`" @@ -355,7 +356,7 @@ |   _Nest Level_ | `\k<n-1>` | ✓ | |   _Numbered_ | `\k<1>` | ✓ | |   _Relative_ | `\k<-2>` | ✓ | -|   _Traditional_ | `\1` thru `\9` | ✓ | +|   _Traditional_ | `\1` through `\9` | ✓ | |   _**Capturing**_ | `(abc)` | ✓ | |   _**Comments**_ | `(?# comment text)` | ✓ | |   _**Named**_ | `(?<name>abc)`, `(?'name'abc)` | ✓ | @@ -375,7 +376,7 @@ |   _**Meta** \[2\]_ | `\M-c`, `\M-\C-C`, `\M-\cC`, `\C-\M-C`, `\c\M-C` | ✓ | |   _**Octal**_ | `\0`, `\01`, `\012` | ✓ | |   _**Unicode**_ | `\uHHHH`, `\u{H+ H+}` | ✓ | -| **Unicode Properties** | _<sub>([Unicode 13.0.0](https://www.unicode.org/versions/Unicode13.0.0/))</sub>_ | ⋱ | +| **Unicode Properties** | _<sub>([Unicode 13.0.0])</sub>_ | ⋱ | |   _**Age**_ | `\p{Age=5.2}`, `\P{age=7.0}`, `\p{^age=8.0}` | ✓ | |   _**Blocks**_ | `\p{InArmenian}`, `\P{InKhmer}`, `\p{^InThai}` | ✓ | |   _**Classes**_ | `\p{Alpha}`, `\P{Space}`, `\p{^Alnum}` | ✓ | @@ -384,13 +385,17 @@ |   _**Scripts**_ | `\p{Arabic}`, `\P{Hiragana}`, `\p{^Greek}` | ✓ | |   _**Simple**_ | `\p{Dash}`, `\p{Extender}`, `\p{^Hyphen}` | ✓ | -**\[1\]**: Ruby does not support lazy or possessive interval quantifiers. Any `+` or `?` that follows an interval -quantifier will be treated as another, chained quantifier. See also [#3](https://github.com/ammar/regexp_parser/issue/3), +[Unicode 13.0.0]: https://www.unicode.org/versions/Unicode13.0.0/ + +**\[1\]**: Ruby does not support lazy or possessive interval quantifiers. +Any `+` or `?` that follows an interval quantifier will be treated as another, +chained quantifier. See also [#3](https://github.com/ammar/regexp_parser/issue/3), [#69](https://github.com/ammar/regexp_parser/pull/69). -**\[2\]**: As of Ruby 3.1, meta and control sequences are [pre-processed to hex escapes when used in Regexp literals]( - https://github.com/ruby/ruby/commit/11ae581a4a7f5d5f5ec6378872eab8f25381b1b9 ), so they will only reach the -scanner and will only be emitted if a String or a Regexp that has been built with the `::new` constructor is scanned. +**\[2\]**: As of Ruby 3.1, meta and control sequences are [pre-processed to hex +escapes when used in Regexp literals](https://github.com/ruby/ruby/commit/11ae581), +so they will only reach the scanner and will only be emitted if a String or a Regexp +that has been built with the `::new` constructor is scanned. ##### Inapplicable Features @@ -407,25 +412,27 @@ See something missing? Please submit an [issue](https://github.com/ammar/regexp_parser/issues) -_**Note**: Attempting to process expressions with unsupported syntax features can raise an error, -or incorrectly return tokens/objects as literals._ +_**Note**: Attempting to process expressions with unsupported syntax features can raise +an error, or incorrectly return tokens/objects as literals._ ## Testing To run the tests simply run rake from the root directory. -The default task generates the scanner's code from the Ragel source files and runs all the specs, thus it requires Ragel to be installed. +The default task generates the scanner's code from the Ragel source files and runs +all the specs, thus it requires Ragel to be installed. -Note that changes to Ragel files will not be reflected when running `rspec` on its own, so to run individual tests you might want to run: +Note that changes to Ragel files will not be reflected when running `rspec` on its own, +so to run individual tests you might want to run: ``` rake ragel:rb && rspec spec/scanner/properties_spec.rb ``` ## Building -Building the scanner and the gem requires [Ragel](http://www.colm.net/open-source/ragel/) to be -installed. The build tasks will automatically invoke the 'ragel:rb' task to generate the -Ruby scanner code. +Building the scanner and the gem requires [Ragel](http://www.colm.net/open-source/ragel/) +to be installed. The build tasks will automatically invoke the 'ragel:rb' task to generate +the Ruby scanner code. The project uses the standard rubygems package tasks, so: @@ -445,19 +452,26 @@ ## Example Projects Projects using regexp_parser. -- [capybara](https://github.com/teamcapybara/capybara) is an integration testing tool that uses regexp_parser to convert Regexps to css/xpath selectors. +- [capybara](https://github.com/teamcapybara/capybara) is an integration testing tool +that uses regexp_parser to convert Regexps to css/xpath selectors. -- [js_regex](https://github.com/jaynetics/js_regex) converts Ruby regular expressions to JavaScript-compatible regular expressions. +- [js_regex](https://github.com/jaynetics/js_regex) converts Ruby regular expressions +to JavaScript-compatible regular expressions. -- [meta_re](https://github.com/ammar/meta_re) is a regular expression preprocessor with alias support. +- [meta_re](https://github.com/ammar/meta_re) is a regular expression preprocessor +with alias support. -- [mutant](https://github.com/mbj/mutant) manipulates your regular expressions (amongst others) to see if your tests cover their behavior. +- [mutant](https://github.com/mbj/mutant) manipulates your regular expressions +(amongst others) to see if your tests cover their behavior. -- [repper](https://github.com/jaynetics/repper) is a regular expression pretty-printer for Ruby. +- [repper](https://github.com/jaynetics/repper) is a regular expression +pretty-printer and formatter for Ruby. -- [rubocop](https://github.com/rubocop-hq/rubocop) is a linter for Ruby that uses regexp_parser to lint Regexps. +- [rubocop](https://github.com/rubocop-hq/rubocop) is a linter for Ruby that +uses regexp_parser to lint Regexps. -- [twitter-cldr-rb](https://github.com/twitter/twitter-cldr-rb) is a localization helper that uses regexp_parser to generate examples of postal codes. +- [twitter-cldr-rb](https://github.com/twitter/twitter-cldr-rb) is a localization helper +that uses regexp_parser to generate examples of postal codes. ## References Binary files old/checksums.yaml.gz and new/checksums.yaml.gz differ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/base.rb new/lib/regexp_parser/expression/base.rb --- old/lib/regexp_parser/expression/base.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/base.rb 2022-09-26 22:15:50.000000000 +0200 @@ -14,6 +14,10 @@ end def to_re(format = :full) + if set_level > 0 + warn "Calling #to_re on character set members is deprecated - "\ + "their behavior might not be equivalent outside of the set." + end ::Regexp.new(to_s(format)) end @@ -32,15 +36,19 @@ end def repetitions - return 1..1 unless quantified? - min = quantifier.min - max = quantifier.max < 0 ? Float::INFINITY : quantifier.max - range = min..max - # fix Range#minmax on old Rubies - https://bugs.ruby-lang.org/issues/15807 - if RUBY_VERSION.to_f < 2.7 - range.define_singleton_method(:minmax) { [min, max] } - end - range + @repetitions ||= + if quantified? + min = quantifier.min + max = quantifier.max < 0 ? Float::INFINITY : quantifier.max + range = min..max + # fix Range#minmax on old Rubies - https://bugs.ruby-lang.org/issues/15807 + if RUBY_VERSION.to_f < 2.7 + range.define_singleton_method(:minmax) { [min, max] } + end + range + else + 1..1 + end end def greedy? diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/classes/backreference.rb new/lib/regexp_parser/expression/classes/backreference.rb --- old/lib/regexp_parser/expression/classes/backreference.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/classes/backreference.rb 2022-09-26 22:15:50.000000000 +0200 @@ -39,7 +39,7 @@ class NameCall < Backreference::Name; end class NumberCallRelative < Backreference::NumberRelative; end - class NumberRecursionLevel < Backreference::Number + class NumberRecursionLevel < Backreference::NumberRelative attr_reader :recursion_level def initialize(token, options = {}) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/classes/escape_sequence.rb new/lib/regexp_parser/expression/classes/escape_sequence.rb --- old/lib/regexp_parser/expression/classes/escape_sequence.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/classes/escape_sequence.rb 2022-09-26 22:15:50.000000000 +0200 @@ -1,5 +1,5 @@ module Regexp::Expression - # TODO: unify naming with Token::Escape, on way or the other, in v3.0.0 + # TODO: unify naming with Token::Escape, one way or the other, in v3.0.0 module EscapeSequence class Base < Regexp::Expression::Base def codepoint diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/classes/group.rb new/lib/regexp_parser/expression/classes/group.rb --- old/lib/regexp_parser/expression/classes/group.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/classes/group.rb 2022-09-26 22:15:50.000000000 +0200 @@ -33,6 +33,8 @@ class Absence < Group::Base; end class Atomic < Group::Base; end + # TODO: should split off OptionsSwitch in v3.0.0. Maybe even make it no + # longer inherit from Group because it is effectively a terminal expression. class Options < Group::Base attr_accessor :option_changes @@ -40,6 +42,14 @@ self.option_changes = orig.option_changes.dup super end + + def quantify(*args) + if token == :options_switch + raise Regexp::Parser::Error, 'Can not quantify an option switch' + else + super + end + end end class Capture < Group::Base diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/classes/unicode_property.rb new/lib/regexp_parser/expression/classes/unicode_property.rb --- old/lib/regexp_parser/expression/classes/unicode_property.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/classes/unicode_property.rb 2022-09-26 22:15:50.000000000 +0200 @@ -1,5 +1,5 @@ module Regexp::Expression - # TODO: unify name with token :property, on way or the other, in v3.0.0 + # TODO: unify name with token :property, one way or the other, in v3.0.0 module UnicodeProperty class Base < Regexp::Expression::Base def negative? diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/methods/human_name.rb new/lib/regexp_parser/expression/methods/human_name.rb --- old/lib/regexp_parser/expression/methods/human_name.rb 1970-01-01 01:00:00.000000000 +0100 +++ new/lib/regexp_parser/expression/methods/human_name.rb 2022-09-26 22:15:50.000000000 +0200 @@ -0,0 +1,43 @@ +module Regexp::Expression + module Shared + # default implementation, e.g. "atomic group", "hex escape", "word type", .. + def human_name + [token, type].compact.join(' ').tr('_', ' ') + end + end + + Alternation.class_eval { def human_name; 'alternation' end } + Alternative.class_eval { def human_name; 'alternative' end } + Anchor::BOL.class_eval { def human_name; 'beginning of line' end } + Anchor::BOS.class_eval { def human_name; 'beginning of string' end } + Anchor::EOL.class_eval { def human_name; 'end of line' end } + Anchor::EOS.class_eval { def human_name; 'end of string' end } + Anchor::EOSobEOL.class_eval { def human_name; 'newline-ready end of string' end } + Anchor::MatchStart.class_eval { def human_name; 'match start' end } + Anchor::NonWordBoundary.class_eval { def human_name; 'no word boundary' end } + Anchor::WordBoundary.class_eval { def human_name; 'word boundary' end } + Assertion::Lookahead.class_eval { def human_name; 'lookahead' end } + Assertion::Lookbehind.class_eval { def human_name; 'lookbehind' end } + Assertion::NegativeLookahead.class_eval { def human_name; 'negative lookahead' end } + Assertion::NegativeLookbehind.class_eval { def human_name; 'negative lookbehind' end } + Backreference::Name.class_eval { def human_name; 'backreference by name' end } + Backreference::NameCall.class_eval { def human_name; 'subexpression call by name' end } + Backreference::Number.class_eval { def human_name; 'backreference' end } + Backreference::NumberRelative.class_eval { def human_name; 'relative backreference' end } + Backreference::NumberCall.class_eval { def human_name; 'subexpression call' end } + Backreference::NumberCallRelative.class_eval { def human_name; 'relative subexpression call' end } + CharacterSet::IntersectedSequence.class_eval { def human_name; 'intersected sequence' end } + CharacterSet::Intersection.class_eval { def human_name; 'intersection' end } + CharacterSet::Range.class_eval { def human_name; 'character range' end } + CharacterType::Any.class_eval { def human_name; 'match-all' end } + Comment.class_eval { def human_name; 'comment' end } + Conditional::Branch.class_eval { def human_name; 'conditional branch' end } + Conditional::Condition.class_eval { def human_name; 'condition' end } + Conditional::Expression.class_eval { def human_name; 'conditional' end } + Group::Capture.class_eval { def human_name; "capture group #{number}" end } + Group::Named.class_eval { def human_name; 'named capture group' end } + Keep::Mark.class_eval { def human_name; 'keep-mark lookbehind' end } + Literal.class_eval { def human_name; 'literal' end } + Root.class_eval { def human_name; 'root' end } + WhiteSpace.class_eval { def human_name; 'free space' end } +end diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression/shared.rb new/lib/regexp_parser/expression/shared.rb --- old/lib/regexp_parser/expression/shared.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression/shared.rb 2022-09-26 22:15:50.000000000 +0200 @@ -8,9 +8,9 @@ attr_accessor :type, :token, :text, :ts, :te, :level, :set_level, :conditional_level, - :options, :quantifier + :options - attr_reader :nesting_level + attr_reader :nesting_level, :quantifier end end @@ -64,6 +64,10 @@ !quantifier.nil? end + def optional? + quantified? && quantifier.min == 0 + end + def offset [starts_at, full_length] end @@ -81,5 +85,10 @@ quantifier && quantifier.nesting_level = lvl terminal? || each { |subexp| subexp.nesting_level = lvl + 1 } end + + def quantifier=(qtf) + @quantifier = qtf + @repetitions = nil # clear memoized value + end end end diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/expression.rb new/lib/regexp_parser/expression.rb --- old/lib/regexp_parser/expression.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/expression.rb 2022-09-26 22:15:50.000000000 +0200 @@ -25,6 +25,7 @@ require 'regexp_parser/expression/classes/unicode_property' require 'regexp_parser/expression/methods/construct' +require 'regexp_parser/expression/methods/human_name' require 'regexp_parser/expression/methods/match' require 'regexp_parser/expression/methods/match_length' require 'regexp_parser/expression/methods/options' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/parser.rb new/lib/regexp_parser/parser.rb --- old/lib/regexp_parser/parser.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/parser.rb 2022-09-26 22:15:50.000000000 +0200 @@ -235,7 +235,15 @@ when :number, :number_ref node << Backreference::Number.new(token, active_opts) when :number_recursion_ref - node << Backreference::NumberRecursionLevel.new(token, active_opts) + node << Backreference::NumberRecursionLevel.new(token, active_opts).tap do |exp| + # TODO: should split off new token number_recursion_rel_ref and new + # class NumberRelativeRecursionLevel in v3.0.0 to get rid of this + if exp.text =~ /[<'][+-]/ + assign_effective_number(exp) + else + exp.effective_number = exp.number + end + end when :number_call node << Backreference::NumberCall.new(token, active_opts) when :number_rel_ref @@ -254,6 +262,8 @@ def assign_effective_number(exp) exp.effective_number = exp.number + total_captured_group_count + (exp.number < 0 ? 1 : 0) + exp.effective_number > 0 || + raise(ParserError, "Invalid reference: #{exp.reference}") end def conditional(token) @@ -569,15 +579,17 @@ # an instance of Backreference::Number, its #referenced_expression is set to # the instance of Group::Capture that it refers to via its number. def assign_referenced_expressions - targets = {} # find all referencable expressions + targets = { 0 => root } root.each_expression do |exp| exp.is_a?(Group::Capture) && targets[exp.identifier] = exp end # assign them to any refering expressions root.each_expression do |exp| - exp.respond_to?(:reference) && - exp.referenced_expression = targets[exp.reference] + next unless exp.respond_to?(:reference) + + exp.referenced_expression = targets[exp.reference] || + raise(ParserError, "Invalid reference: #{exp.reference}") end end end # module Regexp::Parser diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lib/regexp_parser/version.rb new/lib/regexp_parser/version.rb --- old/lib/regexp_parser/version.rb 2022-05-27 23:32:50.000000000 +0200 +++ new/lib/regexp_parser/version.rb 2022-09-26 22:15:50.000000000 +0200 @@ -1,5 +1,5 @@ class Regexp class Parser - VERSION = '2.5.0' + VERSION = '2.6.0' end end diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/metadata new/metadata --- old/metadata 2022-05-27 23:32:50.000000000 +0200 +++ new/metadata 2022-09-26 22:15:50.000000000 +0200 @@ -1,14 +1,14 @@ --- !ruby/object:Gem::Specification name: regexp_parser version: !ruby/object:Gem::Version - version: 2.5.0 + version: 2.6.0 platform: ruby authors: - Ammar Ali autorequire: bindir: bin cert_chain: [] -date: 2022-05-27 00:00:00.000000000 Z +date: 2022-09-26 00:00:00.000000000 Z dependencies: [] description: A library for tokenizing, lexing, and parsing Ruby regular expressions. email: @@ -43,6 +43,7 @@ - lib/regexp_parser/expression/classes/root.rb - lib/regexp_parser/expression/classes/unicode_property.rb - lib/regexp_parser/expression/methods/construct.rb +- lib/regexp_parser/expression/methods/human_name.rb - lib/regexp_parser/expression/methods/match.rb - lib/regexp_parser/expression/methods/match_length.rb - lib/regexp_parser/expression/methods/options.rb