Script 'mail_helper' called by obssrc
Hello community,
here is the log from the commit of package python-parsimonious for
openSUSE:Factory checked in at 2025-11-24 14:09:20
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-parsimonious (Old)
and /work/SRC/openSUSE:Factory/.python-parsimonious.new.14147 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-parsimonious"
Mon Nov 24 14:09:20 2025 rev:5 rq:1319181 version:0.11.0
Changes:
--------
--- /work/SRC/openSUSE:Factory/python-parsimonious/python-parsimonious.changes
2025-06-03 17:55:44.019064963 +0200
+++
/work/SRC/openSUSE:Factory/.python-parsimonious.new.14147/python-parsimonious.changes
2025-11-24 14:10:46.357263295 +0100
@@ -1,0 +2,9 @@
+Fri Nov 21 21:41:23 UTC 2025 - Dirk Müller <[email protected]>
+
+- update to 0.11.0:
+ * Correctly handle `/` expressions with multiple terms in a row.
+ * Start using pyproject.toml.
+ * Add a ``ParsimoniousError`` exception base class.
+ * Fall back to ``re`` when the ``regex`` lib is not available.
+
+-------------------------------------------------------------------
Old:
----
parsimonious-0.10.0.tar.gz
New:
----
parsimonious-0.11.0.tar.gz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ python-parsimonious.spec ++++++
--- /var/tmp/diff_new_pack.u26jbe/_old 2025-11-24 14:10:46.957288538 +0100
+++ /var/tmp/diff_new_pack.u26jbe/_new 2025-11-24 14:10:46.961288707 +0100
@@ -1,7 +1,7 @@
#
# spec file for package python-parsimonious
#
-# Copyright (c) 2025 SUSE LLC
+# Copyright (c) 2025 SUSE LLC and contributors
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@@ -17,7 +17,7 @@
Name: python-parsimonious
-Version: 0.10.0
+Version: 0.11.0
Release: 0
Summary: Pure-Python PEG parser
License: MIT
++++++ parsimonious-0.10.0.tar.gz -> parsimonious-0.11.0.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/.github/workflows/main.yml
new/parsimonious-0.11.0/.github/workflows/main.yml
--- old/parsimonious-0.10.0/.github/workflows/main.yml 1970-01-01
01:00:00.000000000 +0100
+++ new/parsimonious-0.11.0/.github/workflows/main.yml 2025-11-12
01:57:16.000000000 +0100
@@ -0,0 +1,34 @@
+---
+name: CI
+
+on:
+ push:
+ branches: [ master ]
+ pull_request:
+ branches: [ master ]
+
+jobs:
+ build:
+
+ runs-on: ubuntu-latest
+
+ strategy:
+ matrix:
+ python-version: ['3.8', '3.9', '3.10', '3.11', '3.12', '3.13']
+
+ name: Python ${{ matrix.python-version}}
+ steps:
+ - uses: actions/checkout@v4
+
+ - name: Set up Python
+ uses: actions/setup-python@v5
+ with:
+ python-version: ${{ matrix.python-version }}
+
+ - name: Update pip and install dev requirements
+ run: |
+ python -m pip install --upgrade pip
+ pip install tox tox-gh-actions
+
+ - name: Test
+ run: tox
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/.gitignore
new/parsimonious-0.11.0/.gitignore
--- old/parsimonious-0.10.0/.gitignore 1970-01-01 01:00:00.000000000 +0100
+++ new/parsimonious-0.11.0/.gitignore 2012-12-01 03:04:42.000000000 +0100
@@ -0,0 +1,6 @@
+.tox
+*.egg-info
+*.egg
+*.pyc
+build
+dist
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/PKG-INFO
new/parsimonious-0.11.0/PKG-INFO
--- old/parsimonious-0.10.0/PKG-INFO 2022-09-03 18:58:31.418114200 +0200
+++ new/parsimonious-0.11.0/PKG-INFO 2025-11-12 02:27:36.728791500 +0100
@@ -1,11 +1,10 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.4
Name: parsimonious
-Version: 0.10.0
+Version: 0.11.0
Summary: (Soon to be) the fastest pure-Python PEG parser I could muster
-Home-page: https://github.com/erikrose/parsimonious
-Author: Erik Rose
-Author-email: [email protected]
+Author-email: Erik Rose <[email protected]>
License: MIT
+Project-URL: Homepage, https://github.com/erikrose/parsimonious
Keywords: parse,parser,parsing,peg,packrat,grammar,language
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
@@ -14,15 +13,21 @@
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: General
Description-Content-Type: text/x-rst
License-File: LICENSE
+Requires-Dist: regex>=2022.3.15
+Provides-Extra: testing
+Requires-Dist: pytest; extra == "testing"
+Dynamic: license-file
============
Parsimonious
@@ -117,7 +122,7 @@
equal = ws? "=" ws?
lpar = "["
rpar = "]"
- ws = ~"\s*"
+ ws = ~r"\s*"
emptyline = ws+
"""
)
@@ -253,6 +258,9 @@
You can wrap a rule across multiple lines if you like; the syntax is very
forgiving.
+If you want to save your grammar into a separate file, you should name it using
+``.ppeg`` extension.
+
Syntax Reference
----------------
@@ -467,8 +475,12 @@
Version History
===============
-(Next release)
- * ...
+
+0.11.0
+ * Correctly handle `/` expressions with multiple terms in a row. (lucaswiman)
+ * Start using pyproject.toml. (Kolanich)
+ * Add a ``ParsimoniousError`` exception base class. (Kevin Kirsche)
+ * Fall back to ``re`` when the ``regex`` lib is not available. (Pavel
Kirienko)
0.10.0
* Fix infinite recursion in __eq__ in some cases. (FelisNivalis)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/README.rst
new/parsimonious-0.11.0/README.rst
--- old/parsimonious-0.10.0/README.rst 2022-09-03 18:57:09.000000000 +0200
+++ new/parsimonious-0.11.0/README.rst 2025-11-12 02:13:20.000000000 +0100
@@ -91,7 +91,7 @@
equal = ws? "=" ws?
lpar = "["
rpar = "]"
- ws = ~"\s*"
+ ws = ~r"\s*"
emptyline = ws+
"""
)
@@ -227,6 +227,9 @@
You can wrap a rule across multiple lines if you like; the syntax is very
forgiving.
+If you want to save your grammar into a separate file, you should name it using
+``.ppeg`` extension.
+
Syntax Reference
----------------
@@ -441,8 +444,12 @@
Version History
===============
-(Next release)
- * ...
+
+0.11.0
+ * Correctly handle `/` expressions with multiple terms in a row. (lucaswiman)
+ * Start using pyproject.toml. (Kolanich)
+ * Add a ``ParsimoniousError`` exception base class. (Kevin Kirsche)
+ * Fall back to ``re`` when the ``regex`` lib is not available. (Pavel
Kirienko)
0.10.0
* Fix infinite recursion in __eq__ in some cases. (FelisNivalis)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/parsimonious/adhoc_expression.py
new/parsimonious-0.11.0/parsimonious/adhoc_expression.py
--- old/parsimonious-0.10.0/parsimonious/adhoc_expression.py 2022-05-10
07:17:05.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious/adhoc_expression.py 1970-01-01
01:00:00.000000000 +0100
@@ -1,75 +0,0 @@
-from .expressions import Expression
-def expression(callable, rule_name, grammar):
- """Turn a plain callable into an Expression.
-
- The callable can be of this simple form::
-
- def foo(text, pos):
- '''If this custom expression matches starting at text[pos], return
- the index where it stops matching. Otherwise, return None.'''
- if the expression matched:
- return end_pos
-
- If there child nodes to return, return a tuple::
-
- return end_pos, children
-
- If the expression doesn't match at the given ``pos`` at all... ::
-
- return None
-
- If your callable needs to make sub-calls to other rules in the grammar or
- do error reporting, it can take this form, gaining additional arguments::
-
- def foo(text, pos, cache, error, grammar):
- # Call out to other rules:
- node = grammar['another_rule'].match_core(text, pos, cache, error)
- ...
- # Return values as above.
-
- The return value of the callable, if an int or a tuple, will be
- automatically transmuted into a :class:`~parsimonious.Node`. If it returns
- a Node-like class directly, it will be passed through unchanged.
-
- :arg rule_name: The rule name to attach to the resulting
- :class:`~parsimonious.Expression`
- :arg grammar: The :class:`~parsimonious.Grammar` this expression will be a
- part of, to make delegating to other rules possible
-
- """
-
- # Resolve unbound methods; allows grammars to use @staticmethod custom
rules
- #
https://stackoverflow.com/questions/41921255/staticmethod-object-is-not-callable
- if ismethoddescriptor(callable) and hasattr(callable, '__func__'):
- callable = callable.__func__
-
- num_args = len(getfullargspec(callable).args)
- if ismethod(callable):
- # do not count the first argument (typically 'self') for methods
- num_args -= 1
- if num_args == 2:
- is_simple = True
- elif num_args == 5:
- is_simple = False
- else:
- raise RuntimeError("Custom rule functions must take either 2 or 5 "
- "arguments, not %s." % num_args)
-
- class AdHocExpression(Expression):
- def _uncached_match(self, text, pos, cache, error):
- result = (callable(text, pos) if is_simple else
- callable(text, pos, cache, error, grammar))
-
- if isinstance(result, int):
- end, children = result, None
- elif isinstance(result, tuple):
- end, children = result
- else:
- # Node or None
- return result
- return Node(self, text, pos, end, children=children)
-
- def _as_rhs(self):
- return '{custom function "%s"}' % callable.__name__
-
- return AdHocExpression(name=rule_name)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/parsimonious/exceptions.py
new/parsimonious-0.11.0/parsimonious/exceptions.py
--- old/parsimonious-0.10.0/parsimonious/exceptions.py 2022-06-26
20:32:08.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious/exceptions.py 2025-11-12
01:57:16.000000000 +0100
@@ -3,7 +3,12 @@
from parsimonious.utils import StrAndRepr
-class ParseError(StrAndRepr, Exception):
+class ParsimoniousError(Exception):
+ """A base exception class to allow library users to catch any Parsimonious
error."""
+ pass
+
+
+class ParseError(StrAndRepr, ParsimoniousError):
"""A call to ``Expression.parse()`` or ``match()`` didn't match."""
def __init__(self, text, pos=-1, expr=None):
@@ -71,7 +76,7 @@
self.column())
-class VisitationError(Exception):
+class VisitationError(ParsimoniousError):
"""Something went wrong while traversing a parse tree.
This exception exists to augment an underlying exception with information
@@ -100,7 +105,7 @@
node.prettily(error=node)))
-class BadGrammar(StrAndRepr, Exception):
+class BadGrammar(StrAndRepr, ParsimoniousError):
"""Something was wrong with the definition of a grammar.
Note that a ParseError might be raised instead if the error is in the
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/parsimonious/expressions.py
new/parsimonious-0.11.0/parsimonious/expressions.py
--- old/parsimonious-0.10.0/parsimonious/expressions.py 2022-06-29
07:53:12.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious/expressions.py 2025-11-12
01:57:16.000000000 +0100
@@ -8,7 +8,10 @@
from collections import defaultdict
from inspect import getfullargspec, isfunction, ismethod, ismethoddescriptor
-import regex as re
+try:
+ import regex as re
+except ImportError:
+ import re # Fallback as per
https://github.com/erikrose/parsimonious/issues/231
from parsimonious.exceptions import ParseError, IncompleteParseError,
LeftRecursionError
from parsimonious.nodes import Node, RegexNode
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/parsimonious/grammar.py
new/parsimonious-0.11.0/parsimonious/grammar.py
--- old/parsimonious-0.10.0/parsimonious/grammar.py 2022-06-26
20:33:42.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious/grammar.py 2025-11-12
01:57:16.000000000 +0100
@@ -197,8 +197,8 @@
term.members = (not_term,) + term.members
sequence = Sequence(term, OneOrMore(term), name='sequence')
- or_term = Sequence(Literal('/'), _, term, name='or_term')
- ored = Sequence(term, OneOrMore(or_term), name='ored')
+ or_term = Sequence(Literal('/'), _, OneOrMore(term), name='or_term')
+ ored = Sequence(OneOrMore(term), OneOrMore(or_term), name='ored')
expression = OneOf(ored, sequence, term, name='expression')
rule = Sequence(label, equals, expression, name='rule')
rules = Sequence(_, OneOrMore(rule), name='rules')
@@ -231,8 +231,8 @@
~"u?r?b?'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'"is
expression = ored / sequence / term
- or_term = "/" _ term
- ored = term or_term+
+ or_term = "/" _ term+
+ ored = term+ or_term+
sequence = term term+
not_term = "!" term _
lookahead_term = "&" term _
@@ -367,6 +367,10 @@
def visit_ored(self, node, ored):
first_term, other_terms = ored
+ if len(first_term) == 1:
+ first_term = first_term[0]
+ else:
+ first_term = Sequence(*first_term)
return OneOf(first_term, *other_terms)
def visit_or_term(self, node, or_term):
@@ -375,8 +379,11 @@
We already know it's going to be ored, from the containing ``ored``.
"""
- slash, _, term = or_term
- return term
+ slash, _, terms = or_term
+ if len(terms) == 1:
+ return terms[0]
+ else:
+ return Sequence(*terms)
def visit_label(self, node, label):
"""Turn a label into a unicode string."""
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' "old/parsimonious-0.10.0/parsimonious/grammar.py
pre-rewriting-resolve_refs.py" "new/parsimonious-0.11.0/parsimonious/grammar.py
pre-rewriting-resolve_refs.py"
--- "old/parsimonious-0.10.0/parsimonious/grammar.py
pre-rewriting-resolve_refs.py" 1970-01-01 01:00:00.000000000 +0100
+++ "new/parsimonious-0.11.0/parsimonious/grammar.py
pre-rewriting-resolve_refs.py" 2018-04-11 03:38:41.000000000 +0200
@@ -0,0 +1,498 @@
+"""A convenience which constructs expression trees from an easy-to-read syntax
+
+Use this unless you have a compelling reason not to; it performs some
+optimizations that would be tedious to do when constructing an expression tree
+by hand.
+
+"""
+from collections import OrderedDict
+from inspect import isfunction, ismethod
+
+from six import (text_type, itervalues, iteritems,
python_2_unicode_compatible, PY2)
+
+from parsimonious.exceptions import BadGrammar, UndefinedLabel
+from parsimonious.expressions import (Literal, Regex, Sequence, OneOf,
+ Lookahead, Optional, ZeroOrMore, OneOrMore, Not, TokenMatcher,
+ expression)
+from parsimonious.nodes import NodeVisitor
+from parsimonious.utils import evaluate_string
+
+@python_2_unicode_compatible
+class Grammar(OrderedDict):
+ """A collection of rules that describe a language
+
+ You can start parsing from the default rule by calling ``parse()``
+ directly on the ``Grammar`` object::
+
+ g = Grammar('''
+ polite_greeting = greeting ", my good " title
+ greeting = "Hi" / "Hello"
+ title = "madam" / "sir"
+ ''')
+ g.parse('Hello, my good sir')
+
+ Or start parsing from any of the other rules; you can pull them out of the
+ grammar as if it were a dictionary::
+
+ g['title'].parse('sir')
+
+ You could also just construct a bunch of ``Expression`` objects yourself
+ and stitch them together into a language, but using a ``Grammar`` has some
+ important advantages:
+
+ * Languages are much easier to define in the nice syntax it provides.
+ * Circular references aren't a pain.
+ * It does all kinds of whizzy space- and time-saving optimizations, like
+ factoring up repeated subexpressions into a single object, which should
+ increase cache hit ratio. [Is this implemented yet?]
+
+ """
+ def __init__(self, rules='', **more_rules):
+ """Construct a grammar.
+
+ :arg rules: A string of production rules, one per line.
+ :arg default_rule: The name of the rule invoked when you call
+ :meth:`parse()` or :meth:`match()` on the grammar. Defaults to the
+ first rule. Falls back to None if there are no string-based rules
+ in this grammar.
+ :arg more_rules: Additional kwargs whose names are rule names and
+ values are Expressions or custom-coded callables which accomplish
+ things the built-in rule syntax cannot. These take precedence over
+ ``rules`` in case of naming conflicts.
+
+ """
+
+ decorated_custom_rules = {
+ k: (expression(v, k, self) if isfunction(v) or ismethod(v) else v)
+ for k, v in iteritems(more_rules)}
+
+ exprs, first = self._expressions_from_rules(rules,
decorated_custom_rules)
+ super(Grammar, self).__init__(exprs.items())
+ self.default_rule = first # may be None
+
+ def default(self, rule_name):
+ """Return a new Grammar whose :term:`default rule` is ``rule_name``."""
+ new = self._copy()
+ new.default_rule = new[rule_name]
+ return new
+
+ def _copy(self):
+ """Return a shallow copy of myself.
+
+ Deep is unnecessary, since Expression trees are immutable. Subgrammars
+ recreate all the Expressions from scratch, and AbstractGrammars have
+ no Expressions.
+
+ """
+ new = Grammar.__new__(Grammar)
+ super(Grammar, new).__init__(iteritems(self))
+ new.default_rule = self.default_rule
+ return new
+
+ def _expressions_from_rules(self, rules, custom_rules):
+ """Return a 2-tuple: a dict of rule names pointing to their
+ expressions, and then the first rule.
+
+ It's a web of expressions, all referencing each other. Typically,
+ there's a single root to the web of references, and that root is the
+ starting symbol for parsing, but there's nothing saying you can't have
+ multiple roots.
+
+ :arg custom_rules: A map of rule names to custom-coded rules:
+ Expressions
+
+ """
+ tree = rule_grammar.parse(rules)
+ return RuleVisitor(custom_rules).visit(tree)
+
+ def parse(self, text, pos=0):
+ """Parse some text with the :term:`default rule`.
+
+ :arg pos: The index at which to start parsing
+
+ """
+ self._check_default_rule()
+ return self.default_rule.parse(text, pos=pos)
+
+ def match(self, text, pos=0):
+ """Parse some text with the :term:`default rule` but not necessarily
+ all the way to the end.
+
+ :arg pos: The index at which to start parsing
+
+ """
+ self._check_default_rule()
+ return self.default_rule.match(text, pos=pos)
+
+ def _check_default_rule(self):
+ """Raise RuntimeError if there is no default rule defined."""
+ if not self.default_rule:
+ raise RuntimeError("Can't call parse() on a Grammar that has no "
+ "default rule. Choose a specific rule instead, "
+ "like some_grammar['some_rule'].parse(...).")
+
+ def __str__(self):
+ """Return a rule string that, when passed to the constructor, would
+ reconstitute the grammar."""
+ exprs = [self.default_rule] if self.default_rule else []
+ exprs.extend(expr for expr in itervalues(self) if
+ expr is not self.default_rule)
+ return '\n'.join(expr.as_rule() for expr in exprs)
+
+ def __repr__(self):
+ """Return an expression that will reconstitute the grammar."""
+ codec = 'string_escape' if PY2 else 'unicode_escape'
+ return "Grammar('%s')" % str(self).encode(codec)
+
+
+class TokenGrammar(Grammar):
+ """A Grammar which takes a list of pre-lexed tokens instead of text
+
+ This is useful if you want to do the lexing yourself, as a separate pass:
+ for example, to implement indentation-based languages.
+
+ """
+ def _expressions_from_rules(self, rules, custom_rules):
+ tree = rule_grammar.parse(rules)
+ return TokenRuleVisitor(custom_rules).visit(tree)
+
+
+class BootstrappingGrammar(Grammar):
+ """The grammar used to recognize the textual rules that describe other
+ grammars
+
+ This grammar gets its start from some hard-coded Expressions and claws its
+ way from there to an expression tree that describes how to parse the
+ grammar description syntax.
+
+ """
+ def _expressions_from_rules(self, rule_syntax, custom_rules):
+ """Return the rules for parsing the grammar definition syntax.
+
+ Return a 2-tuple: a dict of rule names pointing to their expressions,
+ and then the top-level expression for the first rule.
+
+ """
+ # Hard-code enough of the rules to parse the grammar that describes the
+ # grammar description language, to bootstrap:
+ comment = Regex(r'#[^\r\n]*', name='comment')
+ meaninglessness = OneOf(Regex(r'\s+'), comment, name='meaninglessness')
+ _ = ZeroOrMore(meaninglessness, name='_')
+ equals = Sequence(Literal('='), _, name='equals')
+ label = Sequence(Regex(r'[a-zA-Z_][a-zA-Z_0-9]*'), _, name='label')
+ reference = Sequence(label, Not(equals), name='reference')
+ quantifier = Sequence(Regex(r'[*+?]'), _, name='quantifier')
+ # This pattern supports empty literals. TODO: A problem?
+ spaceless_literal = Regex(r'u?r?"[^"\\]*(?:\\.[^"\\]*)*"',
+ ignore_case=True,
+ dot_all=True,
+ name='spaceless_literal')
+ literal = Sequence(spaceless_literal, _, name='literal')
+ regex = Sequence(Literal('~'),
+ literal,
+ Regex('[ilmsux]*', ignore_case=True),
+ _,
+ name='regex')
+ atom = OneOf(reference, literal, regex, name='atom')
+ quantified = Sequence(atom, quantifier, name='quantified')
+
+ term = OneOf(quantified, atom, name='term')
+ not_term = Sequence(Literal('!'), term, _, name='not_term')
+ term.members = (not_term,) + term.members
+
+ sequence = Sequence(term, OneOrMore(term), name='sequence')
+ or_term = Sequence(Literal('/'), _, term, name='or_term')
+ ored = Sequence(term, OneOrMore(or_term), name='ored')
+ expression = OneOf(ored, sequence, term, name='expression')
+ rule = Sequence(label, equals, expression, name='rule')
+ rules = Sequence(_, OneOrMore(rule), name='rules')
+
+ # Use those hard-coded rules to parse the (more extensive) rule syntax.
+ # (For example, unless I start using parentheses in the rule language
+ # definition itself, I should never have to hard-code expressions for
+ # those above.)
+
+ rule_tree = rules.parse(rule_syntax)
+
+ # Turn the parse tree into a map of expressions:
+ return RuleVisitor().visit(rule_tree)
+
+
+# The grammar for parsing PEG grammar definitions:
+# This is a nice, simple grammar. We may someday add to it, but it's a safe bet
+# that the future will always be a superset of this.
+rule_syntax = (r'''
+ # Ignored things (represented by _) are typically hung off the end of the
+ # leafmost kinds of nodes. Literals like "/" count as leaves.
+
+ rules = _ rule*
+ rule = label equals expression
+ equals = "=" _
+ literal = spaceless_literal _
+
+ # So you can't spell a regex like `~"..." ilm`:
+ spaceless_literal = ~"u?r?\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\""is /
+ ~"u?r?'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'"is
+
+ expression = ored / sequence / term
+ or_term = "/" _ term
+ ored = term or_term+
+ sequence = term term+
+ not_term = "!" term _
+ lookahead_term = "&" term _
+ term = not_term / lookahead_term / quantified / atom
+ quantified = atom quantifier
+ atom = reference / literal / regex / parenthesized
+ regex = "~" spaceless_literal ~"[ilmsux]*"i _
+ parenthesized = "(" _ expression ")" _
+ quantifier = ~"[*+?]" _
+ reference = label !equals
+
+ # A subsequent equal sign is the only thing that distinguishes a label
+ # (which begins a new rule) from a reference (which is just a pointer to a
+ # rule defined somewhere else):
+ label = ~"[a-zA-Z_][a-zA-Z_0-9]*" _
+
+ # _ = ~r"\s*(?:#[^\r\n]*)?\s*"
+ _ = meaninglessness*
+ meaninglessness = ~r"\s+" / comment
+ comment = ~r"#[^\r\n]*"
+ ''')
+
+
+class LazyReference(text_type):
+ """A lazy reference to a rule, which we resolve after grokking all the
+ rules"""
+
+ name = u''
+
+ # Just for debugging:
+ def _as_rhs(self):
+ return u'<LazyReference to %s>' % self
+
+
+class RuleVisitor(NodeVisitor):
+ """Turns a parse tree of a grammar definition into a map of ``Expression``
+ objects
+
+ This is the magic piece that breathes life into a parsed bunch of parse
+ rules, allowing them to go forth and parse other things.
+
+ """
+ quantifier_classes = {'?': Optional, '*': ZeroOrMore, '+': OneOrMore}
+
+ visit_expression = visit_term = visit_atom = NodeVisitor.lift_child
+
+ def __init__(self, custom_rules=None):
+ """Construct.
+
+ :arg custom_rules: A dict of {rule name: expression} holding custom
+ rules which will take precedence over the others
+
+ """
+ self.custom_rules = custom_rules or {}
+
+ def visit_parenthesized(self, node, parenthesized):
+ """Treat a parenthesized subexpression as just its contents.
+
+ Its position in the tree suffices to maintain its grouping semantics.
+
+ """
+ left_paren, _, expression, right_paren, _ = parenthesized
+ return expression
+
+ def visit_quantifier(self, node, quantifier):
+ """Turn a quantifier into just its symbol-matching node."""
+ symbol, _ = quantifier
+ return symbol
+
+ def visit_quantified(self, node, quantified):
+ atom, quantifier = quantified
+ return self.quantifier_classes[quantifier.text](atom)
+
+ def visit_lookahead_term(self, node, lookahead_term):
+ ampersand, term, _ = lookahead_term
+ return Lookahead(term)
+
+ def visit_not_term(self, node, not_term):
+ exclamation, term, _ = not_term
+ return Not(term)
+
+ def visit_rule(self, node, rule):
+ """Assign a name to the Expression and return it."""
+ label, equals, expression = rule
+ expression.name = label # Assign a name to the expr.
+ return expression
+
+ def visit_sequence(self, node, sequence):
+ """A parsed Sequence looks like [term node, OneOrMore node of
+ ``another_term``s]. Flatten it out."""
+ term, other_terms = sequence
+ return Sequence(term, *other_terms)
+
+ def visit_ored(self, node, ored):
+ first_term, other_terms = ored
+ return OneOf(first_term, *other_terms)
+
+ def visit_or_term(self, node, or_term):
+ """Return just the term from an ``or_term``.
+
+ We already know it's going to be ored, from the containing ``ored``.
+
+ """
+ slash, _, term = or_term
+ return term
+
+ def visit_label(self, node, label):
+ """Turn a label into a unicode string."""
+ name, _ = label
+ return name.text
+
+ def visit_reference(self, node, reference):
+ """Stick a :class:`LazyReference` in the tree as a placeholder.
+
+ We resolve them all later.
+
+ """
+ label, not_equals = reference
+ return LazyReference(label)
+
+ def visit_regex(self, node, regex):
+ """Return a ``Regex`` expression."""
+ tilde, literal, flags, _ = regex
+ flags = flags.text.upper()
+ pattern = literal.literal # Pull the string back out of the Literal
+ # object.
+ return Regex(pattern, ignore_case='I' in flags,
+ locale='L' in flags,
+ multiline='M' in flags,
+ dot_all='S' in flags,
+ unicode='U' in flags,
+ verbose='X' in flags)
+
+ def visit_spaceless_literal(self, spaceless_literal, visited_children):
+ """Turn a string literal into a ``Literal`` that recognizes it."""
+ return Literal(evaluate_string(spaceless_literal.text))
+
+ def visit_literal(self, node, literal):
+ """Pick just the literal out of a literal-and-junk combo."""
+ spaceless_literal, _ = literal
+ return spaceless_literal
+
+ def generic_visit(self, node, visited_children):
+ """Replace childbearing nodes with a list of their children; keep
+ others untouched.
+
+ For our case, if a node has children, only the children are important.
+ Otherwise, keep the node around for (for example) the flags of the
+ regex rule. Most of these kept-around nodes are subsequently thrown
+ away by the other visitor methods.
+
+ We can't simply hang the visited children off the original node; that
+ would be disastrous if the node occurred in more than one place in the
+ tree.
+
+ """
+ return visited_children or node # should semantically be a tuple
+
+ def _resolve_refs(self, rule_map, expr, done, level):
+ """Return an expression with all its lazy references recursively
+ resolved.
+
+ Resolve any lazy references in the expression ``expr``, recursing into
+ all subexpressions.
+
+ :arg done: The set of Expressions that have already been or are
+ currently being resolved, to ward off redundant work and prevent
+ infinite recursion for circular refs
+
+ """
+# if expr.name == 'paren':
+# import pdb;pdb.set_trace()
+
+ print(' ' * level, expr)
+ if isinstance(expr, LazyReference):
+ label = text_type(expr)
+ try:
+ reffed_expr = rule_map[label]
+ except KeyError:
+ raise UndefinedLabel(expr)
+ return self._resolve_refs(rule_map, reffed_expr, done, level + 1)
+ else:
+ if getattr(expr, 'members', ()):
+ if expr not in done:
+ # Prevents infinite recursion for circular refs. At worst,
one
+ # of `expr.members` can refer back to `expr`, but it can't
go
+ # any farther.
+ done.add(expr)
+ expr.members = tuple(self._resolve_refs(rule_map, member,
done, level + 1)
+ for member in expr.members)
+ else:
+ print(' ' * level, 'ALREADY DONE^^^')
+ return expr
+
+ def visit_rules(self, node, rules_list):
+ """Collate all the rules into a map. Return (map, default rule).
+
+ The default rule is the first one. Or, if you have more than one rule
+ of that name, it's the last-occurring rule of that name. (This lets you
+ override the default rule when you extend a grammar.) If there are no
+ string-based rules, the default rule is None, because the custom rules,
+ due to being kwarg-based, are unordered.
+
+ """
+ _, rules = rules_list
+
+ # Map each rule's name to its Expression. Later rules of the same name
+ # override earlier ones. This lets us define rules multiple times and
+ # have the last declaration win, so you can extend grammars by
+ # concatenation.
+ rule_map = OrderedDict((expr.name, expr) for expr in rules)
+
+ # And custom rules override string-based rules. This is the least
+ # surprising choice when you compare the dict constructor:
+ # dict({'x': 5}, x=6).
+ rule_map.update(self.custom_rules)
+
+ # Resolve references. This tolerates forward references.
+ done = set()
+ rule_map = OrderedDict((expr.name, self._resolve_refs(rule_map, expr,
done, 0))
+ for expr in itervalues(rule_map))
+
+ # isinstance() is a temporary hack around the fact that * rules don't
+ # always get transformed into lists by NodeVisitor. We should fix that;
+ # it's surprising and requires writing lame branches like this.
+ return rule_map, (rule_map[rules[0].name]
+ if isinstance(rules, list) and rules else None)
+
+
+class TokenRuleVisitor(RuleVisitor):
+ """A visitor which builds expression trees meant to work on sequences of
+ pre-lexed tokens rather than strings"""
+
+ def visit_spaceless_literal(self, spaceless_literal, visited_children):
+ """Turn a string literal into a ``TokenMatcher`` that matches
+ ``Token`` objects by their ``type`` attributes."""
+ return TokenMatcher(evaluate_string(spaceless_literal.text))
+
+ def visit_regex(self, node, regex):
+ tilde, literal, flags, _ = regex
+ raise BadGrammar('Regexes do not make sense in TokenGrammars, since '
+ 'TokenGrammars operate on pre-lexed tokens rather '
+ 'than characters.')
+
+
+# Bootstrap to level 1...
+rule_grammar = BootstrappingGrammar(rule_syntax)
+# ...and then to level 2. This establishes that the node tree of our rule
+# syntax is built by the same machinery that will build trees of our users'
+# grammars. And the correctness of that tree is tested, indirectly, in
+# test_grammar.
+rule_grammar = Grammar(rule_syntax)
+
+
+# TODO: Teach Expression trees how to spit out Python representations of
+# themselves. Then we can just paste that in above, and we won't have to
+# bootstrap on import. Though it'll be a little less DRY. [Ah, but this is not
+# so clean, because it would have to output multiple statements to get multiple
+# refs to a single expression hooked up.]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/parsimonious-0.10.0/parsimonious/tests/test_grammar.py
new/parsimonious-0.11.0/parsimonious/tests/test_grammar.py
--- old/parsimonious-0.10.0/parsimonious/tests/test_grammar.py 2022-06-29
07:53:12.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious/tests/test_grammar.py 2025-11-12
01:57:16.000000000 +0100
@@ -504,6 +504,29 @@
list(grammar.keys()),
['r%s' % i for i in range(100)])
+ def test_sequence_choice_bug(self):
+ """
+ Regression test for https://github.com/erikrose/parsimonious/issues/238
+ """
+ grammar = Grammar(r'''
+ value = "[" "]" / "5"
+ ''')
+ self.assertTrue(grammar.parse('[]') is not None)
+ self.assertTrue(grammar.parse('5') is not None)
+ grammar2 = Grammar(r'''
+ value = "5" / "[" "]"
+ ''')
+ self.assertTrue(grammar2.parse('[]') is not None)
+ self.assertTrue(grammar2.parse('5') is not None)
+ grammar3 = Grammar(r'''
+ value = "4" / "[" "]" / "(" ")" / "{" "}" / "5"
+ ''')
+ self.assertTrue(grammar3.parse('[]') is not None)
+ self.assertTrue(grammar3.parse('5') is not None)
+ self.assertTrue(grammar3.parse('()') is not None)
+ self.assertTrue(grammar3.parse('{}') is not None)
+ self.assertTrue(grammar3.parse('4') is not None)
+
def test_repetitions(self):
grammar = Grammar(r'''
left_missing = "a"{,5}
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/parsimonious.egg-info/PKG-INFO
new/parsimonious-0.11.0/parsimonious.egg-info/PKG-INFO
--- old/parsimonious-0.10.0/parsimonious.egg-info/PKG-INFO 2022-09-03
18:58:31.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious.egg-info/PKG-INFO 2025-11-12
02:27:36.000000000 +0100
@@ -1,11 +1,10 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.4
Name: parsimonious
-Version: 0.10.0
+Version: 0.11.0
Summary: (Soon to be) the fastest pure-Python PEG parser I could muster
-Home-page: https://github.com/erikrose/parsimonious
-Author: Erik Rose
-Author-email: [email protected]
+Author-email: Erik Rose <[email protected]>
License: MIT
+Project-URL: Homepage, https://github.com/erikrose/parsimonious
Keywords: parse,parser,parsing,peg,packrat,grammar,language
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
@@ -14,15 +13,21 @@
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: General
Description-Content-Type: text/x-rst
License-File: LICENSE
+Requires-Dist: regex>=2022.3.15
+Provides-Extra: testing
+Requires-Dist: pytest; extra == "testing"
+Dynamic: license-file
============
Parsimonious
@@ -117,7 +122,7 @@
equal = ws? "=" ws?
lpar = "["
rpar = "]"
- ws = ~"\s*"
+ ws = ~r"\s*"
emptyline = ws+
"""
)
@@ -253,6 +258,9 @@
You can wrap a rule across multiple lines if you like; the syntax is very
forgiving.
+If you want to save your grammar into a separate file, you should name it using
+``.ppeg`` extension.
+
Syntax Reference
----------------
@@ -467,8 +475,12 @@
Version History
===============
-(Next release)
- * ...
+
+0.11.0
+ * Correctly handle `/` expressions with multiple terms in a row. (lucaswiman)
+ * Start using pyproject.toml. (Kolanich)
+ * Add a ``ParsimoniousError`` exception base class. (Kevin Kirsche)
+ * Fall back to ``re`` when the ``regex`` lib is not available. (Pavel
Kirienko)
0.10.0
* Fix infinite recursion in __eq__ in some cases. (FelisNivalis)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/parsimonious-0.10.0/parsimonious.egg-info/SOURCES.txt
new/parsimonious-0.11.0/parsimonious.egg-info/SOURCES.txt
--- old/parsimonious-0.10.0/parsimonious.egg-info/SOURCES.txt 2022-09-03
18:58:31.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious.egg-info/SOURCES.txt 2025-11-12
02:27:36.000000000 +0100
@@ -1,12 +1,16 @@
+.gitignore
LICENSE
MANIFEST.in
README.rst
+pyproject.toml
setup.py
+tox.ini
+.github/workflows/main.yml
parsimonious/__init__.py
-parsimonious/adhoc_expression.py
parsimonious/exceptions.py
parsimonious/expressions.py
parsimonious/grammar.py
+parsimonious/grammar.py pre-rewriting-resolve_refs.py
parsimonious/nodes.py
parsimonious/utils.py
parsimonious.egg-info/PKG-INFO
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/parsimonious-0.10.0/parsimonious.egg-info/requires.txt
new/parsimonious-0.11.0/parsimonious.egg-info/requires.txt
--- old/parsimonious-0.10.0/parsimonious.egg-info/requires.txt 2022-09-03
18:58:31.000000000 +0200
+++ new/parsimonious-0.11.0/parsimonious.egg-info/requires.txt 2025-11-12
02:27:36.000000000 +0100
@@ -1 +1,4 @@
regex>=2022.3.15
+
+[testing]
+pytest
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/pyproject.toml
new/parsimonious-0.11.0/pyproject.toml
--- old/parsimonious-0.10.0/pyproject.toml 1970-01-01 01:00:00.000000000
+0100
+++ new/parsimonious-0.11.0/pyproject.toml 2025-11-12 02:04:47.000000000
+0100
@@ -0,0 +1,51 @@
+[build-system]
+requires = ["setuptools>=61.2.0", "setuptools_scm[toml]>=3.4.3"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "parsimonious"
+version = "0.11.0"
+authors = [{name = "Erik Rose", email = "[email protected]"}]
+license = {text = "MIT"}
+description = "(Soon to be) the fastest pure-Python PEG parser I could muster"
+keywords = [
+ "parse",
+ "parser",
+ "parsing",
+ "peg",
+ "packrat",
+ "grammar",
+ "language",
+]
+readme = "README.rst"
+classifiers = [
+ "Intended Audience :: Developers",
+ "Natural Language :: English",
+ "Development Status :: 3 - Alpha",
+ "License :: OSI Approved :: MIT License",
+ "Operating System :: OS Independent",
+ "Programming Language :: Python :: 3 :: Only",
+ "Programming Language :: Python :: 3",
+ "Programming Language :: Python :: 3.8",
+ "Programming Language :: Python :: 3.9",
+ "Programming Language :: Python :: 3.10",
+ "Programming Language :: Python :: 3.11",
+ "Programming Language :: Python :: 3.12",
+ "Programming Language :: Python :: 3.13",
+ "Topic :: Scientific/Engineering :: Information Analysis",
+ "Topic :: Software Development :: Libraries",
+ "Topic :: Text Processing :: General",
+]
+urls = {Homepage = "https://github.com/erikrose/parsimonious"}
+dependencies = ["regex>=2022.3.15"]
+
+[project.optional-dependencies]
+testing = ["pytest"]
+
+[tool.setuptools]
+include-package-data = true
+
+[tool.setuptools.packages]
+find = {namespaces = false}
+
+[tool.setuptools_scm]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/setup.py
new/parsimonious-0.11.0/setup.py
--- old/parsimonious-0.10.0/setup.py 2022-09-03 18:58:18.000000000 +0200
+++ new/parsimonious-0.11.0/setup.py 2025-11-12 01:57:16.000000000 +0100
@@ -1,38 +1,4 @@
-from sys import version_info
+from setuptools import setup
-from io import open
-from setuptools import setup, find_packages
-
-long_description=open('README.rst', 'r', encoding='utf8').read()
-
-setup(
- name='parsimonious',
- version='0.10.0',
- description='(Soon to be) the fastest pure-Python PEG parser I could
muster',
- long_description=long_description,
- long_description_content_type='text/x-rst',
- author='Erik Rose',
- author_email='[email protected]',
- license='MIT',
- packages=find_packages(exclude=['ez_setup']),
- test_suite='tests',
- url='https://github.com/erikrose/parsimonious',
- include_package_data=True,
- install_requires=['regex>=2022.3.15'],
- classifiers=[
- 'Intended Audience :: Developers',
- 'Natural Language :: English',
- 'Development Status :: 3 - Alpha',
- 'License :: OSI Approved :: MIT License',
- 'Operating System :: OS Independent',
- 'Programming Language :: Python :: 3 :: Only',
- 'Programming Language :: Python :: 3',
- 'Programming Language :: Python :: 3.7',
- 'Programming Language :: Python :: 3.8',
- 'Programming Language :: Python :: 3.9',
- 'Programming Language :: Python :: 3.10',
- 'Topic :: Scientific/Engineering :: Information Analysis',
- 'Topic :: Software Development :: Libraries',
- 'Topic :: Text Processing :: General'],
- keywords=['parse', 'parser', 'parsing', 'peg', 'packrat', 'grammar',
'language'],
-)
+if __name__ == "__main__":
+ setup()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/parsimonious-0.10.0/tox.ini
new/parsimonious-0.11.0/tox.ini
--- old/parsimonious-0.10.0/tox.ini 1970-01-01 01:00:00.000000000 +0100
+++ new/parsimonious-0.11.0/tox.ini 2025-11-12 01:57:16.000000000 +0100
@@ -0,0 +1,17 @@
+[tox]
+envlist = py38, py39, py310, py311, py312, py313
+
+[gh-actions]
+python =
+ 3.8: py38
+ 3.9: py39
+ 3.10: py310
+ 3.11: py311
+ 3.12: py312
+ 3.13: py313
+
+[testenv]
+usedevelop = True
+commands = py.test --tb=native {posargs:parsimonious}
+deps =
+ pytest