Hello community,
here is the log from the commit of package python-hypothesmith for
openSUSE:Factory checked in at 2020-08-25 12:38:33
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-hypothesmith (Old)
and /work/SRC/openSUSE:Factory/.python-hypothesmith.new.3399 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-hypothesmith"
Tue Aug 25 12:38:33 2020 rev:2 rq:828110 version:0.1.4
Changes:
--------
--- /work/SRC/openSUSE:Factory/python-hypothesmith/python-hypothesmith.changes
2020-04-16 23:05:18.031784123 +0200
+++
/work/SRC/openSUSE:Factory/.python-hypothesmith.new.3399/python-hypothesmith.changes
2020-08-25 12:38:41.313413578 +0200
@@ -1,0 +2,42 @@
+Wed Aug 19 15:46:37 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Update to 0.1.4
+ * Improve handling of identifiers
+ * Fix internal error in `from_grammar("single_input")
+- do not install myself on multibuild test flavor
+
+-------------------------------------------------------------------
+Sat Aug 8 18:44:29 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Use github repository download for LICENSE, CHANGELOG.md (needed
+ by tests) and test directory gh#Zac-HD/hypothesmith#5
+- run tests in multibuild flavor, they are quite time-consuming and
+ the test requirements create dependency loops
+- filter empty types file python-hypothesmith-rpmlintrc
+
+-------------------------------------------------------------------
+Thu Aug 6 13:31:26 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Update to 0.1.3
+ * Update to latest versions of LibCST and Hypothesis, for Python
+ 3.9 support
+- 0.1.2 - 2020-05-17
+ * Emit more debug info to diagnose a compile() issue in CPython
+ nightly
+- 0.1.1 - 2020-05-17
+ * Emit some debug info to help diagnose a possible upstream bug
+ in CPython nightly
+- 0.1.0 - 2020-04-24
+ * Added auto_target=True argument to the from_node() strategy.
+ * Improved from_node() generation of comments and trailing
+ whitespace.
+- 0.0.8 - 2020-04-23
+ * Added a from_node() strategy which uses LibCST to generate
+ source code. This is a proof-of-concept rather than a robust
+ tool, but IMO it's a pretty cool concept.
+- 0.0.7 - 2020-04-19
+ * The from_grammar() strategy now takes an auto_target=True
+ argument, to drive generated examples towards (relatively)
+ larger and more complex programs.
+
+-------------------------------------------------------------------
Old:
----
LICENSE
hypothesmith-0.0.6.tar.gz
New:
----
_multibuild
hypothesmith-0.1.4.tar.gz
hypothesmith-gh-0.1.4.tar.gz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ python-hypothesmith.spec ++++++
--- /var/tmp/diff_new_pack.d59m6g/_old 2020-08-25 12:38:42.737413847 +0200
+++ /var/tmp/diff_new_pack.d59m6g/_new 2020-08-25 12:38:42.737413847 +0200
@@ -18,23 +18,42 @@
%{?!python_module:%define python_module() python-%{**} python3-%{**}}
%define skip_python2 1
-Name: python-hypothesmith
-Version: 0.0.6
+# no release tags in repository, but we need LICENSE and tests not
+# packaged in PyPI source https://github.com/Zac-HD/hypothesmith/issues/5
+%define commithash 6124cd71317add93500e0cb04c98cf5606adedea
+%global flavor @BUILD_FLAVOR@%{nil}
+%if "%{flavor}" == "test"
+%define psuffix -test
+%bcond_without test
+%else
+%define psuffix %{nil}
+%bcond_with test
+%endif
+Name: python-hypothesmith%{psuffix}
+Version: 0.1.4
Release: 0
Summary: Hypothesis strategies for generating Python programs,
something like CSmith
License: MPL-2.0
URL: https://github.com/Zac-HD/hypothesmith
Source:
https://files.pythonhosted.org/packages/source/h/hypothesmith/hypothesmith-%{version}.tar.gz
-# https://github.com/Zac-HD/hypothesmith/issues/5
-Source1:
https://raw.githubusercontent.com/Zac-HD/hypothesmith/master/LICENSE
+Source1:
https://github.com/Zac-HD/hypothesmith/archive/%{commithash}.tar.gz#/hypothesmith-gh-%{version}.tar.gz
BuildRequires: %{python_module base >= 3.6}
-BuildRequires: %{python_module hypothesis >= 4.36.0}
-BuildRequires: %{python_module lark-parser >= 0.7.2}
BuildRequires: %{python_module setuptools}
BuildRequires: fdupes
BuildRequires: python-rpm-macros
-Requires: python-hypothesis >= 4.36.0
+Requires: python-base >= 3.6
+Requires: python-hypothesis >= 5.23.7
Requires: python-lark-parser >= 0.7.2
+Requires: python-libcst >= 0.3.8
+%if %{with test}
+BuildRequires: %{python_module black}
+BuildRequires: %{python_module hypothesis >= 5.23.7}
+BuildRequires: %{python_module lark-parser >= 0.7.2}
+BuildRequires: %{python_module libcst >= 0.3.8}
+BuildRequires: %{python_module parso}
+BuildRequires: %{python_module pytest-xdist}
+BuildRequires: %{python_module pytest}
+%endif
BuildArch: noarch
%python_subpackages
@@ -42,22 +61,33 @@
Hypothesis strategies for generating Python programs, something like CSmith.
%prep
-%setup -q -n hypothesmith-%{version}
-cp %{SOURCE1} .
+%setup -q -n hypothesmith-%{version} -b 1
+cp -r ../hypothesmith-%{commithash}/{LICENSE,CHANGELOG.md,tests} .
%build
+%if !%{with test}
%python_build
+%endif
%install
+%if !%{with test}
%python_install
%python_expand %fdupes %{buildroot}%{$python_sitelib}
+%endif
%check
-# https://github.com/Zac-HD/hypothesmith/issues/5
+%if %{with test}
+# multibuild: test the source dir, nothing is installed
+export PYTHONPATH=$(pwd)/src
+%pytest -n auto
+%endif
+%if !%{with test}
%files %{python_files}
-%doc README.md
+%doc README.md CHANGELOG.md
%license LICENSE
-%{python_sitelib}/*
+%{python_sitelib}/hypothesmith
+%{python_sitelib}/hypothesmith-%{version}-py*.egg-info
+%endif
%changelog
++++++ _multibuild ++++++
<multibuild>
<package>test</package>
</multibuild>
++++++ hypothesmith-0.0.6.tar.gz -> hypothesmith-0.1.4.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/PKG-INFO
new/hypothesmith-0.1.4/PKG-INFO
--- old/hypothesmith-0.0.6/PKG-INFO 2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/PKG-INFO 2020-08-17 06:18:13.808235200 +0200
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: hypothesmith
-Version: 0.0.6
+Version: 0.1.4
Summary: Hypothesis strategies for generating Python programs, something like
CSmith
Home-page: https://github.com/Zac-HD/hypothesmith
Author: Zac Hatfield-Dodds
@@ -19,31 +19,55 @@
You can run the tests, such as they are, with `tox` on Python 3.6 or
later.
Use `tox -va` to see what environments are available.
- ## Changelog
+ ## Usage
+ This package provides two Hypothesis strategies for generating Python
source code.
- ### 0.0.6 - 2020-04-08
- - support for non-ASCII identifiers
+ The generated code will always be syntatically valid, and is useful
for testing
+ parsers, linters, auto-formatters, and other tools that operate on
source code.
- ### 0.0.5 - 2019-11-27
- - Updated project metadata and started testing on Python 3.8
+ > DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+ >
+ > It could do literally anything that running Python code is able to
do,
+ > including changing, deleting, or uploading important data. Arbitrary
+ > code can be useful, but "arbitrary code execution" can be very, very
bad.
+
+ #### `hypothesmith.from_grammar(start="file_input", *,
auto_target=True)`
+
+ Generates syntactically-valid Python source code based on the grammar.
+
+ Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+ ``"eval_input"``; respectively a single interactive statement, a
module or
+ sequence of commands read from a file, and input for the eval()
function.
+
+ If ``auto_target`` is ``True``, this strategy uses
``hypothesis.target()``
+ internally to drive towards larger and more complex examples. We
recommend
+ leaving this enabled, as the grammar is quite complex and only simple
examples
+ tend to be generated otherwise.
+
+ #### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+
+ Generates syntactically-valid Python source code based on the node
types
+ defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/)
project.
+
+ You can pass any subtype of `libcst.CSTNode`. Alternatively, you can
use
+ Hypothesis' built-in `from_type(node_type).map(lambda n:
libcst.Module([n]).code`,
+ after Hypothesmith has registered the required strategies. However,
this does
+ not include automatic targeting and limitations of LibCST may lead to
invalid
+ code being generated.
+
+ ## Notable bugs found with Hypothesmith
+ - [BPO-40661, a segfault in the new
parser](https://bugs.python.org/issue40661),
+ was given maximum priority and blocked the planned release of
CPython 3.9 beta1.
+ - [BPO-38953](https://bugs.python.org/issue38953) `tokenize` ->
`untokenize` roundtrip bugs.
+ - [`lib2to3` errors on \r in
comment](https://github.com/psf/black/issues/970)
+ - [Black fails on files ending in a
backslash](https://github.com/psf/black/issues/1012)
+ - [At least three round-trip bugs in
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+ (search commits for "hypothesis")
+ - [Invalid code generated by
LibCST](https://github.com/Instagram/LibCST/issues/287)
- ### 0.0.4 - 2019-09-10
- - Depends on more recent Hypothesis version, with upstreamed grammar
generation.
- - Improved filtering rejects fewer valid examples, finding another bug
in Black.
-
- ### 0.0.3 - 2019-08-08
- Checks validity at statement level, which makes filtering much more
efficient.
- Improved testing, input validation, and code comments.
-
- ### 0.0.2 - 2019-08-07
- Improved filtering and fixing of source code generated from the
grammar.
- This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
- built-in `compile()` and `exec()` functions, but not by `black` or
`lib2to3`.
-
- ### 0.0.1 - 2019-08-06
- Initial release. This is a minimal proof of concept, generating from
the
- grammar and rejecting it if we get errors from `black` or `tokenize`.
- Cool, but while promising not very useful at this stage.
+ ### Changelog
+
+ Patch notes [can be found in
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
Keywords: python testing fuzzing property-based-testing
Platform: UNKNOWN
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/README.md
new/hypothesmith-0.1.4/README.md
--- old/hypothesmith-0.0.6/README.md 2020-04-08 05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/README.md 2020-08-17 06:18:03.000000000 +0200
@@ -10,28 +10,52 @@
You can run the tests, such as they are, with `tox` on Python 3.6 or later.
Use `tox -va` to see what environments are available.
-## Changelog
+## Usage
+This package provides two Hypothesis strategies for generating Python source
code.
-### 0.0.6 - 2020-04-08
-- support for non-ASCII identifiers
+The generated code will always be syntatically valid, and is useful for testing
+parsers, linters, auto-formatters, and other tools that operate on source code.
-### 0.0.5 - 2019-11-27
-- Updated project metadata and started testing on Python 3.8
+> DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+>
+> It could do literally anything that running Python code is able to do,
+> including changing, deleting, or uploading important data. Arbitrary
+> code can be useful, but "arbitrary code execution" can be very, very bad.
+
+#### `hypothesmith.from_grammar(start="file_input", *, auto_target=True)`
+
+Generates syntactically-valid Python source code based on the grammar.
+
+Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+``"eval_input"``; respectively a single interactive statement, a module or
+sequence of commands read from a file, and input for the eval() function.
+
+If ``auto_target`` is ``True``, this strategy uses ``hypothesis.target()``
+internally to drive towards larger and more complex examples. We recommend
+leaving this enabled, as the grammar is quite complex and only simple examples
+tend to be generated otherwise.
+
+#### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+
+Generates syntactically-valid Python source code based on the node types
+defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/) project.
+
+You can pass any subtype of `libcst.CSTNode`. Alternatively, you can use
+Hypothesis' built-in `from_type(node_type).map(lambda n:
libcst.Module([n]).code`,
+after Hypothesmith has registered the required strategies. However, this does
+not include automatic targeting and limitations of LibCST may lead to invalid
+code being generated.
+
+## Notable bugs found with Hypothesmith
+- [BPO-40661, a segfault in the new
parser](https://bugs.python.org/issue40661),
+ was given maximum priority and blocked the planned release of CPython 3.9
beta1.
+- [BPO-38953](https://bugs.python.org/issue38953) `tokenize` -> `untokenize`
roundtrip bugs.
+- [`lib2to3` errors on \r in comment](https://github.com/psf/black/issues/970)
+- [Black fails on files ending in a
backslash](https://github.com/psf/black/issues/1012)
+- [At least three round-trip bugs in
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+ (search commits for "hypothesis")
+- [Invalid code generated by
LibCST](https://github.com/Instagram/LibCST/issues/287)
-### 0.0.4 - 2019-09-10
-- Depends on more recent Hypothesis version, with upstreamed grammar
generation.
-- Improved filtering rejects fewer valid examples, finding another bug in
Black.
-
-### 0.0.3 - 2019-08-08
-Checks validity at statement level, which makes filtering much more efficient.
-Improved testing, input validation, and code comments.
-
-### 0.0.2 - 2019-08-07
-Improved filtering and fixing of source code generated from the grammar.
-This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
-built-in `compile()` and `exec()` functions, but not by `black` or `lib2to3`.
-
-### 0.0.1 - 2019-08-06
-Initial release. This is a minimal proof of concept, generating from the
-grammar and rejecting it if we get errors from `black` or `tokenize`.
-Cool, but while promising not very useful at this stage.
+### Changelog
+
+Patch notes [can be found in
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/setup.py
new/hypothesmith-0.1.4/setup.py
--- old/hypothesmith-0.0.6/setup.py 2020-04-08 05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/setup.py 2020-08-17 06:18:03.000000000 +0200
@@ -1,4 +1,4 @@
-"""It's a setup.py"""
+"""Packaging config for Hypothesmith."""
import os
@@ -32,7 +32,7 @@
license="MPL 2.0",
description="Hypothesis strategies for generating Python programs,
something like CSmith",
zip_safe=False,
- install_requires=["hypothesis>=4.36.0", "lark-parser>=0.7.2"],
+ install_requires=["hypothesis>=5.23.7", "lark-parser>=0.7.2",
"libcst>=0.3.8"],
python_requires=">=3.6",
classifiers=[
"Development Status :: 2 - Pre-Alpha",
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/__init__.py
new/hypothesmith-0.1.4/src/hypothesmith/__init__.py
--- old/hypothesmith-0.0.6/src/hypothesmith/__init__.py 2020-04-08
05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith/__init__.py 2020-08-17
06:18:03.000000000 +0200
@@ -1,6 +1,7 @@
"""Hypothesis strategies for generating Python source code, somewhat like
CSmith."""
+from hypothesmith.cst import from_node
from hypothesmith.syntactic import from_grammar
-__version__ = "0.0.6"
-__all__ = ["from_grammar"]
+__version__ = "0.1.4"
+__all__ = ["from_grammar", "from_node"]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/cst.py
new/hypothesmith-0.1.4/src/hypothesmith/cst.py
--- old/hypothesmith-0.0.6/src/hypothesmith/cst.py 1970-01-01
01:00:00.000000000 +0100
+++ new/hypothesmith-0.1.4/src/hypothesmith/cst.py 2020-08-17
06:18:03.000000000 +0200
@@ -0,0 +1,161 @@
+"""
+Generating Python source code from a syntax tree.
+
+Thanks to Instagram for open-sourcing libCST (which is great!) and
+thanks to Tolkein for the name of this module.
+"""
+
+import ast
+import dis
+from inspect import getfullargspec
+from tokenize import (
+ Floatnumber as FLOATNUMBER_RE,
+ Imagnumber as IMAGNUMBER_RE,
+ Intnumber as INTNUMBER_RE,
+)
+from typing import Type
+
+import libcst
+from hypothesis import infer, strategies as st, target
+
+from hypothesmith.syntactic import identifiers
+
+# For some nodes, we just need to ensure that they use the appropriate regex
+# pattern instead of allowing literally any string.
+for node_type, pattern in {
+ libcst.Float: FLOATNUMBER_RE,
+ libcst.Integer: INTNUMBER_RE,
+ libcst.Imaginary: IMAGNUMBER_RE,
+ libcst.SimpleWhitespace: libcst._nodes.whitespace.SIMPLE_WHITESPACE_RE,
+}.items():
+ _strategy = st.builds(node_type, st.from_regex(pattern, fullmatch=True))
+ st.register_type_strategy(node_type, _strategy)
+
+# type-ignore comments are special in the 3.8+ (typed) ast, so boost their
chances)
+_comments = st.from_regex(libcst._nodes.whitespace.COMMENT_RE, fullmatch=True)
+st.register_type_strategy(
+ libcst.Comment, st.builds(libcst.Comment, _comments | st.just("# type:
ignore")),
+)
+
+# `from_type()` has less laziness than other strategies, we we register for
these
+# foundational node types *before* referring to them in other strategies.
+st.register_type_strategy(libcst.Name, st.builds(libcst.Name, identifiers()))
+st.register_type_strategy(
+ libcst.SimpleString, st.builds(libcst.SimpleString, st.text().map(repr))
+)
+
+# Ensure that ImportAlias uses Attribute nodes composed only of Name nodes.
+names = st.from_type(libcst.Name)
+name_only_attributes = st.one_of(
+ names,
+ st.builds(libcst.Attribute, names, names),
+ st.builds(libcst.Attribute, st.builds(libcst.Attribute, names, names),
names),
+)
+st.register_type_strategy(
+ libcst.ImportAlias, st.builds(libcst.ImportAlias, name_only_attributes)
+)
+
+
+def nonempty_seq(*node: Type[libcst.CSTNode]) -> st.SearchStrategy:
+ return st.lists(st.one_of(*map(st.from_type, node)), min_size=1)
+
+
+# There are around 150 concrete types of CST nodes. Delightfully, libCST uses
+# dataclasses for all these classes, so we can allow the `builds` & `from_type`
+# inference to provide most of our arguments for us.
+# However, in some cases we want to either restrict arguments (e.g.
libcst.Name),
+# or supply something nastier than the default argument (e.g.
libcst.SimpleWhitespace)
+REGISTERED = (
+ [libcst.AsName, st.from_type(libcst.Name)],
+ [libcst.Assign, nonempty_seq(libcst.AssignTarget)],
+ [libcst.Comparison, infer, nonempty_seq(libcst.ComparisonTarget)],
+ [libcst.Decorator, st.from_type(libcst.Name) |
st.from_type(libcst.Attribute)],
+ [libcst.EmptyLine, infer, infer, infer],
+ [libcst.Global, nonempty_seq(libcst.NameItem)],
+ [libcst.Import, nonempty_seq(libcst.ImportAlias)],
+ [
+ libcst.ImportFrom,
+ st.from_type(libcst.Name) | st.from_type(libcst.Attribute),
+ nonempty_seq(libcst.ImportAlias),
+ ],
+ [libcst.NamedExpr, st.from_type(libcst.Name)],
+ [libcst.Nonlocal, nonempty_seq(libcst.NameItem)],
+ [libcst.Set, nonempty_seq(libcst.Element, libcst.StarredElement)],
+ [libcst.Subscript, infer, nonempty_seq(libcst.SubscriptElement)],
+ [libcst.TrailingWhitespace, infer, infer],
+ [libcst.With, nonempty_seq(libcst.WithItem)],
+)
+
+
+# This is where the magic happens: teach `st.from_type` to generate each node
type
+for node_type, *strats in REGISTERED:
+ # TODO: once everything else is working, come back here and use `infer` for
+ # all arguments without an explicit strategy - inference is more
"interesting"
+ # than just using the default argument... in the proverbial sense.
+ # Mostly this will consist of ensuring that parens remain balanced.
+ args = [name for name in getfullargspec(node_type).args if name != "self"]
+ kwargs = dict(zip(args, strats))
+ st.register_type_strategy(node_type, st.builds(node_type, **kwargs))
+
+# We have special handling for `Try` nodes, because there are two options.
+# If a Try node has no `except` clause, it *must* have a `finally` clause and
+# *must not* have an `else` clause. With one or more except clauses, it may
+# have an else and/or a finally, or neither.
+st.register_type_strategy(
+ libcst.Try,
+ st.builds(libcst.Try, finalbody=st.from_type(libcst.Finally))
+ | st.builds(
+ libcst.Try,
+ body=infer,
+ handlers=st.lists(
+ st.from_type(libcst.ExceptHandler),
+ min_size=1,
+ unique_by=lambda caught: caught.type,
+ ),
+ orelse=infer,
+ finalbody=infer,
+ ),
+)
+
+
+def record_targets(code: str) -> str:
+ # target larger inputs - the Hypothesis engine will do a multi-objective
+ # hill-climbing search using these scores to generate 'better' examples.
+ nodes = list(ast.walk(ast.parse(code)))
+ uniq_nodes = {type(n) for n in nodes}
+ instructions = list(dis.Bytecode(compile(code, "<string>", "exec")))
+ for value, label in [
+ (len(instructions), "(hypothesmith from_node) instructions in
bytecode"),
+ (len(nodes), "(hypothesmith from_node) total number of ast nodes"),
+ (len(uniq_nodes), "(hypothesmith from_node) number of unique ast node
types"),
+ ]:
+ target(float(value), label=label)
+ return code
+
+
+def compilable(code: str, mode: str = "exec") -> bool:
+ # This is used as a filter on `from_node()`, but note that LibCST aspires
to
+ # disallow construction of a CST node which is converted to invalid code.
+ # (that is, if the resulting code would be invalid, raise an error instead)
+ # See also https://github.com/Instagram/LibCST/issues/287
+ try:
+ compile(code, "<string>", mode)
+ return True
+ except (SyntaxError, ValueError):
+ return False
+
+
+def from_node(
+ node: Type[libcst.CSTNode] = libcst.Module, *, auto_target: bool = True
+) -> st.SearchStrategy[str]:
+ """Generate syntactically-valid Python source code for a LibCST node type.
+
+ You can pass any subtype of `libcst.CSTNode`. Alternatively, you can use
+ Hypothesis' built-in `from_type(node_type).map(lambda n:
libcst.Module([n]).code`,
+ after Hypothesmith has registered the required strategies. However, this
does
+ not include automatic targeting and limitations of LibCST may lead to
invalid
+ code being generated.
+ """
+ assert issubclass(node, libcst.CSTNode)
+ code = st.from_type(node).map(lambda n:
libcst.Module([n]).code).filter(compilable)
+ return code.map(record_targets) if auto_target else code
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/syntactic.py
new/hypothesmith-0.1.4/src/hypothesmith/syntactic.py
--- old/hypothesmith-0.0.6/src/hypothesmith/syntactic.py 2020-04-08
05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith/syntactic.py 2020-08-17
06:18:03.000000000 +0200
@@ -1,13 +1,14 @@
"""Hypothesis strategies for generating Python source code, somewhat like
CSmith."""
+import ast
+import dis
import re
import sys
import urllib.request
from functools import lru_cache
from pathlib import Path
-import hypothesis.strategies as st
-from hypothesis import assume
+from hypothesis import assume, strategies as st
from hypothesis.extra.lark import LarkStrategy
from lark import Lark
from lark.indenter import Indenter
@@ -39,12 +40,14 @@
_lead = []
_subs = []
for c in map(chr, range(sys.maxunicode + 1)):
+ if not utf8_encodable(c):
+ continue
if c.isidentifier():
_lead.append(c) # e.g. "a"
if ("_" + c).isidentifier():
_subs.append(c) # e.g. "1"
pattern = "[{}][{}]*".format(re.escape("".join(_lead)),
re.escape("".join(_subs)))
- return st.from_regex(pattern, fullmatch=True)
+ return st.from_regex(pattern, fullmatch=True).filter(str.isidentifier)
class PythonIndenter(Indenter):
@@ -69,7 +72,7 @@
class GrammarStrategy(LarkStrategy):
- def __init__(self, grammar: Lark, start: str):
+ def __init__(self, grammar: Lark, start: str, auto_target: bool):
explicit_strategies = {
PythonIndenter.INDENT_type: st.just(" " * PythonIndenter.tab_len),
PythonIndenter.DEDENT_type: st.just(""),
@@ -80,6 +83,24 @@
k: v.map(lambda s: s.replace("\0", "")).filter(utf8_encodable)
for k, v in self.terminal_strategies.items() # type: ignore
}
+ self.auto_target = auto_target and start != "single_input"
+
+ def do_draw(self, data): # type: ignore
+ result = super().do_draw(data)
+ if self.auto_target:
+ # target larger inputs - the Hypothesis engine will do a
multi-objective
+ # hill-climbing search using these scores to generate 'better'
examples.
+ nodes = list(ast.walk(ast.parse(result)))
+ uniq_nodes = {type(n) for n in nodes}
+ instructions = list(dis.Bytecode(compile(result, "<string>",
"exec")))
+ targets = data.target_observations
+ for value, label in [
+ (instructions, "(hypothesmith) instructions in bytecode"),
+ (nodes, "(hypothesmith) total number of ast nodes"),
+ (uniq_nodes, "(hypothesmith) number of unique ast node types"),
+ ]:
+ targets[label] = max(float(len(value)), targets.get(label,
0.0))
+ return result
def draw_symbol(self, data, symbol, draw_state): # type: ignore
count = len(draw_state.result)
@@ -91,19 +112,45 @@
filename="<string>",
mode=COMPILE_MODES[symbol.name],
)
+ except SystemError as err: # pragma: no cover
+ # Extra output to help track down a possible upstream issue
+ # https://github.com/Zac-HD/stdlib-property-tests/issues/14
+ source_code = "".join(draw_state.result[count:])
+ raise Exception(
+ f"unexpected error while attempting to compile
{source_code!r}"
+ f" in mode={COMPILE_MODES[symbol.name]}"
+ ) from err
except SyntaxError:
# Python's grammar doesn't actually fully describe the
behaviour of the
# CPython parser and AST-post-processor, so we just filter out
errors.
assume(False)
+ def gen_ignore(self, data, draw_state): # type: ignore
+ # Set a consistent 1/4 chance of generating any ignored tokens
(comments,
+ # whitespace, line-continuations) as part of this draw.
+ if data.draw(
+ st.shared(
+ st.sampled_from([False, True, False, False]),
+ key="hypothesmith_gen_ignored",
+ )
+ ):
+ super().gen_ignore(data, draw_state)
+
-def from_grammar(start: str = "file_input") -> st.SearchStrategy[str]:
+def from_grammar(
+ start: str = "file_input", *, auto_target: bool = True
+) -> st.SearchStrategy[str]:
"""Generate syntactically-valid Python source code based on the grammar.
Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
``"eval_input"``; respectively a single interactive statement, a module or
sequence of commands read from a file, and input for the eval() function.
+ If ``auto_target`` is True, this strategy uses ``hypothesis.target()``
+ internally to drive towards larger and more complex examples. We recommend
+ leaving this enabled, as the grammar is quite complex and only simple
examples
+ tend to be generated otherwise.
+
.. warning::
DO NOT EXECUTE CODE GENERATED BY THIS STRATEGY.
@@ -112,5 +159,6 @@
code can be useful, but "arbitrary code execution" can be very, very
bad.
"""
assert start in {"single_input", "file_input", "eval_input"}
+ assert isinstance(auto_target, bool)
grammar = Lark(lark_grammar, parser="lalr", postlex=PythonIndenter(),
start=start)
- return GrammarStrategy(grammar, start)
+ return GrammarStrategy(grammar, start, auto_target)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/PKG-INFO
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/PKG-INFO
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/PKG-INFO 2020-04-08
05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/PKG-INFO 2020-08-17
06:18:13.000000000 +0200
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: hypothesmith
-Version: 0.0.6
+Version: 0.1.4
Summary: Hypothesis strategies for generating Python programs, something like
CSmith
Home-page: https://github.com/Zac-HD/hypothesmith
Author: Zac Hatfield-Dodds
@@ -19,31 +19,55 @@
You can run the tests, such as they are, with `tox` on Python 3.6 or
later.
Use `tox -va` to see what environments are available.
- ## Changelog
+ ## Usage
+ This package provides two Hypothesis strategies for generating Python
source code.
- ### 0.0.6 - 2020-04-08
- - support for non-ASCII identifiers
+ The generated code will always be syntatically valid, and is useful
for testing
+ parsers, linters, auto-formatters, and other tools that operate on
source code.
- ### 0.0.5 - 2019-11-27
- - Updated project metadata and started testing on Python 3.8
+ > DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+ >
+ > It could do literally anything that running Python code is able to
do,
+ > including changing, deleting, or uploading important data. Arbitrary
+ > code can be useful, but "arbitrary code execution" can be very, very
bad.
+
+ #### `hypothesmith.from_grammar(start="file_input", *,
auto_target=True)`
+
+ Generates syntactically-valid Python source code based on the grammar.
+
+ Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+ ``"eval_input"``; respectively a single interactive statement, a
module or
+ sequence of commands read from a file, and input for the eval()
function.
+
+ If ``auto_target`` is ``True``, this strategy uses
``hypothesis.target()``
+ internally to drive towards larger and more complex examples. We
recommend
+ leaving this enabled, as the grammar is quite complex and only simple
examples
+ tend to be generated otherwise.
+
+ #### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+
+ Generates syntactically-valid Python source code based on the node
types
+ defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/)
project.
+
+ You can pass any subtype of `libcst.CSTNode`. Alternatively, you can
use
+ Hypothesis' built-in `from_type(node_type).map(lambda n:
libcst.Module([n]).code`,
+ after Hypothesmith has registered the required strategies. However,
this does
+ not include automatic targeting and limitations of LibCST may lead to
invalid
+ code being generated.
+
+ ## Notable bugs found with Hypothesmith
+ - [BPO-40661, a segfault in the new
parser](https://bugs.python.org/issue40661),
+ was given maximum priority and blocked the planned release of
CPython 3.9 beta1.
+ - [BPO-38953](https://bugs.python.org/issue38953) `tokenize` ->
`untokenize` roundtrip bugs.
+ - [`lib2to3` errors on \r in
comment](https://github.com/psf/black/issues/970)
+ - [Black fails on files ending in a
backslash](https://github.com/psf/black/issues/1012)
+ - [At least three round-trip bugs in
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+ (search commits for "hypothesis")
+ - [Invalid code generated by
LibCST](https://github.com/Instagram/LibCST/issues/287)
- ### 0.0.4 - 2019-09-10
- - Depends on more recent Hypothesis version, with upstreamed grammar
generation.
- - Improved filtering rejects fewer valid examples, finding another bug
in Black.
-
- ### 0.0.3 - 2019-08-08
- Checks validity at statement level, which makes filtering much more
efficient.
- Improved testing, input validation, and code comments.
-
- ### 0.0.2 - 2019-08-07
- Improved filtering and fixing of source code generated from the
grammar.
- This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
- built-in `compile()` and `exec()` functions, but not by `black` or
`lib2to3`.
-
- ### 0.0.1 - 2019-08-06
- Initial release. This is a minimal proof of concept, generating from
the
- grammar and rejecting it if we get errors from `black` or `tokenize`.
- Cool, but while promising not very useful at this stage.
+ ### Changelog
+
+ Patch notes [can be found in
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
Keywords: python testing fuzzing property-based-testing
Platform: UNKNOWN
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/SOURCES.txt
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/SOURCES.txt
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/SOURCES.txt
2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/SOURCES.txt
2020-08-17 06:18:13.000000000 +0200
@@ -1,6 +1,7 @@
README.md
setup.py
src/hypothesmith/__init__.py
+src/hypothesmith/cst.py
src/hypothesmith/py.typed
src/hypothesmith/python3.lark
src/hypothesmith/syntactic.py
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn'
'--exclude=.svnignore'
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/requires.txt
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/requires.txt
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/requires.txt
2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/requires.txt
2020-08-17 06:18:13.000000000 +0200
@@ -1,2 +1,3 @@
-hypothesis>=4.36.0
+hypothesis>=5.23.7
lark-parser>=0.7.2
+libcst>=0.3.8