commit python-hypothesmith for openSUSE:Factory

root Tue, 25 Aug 2020 03:39:07 -0700

Hello community,

here is the log from the commit of package python-hypothesmith for 
openSUSE:Factory checked in at 2020-08-25 12:38:33
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-hypothesmith (Old)
 and      /work/SRC/openSUSE:Factory/.python-hypothesmith.new.3399 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-hypothesmith"

Tue Aug 25 12:38:33 2020 rev:2 rq:828110 version:0.1.4

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-hypothesmith/python-hypothesmith.changes  
2020-04-16 23:05:18.031784123 +0200
+++ 
/work/SRC/openSUSE:Factory/.python-hypothesmith.new.3399/python-hypothesmith.changes
        2020-08-25 12:38:41.313413578 +0200
@@ -1,0 +2,42 @@
+Wed Aug 19 15:46:37 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Update to 0.1.4
+  * Improve handling of identifiers
+  * Fix internal error in `from_grammar("single_input") 
+- do not install myself on multibuild test flavor
+
+-------------------------------------------------------------------
+Sat Aug  8 18:44:29 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Use github repository download for LICENSE, CHANGELOG.md (needed
+  by tests) and test directory gh#Zac-HD/hypothesmith#5
+- run tests in multibuild flavor, they are quite time-consuming and
+  the test requirements create dependency loops
+- filter empty types file python-hypothesmith-rpmlintrc
+
+-------------------------------------------------------------------
+Thu Aug  6 13:31:26 UTC 2020 - Benjamin Greiner <[email protected]>
+
+- Update to 0.1.3
+  * Update to latest versions of LibCST and Hypothesis, for Python
+    3.9 support
+- 0.1.2 - 2020-05-17
+  * Emit more debug info to diagnose a compile() issue in CPython 
+    nightly
+- 0.1.1 - 2020-05-17
+  * Emit some debug info to help diagnose a possible upstream bug
+    in CPython nightly
+- 0.1.0 - 2020-04-24
+  * Added auto_target=True argument to the from_node() strategy.
+  * Improved from_node() generation of comments and trailing
+    whitespace.
+- 0.0.8 - 2020-04-23
+  * Added a from_node() strategy which uses LibCST to generate
+    source code. This is a proof-of-concept rather than a robust
+    tool, but IMO it's a pretty cool concept.
+- 0.0.7 - 2020-04-19
+  * The from_grammar() strategy now takes an auto_target=True
+    argument, to drive generated examples towards (relatively)
+    larger and more complex programs.
+
+-------------------------------------------------------------------

Old:
----
  LICENSE
  hypothesmith-0.0.6.tar.gz

New:
----
  _multibuild
  hypothesmith-0.1.4.tar.gz
  hypothesmith-gh-0.1.4.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-hypothesmith.spec ++++++
--- /var/tmp/diff_new_pack.d59m6g/_old  2020-08-25 12:38:42.737413847 +0200
+++ /var/tmp/diff_new_pack.d59m6g/_new  2020-08-25 12:38:42.737413847 +0200
@@ -18,23 +18,42 @@
 
 %{?!python_module:%define python_module() python-%{**} python3-%{**}}
 %define skip_python2 1
-Name:           python-hypothesmith
-Version:        0.0.6
+# no release tags in repository, but we need LICENSE and tests not
+# packaged in PyPI source https://github.com/Zac-HD/hypothesmith/issues/5
+%define commithash 6124cd71317add93500e0cb04c98cf5606adedea
+%global flavor @BUILD_FLAVOR@%{nil}
+%if "%{flavor}" == "test"
+%define psuffix -test
+%bcond_without test
+%else
+%define psuffix %{nil}
+%bcond_with test
+%endif
+Name:           python-hypothesmith%{psuffix}
+Version:        0.1.4
 Release:        0
 Summary:        Hypothesis strategies for generating Python programs, 
something like CSmith
 License:        MPL-2.0
 URL:            https://github.com/Zac-HD/hypothesmith
 Source:         
https://files.pythonhosted.org/packages/source/h/hypothesmith/hypothesmith-%{version}.tar.gz
-# https://github.com/Zac-HD/hypothesmith/issues/5
-Source1:        
https://raw.githubusercontent.com/Zac-HD/hypothesmith/master/LICENSE
+Source1:        
https://github.com/Zac-HD/hypothesmith/archive/%{commithash}.tar.gz#/hypothesmith-gh-%{version}.tar.gz
 BuildRequires:  %{python_module base >= 3.6}
-BuildRequires:  %{python_module hypothesis >= 4.36.0}
-BuildRequires:  %{python_module lark-parser >= 0.7.2}
 BuildRequires:  %{python_module setuptools}
 BuildRequires:  fdupes
 BuildRequires:  python-rpm-macros
-Requires:       python-hypothesis >= 4.36.0
+Requires:       python-base >= 3.6
+Requires:       python-hypothesis >= 5.23.7
 Requires:       python-lark-parser >= 0.7.2
+Requires:       python-libcst >= 0.3.8
+%if %{with test}
+BuildRequires:  %{python_module black}
+BuildRequires:  %{python_module hypothesis >= 5.23.7}
+BuildRequires:  %{python_module lark-parser >= 0.7.2}
+BuildRequires:  %{python_module libcst >= 0.3.8}
+BuildRequires:  %{python_module parso}
+BuildRequires:  %{python_module pytest-xdist}
+BuildRequires:  %{python_module pytest}
+%endif
 BuildArch:      noarch
 %python_subpackages
 
@@ -42,22 +61,33 @@
 Hypothesis strategies for generating Python programs, something like CSmith.
 
 %prep
-%setup -q -n hypothesmith-%{version}
-cp %{SOURCE1} .
+%setup -q -n hypothesmith-%{version} -b 1
+cp -r ../hypothesmith-%{commithash}/{LICENSE,CHANGELOG.md,tests} .
 
 %build
+%if !%{with test}
 %python_build
+%endif
 
 %install
+%if !%{with test}
 %python_install
 %python_expand %fdupes %{buildroot}%{$python_sitelib}
+%endif
 
 %check
-# https://github.com/Zac-HD/hypothesmith/issues/5
+%if %{with test}
+# multibuild: test the source dir, nothing is installed
+export PYTHONPATH=$(pwd)/src
+%pytest -n auto
+%endif
 
+%if !%{with test}
 %files %{python_files}
-%doc README.md
+%doc README.md CHANGELOG.md
 %license LICENSE
-%{python_sitelib}/*
+%{python_sitelib}/hypothesmith
+%{python_sitelib}/hypothesmith-%{version}-py*.egg-info
+%endif
 
 %changelog

++++++ _multibuild ++++++
<multibuild>
  <package>test</package>
</multibuild>
++++++ hypothesmith-0.0.6.tar.gz -> hypothesmith-0.1.4.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/PKG-INFO 
new/hypothesmith-0.1.4/PKG-INFO
--- old/hypothesmith-0.0.6/PKG-INFO     2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/PKG-INFO     2020-08-17 06:18:13.808235200 +0200
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: hypothesmith
-Version: 0.0.6
+Version: 0.1.4
 Summary: Hypothesis strategies for generating Python programs, something like 
CSmith
 Home-page: https://github.com/Zac-HD/hypothesmith
 Author: Zac Hatfield-Dodds
@@ -19,31 +19,55 @@
         You can run the tests, such as they are, with `tox` on Python 3.6 or 
later.
         Use `tox -va` to see what environments are available.
         
-        ## Changelog
+        ## Usage
+        This package provides two Hypothesis strategies for generating Python 
source code.
         
-        ### 0.0.6 - 2020-04-08
-        - support for non-ASCII identifiers
+        The generated code will always be syntatically valid, and is useful 
for testing
+        parsers, linters, auto-formatters, and other tools that operate on 
source code.
         
-        ### 0.0.5 - 2019-11-27
-        - Updated project metadata and started testing on Python 3.8
+        > DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+        >
+        > It could do literally anything that running Python code is able to 
do,
+        > including changing, deleting, or uploading important data.  Arbitrary
+        > code can be useful, but "arbitrary code execution" can be very, very 
bad.
+        
+        #### `hypothesmith.from_grammar(start="file_input", *, 
auto_target=True)`
+        
+        Generates syntactically-valid Python source code based on the grammar.
+        
+        Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+        ``"eval_input"``; respectively a single interactive statement, a 
module or
+        sequence of commands read from a file, and input for the eval() 
function.
+        
+        If ``auto_target`` is ``True``, this strategy uses 
``hypothesis.target()``
+        internally to drive towards larger and more complex examples.  We 
recommend
+        leaving this enabled, as the grammar is quite complex and only simple 
examples
+        tend to be generated otherwise.
+        
+        #### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+        
+        Generates syntactically-valid Python source code based on the node 
types
+        defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/) 
project.
+        
+        You can pass any subtype of `libcst.CSTNode`.  Alternatively, you can 
use
+        Hypothesis' built-in `from_type(node_type).map(lambda n: 
libcst.Module([n]).code`,
+        after Hypothesmith has registered the required strategies.  However, 
this does
+        not include automatic targeting and limitations of LibCST may lead to 
invalid
+        code being generated.
+        
+        ## Notable bugs found with Hypothesmith
+        - [BPO-40661, a segfault in the new 
parser](https://bugs.python.org/issue40661),
+          was given maximum priority and blocked the planned release of 
CPython 3.9 beta1.
+        - [BPO-38953](https://bugs.python.org/issue38953) `tokenize` -> 
`untokenize` roundtrip bugs.
+        - [`lib2to3` errors on \r in 
comment](https://github.com/psf/black/issues/970)
+        - [Black fails on files ending in a 
backslash](https://github.com/psf/black/issues/1012)
+        - [At least three round-trip bugs in 
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+          (search commits for "hypothesis")
+        - [Invalid code generated by 
LibCST](https://github.com/Instagram/LibCST/issues/287)
         
-        ### 0.0.4 - 2019-09-10
-        - Depends on more recent Hypothesis version, with upstreamed grammar 
generation.
-        - Improved filtering rejects fewer valid examples, finding another bug 
in Black.
-        
-        ### 0.0.3 - 2019-08-08
-        Checks validity at statement level, which makes filtering much more 
efficient.
-        Improved testing, input validation, and code comments.
-        
-        ### 0.0.2 - 2019-08-07
-        Improved filtering and fixing of source code generated from the 
grammar.
-        This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
-        built-in `compile()` and `exec()` functions, but not by `black` or 
`lib2to3`.
-        
-        ### 0.0.1 - 2019-08-06
-        Initial release.  This is a minimal proof of concept, generating from 
the
-        grammar and rejecting it if we get errors from `black` or `tokenize`.
-        Cool, but while promising not very useful at this stage.
+        ### Changelog
+        
+        Patch notes [can be found in 
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
         
 Keywords: python testing fuzzing property-based-testing
 Platform: UNKNOWN
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/README.md 
new/hypothesmith-0.1.4/README.md
--- old/hypothesmith-0.0.6/README.md    2020-04-08 05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/README.md    2020-08-17 06:18:03.000000000 +0200
@@ -10,28 +10,52 @@
 You can run the tests, such as they are, with `tox` on Python 3.6 or later.
 Use `tox -va` to see what environments are available.
 
-## Changelog
+## Usage
+This package provides two Hypothesis strategies for generating Python source 
code.
 
-### 0.0.6 - 2020-04-08
-- support for non-ASCII identifiers
+The generated code will always be syntatically valid, and is useful for testing
+parsers, linters, auto-formatters, and other tools that operate on source code.
 
-### 0.0.5 - 2019-11-27
-- Updated project metadata and started testing on Python 3.8
+> DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+>
+> It could do literally anything that running Python code is able to do,
+> including changing, deleting, or uploading important data.  Arbitrary
+> code can be useful, but "arbitrary code execution" can be very, very bad.
+
+#### `hypothesmith.from_grammar(start="file_input", *, auto_target=True)`
+
+Generates syntactically-valid Python source code based on the grammar.
+
+Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+``"eval_input"``; respectively a single interactive statement, a module or
+sequence of commands read from a file, and input for the eval() function.
+
+If ``auto_target`` is ``True``, this strategy uses ``hypothesis.target()``
+internally to drive towards larger and more complex examples.  We recommend
+leaving this enabled, as the grammar is quite complex and only simple examples
+tend to be generated otherwise.
+
+#### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+
+Generates syntactically-valid Python source code based on the node types
+defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/) project.
+
+You can pass any subtype of `libcst.CSTNode`.  Alternatively, you can use
+Hypothesis' built-in `from_type(node_type).map(lambda n: 
libcst.Module([n]).code`,
+after Hypothesmith has registered the required strategies.  However, this does
+not include automatic targeting and limitations of LibCST may lead to invalid
+code being generated.
+
+## Notable bugs found with Hypothesmith
+- [BPO-40661, a segfault in the new 
parser](https://bugs.python.org/issue40661),
+  was given maximum priority and blocked the planned release of CPython 3.9 
beta1.
+- [BPO-38953](https://bugs.python.org/issue38953) `tokenize` -> `untokenize` 
roundtrip bugs.
+- [`lib2to3` errors on \r in comment](https://github.com/psf/black/issues/970)
+- [Black fails on files ending in a 
backslash](https://github.com/psf/black/issues/1012)
+- [At least three round-trip bugs in 
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+  (search commits for "hypothesis")
+- [Invalid code generated by 
LibCST](https://github.com/Instagram/LibCST/issues/287)
 
-### 0.0.4 - 2019-09-10
-- Depends on more recent Hypothesis version, with upstreamed grammar 
generation.
-- Improved filtering rejects fewer valid examples, finding another bug in 
Black.
-
-### 0.0.3 - 2019-08-08
-Checks validity at statement level, which makes filtering much more efficient.
-Improved testing, input validation, and code comments.
-
-### 0.0.2 - 2019-08-07
-Improved filtering and fixing of source code generated from the grammar.
-This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
-built-in `compile()` and `exec()` functions, but not by `black` or `lib2to3`.
-
-### 0.0.1 - 2019-08-06
-Initial release.  This is a minimal proof of concept, generating from the
-grammar and rejecting it if we get errors from `black` or `tokenize`.
-Cool, but while promising not very useful at this stage.
+### Changelog
+
+Patch notes [can be found in 
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/setup.py 
new/hypothesmith-0.1.4/setup.py
--- old/hypothesmith-0.0.6/setup.py     2020-04-08 05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/setup.py     2020-08-17 06:18:03.000000000 +0200
@@ -1,4 +1,4 @@
-"""It's a setup.py"""
+"""Packaging config for Hypothesmith."""
 
 import os
 
@@ -32,7 +32,7 @@
     license="MPL 2.0",
     description="Hypothesis strategies for generating Python programs, 
something like CSmith",
     zip_safe=False,
-    install_requires=["hypothesis>=4.36.0", "lark-parser>=0.7.2"],
+    install_requires=["hypothesis>=5.23.7", "lark-parser>=0.7.2", 
"libcst>=0.3.8"],
     python_requires=">=3.6",
     classifiers=[
         "Development Status :: 2 - Pre-Alpha",
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/__init__.py 
new/hypothesmith-0.1.4/src/hypothesmith/__init__.py
--- old/hypothesmith-0.0.6/src/hypothesmith/__init__.py 2020-04-08 
05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith/__init__.py 2020-08-17 
06:18:03.000000000 +0200
@@ -1,6 +1,7 @@
 """Hypothesis strategies for generating Python source code, somewhat like 
CSmith."""
 
+from hypothesmith.cst import from_node
 from hypothesmith.syntactic import from_grammar
 
-__version__ = "0.0.6"
-__all__ = ["from_grammar"]
+__version__ = "0.1.4"
+__all__ = ["from_grammar", "from_node"]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/cst.py 
new/hypothesmith-0.1.4/src/hypothesmith/cst.py
--- old/hypothesmith-0.0.6/src/hypothesmith/cst.py      1970-01-01 
01:00:00.000000000 +0100
+++ new/hypothesmith-0.1.4/src/hypothesmith/cst.py      2020-08-17 
06:18:03.000000000 +0200
@@ -0,0 +1,161 @@
+"""
+Generating Python source code from a syntax tree.
+
+Thanks to Instagram for open-sourcing libCST (which is great!) and
+thanks to Tolkein for the name of this module.
+"""
+
+import ast
+import dis
+from inspect import getfullargspec
+from tokenize import (
+    Floatnumber as FLOATNUMBER_RE,
+    Imagnumber as IMAGNUMBER_RE,
+    Intnumber as INTNUMBER_RE,
+)
+from typing import Type
+
+import libcst
+from hypothesis import infer, strategies as st, target
+
+from hypothesmith.syntactic import identifiers
+
+# For some nodes, we just need to ensure that they use the appropriate regex
+# pattern instead of allowing literally any string.
+for node_type, pattern in {
+    libcst.Float: FLOATNUMBER_RE,
+    libcst.Integer: INTNUMBER_RE,
+    libcst.Imaginary: IMAGNUMBER_RE,
+    libcst.SimpleWhitespace: libcst._nodes.whitespace.SIMPLE_WHITESPACE_RE,
+}.items():
+    _strategy = st.builds(node_type, st.from_regex(pattern, fullmatch=True))
+    st.register_type_strategy(node_type, _strategy)
+
+# type-ignore comments are special in the 3.8+ (typed) ast, so boost their 
chances)
+_comments = st.from_regex(libcst._nodes.whitespace.COMMENT_RE, fullmatch=True)
+st.register_type_strategy(
+    libcst.Comment, st.builds(libcst.Comment, _comments | st.just("# type: 
ignore")),
+)
+
+# `from_type()` has less laziness than other strategies, we we register for 
these
+# foundational node types *before* referring to them in other strategies.
+st.register_type_strategy(libcst.Name, st.builds(libcst.Name, identifiers()))
+st.register_type_strategy(
+    libcst.SimpleString, st.builds(libcst.SimpleString, st.text().map(repr))
+)
+
+# Ensure that ImportAlias uses Attribute nodes composed only of Name nodes.
+names = st.from_type(libcst.Name)
+name_only_attributes = st.one_of(
+    names,
+    st.builds(libcst.Attribute, names, names),
+    st.builds(libcst.Attribute, st.builds(libcst.Attribute, names, names), 
names),
+)
+st.register_type_strategy(
+    libcst.ImportAlias, st.builds(libcst.ImportAlias, name_only_attributes)
+)
+
+
+def nonempty_seq(*node: Type[libcst.CSTNode]) -> st.SearchStrategy:
+    return st.lists(st.one_of(*map(st.from_type, node)), min_size=1)
+
+
+# There are around 150 concrete types of CST nodes.  Delightfully, libCST uses
+# dataclasses for all these classes, so we can allow the `builds` & `from_type`
+# inference to provide most of our arguments for us.
+# However, in some cases we want to either restrict arguments (e.g. 
libcst.Name),
+# or supply something nastier than the default argument (e.g. 
libcst.SimpleWhitespace)
+REGISTERED = (
+    [libcst.AsName, st.from_type(libcst.Name)],
+    [libcst.Assign, nonempty_seq(libcst.AssignTarget)],
+    [libcst.Comparison, infer, nonempty_seq(libcst.ComparisonTarget)],
+    [libcst.Decorator, st.from_type(libcst.Name) | 
st.from_type(libcst.Attribute)],
+    [libcst.EmptyLine, infer, infer, infer],
+    [libcst.Global, nonempty_seq(libcst.NameItem)],
+    [libcst.Import, nonempty_seq(libcst.ImportAlias)],
+    [
+        libcst.ImportFrom,
+        st.from_type(libcst.Name) | st.from_type(libcst.Attribute),
+        nonempty_seq(libcst.ImportAlias),
+    ],
+    [libcst.NamedExpr, st.from_type(libcst.Name)],
+    [libcst.Nonlocal, nonempty_seq(libcst.NameItem)],
+    [libcst.Set, nonempty_seq(libcst.Element, libcst.StarredElement)],
+    [libcst.Subscript, infer, nonempty_seq(libcst.SubscriptElement)],
+    [libcst.TrailingWhitespace, infer, infer],
+    [libcst.With, nonempty_seq(libcst.WithItem)],
+)
+
+
+# This is where the magic happens: teach `st.from_type` to generate each node 
type
+for node_type, *strats in REGISTERED:
+    # TODO: once everything else is working, come back here and use `infer` for
+    # all arguments without an explicit strategy - inference is more 
"interesting"
+    # than just using the default argument... in the proverbial sense.
+    # Mostly this will consist of ensuring that parens remain balanced.
+    args = [name for name in getfullargspec(node_type).args if name != "self"]
+    kwargs = dict(zip(args, strats))
+    st.register_type_strategy(node_type, st.builds(node_type, **kwargs))
+
+# We have special handling for `Try` nodes, because there are two options.
+# If a Try node has no `except` clause, it *must* have a `finally` clause and
+# *must not* have an `else` clause.  With one or more except clauses, it may
+# have an else and/or a finally, or neither.
+st.register_type_strategy(
+    libcst.Try,
+    st.builds(libcst.Try, finalbody=st.from_type(libcst.Finally))
+    | st.builds(
+        libcst.Try,
+        body=infer,
+        handlers=st.lists(
+            st.from_type(libcst.ExceptHandler),
+            min_size=1,
+            unique_by=lambda caught: caught.type,
+        ),
+        orelse=infer,
+        finalbody=infer,
+    ),
+)
+
+
+def record_targets(code: str) -> str:
+    # target larger inputs - the Hypothesis engine will do a multi-objective
+    # hill-climbing search using these scores to generate 'better' examples.
+    nodes = list(ast.walk(ast.parse(code)))
+    uniq_nodes = {type(n) for n in nodes}
+    instructions = list(dis.Bytecode(compile(code, "<string>", "exec")))
+    for value, label in [
+        (len(instructions), "(hypothesmith from_node) instructions in 
bytecode"),
+        (len(nodes), "(hypothesmith from_node) total number of ast nodes"),
+        (len(uniq_nodes), "(hypothesmith from_node) number of unique ast node 
types"),
+    ]:
+        target(float(value), label=label)
+    return code
+
+
+def compilable(code: str, mode: str = "exec") -> bool:
+    # This is used as a filter on `from_node()`, but note that LibCST aspires 
to
+    # disallow construction of a CST node which is converted to invalid code.
+    # (that is, if the resulting code would be invalid, raise an error instead)
+    # See also https://github.com/Instagram/LibCST/issues/287
+    try:
+        compile(code, "<string>", mode)
+        return True
+    except (SyntaxError, ValueError):
+        return False
+
+
+def from_node(
+    node: Type[libcst.CSTNode] = libcst.Module, *, auto_target: bool = True
+) -> st.SearchStrategy[str]:
+    """Generate syntactically-valid Python source code for a LibCST node type.
+
+    You can pass any subtype of `libcst.CSTNode`.  Alternatively, you can use
+    Hypothesis' built-in `from_type(node_type).map(lambda n: 
libcst.Module([n]).code`,
+    after Hypothesmith has registered the required strategies.  However, this 
does
+    not include automatic targeting and limitations of LibCST may lead to 
invalid
+    code being generated.
+    """
+    assert issubclass(node, libcst.CSTNode)
+    code = st.from_type(node).map(lambda n: 
libcst.Module([n]).code).filter(compilable)
+    return code.map(record_targets) if auto_target else code
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/hypothesmith-0.0.6/src/hypothesmith/syntactic.py 
new/hypothesmith-0.1.4/src/hypothesmith/syntactic.py
--- old/hypothesmith-0.0.6/src/hypothesmith/syntactic.py        2020-04-08 
05:01:30.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith/syntactic.py        2020-08-17 
06:18:03.000000000 +0200
@@ -1,13 +1,14 @@
 """Hypothesis strategies for generating Python source code, somewhat like 
CSmith."""
 
+import ast
+import dis
 import re
 import sys
 import urllib.request
 from functools import lru_cache
 from pathlib import Path
 
-import hypothesis.strategies as st
-from hypothesis import assume
+from hypothesis import assume, strategies as st
 from hypothesis.extra.lark import LarkStrategy
 from lark import Lark
 from lark.indenter import Indenter
@@ -39,12 +40,14 @@
     _lead = []
     _subs = []
     for c in map(chr, range(sys.maxunicode + 1)):
+        if not utf8_encodable(c):
+            continue
         if c.isidentifier():
             _lead.append(c)  # e.g. "a"
         if ("_" + c).isidentifier():
             _subs.append(c)  # e.g. "1"
     pattern = "[{}][{}]*".format(re.escape("".join(_lead)), 
re.escape("".join(_subs)))
-    return st.from_regex(pattern, fullmatch=True)
+    return st.from_regex(pattern, fullmatch=True).filter(str.isidentifier)
 
 
 class PythonIndenter(Indenter):
@@ -69,7 +72,7 @@
 
 
 class GrammarStrategy(LarkStrategy):
-    def __init__(self, grammar: Lark, start: str):
+    def __init__(self, grammar: Lark, start: str, auto_target: bool):
         explicit_strategies = {
             PythonIndenter.INDENT_type: st.just(" " * PythonIndenter.tab_len),
             PythonIndenter.DEDENT_type: st.just(""),
@@ -80,6 +83,24 @@
             k: v.map(lambda s: s.replace("\0", "")).filter(utf8_encodable)
             for k, v in self.terminal_strategies.items()  # type: ignore
         }
+        self.auto_target = auto_target and start != "single_input"
+
+    def do_draw(self, data):  # type: ignore
+        result = super().do_draw(data)
+        if self.auto_target:
+            # target larger inputs - the Hypothesis engine will do a 
multi-objective
+            # hill-climbing search using these scores to generate 'better' 
examples.
+            nodes = list(ast.walk(ast.parse(result)))
+            uniq_nodes = {type(n) for n in nodes}
+            instructions = list(dis.Bytecode(compile(result, "<string>", 
"exec")))
+            targets = data.target_observations
+            for value, label in [
+                (instructions, "(hypothesmith) instructions in bytecode"),
+                (nodes, "(hypothesmith) total number of ast nodes"),
+                (uniq_nodes, "(hypothesmith) number of unique ast node types"),
+            ]:
+                targets[label] = max(float(len(value)), targets.get(label, 
0.0))
+        return result
 
     def draw_symbol(self, data, symbol, draw_state):  # type: ignore
         count = len(draw_state.result)
@@ -91,19 +112,45 @@
                     filename="<string>",
                     mode=COMPILE_MODES[symbol.name],
                 )
+            except SystemError as err:  # pragma: no cover
+                # Extra output to help track down a possible upstream issue
+                # https://github.com/Zac-HD/stdlib-property-tests/issues/14
+                source_code = "".join(draw_state.result[count:])
+                raise Exception(
+                    f"unexpected error while attempting to compile 
{source_code!r}"
+                    f" in mode={COMPILE_MODES[symbol.name]}"
+                ) from err
             except SyntaxError:
                 # Python's grammar doesn't actually fully describe the 
behaviour of the
                 # CPython parser and AST-post-processor, so we just filter out 
errors.
                 assume(False)
 
+    def gen_ignore(self, data, draw_state):  # type: ignore
+        # Set a consistent 1/4 chance of generating any ignored tokens 
(comments,
+        # whitespace, line-continuations) as part of this draw.
+        if data.draw(
+            st.shared(
+                st.sampled_from([False, True, False, False]),
+                key="hypothesmith_gen_ignored",
+            )
+        ):
+            super().gen_ignore(data, draw_state)
+
 
-def from_grammar(start: str = "file_input") -> st.SearchStrategy[str]:
+def from_grammar(
+    start: str = "file_input", *, auto_target: bool = True
+) -> st.SearchStrategy[str]:
     """Generate syntactically-valid Python source code based on the grammar.
 
     Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
     ``"eval_input"``; respectively a single interactive statement, a module or
     sequence of commands read from a file, and input for the eval() function.
 
+    If ``auto_target`` is True, this strategy uses ``hypothesis.target()``
+    internally to drive towards larger and more complex examples.  We recommend
+    leaving this enabled, as the grammar is quite complex and only simple 
examples
+    tend to be generated otherwise.
+
     .. warning::
         DO NOT EXECUTE CODE GENERATED BY THIS STRATEGY.
 
@@ -112,5 +159,6 @@
         code can be useful, but "arbitrary code execution" can be very, very 
bad.
     """
     assert start in {"single_input", "file_input", "eval_input"}
+    assert isinstance(auto_target, bool)
     grammar = Lark(lark_grammar, parser="lalr", postlex=PythonIndenter(), 
start=start)
-    return GrammarStrategy(grammar, start)
+    return GrammarStrategy(grammar, start, auto_target)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/PKG-INFO 
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/PKG-INFO
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/PKG-INFO   2020-04-08 
05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/PKG-INFO   2020-08-17 
06:18:13.000000000 +0200
@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: hypothesmith
-Version: 0.0.6
+Version: 0.1.4
 Summary: Hypothesis strategies for generating Python programs, something like 
CSmith
 Home-page: https://github.com/Zac-HD/hypothesmith
 Author: Zac Hatfield-Dodds
@@ -19,31 +19,55 @@
         You can run the tests, such as they are, with `tox` on Python 3.6 or 
later.
         Use `tox -va` to see what environments are available.
         
-        ## Changelog
+        ## Usage
+        This package provides two Hypothesis strategies for generating Python 
source code.
         
-        ### 0.0.6 - 2020-04-08
-        - support for non-ASCII identifiers
+        The generated code will always be syntatically valid, and is useful 
for testing
+        parsers, linters, auto-formatters, and other tools that operate on 
source code.
         
-        ### 0.0.5 - 2019-11-27
-        - Updated project metadata and started testing on Python 3.8
+        > DO NOT EXECUTE CODE GENERATED BY THESE STRATEGIES.
+        >
+        > It could do literally anything that running Python code is able to 
do,
+        > including changing, deleting, or uploading important data.  Arbitrary
+        > code can be useful, but "arbitrary code execution" can be very, very 
bad.
+        
+        #### `hypothesmith.from_grammar(start="file_input", *, 
auto_target=True)`
+        
+        Generates syntactically-valid Python source code based on the grammar.
+        
+        Valid values for ``start`` are ``"single_input"``, ``"file_input"``, or
+        ``"eval_input"``; respectively a single interactive statement, a 
module or
+        sequence of commands read from a file, and input for the eval() 
function.
+        
+        If ``auto_target`` is ``True``, this strategy uses 
``hypothesis.target()``
+        internally to drive towards larger and more complex examples.  We 
recommend
+        leaving this enabled, as the grammar is quite complex and only simple 
examples
+        tend to be generated otherwise.
+        
+        #### `hypothesmith.from_node(node=libcst.Module, *, auto_target=True)`
+        
+        Generates syntactically-valid Python source code based on the node 
types
+        defined by the [`LibCST`](https://libcst.readthedocs.io/en/latest/) 
project.
+        
+        You can pass any subtype of `libcst.CSTNode`.  Alternatively, you can 
use
+        Hypothesis' built-in `from_type(node_type).map(lambda n: 
libcst.Module([n]).code`,
+        after Hypothesmith has registered the required strategies.  However, 
this does
+        not include automatic targeting and limitations of LibCST may lead to 
invalid
+        code being generated.
+        
+        ## Notable bugs found with Hypothesmith
+        - [BPO-40661, a segfault in the new 
parser](https://bugs.python.org/issue40661),
+          was given maximum priority and blocked the planned release of 
CPython 3.9 beta1.
+        - [BPO-38953](https://bugs.python.org/issue38953) `tokenize` -> 
`untokenize` roundtrip bugs.
+        - [`lib2to3` errors on \r in 
comment](https://github.com/psf/black/issues/970)
+        - [Black fails on files ending in a 
backslash](https://github.com/psf/black/issues/1012)
+        - [At least three round-trip bugs in 
LibCST](https://github.com/Instagram/LibCST#acknowledgements)
+          (search commits for "hypothesis")
+        - [Invalid code generated by 
LibCST](https://github.com/Instagram/LibCST/issues/287)
         
-        ### 0.0.4 - 2019-09-10
-        - Depends on more recent Hypothesis version, with upstreamed grammar 
generation.
-        - Improved filtering rejects fewer valid examples, finding another bug 
in Black.
-        
-        ### 0.0.3 - 2019-08-08
-        Checks validity at statement level, which makes filtering much more 
efficient.
-        Improved testing, input validation, and code comments.
-        
-        ### 0.0.2 - 2019-08-07
-        Improved filtering and fixing of source code generated from the 
grammar.
-        This version found a novel bug: `"pass #\\r#\\n"` is accepted by the
-        built-in `compile()` and `exec()` functions, but not by `black` or 
`lib2to3`.
-        
-        ### 0.0.1 - 2019-08-06
-        Initial release.  This is a minimal proof of concept, generating from 
the
-        grammar and rejecting it if we get errors from `black` or `tokenize`.
-        Cool, but while promising not very useful at this stage.
+        ### Changelog
+        
+        Patch notes [can be found in 
`CHANGELOG.md`](https://github.com/Zac-HD/hypothesmith/blob/master/CHANGELOG.md).
         
 Keywords: python testing fuzzing property-based-testing
 Platform: UNKNOWN
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/SOURCES.txt 
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/SOURCES.txt
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/SOURCES.txt        
2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/SOURCES.txt        
2020-08-17 06:18:13.000000000 +0200
@@ -1,6 +1,7 @@
 README.md
 setup.py
 src/hypothesmith/__init__.py
+src/hypothesmith/cst.py
 src/hypothesmith/py.typed
 src/hypothesmith/python3.lark
 src/hypothesmith/syntactic.py
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' 
old/hypothesmith-0.0.6/src/hypothesmith.egg-info/requires.txt 
new/hypothesmith-0.1.4/src/hypothesmith.egg-info/requires.txt
--- old/hypothesmith-0.0.6/src/hypothesmith.egg-info/requires.txt       
2020-04-08 05:01:38.000000000 +0200
+++ new/hypothesmith-0.1.4/src/hypothesmith.egg-info/requires.txt       
2020-08-17 06:18:13.000000000 +0200
@@ -1,2 +1,3 @@
-hypothesis>=4.36.0
+hypothesis>=5.23.7
 lark-parser>=0.7.2
+libcst>=0.3.8

commit python-hypothesmith for openSUSE:Factory

Reply via email to