[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r94583:916081855b81 Date: 2018-05-14 15:35 +0100 http://bitbucket.org/pypy/pypy/changeset/916081855b81/ Log:hg merge default diff too long, truncating to 2000 out of 3209 lines diff --git a/pypy/doc/architecture.rst b/pypy/doc/architecture.rst --- a/pypy/doc/architecture.rst +++ b/pypy/doc/architecture.rst @@ -73,3 +73,63 @@ This division between bytecode evaluator and object space gives a lot of flexibility. One can plug in different :doc:`object spaces ` to get different or enriched behaviours of the Python objects. + +Layers +-- + +RPython +~~~ +:ref:`RPython ` is the language in which we write interpreters. +Not the entire PyPy project is written in RPython, only the parts that are +compiled in the translation process. The interesting point is that RPython +has no parser, it's compiled from the live python objects, which makes it +possible to do all kinds of metaprogramming during import time. In short, +Python is a meta programming language for RPython. + +The RPython standard library is to be found in the ``rlib`` subdirectory. + +Consult `Getting Started with RPython`_ for further reading + +Translation +~~~ +The translation toolchain - this is the part that takes care of translating +RPython to flow graphs and then to C. There is more in the +:doc:`architecture ` document written about it. + +It lives in the ``rpython`` directory: ``flowspace``, ``annotator`` +and ``rtyper``. + +PyPy Interpreter + +This is in the ``pypy`` directory. ``pypy/interpreter`` is a standard +interpreter for Python written in RPython. The fact that it is +RPython is not apparent at first. Built-in modules are written in +``pypy/module/*``. Some modules that CPython implements in C are +simply written in pure Python; they are in the top-level ``lib_pypy`` +directory. The standard library of Python (with a few changes to +accomodate PyPy) is in ``lib-python``. + +JIT Compiler + +:ref:`Just-in-Time Compiler (JIT) `: we have a tracing JIT that traces the +interpreter written in RPython, rather than the user program that it +interprets. As a result it applies to any interpreter, i.e. any +language. But getting it to work correctly is not trivial: it +requires a small number of precise "hints" and possibly some small +refactorings of the interpreter. The JIT itself also has several +almost-independent parts: the tracer itself in ``rpython/jit/metainterp``, the +optimizer in ``rpython/jit/metainterp/optimizer`` that optimizes a list of +residual operations, and the backend in ``rpython/jit/backend/`` +that turns it into machine code. Writing a new backend is a +traditional way to get into the project. + +Garbage Collectors +~~ +Garbage Collectors (GC): as you may notice if you are used to CPython's +C code, there are no ``Py_INCREF/Py_DECREF`` equivalents in RPython code. +:ref:`rpython:garbage-collection` is inserted +during translation. Moreover, this is not reference counting; it is a real +GC written as more RPython code. The best one we have so far is in +``rpython/memory/gc/incminimark.py``. + +.. _`Getting started with RPython`: http://rpython.readthedocs.org/en/latest/getting-started.html diff --git a/pypy/doc/build.rst b/pypy/doc/build.rst --- a/pypy/doc/build.rst +++ b/pypy/doc/build.rst @@ -267,14 +267,14 @@ * PyPy 2.5.1 or earlier: normal users would see permission errors. Installers need to run ``pypy -c "import gdbm"`` and other similar commands at install time; the exact list is in - :source:`pypy/tool/release/package.py `. Users + :source:`pypy/tool/release/package.py`. Users seeing a broken installation of PyPy can fix it after-the-fact if they have sudo rights, by running once e.g. ``sudo pypy -c "import gdbm``. * PyPy 2.6 and later: anyone would get ``ImportError: no module named _gdbm_cffi``. Installers need to run ``pypy _gdbm_build.py`` in the ``lib_pypy`` directory during the installation process (plus others; - see the exact list in :source:`pypy/tool/release/package.py `). + see the exact list in :source:`pypy/tool/release/package.py`). Users seeing a broken installation of PyPy can fix it after-the-fact, by running ``pypy /path/to/lib_pypy/_gdbm_build.py``. This command produces a file diff --git a/pypy/doc/coding-guide.rst b/pypy/doc/coding-guide.rst --- a/pypy/doc/coding-guide.rst +++ b/pypy/doc/coding-guide.rst @@ -539,7 +539,7 @@ hg help branch -.. _official wiki: http://mercurial.selenic.com/wiki/Branch +.. _official wiki: https://www.mercurial-scm.org/wiki/ .. _using-development-tracker: @@ -547,15 +547,7 @@ Using the development bug/feature tracker - -We have a `development tracker`_, based on Richard Jones' -`roundup`_ application. You can file bugs, -feature requests or see what's going on -for the next milestone, both from an E-Mail and from
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93393:fc38dc2766eb Date: 2017-12-12 18:28 + http://bitbucket.org/pypy/pypy/changeset/fc38dc2766eb/ Log:hg merge default diff --git a/pypy/module/test_lib_pypy/test_json_extra.py b/extra_tests/test_json.py rename from pypy/module/test_lib_pypy/test_json_extra.py rename to extra_tests/test_json.py --- a/pypy/module/test_lib_pypy/test_json_extra.py +++ b/extra_tests/test_json.py @@ -1,4 +1,5 @@ -import py, json +import pytest +import json def is_(x, y): return type(x) is type(y) and x == y @@ -6,12 +7,14 @@ def test_no_ensure_ascii(): assert is_(json.dumps(u"\u1234", ensure_ascii=False), u'"\u1234"') assert is_(json.dumps("\xc0", ensure_ascii=False), '"\xc0"') -e = py.test.raises(UnicodeDecodeError, json.dumps, - (u"\u1234", "\xc0"), ensure_ascii=False) -assert str(e.value).startswith("'ascii' codec can't decode byte 0xc0 ") -e = py.test.raises(UnicodeDecodeError, json.dumps, - ("\xc0", u"\u1234"), ensure_ascii=False) -assert str(e.value).startswith("'ascii' codec can't decode byte 0xc0 ") +with pytest.raises(UnicodeDecodeError) as excinfo: +json.dumps((u"\u1234", "\xc0"), ensure_ascii=False) +assert str(excinfo.value).startswith( +"'ascii' codec can't decode byte 0xc0 ") +with pytest.raises(UnicodeDecodeError) as excinfo: +json.dumps(("\xc0", u"\u1234"), ensure_ascii=False) +assert str(excinfo.value).startswith( +"'ascii' codec can't decode byte 0xc0 ") def test_issue2191(): assert is_(json.dumps(u"xxx", ensure_ascii=False), u'"xxx"') diff --git a/pypy/doc/whatsnew-head.rst b/pypy/doc/whatsnew-head.rst --- a/pypy/doc/whatsnew-head.rst +++ b/pypy/doc/whatsnew-head.rst @@ -1,42 +1,45 @@ -=== -What's new in PyPy2.7 5.10+ -=== - -.. this is a revision shortly after release-pypy2.7-v5.9.0 -.. startrev:d56dadcef996 - - -.. branch: cppyy-packaging - -Cleanup and improve cppyy packaging - -.. branch: docs-osx-brew-openssl - -.. branch: keep-debug-symbols - -Add a smartstrip tool, which can optionally keep the debug symbols in a -separate file, instead of just stripping them away. Use it in packaging - -.. branch: bsd-patches - -Fix failures on FreeBSD, contributed by David Naylor as patches on the issue -tracker (issues 2694, 2695, 2696, 2697) - -.. branch: run-extra-tests - -Run extra_tests/ in buildbot - -.. branch: vmprof-0.4.10 - -Upgrade the _vmprof backend to vmprof 0.4.10 - -.. branch: fix-vmprof-stacklet-switch - -Fix a vmprof+continulets (i.e. greenelts, eventlet, gevent, ...) - -.. branch: win32-vcvars - -.. branch: unicode-utf8-re -.. branch: utf8-io -Utf8 handling for unicode - +=== +What's new in PyPy2.7 5.10+ +=== + +.. this is a revision shortly after release-pypy2.7-v5.9.0 +.. startrev:d56dadcef996 + + +.. branch: cppyy-packaging + +Cleanup and improve cppyy packaging + +.. branch: docs-osx-brew-openssl + +.. branch: keep-debug-symbols + +Add a smartstrip tool, which can optionally keep the debug symbols in a +separate file, instead of just stripping them away. Use it in packaging + +.. branch: bsd-patches + +Fix failures on FreeBSD, contributed by David Naylor as patches on the issue +tracker (issues 2694, 2695, 2696, 2697) + +.. branch: run-extra-tests + +Run extra_tests/ in buildbot + +.. branch: vmprof-0.4.10 + +Upgrade the _vmprof backend to vmprof 0.4.10 + +.. branch: fix-vmprof-stacklet-switch + +Fix a vmprof+continulets (i.e. greenelts, eventlet, gevent, ...) + +.. branch: win32-vcvars + +.. branch rdict-fast-hash + +Make it possible to declare that the hash function of an r_dict is fast in RPython. + +.. branch: unicode-utf8-re +.. branch: utf8-io +Utf8 handling for unicode diff --git a/pypy/module/_pypyjson/interp_decoder.py b/pypy/module/_pypyjson/interp_decoder.py --- a/pypy/module/_pypyjson/interp_decoder.py +++ b/pypy/module/_pypyjson/interp_decoder.py @@ -49,7 +49,7 @@ self.ll_chars = rffi.str2charp(s) self.end_ptr = lltype.malloc(rffi.CCHARPP.TO, 1, flavor='raw') self.pos = 0 -self.cache = r_dict(slice_eq, slice_hash) +self.cache = r_dict(slice_eq, slice_hash, simple_hash_eq=True) def close(self): rffi.free_charp(self.ll_chars) diff --git a/rpython/annotator/bookkeeper.py b/rpython/annotator/bookkeeper.py --- a/rpython/annotator/bookkeeper.py +++ b/rpython/annotator/bookkeeper.py @@ -194,13 +194,14 @@ listdef.generalize_range_step(flags['range_step']) return SomeList(listdef) -def getdictdef(self, is_r_dict=False, force_non_null=False): +def getdictdef(self, is_r_dict=False, force_non_null=False, simple_hash_eq=False): """Get the DictDef associated with the current position.""" try: dictdef = self.dictdefs[self.position_key] except KeyError:
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93383:b8d6b2298b9b Date: 2017-12-12 05:37 + http://bitbucket.org/pypy/pypy/changeset/b8d6b2298b9b/ Log:hg merge default diff --git a/pypy/doc/build.rst b/pypy/doc/build.rst --- a/pypy/doc/build.rst +++ b/pypy/doc/build.rst @@ -149,7 +149,7 @@ xz-devel # For lzma on PyPy3. (XXX plus the SLES11 version of libgdbm-dev and tk-dev) -On Mac OS X:: +On Mac OS X: Most of these build-time dependencies are installed alongside the Developer Tools. However, note that in order for the installation to diff --git a/pypy/doc/cpython_differences.rst b/pypy/doc/cpython_differences.rst --- a/pypy/doc/cpython_differences.rst +++ b/pypy/doc/cpython_differences.rst @@ -355,7 +355,11 @@ containers (as list items or in sets for example), the exact rule of equality used is "``if x is y or x == y``" (on both CPython and PyPy); as a consequence, because all ``nans`` are identical in PyPy, you -cannot have several of them in a set, unlike in CPython. (Issue `#1974`__) +cannot have several of them in a set, unlike in CPython. (Issue `#1974`__). +Another consequence is that ``cmp(float('nan'), float('nan')) == 0``, because +``cmp`` checks with ``is`` first whether the arguments are identical (there is +no good value to return from this call to ``cmp``, because ``cmp`` pretends +that there is a total order on floats, but that is wrong for NaNs). .. __: https://bitbucket.org/pypy/pypy/issue/1974/different-behaviour-for-collections-of diff --git a/pypy/doc/whatsnew-head.rst b/pypy/doc/whatsnew-head.rst --- a/pypy/doc/whatsnew-head.rst +++ b/pypy/doc/whatsnew-head.rst @@ -5,26 +5,33 @@ .. this is a revision shortly after release-pypy2.7-v5.9.0 .. startrev:d56dadcef996 + .. branch: cppyy-packaging + Cleanup and improve cppyy packaging .. branch: docs-osx-brew-openssl .. branch: keep-debug-symbols + Add a smartstrip tool, which can optionally keep the debug symbols in a separate file, instead of just stripping them away. Use it in packaging .. branch: bsd-patches + Fix failures on FreeBSD, contributed by David Naylor as patches on the issue tracker (issues 2694, 2695, 2696, 2697) .. branch: run-extra-tests + Run extra_tests/ in buildbot .. branch: vmprof-0.4.10 + Upgrade the _vmprof backend to vmprof 0.4.10 .. branch: fix-vmprof-stacklet-switch + Fix a vmprof+continulets (i.e. greenelts, eventlet, gevent, ...) .. branch: win32-vcvars diff --git a/pypy/doc/whatsnew-pypy2-5.6.0.rst b/pypy/doc/whatsnew-pypy2-5.6.0.rst --- a/pypy/doc/whatsnew-pypy2-5.6.0.rst +++ b/pypy/doc/whatsnew-pypy2-5.6.0.rst @@ -101,7 +101,7 @@ .. branch: newinitwarn -Match CPython's stricter handling of __new/init__ arguments +Match CPython's stricter handling of ``__new__``/``__init__`` arguments .. branch: openssl-1.1 diff --git a/pypy/doc/windows.rst b/pypy/doc/windows.rst --- a/pypy/doc/windows.rst +++ b/pypy/doc/windows.rst @@ -11,7 +11,7 @@ To build pypy-c you need a working python environment, and a C compiler. It is possible to translate with a CPython 2.6 or later, but this is not -the preferred way, because it will take a lot longer to run depending +the preferred way, because it will take a lot longer to run depending on your architecture, between two and three times as long. So head to `our downloads`_ and get the latest stable version. @@ -103,6 +103,7 @@ must also copy the ``vcvarsall.bat`` file fron the ``...\9.0`` directory to the ``...\9.0\VC`` directory, and edit it, changing the lines that set ``VCINSTALLDIR`` and ``WindowsSdkDir``:: + set VCINSTALLDIR=%~dp0\ set WindowsSdkDir=%~dp0\..\WinSDK\ diff --git a/pypy/module/__builtin__/test/test_builtin.py b/pypy/module/__builtin__/test/test_builtin.py --- a/pypy/module/__builtin__/test/test_builtin.py +++ b/pypy/module/__builtin__/test/test_builtin.py @@ -404,6 +404,7 @@ def test_cmp(self): +assert cmp(float('nan'), float('nan')) == 0 assert cmp(9,9) == 0 assert cmp(0,9) < 0 assert cmp(9,0) > 0 diff --git a/pypy/module/posix/test/test_posix2.py b/pypy/module/posix/test/test_posix2.py --- a/pypy/module/posix/test/test_posix2.py +++ b/pypy/module/posix/test/test_posix2.py @@ -31,9 +31,15 @@ pdir.join('file2').write("test2") pdir.join('another_longer_file_name').write("test3") mod.pdir = pdir -unicode_dir = udir.ensure('fi\xc5\x9fier.txt', dir=True) +if sys.platform == 'darwin': +# see issue https://bugs.python.org/issue31380 +unicode_dir = udir.ensure('fixc5x9fier.txt', dir=True) +file_name = 'cafxe9' +else: +unicode_dir = udir.ensure('fi\xc5\x9fier.txt', dir=True) +file_name = 'caf\xe9' unicode_dir.join('somefile').write('who cares?') -unicode_dir.join('caf\xe9').write('who knows?') +unicode_dir.join(file_name).write('who knows?') mod.unicode_dir = unicode_dir # in applevel tests, os.stat uses
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93262:56cea686737b Date: 2017-12-03 20:43 + http://bitbucket.org/pypy/pypy/changeset/56cea686737b/ Log:hg merge default diff --git a/extra_tests/test_textio.py b/extra_tests/test_textio.py --- a/extra_tests/test_textio.py +++ b/extra_tests/test_textio.py @@ -1,28 +1,48 @@ from hypothesis import given, strategies as st from io import BytesIO, TextIOWrapper +import os -LINESEP = ['', '\r', '\n', '\r\n'] +def translate_newlines(text): +text = text.replace('\r\n', '\n') +text = text.replace('\r', '\n') +return text.replace('\n', os.linesep) @st.composite -def text_with_newlines(draw): -sep = draw(st.sampled_from(LINESEP)) -lines = draw(st.lists(st.text(max_size=10), max_size=10)) -return sep.join(lines) +def st_readline_universal( +draw, st_nlines=st.integers(min_value=0, max_value=10)): +n_lines = draw(st_nlines) +lines = draw(st.lists( +st.text(st.characters(blacklist_characters='\r\n')), +min_size=n_lines, max_size=n_lines)) +limits = [] +for line in lines: +limit = draw(st.integers(min_value=0, max_value=len(line) + 5)) +limits.append(limit) +limits.append(-1) +endings = draw(st.lists( +st.sampled_from(['\n', '\r', '\r\n']), +min_size=n_lines, max_size=n_lines)) +return ( +''.join(line + ending for line, ending in zip(lines, endings)), +limits) -@given(txt=text_with_newlines(), - mode=st.sampled_from(['\r', '\n', '\r\n', '']), - limit=st.integers(min_value=-1)) -def test_readline(txt, mode, limit): +@given(data=st_readline_universal(), + mode=st.sampled_from(['\r', '\n', '\r\n', '', None])) +def test_readline(data, mode): +txt, limits = data textio = TextIOWrapper( -BytesIO(txt.encode('utf-8')), encoding='utf-8', newline=mode) +BytesIO(txt.encode('utf-8', 'surrogatepass')), +encoding='utf-8', errors='surrogatepass', newline=mode) lines = [] -while True: +for limit in limits: line = textio.readline(limit) -if limit > 0: -assert len(line) < limit +if limit >= 0: +assert len(line) <= limit if line: lines.append(line) -else: +elif limit: break -assert u''.join(lines) == txt +if mode is None: +txt = translate_newlines(txt) +assert txt.startswith(u''.join(lines)) diff --git a/lib_pypy/resource.py b/lib_pypy/resource.py --- a/lib_pypy/resource.py +++ b/lib_pypy/resource.py @@ -20,6 +20,7 @@ or via the attributes ru_utime, ru_stime, ru_maxrss, and so on.""" __metaclass__ = _structseq.structseqtype +name = "resource.struct_rusage" ru_utime = _structseq.structseqfield(0,"user time used") ru_stime = _structseq.structseqfield(1,"system time used") diff --git a/pypy/doc/whatsnew-head.rst b/pypy/doc/whatsnew-head.rst --- a/pypy/doc/whatsnew-head.rst +++ b/pypy/doc/whatsnew-head.rst @@ -26,3 +26,6 @@ .. branch: fix-vmprof-stacklet-switch Fix a vmprof+continulets (i.e. greenelts, eventlet, gevent, ...) + +.. branch: win32-vcvars + diff --git a/pypy/doc/windows.rst b/pypy/doc/windows.rst --- a/pypy/doc/windows.rst +++ b/pypy/doc/windows.rst @@ -25,8 +25,10 @@ This compiler, while the standard one for Python 2.7, is deprecated. Microsoft has made it available as the `Microsoft Visual C++ Compiler for Python 2.7`_ (the link -was checked in Nov 2016). Note that the compiler suite will be installed in -``C:\Users\\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python``. +was checked in Nov 2016). Note that the compiler suite may be installed in +``C:\Users\\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python`` +or in +``C:\Program Files (x86)\Common Files\Microsoft\Visual C++ for Python``. A current version of ``setuptools`` will be able to find it there. For Windows 10, you must right-click the download, and under ``Properties`` -> ``Compatibility`` mark it as ``Run run this program in comatibility mode for`` @@ -41,7 +43,6 @@ --- We routinely test translation using v9, also known as Visual Studio 2008. -Our buildbot is still using the Express Edition, not the compiler noted above. Other configurations may work as well. The translation scripts will set up the appropriate environment variables @@ -81,6 +82,30 @@ .. _build instructions: http://pypy.org/download.html#building-from-source +Setting Up Visual Studio for building SSL in Python3 + + +On Python3, the ``ssl`` module is based on ``cffi``, and requires a build step after +translation. However ``distutils`` does not support the Micorosft-provided Visual C +compiler, and ``cffi`` depends on ``distutils`` to find the compiler. The +traditional solution to this problem is to install the ``setuptools`` module +via
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93177:a40f7eee2bcf Date: 2017-11-26 01:27 + http://bitbucket.org/pypy/pypy/changeset/a40f7eee2bcf/ Log:hg merge default diff --git a/extra_tests/test_textio.py b/extra_tests/test_textio.py --- a/extra_tests/test_textio.py +++ b/extra_tests/test_textio.py @@ -14,7 +14,8 @@ mode=st.sampled_from(['\r', '\n', '\r\n', '']), limit=st.integers(min_value=-1)) def test_readline(txt, mode, limit): -textio = TextIOWrapper(BytesIO(txt.encode('utf-8')), newline=mode) +textio = TextIOWrapper( +BytesIO(txt.encode('utf-8')), encoding='utf-8', newline=mode) lines = [] while True: line = textio.readline(limit) diff --git a/pypy/module/_io/interp_stringio.py b/pypy/module/_io/interp_stringio.py --- a/pypy/module/_io/interp_stringio.py +++ b/pypy/module/_io/interp_stringio.py @@ -2,21 +2,115 @@ from pypy.interpreter.typedef import ( TypeDef, generic_new_descr, GetSetProperty) from pypy.interpreter.gateway import interp2app, unwrap_spec, WrappedDefault -from pypy.module._io.interp_textio import W_TextIOBase, W_IncrementalNewlineDecoder +from pypy.module._io.interp_textio import ( +W_TextIOBase, W_IncrementalNewlineDecoder) from pypy.module._io.interp_iobase import convert_size +class UnicodeIO(object): +def __init__(self, data=None, pos=0): +if data is None: +data = [] +self.data = data +self.pos = pos + +def resize(self, newlength): +if len(self.data) > newlength: +self.data = self.data[:newlength] +if len(self.data) < newlength: +self.data.extend([u'\0'] * (newlength - len(self.data))) + +def read(self, size): +start = self.pos +available = len(self.data) - start +if available <= 0: +return u'' +if size >= 0 and size <= available: +end = start + size +else: +end = len(self.data) +assert 0 <= start <= end +self.pos = end +return u''.join(self.data[start:end]) + +def _convert_limit(self, limit): +if limit < 0 or limit > len(self.data) - self.pos: +limit = len(self.data) - self.pos +assert limit >= 0 +return limit + +def readline_universal(self, limit): +# Universal newline search. Find any of \r, \r\n, \n +limit = self._convert_limit(limit) +start = self.pos +end = start + limit +pos = start +while pos < end: +ch = self.data[pos] +pos += 1 +if ch == '\n': +break +if ch == '\r': +if pos >= end: +break +if self.data[pos] == '\n': +pos += 1 +break +else: +break +self.pos = pos +result = u''.join(self.data[start:pos]) +return result + +def readline(self, marker, limit): +start = self.pos +limit = self._convert_limit(limit) +end = start + limit +found = False +for pos in range(start, end - len(marker) + 1): +ch = self.data[pos] +if ch == marker[0]: +for j in range(1, len(marker)): +if self.data[pos + j] != marker[j]: +break # from inner loop +else: +pos += len(marker) +found = True +break +if not found: +pos = end +self.pos = pos +result = u''.join(self.data[start:pos]) +return result + +def write(self, string): +length = len(string) +if self.pos + length > len(self.data): +self.resize(self.pos + length) + +for i in range(length): +self.data[self.pos + i] = string[i] +self.pos += length + +def seek(self, pos): +self.pos = pos + +def truncate(self, size): +if size < len(self.data): +self.resize(size) + +def getvalue(self): +return u''.join(self.data) + class W_StringIO(W_TextIOBase): def __init__(self, space): W_TextIOBase.__init__(self, space) -self.buf = [] -self.pos = 0 +self.buf = UnicodeIO() -@unwrap_spec(w_newline = WrappedDefault("\n")) +@unwrap_spec(w_newline=WrappedDefault("\n")) def descr_init(self, space, w_initvalue=None, w_newline=None): # In case __init__ is called multiple times -self.buf = [] -self.pos = 0 +self.buf = UnicodeIO() self.w_decoder = None self.readnl = None self.writenl = None @@ -27,7 +121,7 @@ newline = space.unicode_w(w_newline) if (newline is not None and newline != u"" and newline != u"\n" and -newline != u"\r" and newline != u"\r\n"): +
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93170:f9a1926628b2 Date: 2017-11-24 20:22 + http://bitbucket.org/pypy/pypy/changeset/f9a1926628b2/ Log:hg merge default diff --git a/extra_tests/test_textio.py b/extra_tests/test_textio.py new file mode 100644 --- /dev/null +++ b/extra_tests/test_textio.py @@ -0,0 +1,27 @@ +from hypothesis import given, strategies as st + +from io import BytesIO, TextIOWrapper + +LINESEP = ['', '\r', '\n', '\r\n'] + +@st.composite +def text_with_newlines(draw): +sep = draw(st.sampled_from(LINESEP)) +lines = draw(st.lists(st.text(max_size=10), max_size=10)) +return sep.join(lines) + +@given(txt=text_with_newlines(), + mode=st.sampled_from(['\r', '\n', '\r\n', '']), + limit=st.integers(min_value=-1)) +def test_readline(txt, mode, limit): +textio = TextIOWrapper(BytesIO(txt.encode('utf-8')), newline=mode) +lines = [] +while True: +line = textio.readline(limit) +if limit > 0: +assert len(line) < limit +if line: +lines.append(line) +else: +break +assert u''.join(lines) == txt diff --git a/pypy/module/_continuation/test/conftest.py b/pypy/module/_continuation/test/conftest.py new file mode 100644 --- /dev/null +++ b/pypy/module/_continuation/test/conftest.py @@ -0,0 +1,7 @@ +import pytest +import sys + +def pytest_configure(config): +if sys.platform.startswith('linux'): +from rpython.rlib.rvmprof.cintf import configure_libbacktrace_linux +configure_libbacktrace_linux() diff --git a/pypy/module/_io/interp_stringio.py b/pypy/module/_io/interp_stringio.py --- a/pypy/module/_io/interp_stringio.py +++ b/pypy/module/_io/interp_stringio.py @@ -174,18 +174,16 @@ start = self.pos if limit < 0 or limit > len(self.buf) - self.pos: limit = len(self.buf) - self.pos +assert limit >= 0 -assert limit >= 0 -end = start + limit - -endpos, consumed = self._find_line_ending( +endpos, found = self._find_line_ending( # XXX: super inefficient, makes a copy of the entire contents. u"".join(self.buf), start, -end +limit ) -if endpos < 0: -endpos = end +if not found: +endpos = start + limit assert endpos >= 0 self.pos = endpos return space.newunicode(u"".join(self.buf[start:endpos])) diff --git a/pypy/module/_io/interp_textio.py b/pypy/module/_io/interp_textio.py --- a/pypy/module/_io/interp_textio.py +++ b/pypy/module/_io/interp_textio.py @@ -221,44 +221,49 @@ def newlines_get_w(self, space): return space.w_None -def _find_line_ending(self, line, start, end): -size = end - start +def _find_newline_universal(self, line, start, limit): +# Universal newline search. Find any of \r, \r\n, \n +# The decoder ensures that \r\n are not split in two pieces +limit = min(limit, len(line) - start) +end = start + limit +i = start +while i < end: +ch = line[i] +i += 1 +if ch == '\n': +return i, True +if ch == '\r': +if i >= end: +break +if line[i] == '\n': +return i + 1, True +else: +return i, True +return end, False + +def _find_marker(self, marker, line, start, limit): +limit = min(limit, len(line) - start) +end = start + limit +for i in range(start, end - len(marker) + 1): +ch = line[i] +if ch == marker[0]: +for j in range(1, len(marker)): +if line[i + j] != marker[j]: +break # from inner loop +else: +return i + len(marker), True +return end - len(marker) + 1, False + +def _find_line_ending(self, line, start, limit): if self.readuniversal: -# Universal newline search. Find any of \r, \r\n, \n -# The decoder ensures that \r\n are not split in two pieces -i = start -while True: -# Fast path for non-control chars. -while i < end and line[i] > '\r': -i += 1 -if i >= end: -return -1, size -ch = line[i] -i += 1 -if ch == '\n': -return i, 0 -if ch == '\r': -if line[i] == '\n': -return i + 1, 0 -else: -return i, 0 +return self._find_newline_universal(line, start, limit) if self.readtranslate: # Newlines are already translated, only search for \n newline = '\n' else:
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Ronan LamyBranch: unicode-utf8 Changeset: r93149:0797bb6394b6 Date: 2017-11-23 18:07 + http://bitbucket.org/pypy/pypy/changeset/0797bb6394b6/ Log:hg merge default diff --git a/pypy/module/_io/interp_textio.py b/pypy/module/_io/interp_textio.py --- a/pypy/module/_io/interp_textio.py +++ b/pypy/module/_io/interp_textio.py @@ -223,14 +223,7 @@ def _find_line_ending(self, line, start, end): size = end - start -if self.readtranslate: -# Newlines are already translated, only search for \n -pos = line.find('\n', start, end) -if pos >= 0: -return pos + 1, 0 -else: -return -1, size -elif self.readuniversal: +if self.readuniversal: # Universal newline search. Find any of \r, \r\n, \n # The decoder ensures that \r\n are not split in two pieces i = start @@ -249,16 +242,22 @@ return i + 1, 0 else: return i, 0 +if self.readtranslate: +# Newlines are already translated, only search for \n +newline = '\n' else: # Non-universal mode. -pos = line.find(self.readnl, start, end) -if pos >= 0: -return pos + len(self.readnl), 0 -else: -pos = line.find(self.readnl[0], start, end) -if pos >= 0: -return -1, pos - start -return -1, size +newline = self.readnl +end_scan = end - len(newline) + 1 +for i in range(start, end_scan): +ch = line[i] +if ch == newline[0]: +for j in range(1, len(newline)): +if line[i + j] != newline[j]: +break +else: +return i + len(newline), 0 +return -1, end_scan W_TextIOBase.typedef = TypeDef( @@ -548,6 +547,10 @@ self.decoded_chars_used += size return chars +def _has_data(self): +return (self.decoded_chars is not None and +self.decoded_chars_used < len(self.decoded_chars)) + def _read_chunk(self, space): """Read and decode the next chunk of data from the BufferedReader. The return value is True unless EOF was reached. The decoded string @@ -595,6 +598,19 @@ return not eof +def _ensure_data(self, space): +while not self._has_data(): +try: +if not self._read_chunk(space): +self._unset_decoded() +self.snapshot = None +return False +except OperationError as e: +if trap_eintr(space, e): +continue +raise +return True + def next_w(self, space): self._check_attached(space) self.telling = False @@ -628,23 +644,13 @@ builder = StringBuilder(size) # Keep reading chunks until we have n characters to return -while True: +while remaining > 0: +if not self._ensure_data(space): +break data = self._get_decoded_chars(remaining) builder.append(data) remaining -= len(data) -if remaining <= 0: # Done -break - -try: -if not self._read_chunk(space): -# EOF -break -except OperationError as e: -if trap_eintr(space, e): -continue -raise - return space.new_from_utf8(builder.build()) def readline_w(self, space, w_limit=None): @@ -660,20 +666,9 @@ while True: # First, get some data if necessary -has_data = True -while not self.decoded_chars: -try: -if not self._read_chunk(space): -has_data = False -break -except OperationError as e: -if trap_eintr(space, e): -continue -raise +has_data = self._ensure_data(space) if not has_data: # end of file -self._unset_decoded() -self.snapshot = None start = endpos = offset_to_buffer = 0 break ___ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Armin RigoBranch: unicode-utf8 Changeset: r92741:194830b5fc5a Date: 2017-10-12 18:37 +0200 http://bitbucket.org/pypy/pypy/changeset/194830b5fc5a/ Log:hg merge default diff too long, truncating to 2000 out of 16196 lines diff --git a/.hgtags b/.hgtags --- a/.hgtags +++ b/.hgtags @@ -40,3 +40,7 @@ 2875f328eae2216a87f3d6f335092832eb031f56 release-pypy3.5-v5.7.1 c925e73810367cd960a32592dd7f728f436c125c release-pypy2.7-v5.8.0 a37ecfe5f142bc971a86d17305cc5d1d70abec64 release-pypy3.5-v5.8.0 +03d614975835870da65ff0481e1edad68ebbcb8d release-pypy2.7-v5.9.0 +d72f9800a42b46a8056951b1da2426d2c2d8d502 release-pypy3.5-v5.9.0 +03d614975835870da65ff0481e1edad68ebbcb8d release-pypy2.7-v5.9.0 +84a2f3e6a7f88f2fe698e473998755b3bd1a12e2 release-pypy2.7-v5.9.0 diff --git a/LICENSE b/LICENSE --- a/LICENSE +++ b/LICENSE @@ -60,8 +60,8 @@ Wim Lavrijsen Eric van Riet Paap Richard Emslie + Remi Meier Alexander Schremmer - Remi Meier Dan Villiom Podlaski Christiansen Lukas Diekmann Sven Hager @@ -102,6 +102,7 @@ Michael Foord Stephan Diehl Stefano Rivera + Jean-Paul Calderone Stefan Schwarzer Tomek Meka Valentino Volonghi @@ -110,14 +111,13 @@ Bob Ippolito Bruno Gola David Malcolm - Jean-Paul Calderone Squeaky Edd Barrett Timo Paulssen Marius Gedminas + Nicolas Truessel Alexandre Fayolle Simon Burton - Nicolas Truessel Martin Matusiak Laurence Tratt Wenzhu Man @@ -156,6 +156,7 @@ Stefan H. Muller Tim Felgentreff Eugene Oden + Dodan Mihai Jeff Terrace Henry Mason Vasily Kuznetsov @@ -182,11 +183,13 @@ Rocco Moretti Gintautas Miliauskas Lucian Branescu Mihaila + Mariano Anaya anatoly techtonik - Dodan Mihai Karl Bartel + Stefan Beyer Gabriel Lavoie Jared Grubb + Alecsandru Patrascu Olivier Dormond Wouter van Heyst Sebastian Pawlu @@ -194,6 +197,7 @@ Victor Stinner Andrews Medina Aaron Iles + p_ziesch...@yahoo.de Toby Watson Daniel Patrick Stuart Williams @@ -204,6 +208,7 @@ Michael Cheng Mikael Schnenberg Stanislaw Halik + Mihnea Saracin Berkin Ilbeyi Gasper Zejn Faye Zhao @@ -214,14 +219,12 @@ Jonathan David Riehl Beatrice During Alex Perry - p_ziesch...@yahoo.de Robert Zaremba Alan McIntyre Alexander Sedov Vaibhav Sood Reuben Cummings Attila Gobi - Alecsandru Patrascu Christopher Pope Tristan Arthur Christian Tismer @@ -243,7 +246,6 @@ Jacek Generowicz Sylvain Thenault Jakub Stasiak - Stefan Beyer Andrew Dalke Alejandro J. Cura Vladimir Kryachko @@ -275,6 +277,7 @@ Christoph Gerum Miguel de Val Borro Artur Lisiecki + afteryu Toni Mattis Laurens Van Houtven Bobby Impollonia @@ -305,6 +308,7 @@ Anna Katrina Dominguez Kim Jin Su Amber Brown + Anthony Sottile Nate Bragg Ben Darnell Juan Francisco Cantero Hurtado @@ -325,12 +329,14 @@ Mike Bayer Rodrigo Arajo Daniil Yarancev + Min RK OlivierBlanvillain Jonas Pfannschmidt Zearin Andrey Churin Dan Crosta reub...@gmail.com + Stanisaw Halik Julien Phalip Roman Podoliaka Eli Stevens diff --git a/lib-python/2.7/ctypes/test/test_byteswap.py b/lib-python/2.7/ctypes/test/test_byteswap.py --- a/lib-python/2.7/ctypes/test/test_byteswap.py +++ b/lib-python/2.7/ctypes/test/test_byteswap.py @@ -23,7 +23,6 @@ setattr(bits, "i%s" % i, 1) dump(bits) -@xfail def test_endian_short(self): if sys.byteorder == "little": self.assertIs(c_short.__ctype_le__, c_short) @@ -51,7 +50,6 @@ self.assertEqual(bin(s), "3412") self.assertEqual(s.value, 0x1234) -@xfail def test_endian_int(self): if sys.byteorder == "little": self.assertIs(c_int.__ctype_le__, c_int) @@ -80,7 +78,6 @@ self.assertEqual(bin(s), "78563412") self.assertEqual(s.value, 0x12345678) -@xfail def test_endian_longlong(self): if sys.byteorder == "little": self.assertIs(c_longlong.__ctype_le__, c_longlong) @@ -109,7 +106,6 @@ self.assertEqual(bin(s), "EFCDAB9078563412") self.assertEqual(s.value, 0x1234567890ABCDEF) -@xfail def test_endian_float(self): if sys.byteorder == "little": self.assertIs(c_float.__ctype_le__, c_float) @@ -128,7 +124,6 @@ self.assertAlmostEqual(s.value, math.pi, 6) self.assertEqual(bin(struct.pack(">f", math.pi)), bin(s)) -@xfail def test_endian_double(self): if sys.byteorder == "little": self.assertIs(c_double.__ctype_le__, c_double) @@ -156,7 +151,6 @@ self.assertIs(c_char.__ctype_le__, c_char) self.assertIs(c_char.__ctype_be__, c_char) -@xfail def test_struct_fields_1(self): if sys.byteorder == "little": base = BigEndianStructure @@ -192,7 +186,6 @@ pass
[pypy-commit] pypy unicode-utf8: hg merge default
Author: Armin RigoBranch: unicode-utf8 Changeset: r92242:367afaf4ad3a Date: 2017-08-24 11:43 +0200 http://bitbucket.org/pypy/pypy/changeset/367afaf4ad3a/ Log:hg merge default Manual merges may go wrong diff too long, truncating to 2000 out of 88216 lines diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -1,6 +1,6 @@ syntax: glob *.py[co] -*.sw[po] +*.sw[pon] *~ .*.swp .idea @@ -8,6 +8,8 @@ .pydevproject __pycache__ +.cache/ +.gdb_history syntax: regexp ^testresult$ ^site-packages$ @@ -23,16 +25,17 @@ ^pypy/module/cpyext/test/.+\.manifest$ ^pypy/module/test_lib_pypy/ctypes_tests/.+\.o$ ^pypy/module/test_lib_pypy/ctypes_tests/_ctypes_test\.o$ -^pypy/module/cppyy/src/.+\.o$ -^pypy/module/cppyy/bench/.+\.so$ -^pypy/module/cppyy/bench/.+\.root$ -^pypy/module/cppyy/bench/.+\.d$ -^pypy/module/cppyy/src/.+\.errors$ -^pypy/module/cppyy/test/.+_rflx\.cpp$ -^pypy/module/cppyy/test/.+\.so$ -^pypy/module/cppyy/test/.+\.rootmap$ -^pypy/module/cppyy/test/.+\.exe$ -^pypy/module/cppyy/test/.+_cint.h$ +^pypy/module/_cppyy/src/.+\.o$ +^pypy/module/_cppyy/bench/.+\.so$ +^pypy/module/_cppyy/bench/.+\.root$ +^pypy/module/_cppyy/bench/.+\.d$ +^pypy/module/_cppyy/src/.+\.errors$ +^pypy/module/_cppyy/test/.+_rflx\.cpp$ +^pypy/module/_cppyy/test/.+\.so$ +^pypy/module/_cppyy/test/.+\.rootmap$ +^pypy/module/_cppyy/test/.+\.exe$ +^pypy/module/_cppyy/test/.+_cint.h$ +^pypy/module/_cppyy/.+/*\.pcm$ ^pypy/module/test_lib_pypy/cffi_tests/__pycache__.+$ ^pypy/doc/.+\.html$ ^pypy/doc/config/.+\.rst$ @@ -49,6 +52,11 @@ ^rpython/translator/goal/target.+-c$ ^rpython/translator/goal/.+\.exe$ ^rpython/translator/goal/.+\.dll$ +^rpython/rlib/rvmprof/src/shared/libbacktrace/Makefile$ +^rpython/rlib/rvmprof/src/shared/libbacktrace/config.guess$ +^rpython/rlib/rvmprof/src/shared/libbacktrace/config.h$ +^rpython/rlib/rvmprof/src/shared/libbacktrace/config.log$ +^rpython/rlib/rvmprof/src/shared/libbacktrace/config.status$ ^pypy/goal/pypy-translation-snapshot$ ^pypy/goal/pypy-c ^pypy/goal/.+\.exe$ @@ -60,6 +68,9 @@ ^lib_pypy/ctypes_config_cache/_.+_cache\.py$ ^lib_pypy/ctypes_config_cache/_.+_.+_\.py$ ^lib_pypy/_libmpdec/.+.o$ +^lib_pypy/.+.c$ +^lib_pypy/.+.o$ +^lib_pypy/.+.so$ ^pypy/doc/discussion/.+\.html$ ^include/.+\.h$ ^include/.+\.inl$ @@ -74,8 +85,7 @@ ^rpython/doc/_build/.*$ ^compiled ^.git/ -^.hypothesis/ +.hypothesis/ ^release/ ^rpython/_cache$ -pypy/module/cppyy/.+/*\.pcm diff --git a/.hgtags b/.hgtags --- a/.hgtags +++ b/.hgtags @@ -34,3 +34,9 @@ 050d84dd78997f021acf0e133934275d63547cc0 release-pypy2.7-v5.4.1 0e2d9a73f5a1818d0245d75daccdbe21b2d5c3ef release-pypy2.7-v5.4.1 aff251e543859ce4508159dd9f1a82a2f553de00 release-pypy2.7-v5.6.0 +fa3249d55d15b9829e1be69cdf45b5a44cec902d release-pypy2.7-v5.7.0 +b16a4363e930f6401bceb499b9520955504c6cb0 release-pypy3.5-v5.7.0 +1aa2d8e03cdfab54b7121e93fda7e98ea88a30bf release-pypy2.7-v5.7.1 +2875f328eae2216a87f3d6f335092832eb031f56 release-pypy3.5-v5.7.1 +c925e73810367cd960a32592dd7f728f436c125c release-pypy2.7-v5.8.0 +a37ecfe5f142bc971a86d17305cc5d1d70abec64 release-pypy3.5-v5.8.0 diff --git a/LICENSE b/LICENSE --- a/LICENSE +++ b/LICENSE @@ -1,3 +1,5 @@ +#encoding utf-8 + License === @@ -37,14 +39,14 @@ Armin Rigo Maciej Fijalkowski - Carl Friedrich Bolz + Carl Friedrich Bolz-Tereick Amaury Forgeot d'Arc Antonio Cuni + Matti Picus Samuele Pedroni - Matti Picus + Ronan Lamy Alex Gaynor Philip Jenvey - Ronan Lamy Brian Kearns Richard Plangger Michael Hudson @@ -55,12 +57,12 @@ Hakan Ardo Benjamin Peterson Anders Chrigstrom + Wim Lavrijsen Eric van Riet Paap - Wim Lavrijsen Richard Emslie Alexander Schremmer + Remi Meier Dan Villiom Podlaski Christiansen - Remi Meier Lukas Diekmann Sven Hager Anders Lehmann @@ -83,8 +85,8 @@ Lawrence Oluyede Bartosz Skowron Daniel Roberts + Adrien Di Mascio Niko Matsakis - Adrien Di Mascio Alexander Hesse Ludovic Aubry Jacob Hallen @@ -99,278 +101,288 @@ Vincent Legoll Michael Foord Stephan Diehl + Stefano Rivera Stefan Schwarzer + Tomek Meka Valentino Volonghi - Tomek Meka - Stefano Rivera Patrick Maupin Devin Jeanpierre Bob Ippolito Bruno Gola David Malcolm Jean-Paul Calderone + Squeaky + Edd Barrett Timo Paulssen - Edd Barrett - Squeaky Marius Gedminas Alexandre Fayolle Simon Burton + Nicolas Truessel Martin Matusiak - Nicolas Truessel + Laurence Tratt + Wenzhu Man Konstantin Lopuhin - Wenzhu Man John Witulski - Laurence Tratt + Greg Price Ivan Sichmann Freitas - Greg Price Dario Bertini + Jeremy Thurgood Mark Pearse Simon Cross - Jeremy Thurgood + Tobias Pape Andreas Sthrk - Tobias Pape Jean-Philippe St. Pierre Guido van Rossum Pavel Vinogradov Pawe Piotr Przeradowski + William Leslie + marky1991 + Ilya Osadchiy + Tobias Oberstein Paul deGrandis - Ilya Osadchiy -