Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]
On Thu, Oct 13, 2016 at 12:01 PM, Ryan Gonzalezwrote: > On Wed, Oct 12, 2016 at 7:05 PM, Chris Angelico wrote: >> >> I work with a full-stack web development bootcamp. Most of the course >> focuses on JavaScript (Node.js, React, jQuery, etc), > > > Poor students... ;) The bootcamp guarantees them employment upon graduation (subject to some constraints, yada yada), so it teaches those skills that are most likely to be employable. But one of those skills is learning new technologies, hence the freedom to pick anything they like. > I think the craziest thing is probably that, based on how you said it, these > two students haven't even begun to enter the entire Python standard library, > which you'd have to download a zillion npm modules (like the glorious > left-pad) in order to match. Once they realize that, they'll never be going > back! Yes, this is true; but pip/PyPI is still an important part of web dev in Python (eg Flask, SQLAlchemy, etc). My explanation to them is that the dependency tree in Python is not as deep as the equivalent in Ruby or JavaScript, but it's no less there. But Python definitely does offer a far richer standard library. It's the little things, sometimes: var choice = messages[Math.floor(Math.random() * messages.length)]; choice = random.choice(messages) The debate on whether it's worth sacrificing this kind of advantage in order to use the same language on both client and server is unlikely to be resolved any time soon. In the meantime, I'm very happy to be able to introduce a few people to the joy of Pythonning. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]
Thanks for passing this feedback along, Chris! It's always wonderful to see developers feeling empowered by the potential that open source tools offer them. On 13 October 2016 at 11:01, Ryan Gonzalezwrote: > Poor students... ;) Folks, as tempting as it may be to make jokes at the expense of other programming languages, please try to ensure that references to them on the core Python lists are formulated on the basis of "What can we learn from their experiences?", rather than as generic putdowns of entire software development ecosystems. Even as a lighthearted joke (as here), it isn't helpful to the design process to categorise programming languages as being generically "better" or "worse" than each other, rather than seeing them as embodiments of different ways of thinking about algorithmic problem solving. In combination with the W3C HTML5 and CSS standardisation work, the JavaScript community have put together a superb set of tools for creating user interfaces that are independent of the backend API server implementation language, as well as useful tools for remote data access and data transformation pipelines. The fact that all this work is being done in the open and made freely available as open source software means that the Python community is able to benefit from these capabilities as much as anyone. Regards, Nick. P.S. If anyone would like more background on why the "Our language is universally better than your language" approach can be problematic (even in jest!), please take a look at Aurynn Shaw's piece on Contempt Culture in programming communities and the barriers that can create to effective collaboration: http://blog.aurynn.com/contempt-culture There's also my own http://www.curiousefficiency.org/posts/2015/10/languages-to-improve-your-python.html which looks at some other ways in which dismissing ecosystems out of hand can inhibit our ability to learn from both their mistakes and their successes. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On 13 October 2016 at 12:54, Nick Coghlanwrote: > Method proliferation on builtins is a Big Deal(TM) I wanted to quantify this concept, so here's a quick metric that helps convey how every time we add a new builtin method we're immediately making Python harder to comprehend: >>> def get_builtin_types(): ... import builtins ... return {name:obj for name, obj in vars(builtins).items() if isinstance(obj, type) and not (name.startswith("__") or issubclass(obj, BaseException))} ... >>> len(get_builtin_types()) 26 >>> def get_builtin_methods(): ... return [(name, method_name) for name, obj in get_builtin_types().items() for method_name, method in vars(obj).items() if not method_name.startswith("__")] ... >>> len(get_builtin_methods()) 230 Putting special purpose functionality behind an import gate helps to provide a more explicit context of use (in this case, IO buffer manipulation) vs the relatively domain independent namespace that is the builtins. Cheers, Nick. P.S. Since I was poking around in the builtins anyway, here are some other simple language complexity metrics: >>> len(vars(builtins)) 151 >>> def get_interpreter_builtins(): ... import builtins ... return {name:obj for name, obj in vars(builtins).items() if name.startswith("__")} ... >>> len(get_interpreter_builtins()) 8 >>> def get_builtin_exceptions(): ... import builtins ... return {name:obj for name, obj in vars(builtins).items() if isinstance(obj, type) and issubclass(obj, BaseException)} ... >>> len(get_builtin_exceptions()) 65 >>> def get_builtin_functions(): ... import builtins ... return {name:obj for name, obj in vars(builtins).items() if isinstance(obj, type(repr))} ... >>> len(get_builtin_functions()) 42 >>> def get_other_builtins(): ... import builtins ... return {name:obj for name, obj in vars(builtins).items() if not name.startswith("__") and not isinstance(obj, (type, type(repr)))} ... >>> len(get_other_builtins()) 12 The "other" builtins are the builtin constants (None, True, False, Ellipsis, NotImplemented) and various artifacts from doing this at the interactive prompt (license, credits, copyright, quit, exit, help, "_") -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
> > Also, why do the conversion from bytearray to bytes? It is definitely not > always needed. > ba = bytearray(b'abc') b = b'def' ba + b > bytearray(b'abcdef') b'%s %s' % (ba, b) > b'abc def' b + ba > b'defabc' ba.extend(b) ba > bytearray(b'abcdef') > > Even if it is sometimes needed, why do it always? The essence of read_line > is to slice out a line, delete it from the buffer, and return the line. Let > the caller explicitly convert when needed. > > -- > Terry Jan Reedy > Because it's public module API. While bytearray is mostly API compatible (passes duck typing), isinstance(b, bytes) is False when b is bytearray. So, I feel changing return type from bytes to bytearray is last option. I want to return bytes if possible. -- INADA Naoki___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On 13 October 2016 at 02:37, Stephen J. Turnbullwrote: > Victor Stinner writes: > > 2016-10-12 11:34 GMT+02:00 INADA Naoki : > > > > I see. My proposal should be another PEP (if PEP is required). > > > > I don't think that adding a single method deserves its own method. > > You mean "deserves own PEP", right? I interpreted Nick to say that > "the reasons that applied to PEP 367 don't apply here, so you can Just > Do It" (subject to the usual criteria for review, but omit the PEP). Sort of. Adding this to PEP 467 doesn't make sense (as it's not related to easing migration from Python 2 or addressing the mutable->immutable design legacy), but I don't have an opinion yet on whether this should be a PEP or not - that really depends on whether we tackle it as an implementation detail of asyncio, or as a public API in its own right. Method proliferation on builtins is a Big Deal(TM), and efficient buffer management for IO protocol development is a relatively arcane speciality (as well as one where there are dedicated OS level capabilities we may want to exploit some day), which is why I think a dedicated helper module is likely a better way to go. For example: - add `asyncio._iobuffers` as a pure Python memoryview based implementation of the desired buffer management semantics - add `_iobuffers` as an optional asyncio independent accelerator module for `asyncio._iobuffers` If that works out satisfactorily, *then* consider a PEP to either make `iobuffers` a public module in its own right (ala the `selectors` module from the original asyncio implementation), or to expose some of its features directly via the builtin binary data types. The logical leap I strongly disagree with is going straight from "asyncio needs some better low level IO buffer manipulation primitives" to "we should turn the builtin types into low level IO buffer manipulation primitives that are sufficient for asyncio's needs". The notion of "we shouldn't need to define our own domain specific helper libraries" isn't a given for standard library modules any more than it is for 3rd party ones. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]
On Wed, Oct 12, 2016 at 7:05 PM, Chris Angelicowrote: > I work with a full-stack web development bootcamp. Most of the course > focuses on *JavaScript (Node.js, React, jQuery, etc),* Poor students... ;) > but there's a > one-week period in which each student gets to pick some technology to > learn, and at the end of the week, demos to the group some project > s/he has mastered. Two chose to learn Python, and I've been mentoring > them through this week. > > The comments from each of them have been fairly glowing. Python is > this incredible thing that has immense power and flexibility; > significant-whitespace hasn't been a cause of confusion (not even > mentioned after the first day); and > > The most notable features of Python, for these two > JS-only-up-until-now guys, are the simplicity of the 'for' loop > (including that you don't need lots of different forms - you can > iterate over a dictionary without having to learn some new type of > loop), the list comprehension, and metaprogramming - mainly function > decorator syntax. And both of them are starting to talk about being > "converts" to Python :) > > Great job, all. Not that it's particularly difficult to compete with a > language that was originally designed and developed in under two > weeks, but still. :D > > I think the craziest thing is probably that, based on how you said it, these two students haven't even begun to enter the entire Python standard library, which you'd have to download a zillion npm modules (like the glorious left-pad) in order to match. Once they realize that, they'll never be going back! > ChrisA > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > rymg19%40gmail.com > -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python ROCKS! Thanks guys! [anecdote]
I work with a full-stack web development bootcamp. Most of the course focuses on JavaScript (Node.js, React, jQuery, etc), but there's a one-week period in which each student gets to pick some technology to learn, and at the end of the week, demos to the group some project s/he has mastered. Two chose to learn Python, and I've been mentoring them through this week. The comments from each of them have been fairly glowing. Python is this incredible thing that has immense power and flexibility; significant-whitespace hasn't been a cause of confusion (not even mentioned after the first day); and The most notable features of Python, for these two JS-only-up-until-now guys, are the simplicity of the 'for' loop (including that you don't need lots of different forms - you can iterate over a dictionary without having to learn some new type of loop), the list comprehension, and metaprogramming - mainly function decorator syntax. And both of them are starting to talk about being "converts" to Python :) Great job, all. Not that it's particularly difficult to compete with a language that was originally designed and developed in under two weeks, but still. :D ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On Wed, Oct 12, 2016 at 3:28 AM, INADA Naokiwrote: > When Tornado drop Python 2.7 support, they can use bytearray, and > iostream can be more simple and fast. FYI 2.7 does have bytearray. (You still have to implement the O(1) deletion part as a layer on top, like Victor points out, but I suspect that'd still be dramatically simpler than what they're doing now...) -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On Wed, Oct 12, 2016 at 4:55 AM, Victor Stinnerwrote: > 2016-10-12 10:01 GMT+02:00 Nathaniel Smith : >> It's more complicated than that -- the right algorithm is the one that >> Antoine implemented in 3.4. >> (...) >> My point is that >> forcing everyone who writes network code in Python to do that is >> silly, especially given that CPython's apparently been shipping this >> feature for years. > > "For years" means since March 2014, Python 3.4.0 release, so 2 years ago. > > We can document the optimization as a CPython implementation detail > and explain that it's only in Python >= 3.4. > > So an application which should work on Python 2.7 as well cannot rely > on this optimization for example. The proposal is that it should be documented as being part of the language spec starting in 3.4 (or whatever). So applications that support Python 2.7 can't rely on it, sure. But if I have an application that requires, say, 3.5+ but I don't want to depend on CPython-only implementation details, then I'm still allowed to use it. AFAIK basically the only project that would be affected by this is PyPy, and I when I asked on #pypy they said: njs`: I think we either plan to or already support this so I'm not sure why this is controversial. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On 10/12/2016 5:42 AM, INADA Naoki wrote: On Wed, Oct 12, 2016 at 2:32 PM, Serhiy Storchakawrote: On 12.10.16 07:08, INADA Naoki wrote: Sample code: def read_line(buf: bytearray) -> bytes: try: n = buf.index(b'\r\n') except ValueError: return b'' line = bytes(buf)[:n] # bytearray -> bytes -> bytes Wouldn't be more correct to write this as bytes(buf[:n])? Yes, you're right! I shouldn't copy whole data only for cast from bytearray to byte. Also, why do the conversion from bytearray to bytes? It is definitely not always needed. >>> ba = bytearray(b'abc') >>> b = b'def' >>> ba + b bytearray(b'abcdef') >>> b'%s %s' % (ba, b) b'abc def' >>> b + ba b'defabc' >>> ba.extend(b) >>> ba bytearray(b'abcdef') Even if it is sometimes needed, why do it always? The essence of read_line is to slice out a line, delete it from the buffer, and return the line. Let the caller explicitly convert when needed. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
Oops, right, I wanted to write "I don't think that adding a single method deserves its own PEP." Victor 2016-10-12 18:37 GMT+02:00 Stephen J. Turnbull: > Victor Stinner writes: > > 2016-10-12 11:34 GMT+02:00 INADA Naoki : > > > > I see. My proposal should be another PEP (if PEP is required). > > > > I don't think that adding a single method deserves its own method. > > You mean "deserves own PEP", right? I interpreted Nick to say that > "the reasons that applied to PEP 367 don't apply here, so you can Just > Do It" (subject to the usual criteria for review, but omit the PEP). > > I'm not sure whether he was channeling Guido or that should be > qualified with an IMO or IMHO. > > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
Victor Stinner writes: > 2016-10-12 11:34 GMT+02:00 INADA Naoki: > > I see. My proposal should be another PEP (if PEP is required). > > I don't think that adding a single method deserves its own method. You mean "deserves own PEP", right? I interpreted Nick to say that "the reasons that applied to PEP 367 don't apply here, so you can Just Do It" (subject to the usual criteria for review, but omit the PEP). I'm not sure whether he was channeling Guido or that should be qualified with an IMO or IMHO. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467
On Oct 11 2016, Nathaniel Smithwrote: > On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki wrote: >> From Python 3.4, bytearray is good solution for I/O buffer, thanks to >> #19087 [1]. >> Actually, asyncio uses bytearray as I/O buffer often. > > Whoa what?! This is awesome, I had no idea that bytearray had O(1) > deletes at the front. I literally reimplemented this myself on type of > bytearray for some 3.5-only code recently because I assumed bytearray > had the same asymptotics as list, and AFAICT this is totally > undocumented. Indeed, same here. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
2016-10-12 10:01 GMT+02:00 Nathaniel Smith: > It's more complicated than that -- the right algorithm is the one that > Antoine implemented in 3.4. > (...) > My point is that > forcing everyone who writes network code in Python to do that is > silly, especially given that CPython's apparently been shipping this > feature for years. "For years" means since March 2014, Python 3.4.0 release, so 2 years ago. We can document the optimization as a CPython implementation detail and explain that it's only in Python >= 3.4. So an application which should work on Python 2.7 as well cannot rely on this optimization for example. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
2016-10-12 11:34 GMT+02:00 INADA Naoki: > I see. My proposal should be another PEP (if PEP is required). I don't think that adding a single method deserves its own method. I like the idea with Serhiy's API (as Python 2 buffer constructor): bytes.frombuf(buffer, [offset, size]) bytearray.frombuf(buffer, [offset, size]) memoryview.frombuf(buffer, [offset, size]) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
>> >> [1] My use case is parsing HTTP out of a receive buffer. If deleting >> the first k bytes of an N byte buffer is O(N), then not only does >> parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2) >> that random untrusted network clients can trigger at will to DoS your >> server. > > > Deleting from buffer can be avoided if pass the starting index together with > the buffer. For example: > > def read_line(buf: bytes, start: int) -> (bytes, int): > try: > end = buf.index(b'\r\n', start) > except ValueError: > return b'', start > > return buf[start:end], end+2 > In case of asyncio, we can't assume the order of append and consume. For example, stream processing HTTP chunked response. Append to receive buffer and consume a chunk in buffer can happen in arbitrary order. That's why bytes is not good for receive buffer. Efficient append is "must have". For example, Torando implements receive buffer by deque of bytes. See this code. https://github.com/tornadoweb/tornado/blob/master/tornado/iostream.py#L784-L817 When Tornado drop Python 2.7 support, they can use bytearray, and iostream can be more simple and fast. So I hope "amortized O(1) deletion from the front" is language spec, at least for Python 3.5+ -- INADA Naoki___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On Wed, Oct 12, 2016 at 2:32 PM, Serhiy Storchakawrote: > On 12.10.16 07:08, INADA Naoki wrote: >> >> Sample code: >> >> def read_line(buf: bytearray) -> bytes: >> try: >> n = buf.index(b'\r\n') >> except ValueError: >> return b'' >> >> line = bytes(buf)[:n] # bytearray -> bytes -> bytes > > > Wouldn't be more correct to write this as bytes(buf[:n])? Yes, you're right! I shouldn't copy whole data only for cast from bytearray to byte. > >> Adding one more constructor to bytes: >> >> # when length=-1 (default), use until end of *byteslike*. >> bytes.frombuffer(byteslike, length=-1, offset=0) > > > This interface looks unusual. Would not be better to support the interface > of buffer in Python 2: buffer(object [, offset[, size]])? > It looks better. (Actually speaking, I love deprecated old buffer for simplicity. memoryview supports non bytes-like complex data types.) Thanks, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
On Wed, Oct 12, 2016 at 2:07 PM, Nick Coghlanwrote: > I don't think it makes sense to add any more ideas to PEP 467. That > needed to be a PEP because it proposed breaking backwards > compatibility in a couple of areas, and because of the complex history > of Python 3's "bytes-as-tuple-of-ints" and Python 2's "bytes-as-str" > semantics. > > Other enhancements to the binary data handling APIs in Python 3 can be > considered on their own merits. > I see. My proposal should be another PEP (if PEP is required). >> >> * It isn't "one obvious way": Developers including me may forget to >> use context manager. >> And since it works on CPython, it's hard to point it out. > > To add to the confusion, there's also > https://docs.python.org/3/library/stdtypes.html#memoryview.tobytes > giving: > > line = memoryview(buf)[:n].tobytes() > > However, folks *do* need to learn that many mutable data types will > lock themselves against modification while you have a live memory view > on them, so it's important to release views promptly and reliably when > we don't need them any more. > I agree. io.TextWrapper objects reports ResourceWarning for unclosed file. I think same warning for unclosed memoryview objects may help developers. >> Quick benchmark: >> >> (temporary bytes) >> $ python3 -m perf timeit -s 'buf = >> bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- 'bytes(buf)[:3]' >> >> Median +- std dev: 652 ns +- 19 ns >> >> (temporary memoryview without "with" >> $ python3 -m perf timeit -s 'buf = >> bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- 'bytes(memoryview(buf)[:3])' >> >> Median +- std dev: 886 ns +- 26 ns >> >> (temporary memoryview with "with") >> $ python3 -m perf timeit -s 'buf = bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- ' >> with memoryview(buf) as m: >> bytes(m[:3]) >> ' >> >> Median +- std dev: 1.11 us +- 0.03 us > > This is normal though, as memory views trade lower O(N) costs (reduced > data copying) for higher O(1) setup costs (creating and managing the > view, indirection for data access). Yes. When data is small, benefit of less data copy can be hidden easily. One big difficulty of I/O frameworks like asyncio is: we can't assume data size. Framework should be optimized for both of many small chunks and large data. With memoryview, when we optimize for large data (e.g. downloading large file), performance for massive small data (e.g. small JSON API) become worse. Actually, one pull request is gave up to use memoryview because of it. https://github.com/python/asyncio/pull/395#issuecomment-249044218 > >> Proposed solution >> === >> >> Adding one more constructor to bytes: >> >> # when length=-1 (default), use until end of *byteslike*. >> bytes.frombuffer(byteslike, length=-1, offset=0) >> >> With ths API >> >> with memoryview(buf) as m: >> line = bytes(m[:n]) >> >> becomes >> >> line = bytes.frombuffer(buf, n) > > Does that need to be a method on the builtin rather than a separate > helper function, though? Once you define: > > def snapshot(buf, length=None, offset=0): > with memoryview(buf) as m: > return m[offset:length].tobytes() > > then that can be replaced by a more optimised C implementation without > users needing to care about the internal details. I'm thinking about adding such helper function in asyncio speedup C extension. But there are some other non-blocking I/O frameworks: Tornado, Twisted, and curio. And relying on C extention make harder to optimize for other Python implementation. If it is in standard library, PyPy and other Python implementation can optimize it. > > That is, getting back to a variant on one of Serhiy's suggestions in > the last PEP 467 discussion, it may make sense for us to offer a > "buffertools" library that's specifically aimed at supporting > efficient buffer manipulation operations that minimise data copying. > The pure Python implementations would work entirely through > memoryview, but we could also have selected C accelerated operations > if that showed a noticeable improvement on asyncio's benchmarks. > It seems nice idea. I'll read the discussion. > Regards, > Nick. > > P.S. The length/offset API design is also problematic due to the way > it differs from range() & slice(), but I don't think it makes sense to > get into that kind of detail before discussing the larger question of > adding a new helper module for working efficiently with memory buffers > vs further widening the method API for the builtin bytes type > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia I avoid slice API intentionally, because if it seems like slice, someone will propose adding `step` support only for consistency. But, as Serhiy said, consistent with old buffer API is nice. -- INADA Naoki ___ Python-Dev mailing list
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On Wed, Oct 12, 2016 at 12:17 AM, Serhiy Storchakawrote: > On 12.10.16 09:31, Nathaniel Smith wrote: >> >> But amortized O(1) deletes from the front of bytearray are totally >> different, and more like amortized O(1) appends to list: there are >> important use cases[1] that simply cannot be implemented without some >> feature like this, and putting the implementation inside bytearray is >> straightforward, deterministic, and more efficiently than hacking >> together something on top. Python should just guarantee it, IMO. >> >> -n >> >> [1] My use case is parsing HTTP out of a receive buffer. If deleting >> the first k bytes of an N byte buffer is O(N), then not only does >> parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2) >> that random untrusted network clients can trigger at will to DoS your >> server. > > > Deleting from buffer can be avoided if pass the starting index together with > the buffer. For example: > > def read_line(buf: bytes, start: int) -> (bytes, int): > try: > end = buf.index(b'\r\n', start) > except ValueError: > return b'', start > > return buf[start:end], end+2 It's more complicated than that -- the right algorithm is the one that Antoine implemented in 3.4. But yes, having implemented this by hand, I am aware that it can be implemented by hand :-). My point is that forcing everyone who writes network code in Python to do that is silly, especially given that CPython's apparently been shipping this feature for years. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On 12.10.16 09:31, Nathaniel Smith wrote: But amortized O(1) deletes from the front of bytearray are totally different, and more like amortized O(1) appends to list: there are important use cases[1] that simply cannot be implemented without some feature like this, and putting the implementation inside bytearray is straightforward, deterministic, and more efficiently than hacking together something on top. Python should just guarantee it, IMO. -n [1] My use case is parsing HTTP out of a receive buffer. If deleting the first k bytes of an N byte buffer is O(N), then not only does parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2) that random untrusted network clients can trigger at will to DoS your server. Deleting from buffer can be avoided if pass the starting index together with the buffer. For example: def read_line(buf: bytes, start: int) -> (bytes, int): try: end = buf.index(b'\r\n', start) except ValueError: return b'', start return buf[start:end], end+2 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On Tue, Oct 11, 2016 at 10:53 PM, Serhiy Storchakawrote: > On 12.10.16 08:03, Nathaniel Smith wrote: >> >> On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki >> wrote: >>> >>> From Python 3.4, bytearray is good solution for I/O buffer, thanks to >>> #19087 [1]. >>> Actually, asyncio uses bytearray as I/O buffer often. >> >> >> Whoa what?! This is awesome, I had no idea that bytearray had O(1) >> deletes at the front. I literally reimplemented this myself on type of >> bytearray for some 3.5-only code recently because I assumed bytearray >> had the same asymptotics as list, and AFAICT this is totally >> undocumented. Shouldn't we write this down somewhere? Maybe here? -> >> https://docs.python.org/3/library/functions.html#bytearray > > > I afraid this is CPython implementation detail (like string concatenation > optimization). Other implementations can have O(N) deletes at the front of > bytearray. Well, it shouldn't be :-). The problem with the string concatenation optimization is that to work, it requires both the use of refcounting GC and that you get lucky with how the underlying malloc implementation happened to lay things out in memory. Obviously it shouldn't be part of the language spec. But amortized O(1) deletes from the front of bytearray are totally different, and more like amortized O(1) appends to list: there are important use cases[1] that simply cannot be implemented without some feature like this, and putting the implementation inside bytearray is straightforward, deterministic, and more efficiently than hacking together something on top. Python should just guarantee it, IMO. -n [1] My use case is parsing HTTP out of a receive buffer. If deleting the first k bytes of an N byte buffer is O(N), then not only does parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2) that random untrusted network clients can trigger at will to DoS your server. -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com