Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]

2016-10-12 Thread Chris Angelico
On Thu, Oct 13, 2016 at 12:01 PM, Ryan Gonzalez  wrote:
> On Wed, Oct 12, 2016 at 7:05 PM, Chris Angelico  wrote:
>>
>> I work with a full-stack web development bootcamp. Most of the course
>> focuses on JavaScript (Node.js, React, jQuery, etc),
>
>
> Poor students... ;)

The bootcamp guarantees them employment upon graduation (subject to
some constraints, yada yada), so it teaches those skills that are most
likely to be employable. But one of those skills is learning new
technologies, hence the freedom to pick anything they like.

> I think the craziest thing is probably that, based on how you said it, these
> two students haven't even begun to enter the entire Python standard library,
> which you'd have to download a zillion npm modules (like the glorious
> left-pad) in order to match. Once they realize that, they'll never be going
> back!

Yes, this is true; but pip/PyPI is still an important part of web dev
in Python (eg Flask, SQLAlchemy, etc). My explanation to them is that
the dependency tree in Python is not as deep as the equivalent in Ruby
or JavaScript, but it's no less there. But Python definitely does
offer a far richer standard library. It's the little things,
sometimes:

var choice = messages[Math.floor(Math.random() * messages.length)];

choice = random.choice(messages)

The debate on whether it's worth sacrificing this kind of advantage in
order to use the same language on both client and server is unlikely
to be resolved any time soon. In the meantime, I'm very happy to be
able to introduce a few people to the joy of Pythonning.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]

2016-10-12 Thread Nick Coghlan
Thanks for passing this feedback along, Chris! It's always wonderful
to see developers feeling empowered by the potential that open source
tools offer them.

On 13 October 2016 at 11:01, Ryan Gonzalez  wrote:
> Poor students... ;)

Folks, as tempting as it may be to make jokes at the expense of other
programming languages, please try to ensure that references to them on
the core Python lists are formulated on the basis of "What can we
learn from their experiences?", rather than as generic putdowns of
entire software development ecosystems. Even as a lighthearted joke
(as here), it isn't helpful to the design process to categorise
programming languages as being generically "better" or "worse" than
each other, rather than seeing them as embodiments of different ways
of thinking about algorithmic problem solving.

In combination with the W3C HTML5 and CSS standardisation work, the
JavaScript community have put together a superb set of tools for
creating user interfaces that are independent of the backend API
server implementation language, as well as useful tools for remote
data access and data transformation pipelines. The fact that all this
work is being done in the open and made freely available as open
source software means that the Python community is able to benefit
from these capabilities as much as anyone.

Regards,
Nick.

P.S. If anyone would like more background on why the "Our language is
universally better than your language" approach can be problematic
(even in jest!), please take a look at Aurynn Shaw's piece on Contempt
Culture in programming communities and the barriers that can create to
effective collaboration: http://blog.aurynn.com/contempt-culture

There's also my own
http://www.curiousefficiency.org/posts/2015/10/languages-to-improve-your-python.html
which looks at some other ways in which dismissing ecosystems out of
hand can inhibit our ability to learn from both their mistakes and
their successes.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Nick Coghlan
On 13 October 2016 at 12:54, Nick Coghlan  wrote:
> Method proliferation on builtins is a Big Deal(TM)

I wanted to quantify this concept, so here's a quick metric that helps
convey how every time we add a new builtin method we're immediately
making Python harder to comprehend:

>>> def get_builtin_types():
... import builtins
... return {name:obj for name, obj in vars(builtins).items()
if isinstance(obj, type) and not (name.startswith("__") or
issubclass(obj, BaseException))}
...
>>> len(get_builtin_types())
26
>>> def get_builtin_methods():
... return [(name, method_name) for name, obj in
get_builtin_types().items() for method_name, method in
vars(obj).items() if not method_name.startswith("__")]
...
>>> len(get_builtin_methods())
230

Putting special purpose functionality behind an import gate helps to
provide a more explicit context of use (in this case, IO buffer
manipulation) vs the relatively domain independent namespace that is
the builtins.

Cheers,
Nick.

P.S. Since I was poking around in the builtins anyway, here are some
other simple language complexity metrics:

>>> len(vars(builtins))
151
>>> def get_interpreter_builtins():
... import builtins
... return {name:obj for name, obj in vars(builtins).items()
if name.startswith("__")}
...
>>> len(get_interpreter_builtins())
8
>>> def get_builtin_exceptions():
... import builtins
... return {name:obj for name, obj in vars(builtins).items()
if isinstance(obj, type) and issubclass(obj, BaseException)}
...
>>> len(get_builtin_exceptions())
65
>>> def get_builtin_functions():
... import builtins
... return {name:obj for name, obj in vars(builtins).items()
if isinstance(obj, type(repr))}
...
>>> len(get_builtin_functions())
42
>>> def get_other_builtins():
... import builtins
... return {name:obj for name, obj in vars(builtins).items()
if not name.startswith("__") and not isinstance(obj, (type,
type(repr)))}
...
>>> len(get_other_builtins())
12

The "other" builtins are the builtin constants (None, True, False,
Ellipsis, NotImplemented) and various artifacts from doing this at the
interactive prompt (license, credits, copyright, quit, exit, help,
"_")

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread INADA Naoki
>
> Also, why do the conversion from bytearray to bytes?  It is definitely not
> always needed.
>
 ba = bytearray(b'abc')
 b = b'def'
 ba + b
> bytearray(b'abcdef')
 b'%s %s' % (ba, b)
> b'abc def'
 b + ba
> b'defabc'
 ba.extend(b)
 ba
> bytearray(b'abcdef')
>
> Even if it is sometimes needed, why do it always?  The essence of read_line
> is to slice out a line, delete it from the buffer, and return the line.  Let
> the caller explicitly convert when needed.
>
> --
> Terry Jan Reedy
>

Because it's public module API.

While bytearray is mostly API compatible (passes duck typing),
isinstance(b, bytes) is False when b is bytearray.

So, I feel changing return type from bytes to bytearray is last option.
I want to return bytes if possible.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Nick Coghlan
On 13 October 2016 at 02:37, Stephen J. Turnbull
 wrote:
> Victor Stinner writes:
>  > 2016-10-12 11:34 GMT+02:00 INADA Naoki :
>
>  > > I see.  My proposal should be another PEP (if PEP is required).
>  >
>  > I don't think that adding a single method deserves its own method.
>
> You mean "deserves own PEP", right?  I interpreted Nick to say that
> "the reasons that applied to PEP 367 don't apply here, so you can Just
> Do It" (subject to the usual criteria for review, but omit the PEP).

Sort of. Adding this to PEP 467 doesn't make sense (as it's not
related to easing migration from Python 2 or addressing the
mutable->immutable design legacy), but I don't have an opinion yet on
whether this should be a PEP or not - that really depends on whether
we tackle it as an implementation detail of asyncio, or as a public
API in its own right.

Method proliferation on builtins is a Big Deal(TM), and efficient
buffer management for IO protocol development is a relatively arcane
speciality (as well as one where there are dedicated OS level
capabilities we may want to exploit some day), which is why I think a
dedicated helper module is likely a better way to go. For example:

- add `asyncio._iobuffers` as a pure Python memoryview based
implementation of the desired buffer management semantics
- add `_iobuffers` as an optional asyncio independent accelerator
module for `asyncio._iobuffers`

If that works out satisfactorily, *then* consider a PEP to either make
`iobuffers` a public module in its own right (ala the `selectors`
module from the original asyncio implementation), or to expose some of
its features directly via the builtin binary data types.

The logical leap I strongly disagree with is going straight from
"asyncio needs some better low level IO buffer manipulation
primitives" to "we should turn the builtin types into low level IO
buffer manipulation primitives that are sufficient for asyncio's
needs". The notion of "we shouldn't need to define our own domain
specific helper libraries" isn't a given for standard library modules
any more than it is for 3rd party ones.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python ROCKS! Thanks guys! [anecdote]

2016-10-12 Thread Ryan Gonzalez
On Wed, Oct 12, 2016 at 7:05 PM, Chris Angelico  wrote:

> I work with a full-stack web development bootcamp. Most of the course
> focuses on *JavaScript (Node.js, React, jQuery, etc),*


Poor students... ;)


> but there's a
> one-week period in which each student gets to pick some technology to
> learn, and at the end of the week, demos to the group some project
> s/he has mastered. Two chose to learn Python, and I've been mentoring
> them through this week.
>
> The comments from each of them have been fairly glowing. Python is
> this incredible thing that has immense power and flexibility;
> significant-whitespace hasn't been a cause of confusion (not even
> mentioned after the first day); and
>
> The most notable features of Python, for these two
> JS-only-up-until-now guys, are the simplicity of the 'for' loop
> (including that you don't need lots of different forms - you can
> iterate over a dictionary without having to learn some new type of
> loop), the list comprehension, and metaprogramming - mainly function
> decorator syntax. And both of them are starting to talk about being
> "converts" to Python :)
>
> Great job, all. Not that it's particularly difficult to compete with a
> language that was originally designed and developed in under two
> weeks, but still. :D
>
>
I think the craziest thing is probably that, based on how you said it,
these two students haven't even begun to enter the entire Python standard
library, which you'd have to download a zillion npm modules (like the
glorious left-pad) in order to match. Once they realize that, they'll never
be going back!


> ChrisA
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> rymg19%40gmail.com
>



-- 
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something’s wrong.
http://kirbyfan64.github.io/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python ROCKS! Thanks guys! [anecdote]

2016-10-12 Thread Chris Angelico
I work with a full-stack web development bootcamp. Most of the course
focuses on JavaScript (Node.js, React, jQuery, etc), but there's a
one-week period in which each student gets to pick some technology to
learn, and at the end of the week, demos to the group some project
s/he has mastered. Two chose to learn Python, and I've been mentoring
them through this week.

The comments from each of them have been fairly glowing. Python is
this incredible thing that has immense power and flexibility;
significant-whitespace hasn't been a cause of confusion (not even
mentioned after the first day); and

The most notable features of Python, for these two
JS-only-up-until-now guys, are the simplicity of the 'for' loop
(including that you don't need lots of different forms - you can
iterate over a dictionary without having to learn some new type of
loop), the list comprehension, and metaprogramming - mainly function
decorator syntax. And both of them are starting to talk about being
"converts" to Python :)

Great job, all. Not that it's particularly difficult to compete with a
language that was originally designed and developed in under two
weeks, but still. :D

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Nathaniel Smith
On Wed, Oct 12, 2016 at 3:28 AM, INADA Naoki  wrote:
> When Tornado drop Python 2.7 support, they can use bytearray, and
> iostream can be more simple and fast.

FYI 2.7 does have bytearray. (You still have to implement the O(1)
deletion part as a layer on top, like Victor points out, but I suspect
that'd still be dramatically simpler than what they're doing now...)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Nathaniel Smith
On Wed, Oct 12, 2016 at 4:55 AM, Victor Stinner
 wrote:
> 2016-10-12 10:01 GMT+02:00 Nathaniel Smith :
>> It's more complicated than that -- the right algorithm is the one that
>> Antoine implemented in 3.4.
>> (...)
>> My point is that
>> forcing everyone who writes network code in Python to do that is
>> silly, especially given that CPython's apparently been shipping this
>> feature for years.
>
> "For years" means since March 2014, Python 3.4.0 release, so 2 years ago.
>
> We can document the optimization as a CPython implementation detail
> and explain that it's only in Python >= 3.4.
>
> So an application which should work on Python 2.7 as well cannot rely
> on this optimization for example.

The proposal is that it should be documented as being part of the
language spec starting in 3.4 (or whatever). So applications that
support Python 2.7 can't rely on it, sure. But if I have an
application that requires, say, 3.5+ but I don't want to depend on
CPython-only implementation details, then I'm still allowed to use it.

AFAIK basically the only project that would be affected by this is
PyPy, and I when I asked on #pypy they said:

 njs`: I think we either plan to or already support this

so I'm not sure why this is controversial.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Terry Reedy

On 10/12/2016 5:42 AM, INADA Naoki wrote:

On Wed, Oct 12, 2016 at 2:32 PM, Serhiy Storchaka  wrote:

On 12.10.16 07:08, INADA Naoki wrote:


Sample code:

def read_line(buf: bytearray) -> bytes:
try:
n = buf.index(b'\r\n')
except ValueError:
return b''

line = bytes(buf)[:n]  # bytearray -> bytes -> bytes



Wouldn't be more correct to write this as bytes(buf[:n])?


Yes, you're right!
I shouldn't copy whole data only for cast from bytearray to byte.


Also, why do the conversion from bytearray to bytes?  It is definitely 
not always needed.


>>> ba = bytearray(b'abc')
>>> b = b'def'
>>> ba + b
bytearray(b'abcdef')
>>> b'%s %s' % (ba, b)
b'abc def'
>>> b + ba
b'defabc'
>>> ba.extend(b)
>>> ba
bytearray(b'abcdef')

Even if it is sometimes needed, why do it always?  The essence of 
read_line is to slice out a line, delete it from the buffer, and return 
the line.  Let the caller explicitly convert when needed.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Victor Stinner
Oops, right, I wanted to write "I don't think that adding a single
method deserves its own PEP."

Victor

2016-10-12 18:37 GMT+02:00 Stephen J. Turnbull
:
> Victor Stinner writes:
>  > 2016-10-12 11:34 GMT+02:00 INADA Naoki :
>
>  > > I see.  My proposal should be another PEP (if PEP is required).
>  >
>  > I don't think that adding a single method deserves its own method.
>
> You mean "deserves own PEP", right?  I interpreted Nick to say that
> "the reasons that applied to PEP 367 don't apply here, so you can Just
> Do It" (subject to the usual criteria for review, but omit the PEP).
>
> I'm not sure whether he was channeling Guido or that should be
> qualified with an IMO or IMHO.
>
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Stephen J. Turnbull
Victor Stinner writes:
 > 2016-10-12 11:34 GMT+02:00 INADA Naoki :

 > > I see.  My proposal should be another PEP (if PEP is required).
 > 
 > I don't think that adding a single method deserves its own method.

You mean "deserves own PEP", right?  I interpreted Nick to say that
"the reasons that applied to PEP 367 don't apply here, so you can Just
Do It" (subject to the usual criteria for review, but omit the PEP).

I'm not sure whether he was channeling Guido or that should be
qualified with an IMO or IMHO.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467

2016-10-12 Thread Nikolaus Rath
On Oct 11 2016, Nathaniel Smith  wrote:
> On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki  wrote:
>> From Python 3.4, bytearray is good solution for I/O buffer, thanks to
>> #19087 [1].
>> Actually, asyncio uses bytearray as I/O buffer often.
>
> Whoa what?! This is awesome, I had no idea that bytearray had O(1)
> deletes at the front. I literally reimplemented this myself on type of
> bytearray for some 3.5-only code recently because I assumed bytearray
> had the same asymptotics as list, and AFAICT this is totally
> undocumented. 

Indeed, same here.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Victor Stinner
2016-10-12 10:01 GMT+02:00 Nathaniel Smith :
> It's more complicated than that -- the right algorithm is the one that
> Antoine implemented in 3.4.
> (...)
> My point is that
> forcing everyone who writes network code in Python to do that is
> silly, especially given that CPython's apparently been shipping this
> feature for years.

"For years" means since March 2014, Python 3.4.0 release, so 2 years ago.

We can document the optimization as a CPython implementation detail
and explain that it's only in Python >= 3.4.

So an application which should work on Python 2.7 as well cannot rely
on this optimization for example.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread Victor Stinner
2016-10-12 11:34 GMT+02:00 INADA Naoki :
> I see.  My proposal should be another PEP (if PEP is required).

I don't think that adding a single method deserves its own method.

I like the idea with Serhiy's API (as Python 2 buffer constructor):

bytes.frombuf(buffer, [offset, size])
bytearray.frombuf(buffer, [offset, size])
memoryview.frombuf(buffer, [offset, size])

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread INADA Naoki
>>
>> [1] My use case is parsing HTTP out of a receive buffer. If deleting
>> the first k bytes of an N byte buffer is O(N), then not only does
>> parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
>> that random untrusted network clients can trigger at will to DoS your
>> server.
>
>
> Deleting from buffer can be avoided if pass the starting index together with
> the buffer. For example:
>
> def read_line(buf: bytes, start: int) -> (bytes, int):
> try:
> end = buf.index(b'\r\n', start)
> except ValueError:
> return b'', start
>
> return buf[start:end], end+2
>

In case of asyncio, we can't assume the order of append and consume.

For example, stream processing HTTP chunked response.
Append to receive buffer and consume a chunk in buffer can happen in
arbitrary order.
That's why bytes is not good for receive buffer. Efficient append is
"must have".

For example, Torando implements receive buffer by deque of bytes. See this code.
https://github.com/tornadoweb/tornado/blob/master/tornado/iostream.py#L784-L817

When Tornado drop Python 2.7 support, they can use bytearray, and
iostream can be more simple and fast.

So I hope "amortized O(1) deletion from the front" is language spec,
at least for Python 3.5+

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread INADA Naoki
On Wed, Oct 12, 2016 at 2:32 PM, Serhiy Storchaka  wrote:
> On 12.10.16 07:08, INADA Naoki wrote:
>>
>> Sample code:
>>
>> def read_line(buf: bytearray) -> bytes:
>> try:
>> n = buf.index(b'\r\n')
>> except ValueError:
>> return b''
>>
>> line = bytes(buf)[:n]  # bytearray -> bytes -> bytes
>
>
> Wouldn't be more correct to write this as bytes(buf[:n])?

Yes, you're right!
I shouldn't copy whole data only for cast from bytearray to byte.

>
>> Adding one more constructor to bytes:
>>
>> # when length=-1 (default), use until end of *byteslike*.
>> bytes.frombuffer(byteslike, length=-1, offset=0)
>
>
> This interface looks unusual. Would not be better to support the interface
> of buffer in Python 2: buffer(object [, offset[, size]])?
>

It looks better.

(Actually speaking, I love deprecated old buffer for simplicity.
memoryview supports non bytes-like complex data types.)

Thanks,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-12 Thread INADA Naoki
On Wed, Oct 12, 2016 at 2:07 PM, Nick Coghlan  wrote:
> I don't think it makes sense to add any more ideas to PEP 467. That
> needed to be a PEP because it proposed breaking backwards
> compatibility in a couple of areas, and because of the complex history
> of Python 3's "bytes-as-tuple-of-ints" and Python 2's "bytes-as-str"
> semantics.
>
> Other enhancements to the binary data handling APIs in Python 3 can be
> considered on their own merits.
>

I see.  My proposal should be another PEP (if PEP is required).


>>
>> * It isn't "one obvious way": Developers including me may forget to
>> use context manager.
>>   And since it works on CPython, it's hard to point it out.
>
> To add to the confusion, there's also
> https://docs.python.org/3/library/stdtypes.html#memoryview.tobytes
> giving:
>
> line = memoryview(buf)[:n].tobytes()
>
> However, folks *do* need to learn that many mutable data types will
> lock themselves against modification while you have a live memory view
> on them, so it's important to release views promptly and reliably when
> we don't need them any more.
>

I agree.
io.TextWrapper objects reports ResourceWarning for unclosed file.
I think same warning for unclosed memoryview objects may help developers.


>> Quick benchmark:
>>
>> (temporary bytes)
>> $ python3 -m perf timeit -s 'buf =
>> bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- 'bytes(buf)[:3]'
>> 
>> Median +- std dev: 652 ns +- 19 ns
>>
>> (temporary memoryview without "with"
>> $ python3 -m perf timeit -s 'buf =
>> bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- 'bytes(memoryview(buf)[:3])'
>> 
>> Median +- std dev: 886 ns +- 26 ns
>>
>> (temporary memoryview with "with")
>> $ python3 -m perf timeit -s 'buf = bytearray(b"foo\r\nbar\r\nbaz\r\n")' -- '
>> with memoryview(buf) as m:
>> bytes(m[:3])
>> '
>> 
>> Median +- std dev: 1.11 us +- 0.03 us
>
> This is normal though, as memory views trade lower O(N) costs (reduced
> data copying) for higher O(1) setup costs (creating and managing the
> view, indirection for data access).

Yes.  When data is small, benefit of less data copy can be hidden easily.

One big difficulty of I/O frameworks like asyncio is: we can't assume data size.
Framework should be optimized for both of many small chunks and large data.

With memoryview, when we optimize for large data (e.g. downloading large file),
performance for massive small data (e.g. small JSON API) become worse.

Actually, one pull request is gave up to use memoryview because of it.

https://github.com/python/asyncio/pull/395#issuecomment-249044218


>
>> Proposed solution
>> ===
>>
>> Adding one more constructor to bytes:
>>
>> # when length=-1 (default), use until end of *byteslike*.
>> bytes.frombuffer(byteslike, length=-1, offset=0)
>>
>> With ths API
>>
>> with memoryview(buf) as m:
>> line = bytes(m[:n])
>>
>> becomes
>>
>> line = bytes.frombuffer(buf, n)
>
> Does that need to be a method on the builtin rather than a separate
> helper function, though? Once you define:
>
> def snapshot(buf, length=None, offset=0):
> with memoryview(buf) as m:
> return m[offset:length].tobytes()
>
> then that can be replaced by a more optimised C implementation without
> users needing to care about the internal details.

I'm thinking about adding such helper function in asyncio speedup C extension.
But there are some other non-blocking I/O frameworks: Tornado,
Twisted, and curio.

And relying on C extention make harder to optimize for other Python
implementation.
If it is in standard library, PyPy and other Python implementation can
optimize it.


>
> That is, getting back to a variant on one of Serhiy's suggestions in
> the last PEP 467 discussion, it may make sense for us to offer a
> "buffertools" library that's specifically aimed at supporting
> efficient buffer manipulation operations that minimise data copying.
> The pure Python implementations would work entirely through
> memoryview, but we could also have selected C accelerated operations
> if that showed a noticeable improvement on asyncio's benchmarks.
>

It seems nice idea. I'll read the discussion.


> Regards,
> Nick.
>
> P.S. The length/offset API design is also problematic due to the way
> it differs from range() & slice(), but I don't think it makes sense to
> get into that kind of detail before discussing the larger question of
> adding a new helper module for working efficiently with memory buffers
> vs further widening the method API for the builtin bytes type
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia

I avoid slice API intentionally, because if it seems like slice,
someone will propose
adding `step` support only for consistency.

But, as Serhiy said, consistent with old buffer API is nice.

-- 
INADA Naoki  
___
Python-Dev mailing list

Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Nathaniel Smith
On Wed, Oct 12, 2016 at 12:17 AM, Serhiy Storchaka  wrote:
> On 12.10.16 09:31, Nathaniel Smith wrote:
>>
>> But amortized O(1) deletes from the front of bytearray are totally
>> different, and more like amortized O(1) appends to list: there are
>> important use cases[1] that simply cannot be implemented without some
>> feature like this, and putting the implementation inside bytearray is
>> straightforward, deterministic, and more efficiently than hacking
>> together something on top. Python should just guarantee it, IMO.
>>
>> -n
>>
>> [1] My use case is parsing HTTP out of a receive buffer. If deleting
>> the first k bytes of an N byte buffer is O(N), then not only does
>> parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
>> that random untrusted network clients can trigger at will to DoS your
>> server.
>
>
> Deleting from buffer can be avoided if pass the starting index together with
> the buffer. For example:
>
> def read_line(buf: bytes, start: int) -> (bytes, int):
> try:
> end = buf.index(b'\r\n', start)
> except ValueError:
> return b'', start
>
> return buf[start:end], end+2

It's more complicated than that -- the right algorithm is the one that
Antoine implemented in 3.4. But yes, having implemented this by hand,
I am aware that it can be implemented by hand :-). My point is that
forcing everyone who writes network code in Python to do that is
silly, especially given that CPython's apparently been shipping this
feature for years.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Serhiy Storchaka

On 12.10.16 09:31, Nathaniel Smith wrote:

But amortized O(1) deletes from the front of bytearray are totally
different, and more like amortized O(1) appends to list: there are
important use cases[1] that simply cannot be implemented without some
feature like this, and putting the implementation inside bytearray is
straightforward, deterministic, and more efficiently than hacking
together something on top. Python should just guarantee it, IMO.

-n

[1] My use case is parsing HTTP out of a receive buffer. If deleting
the first k bytes of an N byte buffer is O(N), then not only does
parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
that random untrusted network clients can trigger at will to DoS your
server.


Deleting from buffer can be avoided if pass the starting index together 
with the buffer. For example:


def read_line(buf: bytes, start: int) -> (bytes, int):
try:
end = buf.index(b'\r\n', start)
except ValueError:
return b'', start

return buf[start:end], end+2


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

2016-10-12 Thread Nathaniel Smith
On Tue, Oct 11, 2016 at 10:53 PM, Serhiy Storchaka  wrote:
> On 12.10.16 08:03, Nathaniel Smith wrote:
>>
>> On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki 
>> wrote:
>>>
>>> From Python 3.4, bytearray is good solution for I/O buffer, thanks to
>>> #19087 [1].
>>> Actually, asyncio uses bytearray as I/O buffer often.
>>
>>
>> Whoa what?! This is awesome, I had no idea that bytearray had O(1)
>> deletes at the front. I literally reimplemented this myself on type of
>> bytearray for some 3.5-only code recently because I assumed bytearray
>> had the same asymptotics as list, and AFAICT this is totally
>> undocumented. Shouldn't we write this down somewhere? Maybe here? ->
>> https://docs.python.org/3/library/functions.html#bytearray
>
>
> I afraid this is CPython implementation detail (like string concatenation
> optimization). Other implementations can have O(N) deletes at the front of
> bytearray.

Well, it shouldn't be :-).

The problem with the string concatenation optimization is that to
work, it requires both the use of refcounting GC and that you get
lucky with how the underlying malloc implementation happened to lay
things out in memory. Obviously it shouldn't be part of the language
spec.

But amortized O(1) deletes from the front of bytearray are totally
different, and more like amortized O(1) appends to list: there are
important use cases[1] that simply cannot be implemented without some
feature like this, and putting the implementation inside bytearray is
straightforward, deterministic, and more efficiently than hacking
together something on top. Python should just guarantee it, IMO.

-n

[1] My use case is parsing HTTP out of a receive buffer. If deleting
the first k bytes of an N byte buffer is O(N), then not only does
parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
that random untrusted network clients can trigger at will to DoS your
server.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com