[issue47010] Implement zero copy writes in SelectorSocketTransport in asyncio

2022-03-21 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue47010>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45819] Avoid releasing the GIL in nonblocking socket operations

2022-02-02 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue45819>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41226] Supporting `strides` in `memoryview.cast`

2022-01-13 Thread jakirkham


jakirkham  added the comment:

The 2nd argument is the `strides`. IOW it is just specifying how to traverse 
the buffer in memory to visit each of the dimensions.

For the first example where `strides` is not specified, Python makes them 
C-ordered. IOW `m2.strides` would be `(3, 1)`. Effectively this is represented 
like this:

```
[[ b"a", b"c", b"e"],
 [ b"b", b"d', b"f"]]
```

For the second case where strides are overridden (so `m2.strides` would be `(1, 
2)`), we get something like this:

```
[[b"a", b"b", b"c"],
 [b"d", b"e", b"f"]]
```

In either case the `1` here has specified which dimension is fastest to 
traverse along. IOW that content is adjacent in memory.

Should add the reason it is `1` is that for `uint8_t` (or format "B"), this is 
that type's size. If we had a different format, this would be the size of that 
format.

HTH

--

___
Python tracker 
<https://bugs.python.org/issue41226>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46367] multiprocessing's "spawn" doesn't actually use spawn

2022-01-13 Thread jakirkham


New submission from jakirkham :

Reporting an issue recently encountered by a colleague.

It appears the `multiprocessing`'s "spawn" mode doesn't actually use POSIX 
spawn, but instead uses fork+exec[1]. While this is certainly a useful feature 
in its own right, this not quite one would expect from something described as 
spawn. AFAICT the documentation doesn't point this out.

This is important as some libraries are not fork-safe and even fork+exec is not 
sufficient to protect them. Would be helpful if "spawn" did use POSIX spawn and 
the current behavior was covered under a clearer name (like "forkexec").


Ref:
1. 
https://github.com/python/cpython/blob/af6b4068859a5d0c8afd696f3c0c0155660211a4/Lib/multiprocessing/util.py#L448-L458

--
components: Library (Lib)
messages: 410512
nosy: jakirkham
priority: normal
severity: normal
status: open
title: multiprocessing's "spawn" doesn't actually use spawn
type: behavior
versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue46367>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44556] ctypes unittest crashes with libffi 3.4.2

2021-11-19 Thread jakirkham


jakirkham  added the comment:

We ran into the same issue in conda-forge ( 
https://github.com/conda-forge/python-feedstock/issues/522 ).

The problem is Apple also supplies their own `libffi`. However if the build 
scripts in CPython fail to find the user provided `libffi`, they end up pulling 
the headers from Apple's `libffi`, but the linker will link to the user 
provided `libffi`. IOW mashing these two incompatible `libffi`'s together. As 
result one gets crashes like the one illustrated in this bug.

In conda-forge, we are resolving this by forcing our `pkg-config` to be used to 
ensure we pick up the headers from our `libffi` as well as the libraries. Other 
users may be able to workaround this issue by explicitly setting 
`LIBFFI_INCLUDE_DIR`.

That said, it would be preferable to have a clear way to specify the `libffi` 
used and ensure that Apple's one doesn't get accidentally pulled in. If this 
exists and we are just missing these details, some pointers to this effect 
would be very helpful.

--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue44556>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41226] Supporting `strides` in `memoryview.cast`

2021-09-29 Thread jakirkham


Change by jakirkham :


--
components: +Library (Lib)
type:  -> enhancement

___
Python tracker 
<https://bugs.python.org/issue41226>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40718] Support out-of-band pickling for builtin types

2021-09-29 Thread jakirkham


Change by jakirkham :


--
components: +Library (Lib)

___
Python tracker 
<https://bugs.python.org/issue40718>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40718] Support out-of-band pickling for builtin types

2021-09-27 Thread jakirkham


Change by jakirkham :


--
versions: +Python 3.10, Python 3.11

___
Python tracker 
<https://bugs.python.org/issue40718>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41223] `object`-backed `memoryview`'s `tolist` errors

2021-09-27 Thread jakirkham


Change by jakirkham :


--
versions: +Python 3.11

___
Python tracker 
<https://bugs.python.org/issue41223>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43096] Adding `read_into` method to `asyncio.StreamReader`

2021-09-27 Thread jakirkham


Change by jakirkham :


--
versions: +Python 3.11

___
Python tracker 
<https://bugs.python.org/issue43096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41226] Supporting `strides` in `memoryview.cast`

2021-09-27 Thread jakirkham


Change by jakirkham :


--
versions: +Python 3.11

___
Python tracker 
<https://bugs.python.org/issue41226>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45304] Supporting out-of-band buffers (pickle protocol 5) in multiprocessing

2021-09-27 Thread jakirkham


New submission from jakirkham :

In Python 3.8+, pickle protocol 5 ( PEP<574> ) was added, which supports 
out-of-band buffer collection[1]. The idea being that when pickling an object 
with a large amount of data attached to it (like an array, dataframe, etc.) one 
could collect this large amount of data alongside the normal pickled data 
without causing a copy. This is important in particular when serializing data 
for communication between two python instances. IOW this is quite valuable when 
using a `multiprocessing.pool.Pool`[2] or a 
`concurrent.futures.ProcessPoolExecutor`[3]. However AFAICT neither of these 
leverage this functionality[4][5]. To ensure zero-copy processing of large 
data, it would be helpful for pickle protocol 5 to be used in both of these 
pools.


[1] https://docs.python.org/3/library/pickle.html#pickle-oob
[2] 
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool
[3] 
https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor
[4] 
https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L372
[5] 
https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L245

--
components: IO, Library (Lib)
messages: 402736
nosy: jakirkham
priority: normal
severity: normal
status: open
title: Supporting out-of-band buffers (pickle protocol 5) in multiprocessing
type: performance
versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue45304>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43976] Allow Python distributors to add custom site install schemes

2021-08-25 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue43976>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28646] Using a read-only buffer.

2021-08-17 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue28646>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-07-28 Thread jakirkham


jakirkham  added the comment:

Agree with Bruce. It seems like we could have support for OpenSSL 1.1.1 at that 
level with a compile time fallback for previous OpenSSL versions that break up 
the work. Would hope this solution also yields something we can backport more 
easily

--

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37355] SSLSocket.read does a GIL round-trip for every 16KB TLS record

2021-07-01 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue37355>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-06-24 Thread jakirkham


jakirkham  added the comment:

Not following. Why would it break? Presumably once one builds Python for a 
particular OS they keep there (like system package managers for example).

Or alternatively they build on a much older OS and then deploy to newer ones. 
The latter case is what we do in conda-forge (Anaconda does the same thing). We 
also build our own OpenSSL so have control of that knob too. Though I've seen 
application developers do similar things.

Do you have an example where this wouldn't work? Maybe that would help as we 
can define a better solution there.

--

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-06-24 Thread jakirkham


jakirkham  added the comment:

Right with this change ( https://github.com/python/cpython/pull/25468 ). Thanks 
for adding that Christian :)

I guess what I'm wondering is if in older Python versions we could do an 
`#ifdef` check to try and use `SSL_read_ex` & `SSL_write_ex` if the symbols are 
found at build time? This would allow package maintainers the option to build 
with a newer OpenSSL to fix this issue (even on older Pythons)

--

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB

2021-06-02 Thread jakirkham


jakirkham  added the comment:

Would it be possible to check for these newer OpenSSL symbols during the builds 
of Python 3.8 & 3.9 (using them when available and otherwise falling back to 
the older API otherwise)? This would allow people to build Python 3.8 & 3.9 
with the newer OpenSSL benefiting from the fix

That said, not sure if there are other obstacles to using OpenSSL 1.1.1 with 
Python 3.8 & 3.9

--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue42853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27149] Implement socket.sendmsg() for Windows

2021-02-26 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue27149>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40007] An attempt to make asyncio.transport.writelines (selector) use Scatter I/O

2021-02-26 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue40007>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43096] Adding `read_into` method to `asyncio.StreamReader`

2021-02-01 Thread jakirkham


New submission from jakirkham :

To allow reading into a provided buffer without copying, it would be useful to 
have a `read_into` method on `asyncio.StreamReader`, which takes a buffer 
supporting the buffer protocol and fills it.

--
components: asyncio
messages: 386114
nosy: asvetlov, jakirkham, yselivanov
priority: normal
severity: normal
status: open
title: Adding `read_into` method to `asyncio.StreamReader`
type: enhancement
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue43096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37293] concurrent.futures.InterpreterPoolExecutor

2020-08-05 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue37293>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41377] memoryview of str (unicode)

2020-07-23 Thread jakirkham


jakirkham  added the comment:

Thanks for the clarification, Eric! :)

Is this the sort of thing that we could capture in the `format`[1] field (like 
with `"B"`, `"H"`, and `"I"`[2]) or are there potential issues there?

[1]: https://docs.python.org/3/c-api/buffer.html#c.Py_buffer.format
[2]: https://docs.python.org/3/library/struct.html#format-characters

--

___
Python tracker 
<https://bugs.python.org/issue41377>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41377] memoryview of str (unicode)

2020-07-23 Thread jakirkham

New submission from jakirkham :

When working with lower level C/C++ code, the Python Buffer Protocol[1] has 
been immensely useful as it allows common Python `bytes`-like objects to expose 
the underlying memory buffer in a pointer that C/C++ code can easily work with 
zero-copy. In fact `memoryview` objects can be quite handy when facilitating 
coercion of Python objects supporting the Python Buffer Protocol to something 
that Python and/or C/C++ code can use easily. This works with several Python 
objects, many Python APIs, and in is relied on heavily by many performance 
conscious 3rd party libraries.

However one object that gets a lot of use in Python that doesn't support this 
API is the Python `str` (previously `unicode`) object (see code below).

```python
In [1]: s = "Hello World!"  

In [2]: mv = memoryview(s)  
---
TypeError Traceback (most recent call last)
 in 
> 1 mv = memoryview(s)

TypeError: memoryview: a bytes-like object is required, not 'str'
```

The canonical answer today is [to encode to `bytes` first]( 
https://stackoverflow.com/a/54449407 ) and decode to `str` later. While this is 
ok for a smallish piece of text, it can start to slowdown considerably for 
larger pieces of text. So being able to skip this encode/decode step can be 
quite impactful.

```python
In [1]: s = "Hello World!"  

In [2]: %timeit s.encode(); 
54.9 ns ± 0.0788 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: s = 100_000_000 * "Hello World!"

In [4]: %timeit s.encode(); 
729 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

AIUI (though I could be misunderstanding things) `str` objects do use some kind 
of typed array of unicode characters (either 16-bit narrow or 32-bit wide). So 
it seems like it *should* be possible to expose this as a 1-D contiguous array 
that C/C++ code could use. Though I may be misunderstanding how `str`s actually 
work under-the-hood (if so apologies).

It would be quite helpful to bypass this encoding/decoding step and instead 
work directly with the underlying buffer in these situations where C/C++ is 
involved to help performance critical code.

[1]: https://docs.python.org/3/c-api/buffer.html

--
components: Library (Lib)
messages: 374147
nosy: jakirkham
priority: normal
severity: normal
status: open
title: memoryview of str (unicode)
type: enhancement
versions: Python 3.10, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue41377>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41226] Supporting `strides` in `memoryview.cast`

2020-07-06 Thread jakirkham


New submission from jakirkham :

Currently one can reshape a `memoryview` using `.cast(...)` like so...

```
In [1]: m = memoryview(b"abcdef")

In [2]: m2 = m.cast("B", (2, 3))
```

However it is not currently possible to specify the `strides` when reshaping 
the `memoryview`. This would be useful if the `memoryview` should be F-order or 
otherwise strided. To that end, syntax like this would be useful...

```
In [1]: m = memoryview(b"abcdef")

In [2]: m2 = m.cast("B", (2, 3), (1, 2))
```

------
messages: 373202
nosy: jakirkham
priority: normal
severity: normal
status: open
title: Supporting `strides` in `memoryview.cast`
versions: Python 3.10, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue41226>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41223] `object`-backed `memoryview`'s `tolist` errors

2020-07-06 Thread jakirkham


New submission from jakirkham :

When working with an `object`-backed `memoryview`, it seems we are unable to 
coerce it to a `list`. This would be useful as it would provide a way to get 
the underlying `object`'s into something a bit easier to work with.

```
In [1]: import numpy

In [2]: a = numpy.array(["abc", "def", "ghi"], dtype=object)

In [3]: m = memoryview(a)   

In [4]: m.tolist()  
---
NotImplementedError   Traceback (most recent call last)
 in 
> 1 m.tolist()

NotImplementedError: memoryview: format O not supported
```

--
messages: 373175
nosy: jakirkham
priority: normal
severity: normal
status: open
title: `object`-backed `memoryview`'s `tolist` errors
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue41223>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39645] Expand concurrent.futures.Future's public API

2020-06-12 Thread jakirkham


Change by jakirkham :


--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue39645>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35845] Can't read a F-contiguous memoryview in physical order

2020-06-05 Thread jakirkham


jakirkham  added the comment:

Sorry if I'm just misunderstanding the discussion here. Would it make sense to 
have an `order` keyword argument to `cast` as well? This seems useful when 
interpreting a flatten F-order `bytes` object (say on the receiving end of a 
transmission).

--
nosy: +jakirkham

___
Python tracker 
<https://bugs.python.org/issue35845>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40718] Support out-of-band pickling for builtin types

2020-05-21 Thread jakirkham


New submission from jakirkham :

It would be nice (where possible) to support out-of-band pickling of builtin 
`bytes`-like types. This would allow binary data from these objects to be 
shipped along separately zero-copy and later reconstructed during unpickling.

It seems that `bytes`, `bytearray`, and `array` would be good candidates for 
this behavior. Not sure if `mmap` or `memoryview` would make sense as it might 
not be clear on how to reconstruct them during unpickling, but if someone sees 
a way those would be nice to support too.

To illustrate this a bit, here is the behavior with a `bytes` object today:

```
In [1]: import pickle

In [2]: b = b"abc"

In [3]: l = []

In [4]: p = pickle.dumps(b, protocol=5, buffer_callback=l.append)

In [5]: l
Out[5]: []
```

With this change, we would see this behavior instead:

```
In [1]: import pickle

In [2]: b = b"abc"

In [3]: l = []

In [4]: p = pickle.dumps(b, protocol=5, buffer_callback=l.append)

In [5]: l
Out[5]: []
```

(This is my first Python bug submission. So apologies if I got turned around 
here. Please go easy on me :)

--
messages: 369533
nosy: jakirkham
priority: normal
severity: normal
status: open
title: Support out-of-band pickling for builtin types
type: performance
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue40718>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com