Kyle Stanley <aeros...@gmail.com> added the comment:

> I understand that there's *some* overhead associated with spawning a new 
> thread, but from my impression it's not substantial enough to make a 
> significant impact in most cases.

Although I think this still stands to some degree, I will have to rescind the 
following:

> Each individual instance of threading.Thread is only 64 bytes.

The 64 bytes was measured by `sys.getsizeof(threading.Thread())`, which only 
provides a surface level assessment. I believe this only includes the size of 
the reference to the thread object.

In order to get a better estimate, I implemented a custom get_size() function, 
that recursively adds the size of the object and all unique objects from 
gc.get_referents()  (ignoring several redundant and/or unnecessary types). For 
more details, see 
https://gist.github.com/aeros/632bd035b6f95e89cdf4bb29df970a2a. Feel free to 
critique it if there are any apparent issues (for the purpose of measuring the 
size of threads). 

Then, I used this function on three different threads, to figure how much 
memory was needed for each one:

Python 3.8.0+ (heads/3.8:1d2862a323, Nov  4 2019, 06:59:53) 
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> a = threading.Thread()
>>> b = threading.Thread()
>>> c = threading.Thread()
>>> get_size(a)
3995
>>> get_size(b)
1469
>>> get_size(c)
1469

1469 bytes seems to be roughly the amount of additional memory required for 
each new thread, at least on Linux kernel 5.3.8 and Python 3.8. I don't know if 
this is 100% accurate, but it at least provides an improved estimate over 
sys.getsizeof().

> But it spawns a new Python thread per process which can be a blocker issue if 
> a server memory is limited. What if you want to spawn 100 processes? Or 1000 
> processes? What is the memory usage?

>From my understanding, ~1.5KB/thread seems to be quite negligible for most 
>modern equipment. The server's memory would have to be very limited for 
>spawning an additional 1000 threads to be a bottleneck/blocker issue:

Python 3.8.0+ (heads/3.8:1d2862a323, Nov  4 2019, 06:59:53) 
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> threads = []
>>> for _ in range(1000):
...     th = threading.Thread()
...     threads.append(th)
... 
>>> get_size(threads)
1482435

(~1.5MB)

Victor (or anyone else), in your experience, would the additional ~1.5KB per 
process be an issue for 99% of production servers? If not, it seems to me like 
the additional maintenance cost of keeping SafeChildWatcher and 
FastChildWatcher in asyncio's API wouldn't be worthwhile.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue38591>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to