> On 17 Mar 2020, at 16:43, Mark Shannon <m...@hotpy.org> wrote:
> 
> 
> 
> On 17/03/2020 3:38 pm, Steve Dower wrote:
>> On 17Mar2020 1447, Mark Shannon wrote:
>>> On 16/03/2020 3:04 pm, Victor Stinner wrote:
>>>> In short, the answer is yes.
>>> 
>>> I said "no" then and gave reasons. AFAICT no one has faulted my reasoning.
>> I said "yes" then and was also not faulted.
> 
> I'll do that now then ;)
> 
> The accessibility of a thread-local variable is a strict superset of that of 
> a function-local variable.
> 
> Therefore storing the thread state in a thread-local variable is at least as 
> capable as passing thread-state as a parameter.
> 
>>> Let me reiterate why using a thread-local variable is better than passing 
>>> the thread state down the C stack.
>>> 
>>> 1. Using a thread-local variable for the thread state requires much smaller 
>>> changes to the code base.
>> Using thread-local variables enforces a threading model on the host 
>> application, rather than working with the existing threading model. So 
>> anyone embedding is forced into *significantly* more code as a result.
> 
> Putting a value in a function-local variable enforces stronger restrictions 
> than putting it in a thread-local variable.
> 
> I am proposing that we *don't* change the API. How does that make more work 
> for anyone using the API?

Are you saying that all interpreters can use the same thread-local-variable for 
tstate?

Or do you need N thread-local-variables for N interpreters?

Barry


> 
>> We can (and should) maintain a public-facing API that uses TLS to pass the 
>> current thread state around - we have compatibility constraints. But we can 
>> also add private versions that take the thread state (once you've started 
>> trying/struggling to really embed CPython, you'll happily take a dependency 
>> on "private" APIs).
> 
> Again, I am requesting that we *don't* change the API.
> Not changing the API maintains backwards compatibility better than changing 
> it, surely.
> 
>> If the only available API requires TLS, then you're likely to see the caller 
>> wrap it all up in a function that updates TLS before calling. Or 
>> alternatively, introduce dedicated threads for running Python snippets on, 
>> and all the (dead)locking that results (yes, I've done both).
> 
> All the platforms that we support have thread-local storage.
> If a platform doesn't have threads at all, then thread-local just degenerates 
> to a global.
> 
>> Our goal as core CPython developers should be to sacrifice our own effort to 
>> reduce the effort needed by our users, not to do things that make our own 
>> lives easier but harm them.
> 
> Indeed. We might want to speed Python up a bit as well :)
> 
>>> 2. Using a thread-local variable is less error prone. When passing tstate 
>>> as a parameter, what happens if the tstate argument is from a different 
>>> thread or is NULL? Are you adding checks for those cases?
>>> What are the performance implications of adding those checks?
>> Undefined behaviour is totally acceptable here. We can assert in debug 
>> builds - developers who make use of this can test with debug builds.
> 
> I'm not sure what your point is about undefined behaviour.
> 
>>> 3. Using a thread-local variable is likely to be a little bit faster. 
>>> Passing an argument down the stack increases register pressure and spills.
>>> Accessing a thread-local is slower at the point of access, but the cost is 
>>> incurred only when it is needed, so is cheaper overall.
>> Compilers can optimise parameters/locals in ways that are far more efficient 
>> than they can do for anything stored outside the call stack. Especially for 
>> internal calls. Going through public/exported functions is a little more 
>> restricted in terms of optimisations, but if we identify an issue here then 
>> we can work on that then.
> 
> Please skip the patronizing "how compilers work" stuff.
> I know how register allocators work.
> 
>> [OTHER POST]
>>> Just to be clear, this is what I mean by a thread local variable:
>>> https://godbolt.org/z/dpSo-Q
>> Showing what one particular compiler generates for one particular situation 
>> is terrible information (I won't bother calling it "evidence").
> 
> The particular situation is the use of a thread-local variable, which is the 
> point under discussion.
> 
> Here's the links for clang and MSVC:
> 
> https://godbolt.org/z/YnbbqD
> https://www.godbolt.ms/z/9nQEqf
> 
>>>> One motivation is to ease the implementation of subinterpreters (PEP
>>>> 554). But PEP 554 describes more than public API than the
>>>> implementation.
>>> 
>>> I don't see how this eases the implementation of subinterpreters.
>>> Surely it makes it harder by causing merge conflicts.
>> That's a very selfish point-of-view :)
> 
> Why? Merge conflicts are a problem for everyone.
> 
>> It eases it because many more operations need to know the current Python 
>> "thread" in order to access things that used to be globals, such as 
>> PyTypeObject instances. Having the thread state easily and efficiently 
>> accessible does make a difference here.
> 
> Indeed. I want it to be easily and efficiently accessible.
> Putting it a thread-local variable does both.
> For additional efficiency `_PyThreadState_GET()` can be defined as `inline`.
> 
>>>> The long-term goal is to be able to run multiple isolated interpreters
>>>> in parallel.
>>> 
>>> An admirable goal, IMO.
>>> But how is it to be done? That is the question.
>> By isolating thread states properly from global state.
> 
> Yes. But why do you think passing a value down the stack does that better 
> than a thread-local variable?
> 
>>>> Sorry about that :-/ A lot of Python internals should be modified to
>>>> implement subinterpreters.
>>> 
>>> I don't think they *should*. When they *must*, then do that.
>>> Changes should only be made if necessary for correctness or for a 
>>> significant improvement of performance.
>> They must - I think Victor just chose the wrong English word there. 
>> Correctness is the first thing to fall when you access globals in 
>> multithreaded code, and the CPython code base accesses a lot of globals.
> 
> Indeed, but this discussion has nothing to do with global variables, it is 
> about how we access thread-state.
> 
> Cheers,
> Mark.
> 
>> Cheers,
>> Steve
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ERBHVDSI7OVKDZEOAU52PBFBTGA4USJN/
> Code of Conduct: http://python.org/psf/codeofconduct/
> 
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T3MO47265RKWH2BKI2WNXZJVPTYJJ2RD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to