[Python-ideas] Re: isolating user allocations

Wenjun Huang Mon, 20 Jul 2020 19:28:44 -0700

Hi Guido,

Thank you for bearing with me. I wasn't trying to say you guys are mean btw.

I thought that the interpreter might allocate some memory for its own use.
Perhaps I was wrong, but I'll work with your examples here just to be sure.

Stack frames would be considered as interpreter objects here, as they
aren't created because a user object is created. Instead, they are the
results of function calls. Following that, empty spaces in hash tables and
string hashes would be considered as user allocations, as they are created
through explicitly created objects. I think a transitive relation would
work here (i.e. if an explicit object allocation triggers an implicit
allocation, then the latter is considered an user allocation).

Now, maybe getting this to work doesn't benefit profiler users so much, but
there are other potential uses as well. Hopefully they can be more
compelling. I didn't bring these up earlier because I thought the profiling
case was easier to discuss.

For example, provenance of data can be tracked through taint analysis, but
if all objects are lumped together then we have to taint the entire
interpreter.

Another example would be partial GIL sidestepping. The approach would be
blowing up threads into processes and allocating all user objects in shared
memory (accesses would be synchronized). This way we get parallel execution
and threading semantics. However, this is not possible if we can't isolate
user objects, as there's no sensible default to synchronize interpreter
states. This design has been done before for C/C++ (
https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for
different reasons.

On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <[email protected]> wrote:

> On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <[email protected]>
> wrote:
>
>> Hi Barry,
>>
>> It's not just about leaks. You might want to know if certain objects are
>> occupying a lot of memory by themselves. Then you can optimize the memory
>> usage of these objects.
>>
>> Another possibility is to do binary instrumentation and see how the user
>> code is interacting with objects. If we can't tell which objects are
>> created by the interpreter internals, then interpreter accesses and user
>> accesses would be mixed together. It's likely that some accesses would be
>> connected of course, but I don't think this should be outright labeled as
>> useless.
>>
>
> I have to side with Barry -- I don't understand why the difference between
> "interpreter internals" and "user objects" matters. Can you give some
> examples of interpreter internals that aren't being allocated in direct
> response to user code? For example you might call stack frames internals.
> But a stack frame is only created when a user calls a function, so maybe
> that's a user object too? Or take dictionaries. These contain hash tables
> with empty spaces in them. Are the empty spaces internals? Or strings.
> These cache the hash value. Are the 8 bytes for the hash value interpreter
> internals?
>
> So, here's my request -- can you clarify your need for the
> differentiation? Other than just pointing to Scalene. If Scalene has a
> reason for making this differentiation can you explain what Scalene users
> get out of this? Suppose Scalene tells me "your objects take 84.3% of the
> memory and interpreter internals take the other 17.7%" what can I as a user
> do with that information?
>
>
>> Also, I'm not saying "we must implement this because it's so useful."
>> My original intention is to understand:
>> (1) is the differentiation being done at all?
>>
>
> It's not. We're not being mean here. If it was being done someone would
> have told you after your first message.
>
>
>> (2) if it's not being done, why?
>>
>
> Because nobody saw a need for it. In fact, apart from you, there still
> isn't anyone who sees the need for it, since you haven't explained your
> need. (This, too, should have been obvious to you given the responses you'v
> gotten so far. :-)
>
>
>> (3) does it make sense to implement it?
>>
>
> Probably not. I certainly don't expect it to be easy. So it won't "make
> sense" unless you have actually explained your reason for wanting this and
> convinced some folks that this is a good reason. See the answer for (1) and
> (2) above.
>
>
>> So far I think I've got the answers to 1 & 2--it's not being done because
>> people don't find it useful. The answer to 3 is most likely "no" due to the
>> costs, but it would be nice if someone could weigh in on this part. Maybe
>> there's some workaround.
>>
>
> If you were asking me to weigh in *now* I'd say "no", if only because you
> haven't explained the reason why this is needed. And if you have an
> implementation idea in mind, please don't be shy.
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/XHSHLQYFJ5S45J66PO45JVYXONZMUMV2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: isolating user allocations

Reply via email to