[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-28 Thread Nick Coghlan
On Tue., 24 Sep. 2019, 10:29 am Nathaniel Smith,  wrote:

> On Mon, Sep 23, 2019 at 1:30 PM Vinay Sajip via Python-Dev
>  wrote:
> >
> > OK - but that's just one I picked at random. There are others like it -
> what would be the process for deciding which ones need to be made private
> and moved? Should an issue be raised to track this?
>
> There are really two issues here:
>
> - hiding the symbols that *aren't* marked PyAPI_*, consistently across
> platforms.
> - finding symbols that are currently marked PyAPI_*, but shouldn't be.
>
> The first one is a pretty straightforward technical improvement. The
> second one is a longer-term project that could easily get bogged down
> in complex judgement calls. So let's worry about them separately. Even
> if there are too many symbols marked PyAPI_*, we can still get started
> on hiding all the symbols that we *know* should be hidden.
>

One of the other things to keep in mind is that a lot of new symbol exports
are created by copying an existing one, rather than by going & reading the
C API documentation on exporting symbols correctly.

Historically, though, it wasn't especially obvious from the header files
themselves that "exported for internal linkage" and "exported as part of
the public C API" should be written differently, so it was easy to use the
wrong style, and patch reviewers wouldn't necessarily notice.

With the structural split now in place, the creep of incorrectly published
symbols should be contained, and the public header files can start being
reviewed for cases like this one.

Cheers,
Nick.




>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7DW34MYR3DZKUXW76DMLB54V22PWMCVX/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
Fair enough. Pull request created:

https://github.com/python/cpython/pull/16347

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IQGK5HFMV6ESBF7CS32U2BWVM23DIKXF/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Kyle Stanley
> How easy would it be to search the sources and the docs for the ones
that are currently public but not documented?

Comparing the docs and .c files for functions that are not documented
would be fairly trivial, but it's very difficult to define something as
being
public if it's not documented, since that's the main definition used. If
it's not
in the documentation, there's not a definitive way tell that it's public
(AFAIK).

The presence or lack of the name starting with an underscore can sometimes
be an indicator, but that's not consistently reliable.

On Mon, Sep 23, 2019 at 7:59 PM MRAB  wrote:

> On 2019-09-23 21:22, Vinay Sajip via Python-Dev wrote:
> > OK - but that's just one I picked at random. There are others like it -
> what would be the process for deciding which ones need to be made private
> and moved? Should an issue be raised to track this?
> >
> It's not really a question of which ones should be made private, but of
> which ones should be remain public. If it's mentioned in the docs, then
> it's public, otherwise it should be private.
> How easy would it be to search the sources and the docs for the ones
> that are currently public but not documented?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4KUPJBNXXL7VD6JPAC7YCCTD56A5OUAZ/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U3WTOSLDLIZBIB4IJPJ4LSMWQNLY7W2V/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Nathaniel Smith
On Mon, Sep 23, 2019 at 1:30 PM Vinay Sajip via Python-Dev
 wrote:
>
> OK - but that's just one I picked at random. There are others like it - what 
> would be the process for deciding which ones need to be made private and 
> moved? Should an issue be raised to track this?

There are really two issues here:

- hiding the symbols that *aren't* marked PyAPI_*, consistently across
platforms.
- finding symbols that are currently marked PyAPI_*, but shouldn't be.

The first one is a pretty straightforward technical improvement. The
second one is a longer-term project that could easily get bogged down
in complex judgement calls. So let's worry about them separately. Even
if there are too many symbols marked PyAPI_*, we can still get started
on hiding all the symbols that we *know* should be hidden.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VLTZW3HQUK3ZAQEEKZIHCJRUOXRMUPVV/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread MRAB

On 2019-09-23 21:22, Vinay Sajip via Python-Dev wrote:

OK - but that's just one I picked at random. There are others like it - what 
would be the process for deciding which ones need to be made private and moved? 
Should an issue be raised to track this?

It's not really a question of which ones should be made private, but of 
which ones should be remain public. If it's mentioned in the docs, then 
it's public, otherwise it should be private.
How easy would it be to search the sources and the docs for the ones 
that are currently public but not documented?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4KUPJBNXXL7VD6JPAC7YCCTD56A5OUAZ/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
OK - but that's just one I picked at random. There are others like it - what 
would be the process for deciding which ones need to be made private and moved? 
Should an issue be raised to track this?

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/USKL36S4SRDEK7TY46C2HIUHXFVRDQUQ/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Victor Stinner
Le lun. 23 sept. 2019 à 21:36, Vinay Sajip via Python-Dev
 a écrit :
> Nathaniel Smith wrote:
> > Windows already has working symbol visibility handling, and PyAPI_FUNC is
> > what controls it. So adding symbol visibility handling to Linux/macOS is
> > just about making all the platforms consistent. There might be some weird
> > choices being made, but I don't think you need to sort all those out as
> > part of this.
>
> Well, _Py_DecodeLocaleEx is declared with PyAPI_FUNC, so would you expect it 
> to be exposed on Windows?

_Py_DecodeLocaleEx() should not be used outside Python. It's really a
low-level function which should not be used directly.

Its definition should be moved to the internal API and it should use
extern rather than PyAPI_FUNC(). It's a private API so we can make it
internal.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MEAAF636JYXNN7RKXJ4TKKUUBJALIMN4/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
Ah - I checked, and it's there OK ... (head scratch)

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/POLJVZ7UJGQ74E5YYZ3G5HPNDLP6G5T5/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
Nathaniel Smith wrote:
> Windows already has working symbol visibility handling, and PyAPI_FUNC is
> what controls it. So adding symbol visibility handling to Linux/macOS is
> just about making all the platforms consistent. There might be some weird
> choices being made, but I don't think you need to sort all those out as
> part of this.

Well, _Py_DecodeLocaleEx is declared with PyAPI_FUNC, so would you expect it to 
be exposed on Windows? I haven't a Windows machine handy right now, but I would 
expect it not to be exposed, as it doesn't appear in PC/python3.def. I will 
check when I get a chance.

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q6H72YBNCKQYUQ4O4TMP65R4VEPEVLMT/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
Thanks for the pointer. According to that information, everything in 
Include/fileutils.h should be the portable public C API. But there are 
definitions in there which start with _Py - and the information relating to 
Include/cpython/*.h suggests that API prefixed with _Py is conventionally 
private. Is there a reason why (for example) _Py_DecodeLocaleEx appears in 
Include/*.h with a prefix conventionally suggesting it's private, and which is 
backed up by the fact that it's not mentioned in the C API documentation? If 
it's just there for the moment because no-one got around to moving it to 
Include/cpython/*.h, it would be useful to know that. I'm not trying to 
nit-pick here - if we're going to apply visibility rules, then things like this 
need to be absolutely clear.

Just for info, I tried configuring and compiling with -fvisibility=hidden and 
declared default visibility on PyAPI_FUNC, PyAPI_DATA and PyMODINIT_FUNC, as 
well as one or two other declarations. The resulting build passes all tests run 
locally, except test_tools (which is currently also failing for me on master). 
I haven't yet pushed these changes to my public CPython fork, as it's still 
work in progress - which is why I'm asking these questions.

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/G6P4RSVKFF7E6QYHBO4ZRLNHSNJWA6MT/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Nathaniel Smith
On Mon, Sep 23, 2019, 08:28 Vinay Sajip via Python-Dev <
python-dev@python.org> wrote:

> > requires some newer tools like -fvisibility=hidden that work
> > differently across different platforms, and so far no-one's done the
> > work to sort out the details.
>
> I've started looking at this, but quite apart from the specifics of
> applying -fvisibility=hidden, there are some things that aren't yet clear
> to me about the intent behind some of our symbol definitions. For example,
> the file Include/fileutils.h contains the definitions
>
> PyAPI_FUNC(wchar_t *) Py_DecodeLocale(const char *arg, size_t *size);
>
> and
>
> PyAPI_FUNC(int) _Py_DecodeLocaleEx(const char *arg,
> wchar_t **wstr,
> size_t *wlen,
> const char **reason,
> int current_locale,
> _Py_error_handler errors);
>
> However, only the first of these is documented, though the definition via
> PyAPI_FUNC implies that both are part of the public API. If this is the
> case, why aren't both documented? If _Py_DecodeLocaleEx is not part of the
> public API (and the leading underscore suggests so), should it be polluting
> the symbol space?
>
> The comment for PyAPI_FUNC is "Declares a public Python API function and
> return type". Is this really the case, or has PyAPI_FUNC been coopted to
> provide external linkage for use by Python-internal code in different
> compilation units?  _Py_DecodeLocaleEx is called in
> Modules/_testcapimodule.c and also in Objects/unicodeobject.c.
>
> If we want to take steps to restrict symbol visibility, it will
> potentially affect all of the code base - so presumably, a PEP would be
> required, even though it's an implementation detail from the point of view
> of the language itself?
>

Windows already has working symbol visibility handling, and PyAPI_FUNC is
what controls it. So adding symbol visibility handling to Linux/macOS is
just about making all the platforms consistent. There might be some weird
choices being made, but I don't think you need to sort all those out as
part of this.

-n
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TKSQQ5TLVRY4XYMSKVJ7X5GMXANOIDGG/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Brett Cannon
Vinay Sajip wrote:
> > Right, I'm
> > pretty sure that right now Python doesn't have any way to
> > share symbols between .c files without also exposing them in the C
> > API.
> > On other C projects I've worked on, the public API is expressed in one set
> of header files, and internal APIs that need to be exposed across modules are 
> described in
> a different set of internal header files, and developers who incorrectly use 
> internal APIs
> by including the internal headers could see breakage when the internals 
> change ... excuse
> my naïveté, as I haven't done much at Python's C level - does this 
> discipline/approach not
> apply to CPython?

As of Python 3.8 we do this sort of separation: 
https://docs.python.org/3.8/whatsnew/3.8.html#build-and-c-api-changes.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GS2GJDU6J4RXWPHDB5UJJTHLVULOGVLX/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
> requires some newer tools like -fvisibility=hidden that work
> differently across different platforms, and so far no-one's done the
> work to sort out the details.

I've started looking at this, but quite apart from the specifics of applying 
-fvisibility=hidden, there are some things that aren't yet clear to me about 
the intent behind some of our symbol definitions. For example, the file 
Include/fileutils.h contains the definitions

PyAPI_FUNC(wchar_t *) Py_DecodeLocale(const char *arg, size_t *size);

and 

PyAPI_FUNC(int) _Py_DecodeLocaleEx(const char *arg,
wchar_t **wstr,
size_t *wlen,
const char **reason,
int current_locale,
_Py_error_handler errors);

However, only the first of these is documented, though the definition via 
PyAPI_FUNC implies that both are part of the public API. If this is the case, 
why aren't both documented? If _Py_DecodeLocaleEx is not part of the public API 
(and the leading underscore suggests so), should it be polluting the symbol 
space?

The comment for PyAPI_FUNC is "Declares a public Python API function and return 
type". Is this really the case, or has PyAPI_FUNC been coopted to provide 
external linkage for use by Python-internal code in different compilation 
units?  _Py_DecodeLocaleEx is called in Modules/_testcapimodule.c and also in 
Objects/unicodeobject.c.

If we want to take steps to restrict symbol visibility, it will potentially 
affect all of the code base - so presumably, a PEP would be required, even 
though it's an implementation detail from the point of view of the language 
itself?

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H726K4PDW6PI34VVBX6HN26MAE5IZUNG/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Vinay Sajip via Python-Dev
> Moving some of the especially common identifier references into the
> interpreter state struct may make sense.
> Adding more process globals wouldn't be desirable though, as they're one of
> the more common ways of breaking encapsulation between subinterpreters
> (hence Eric's efforts to eliminate as many of them as he reasonably can).

Well, another way of breaking encapsulation is by having ad hoc statics (as 
opposed to globals) which should really be in the interpreter state - it 
prevents compartmentalizing the GIL into per-subinterpreter GILs, for example. 
That's my motivation for looking at this area - I spent a bit of time working 
with Eric at the recent core dev sprint, and wanted to explore the problems in 
this area.

Regards,

Vinay
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/43BYLJIZXPTVBGZTTQUTSPICZVWIF6HR/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Nick Coghlan
On Sat., 21 Sep. 2019, 5:28 am Vinay Sajip via Python-Dev, <
python-dev@python.org> wrote:

> I just ran an analysis of static variable definitions in CPython code,
> using clang, on Ubuntu and Windows. The results should be available here:
>
> https://cpython.red-dove.com/
>
> As I understand it, _Py_IDENTIFIER instances are supposed to hold constant
> strings that are used in Python - e.g. "__class__", "__dict__" and so on. I
> noticed that there are numerous duplicates of these - e.g. 8 instances of
> __name__, 11 instances of __dict__, and so on - each instance is defined as
> static in its source file and so completely distinct from the others.
>
> I realise the overall amount of memory used by these structures isn't
> large, but is there any particular benefit to having these multiple copies?
> The current situation seems a little untidy, at least. What would be the
> disadvantage of making them extern in the headers and allocating them once
> in some consts.c module? After all, they seem for the most part to be
> well-known constant strings that don't need to be encapsulated in any
> particular C compilation unit.
>

Moving some of the especially common identifier references into the
interpreter state struct may make sense.

Adding more process globals wouldn't be desirable though, as they're one of
the more common ways of breaking encapsulation between subinterpreters
(hence Eric's efforts to eliminate as many of them as he reasonably can).

Cheers,
Nick.


>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IVMLDHZOE2WCGYA5PZQ24R3PYBQEUVXH/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Nathaniel Smith
On Fri, Sep 20, 2019 at 2:58 PM Vinay Sajip via Python-Dev
 wrote:
>
> > > Right, I'm pretty sure that right now Python doesn't have any way to
> > share symbols between .c files without also exposing them in the C
> > API.
>
> On other C projects I've worked on, the public API is expressed in one set of 
> header files, and internal APIs that need to be exposed across modules are 
> described in a different set of internal header files, and developers who 
> incorrectly use internal APIs by including the internal headers could see 
> breakage when the internals change ... excuse my naïveté, as I haven't done 
> much at Python's C level - does this discipline/approach not apply to CPython?

Visibility in C is complicated :-). The level I'm talking about is
symbol visibility, which is determined by the linker, not by headers.
If a symbol is exported, then even if you hide the headers, it's still
part of the library ABI, can still collide with user symbols, can
still by accessed by determined users, etc. It's still fixable, but it
requires some newer tools like -fvisibility=hidden that work
differently across different platforms, and so far no-one's done the
work to sort out the details.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GOQDPXFWEBQT7DDQKHHGHWZQMTOG2COI/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Vinay Sajip via Python-Dev
> > Right, I'm pretty sure that right now Python doesn't have any way to
> share symbols between .c files without also exposing them in the C
> API.

On other C projects I've worked on, the public API is expressed in one set of 
header files, and internal APIs that need to be exposed across modules are 
described in a different set of internal header files, and developers who 
incorrectly use internal APIs by including the internal headers could see 
breakage when the internals change ... excuse my naïveté, as I haven't done 
much at Python's C level - does this discipline/approach not apply to CPython?

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XI4Y4M7WBXSMPY2ZDMD6IQWBAOFR6WYE/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Nathaniel Smith
On Fri, Sep 20, 2019 at 1:00 PM Andrew Svetlov  wrote:
> This target is very important for keeping public API as small as possible.

Right, I'm pretty sure that right now Python doesn't have any way to
share symbols between .c files without also exposing them in the C
API.

This is fixable using "symbol visibility" features, and it would be
nice to have the option to share stuff between our own C files without
also sharing it with the world, for lots of reasons. But it might be
necessary to implement that first before doing anything to share
_Py_IDENTIFIERs.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V4Z6EYNZMNNJFU3KFKHKFE2KCB5L5DSP/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Vinay Sajip via Python-Dev
> From my understanding, _Py_IDENTIFIER was designed for static usage.
> The overhead is quite low; keeping it as a private (static)
> module-level internal variable helps to decouple things.

I understand the importance of decoupling in general, but meanings of __name__, 
__dict__ etc. aren't going to change. What practical difficulties (in terms of 
coupling) do you foresee if e.g. all C modules know that somewhere there is a 
constant value representing the constant string "__dict__", and so on? There's 
no suggestion that some C modules will use a different constant string value to 
refer to e.g. the "__dict__" attribute than other C modules, is there? I'm not 
suggesting that all _Py_IDENTIFIERs need to be extern, just the ones whose 
meanings are public and can't be changed.

I'm not coming from a worried-about-the-memory-overhead perspective - it's more 
about the manageability of ad hoc static variables which can be "considered 
harmful" in certain scenarios - such as sub-interpreters, for example. 
_Py_IDENTIFIER seems to account for a fair proportion of all static variable 
declarations.

> This target is very important for keeping public API as small as possible.

I'm not suggesting that _Py_IDENTIFIERs become part of the public API. Surely 
it's not the case that any and every extern identifier becomes part of the 
public API? (I know people can misuse things and use unsupported internal APIs, 
but then they're on their own, right?)

Regards,

Vinay Sajip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZTLSGAHCKHLQK5BBKYHLEQGJHCU4OAFB/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Serhiy Storchaka

20.09.19 22:19, Vinay Sajip via Python-Dev пише:

I just ran an analysis of static variable definitions in CPython code, using 
clang, on Ubuntu and Windows. The results should be available here:

https://cpython.red-dove.com/

As I understand it, _Py_IDENTIFIER instances are supposed to hold constant strings that are used in 
Python - e.g. "__class__", "__dict__" and so on. I noticed that there are 
numerous duplicates of these - e.g. 8 instances of __name__, 11 instances of __dict__, and so on - 
each instance is defined as static in its source file and so completely distinct from the others.


Note that corresponding strings are interned, so all 
_Py_IDENTIFIER(__name__) share the same Python object.


There is also the _PyArg_Parser structure which contains reference to 
the lazy initialized tuple of strings.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYQBFVDXPAVVGSA6O4MXZ5ADAZU4Q37T/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Andrew Svetlov
>From my understanding, _Py_IDENTIFIER was designed for static usage.
The overhead is quite low; keeping it as a private (static)
module-level internal variable helps to decouple things.
This target is very important for keeping public API as small as possible.

On Fri, Sep 20, 2019 at 10:32 PM Vinay Sajip via Python-Dev
 wrote:
>
> I just ran an analysis of static variable definitions in CPython code, using 
> clang, on Ubuntu and Windows. The results should be available here:
>
> https://cpython.red-dove.com/
>
> As I understand it, _Py_IDENTIFIER instances are supposed to hold constant 
> strings that are used in Python - e.g. "__class__", "__dict__" and so on. I 
> noticed that there are numerous duplicates of these - e.g. 8 instances of 
> __name__, 11 instances of __dict__, and so on - each instance is defined as 
> static in its source file and so completely distinct from the others.
>
> I realise the overall amount of memory used by these structures isn't large, 
> but is there any particular benefit to having these multiple copies? The 
> current situation seems a little untidy, at least. What would be the 
> disadvantage of making them extern in the headers and allocating them once in 
> some consts.c module? After all, they seem for the most part to be well-known 
> constant strings that don't need to be encapsulated in any particular C 
> compilation unit.
>
> Regards,
>
> Vinay Sajip
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/GVQOMWXUDYLBXZ2MMIDX5D6X7X42VQV7/



-- 
Thanks,
Andrew Svetlov
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E24YQUYJMXC4LZXZEU7YELWIDDVWBDF3/