Re: Retrieving non-/etc/passwd users with Python 3?

2021-03-31 Thread Loris Bennett
Christian Heimes  writes:

> On 31/03/2021 14.45, Loris Bennett wrote:
>> Chris Angelico  writes:
>> 
>>> On Wed, Mar 31, 2021 at 11:21 PM Loris Bennett
>>>  wrote:

 Hi,

 I want to get a list of users on a Linux system using Python 3.6.  All
 the users I am interested in are just available via LDAP and are not in
 /etc/passwd.  Thus, in a bash shell I can use 'getent' to display them.

 When I try to install the PyPi package

   getent

 I get the error

 File "/tmp/pip-build-vu4lziex/getent/setup.py", line 9, in 
   long_description = file('README.rst').read(),
   NameError: name 'file' is not defined

 I duckduckwent a bit and the problem seems to be that 'file' from Python
 2 has been replaced by 'open' in Python 3.

 So what's the standard way of getting a list of users in this case?

>>>
>>> I don't have LDAP experience so I don't know for sure, but is the
>>> stdlib "pwd" module suitable, or does it only read /etc/passwd?
>>>
>>> https://docs.python.org/3/library/pwd.html
>>>
>>> Failing that, one option - and not as bad as you might think - is
>>> simply to run getent using the subprocess module, and parse its
>>> output. Sometimes that's easier than finding (or porting!) a library.
>> 
>> D'oh!  Thanks, 'pwd' is indeed exactly what I need.  When I read the
>> documentation here
>> 
>>   https://docs.python.org/3.6/library/pwd.html 
>> 
>> I mistakenly got the impression that it was only going to give me the
>> local users.  It doesn't actually say that, but it mentions /etc/shadow
>> and not getent.  However, it does talk about the "account and password
>> database", which is a clue (although our passwords are on an other
>> system entirely), since "database" is more getent terminology.
>> 
>> In any case, I think 'pwd' is hiding its light under a bushel a bit
>> here.
>
> Please open a documentation bug :)

I'll have a look :)

> The pwd and grp module use the libc API to get users from the local
> account database. On Linux and glibc the account database is handled by
> NSS and nsswitch.conf.
>
> By the way I recommend that you use SSSD instead of talking to LDAP
> directly. You'll have a much more pleasant experience.

Yes, we do use SSSD, but my grasp of what it does is pretty much limited
to "as well as looking at the local /etc/passwd it can also talk to
LDAP" :/

Cheers,

Loris

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Canonical conversion of dict of dicts to list of dicts

2021-03-31 Thread Greg Ewing

On 31/03/21 7:37 pm, dn wrote:

Python offers mutable (can be changed) and immutable (can't) objects
(remember: 'everything is an object'):
https://docs.python.org/3/reference/datamodel.html?highlight=mutable%20data


While that's true, it's actually irrelevant to this situation.


   $ a = "bob"
   $ b = a
   $ b = "bert"
   $ a
  'bob'


Here, you're not even attempting to modify the object that is
bound to b; instead, you're rebinding the name b to a different
object. Whether the object to which b was previously bound is
mutable or not makes no difference.

You can see this if you do the equivalent thing with lists:

>>> a = ["alice", "bob", "carol"]
>>> b = a
>>> b
['alice', 'bob', 'carol']
>>> b = ['dave', 'edward', 'felicity']
>>> a
['alice', 'bob', 'carol']
>>> b
['dave', 'edward', 'felicity']

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Horrible abuse of __init_subclass__, or elegant hack?

2021-03-31 Thread dn via Python-list
On 01/04/2021 13.54, Chris Angelico wrote:
> On Thu, Apr 1, 2021 at 11:39 AM dn via Python-list
>  wrote:
>>
>> On 01/04/2021 12.14, Chris Angelico wrote:
>>> I think this code makes some sort of argument in the debate about
>>> whether Python has too much flexibility or if it's the best
>>> metaprogramming toolset in the world. I'm not sure which side of the
>>> debate it falls on, though.
>>>
>>> class Building:
>>> resource = None
>>> @classmethod
>>> def __init_subclass__(bldg):
>>> super().__init_subclass__()
>>> print("Building:", bldg.__name__)
>>> def make_recipe(recip):
>>> print(recip.__name__.replace("_", " "), "is made in a",
>>> bldg.__name__.replace("_", " "))
>>> bldg.__init_subclass__ = classmethod(make_recipe)
>>>
>>>
>>> class Extractor(Building): ...
>>> class Refinery(Building): ...
>>>
>>> class Crude(Extractor):
>>> resource = "Oil"
>>> time: 1
>>> Crude: 1
>>>
>>> class Plastic(Refinery):
>>> Crude: 3
>>> time: 6
>>> Residue: 1
>>> Plastic: 2
>>>
>>> class Rubber(Refinery):
>>> Crude: 3
>>> time: 6
>>> Residue: 2
>>> Rubber: 2
>>
>>
>> [pauses for a moment, to let his mind unwind and return to (what passes
>> as) 'reality']
> 
> Real and imaginary are the same thing, just rotated a quarter turn

In which dimension(s)?


>> Without looking into the details/context: surely there's a more
>> straightforward approach?
> 
> Perhaps, but there are potentially a LOT of recipes, and I needed to
> be able to cleanly edit those, even if the code at the top was a mess.
> (The goal here is to map out production patterns in the game
> "Satisfactory", for the curious. It's easy to add other things, like
> computer manufacturing or bauxite processing, simply by adding more
> recipes.)
> 
> My original plan was basically pairwise tuple summing (deriving a set
> of "oil in, water in, rubber out, fuel out" for each set of recipes,
> where some might be zero), but it turned out that that wasn't flexible
> enough, and it really needed more options than that.

Which was where my mind was going*. Why not a dict of inputs, processes,
and outputs**? Each dict having variable length, from None, and
key:values assigned at declaration/init. In the case of process, the
contained objects could be Python-functions. With "compact
representation" (3.6+) the functions could also be relied upon to
represent a 'production line' or pipeline of functions.

* but it strayed so far, I had to ask for it back
** in my mindless state, this combination of three activities seemed
familiar, to the point of providing much-needed comfort.


>> As to this, I'm slightly amused, but perhaps not in a good way:
>>
>> class Sanatorium( Building ):
>>patient_name = "Chris"
>>duration_of_treatment = "life"
> 
> I already have certificates from Rutledge's Asylum and MaayaInsane's
> (unnamed) asylum, so that seems pretty likely.

Noted you on the list of lauded alumni at the latter.

When you left the former, did they allow you to keep the t-shirt, or did
you have to buy your own memorabilia?
(https://mysterious.americanmcgee.com/products/rutledge-asylum-mug)

The latter's treatment list sounds remarkably like .mil training. I know
of plenty with that t-shirt - but can't think of a one sporting a mug...
Should you have one, kindly bring it (with appropriate contents) come
ANZAC Day at the end of this month...


>> Thus, design suggestion: add a 'back-door' to the __init_subclass__ to
>> ensure access to the Internet from any/all buildings!
> 
> Perfect. Nobody'll find it. I'll have full access to Usenet News from
> a secret panel in one of the padded sections of the wall.

Surely, in such a state of mind, one's natural 'home' would be "the dark
web"?

Curiously, last night, the nieces (now long passed that age) were
talking about the different modes they used to get to various of their
schools. (nostalgia in one's twenties???) Which, it should have been
expected, opened the way for their father to indulge in the usual
grey-hair stories as: walking so many miles barefoot through the snow,
falling backwards off the horse because there were so many others
climbing up in-front, ... After rolling their eyes (compulsory Dad-joke,
-dance, -comment, ... behavioral-response) they attempted to de-rail
his, um, railing, by reminding everyone that I went to (boarding) school
(aka gentle asylum for young boys - at a considerable distance from
'polite society') by long-distance train. The only understanding of
which came when they watched the Harry Potter films and saw the
school-kids collecting at a ?London terminus on their way to magic-school.

Magic, you ask? Well, maybe more "sinister". We did manage to find a
loose floor-board, but a sad life-lesson was learned, when certain ones
(un-named*) took it upon themselves to eat all of the contraband
secreted there. Another dorm[itory] did manage to prise-open a
wall-panel. Their boasting creat

Re: memory consumption

2021-03-31 Thread Rob Cliffe via Python-list



On 31/03/2021 09:35, Alexey wrote:

среда, 31 марта 2021 г. в 01:20:06 UTC+3, Dan Stromberg:


What if you increase the machine's (operating system's) swap space? Does
that take care of the problem in practice?

I can`t do that because it will affect other containers running on this host.
In my opinion it may significantly reduce their performance.

Probably still worth trying.  Always better to measure than to guess.
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list


Re: Horrible abuse of __init_subclass__, or elegant hack?

2021-03-31 Thread Chris Angelico
On Thu, Apr 1, 2021 at 11:39 AM dn via Python-list
 wrote:
>
> On 01/04/2021 12.14, Chris Angelico wrote:
> > I think this code makes some sort of argument in the debate about
> > whether Python has too much flexibility or if it's the best
> > metaprogramming toolset in the world. I'm not sure which side of the
> > debate it falls on, though.
> >
> > class Building:
> > resource = None
> > @classmethod
> > def __init_subclass__(bldg):
> > super().__init_subclass__()
> > print("Building:", bldg.__name__)
> > def make_recipe(recip):
> > print(recip.__name__.replace("_", " "), "is made in a",
> > bldg.__name__.replace("_", " "))
> > bldg.__init_subclass__ = classmethod(make_recipe)
> >
> >
> > class Extractor(Building): ...
> > class Refinery(Building): ...
> >
> > class Crude(Extractor):
> > resource = "Oil"
> > time: 1
> > Crude: 1
> >
> > class Plastic(Refinery):
> > Crude: 3
> > time: 6
> > Residue: 1
> > Plastic: 2
> >
> > class Rubber(Refinery):
> > Crude: 3
> > time: 6
> > Residue: 2
> > Rubber: 2
>
>
> [pauses for a moment, to let his mind unwind and return to (what passes
> as) 'reality']

Real and imaginary are the same thing, just rotated a quarter turn

> Without looking into the details/context: surely there's a more
> straightforward approach?

Perhaps, but there are potentially a LOT of recipes, and I needed to
be able to cleanly edit those, even if the code at the top was a mess.
(The goal here is to map out production patterns in the game
"Satisfactory", for the curious. It's easy to add other things, like
computer manufacturing or bauxite processing, simply by adding more
recipes.)

My original plan was basically pairwise tuple summing (deriving a set
of "oil in, water in, rubber out, fuel out" for each set of recipes,
where some might be zero), but it turned out that that wasn't flexible
enough, and it really needed more options than that.

> As to this, I'm slightly amused, but perhaps not in a good way:
>
> class Sanatorium( Building ):
>patient_name = "Chris"
>duration_of_treatment = "life"

I already have certificates from Rutledge's Asylum and MaayaInsane's
(unnamed) asylum, so that seems pretty likely.

> Thus, design suggestion: add a 'back-door' to the __init_subclass__ to
> ensure access to the Internet from any/all buildings!

Perfect. Nobody'll find it. I'll have full access to Usenet News from
a secret panel in one of the padded sections of the wall.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Horrible abuse of __init_subclass__, or elegant hack?

2021-03-31 Thread dn via Python-list
On 01/04/2021 12.14, Chris Angelico wrote:
> I think this code makes some sort of argument in the debate about
> whether Python has too much flexibility or if it's the best
> metaprogramming toolset in the world. I'm not sure which side of the
> debate it falls on, though.
> 
> class Building:
> resource = None
> @classmethod
> def __init_subclass__(bldg):
> super().__init_subclass__()
> print("Building:", bldg.__name__)
> def make_recipe(recip):
> print(recip.__name__.replace("_", " "), "is made in a",
> bldg.__name__.replace("_", " "))
> bldg.__init_subclass__ = classmethod(make_recipe)
> 
> 
> class Extractor(Building): ...
> class Refinery(Building): ...
> 
> class Crude(Extractor):
> resource = "Oil"
> time: 1
> Crude: 1
> 
> class Plastic(Refinery):
> Crude: 3
> time: 6
> Residue: 1
> Plastic: 2
> 
> class Rubber(Refinery):
> Crude: 3
> time: 6
> Residue: 2
> Rubber: 2


[pauses for a moment, to let his mind unwind and return to (what passes
as) 'reality']

Without looking into the details/context: surely there's a more
straightforward approach?


As to this, I'm slightly amused, but perhaps not in a good way:

class Sanatorium( Building ):
   patient_name = "Chris"
   duration_of_treatment = "life"


Thus, design suggestion: add a 'back-door' to the __init_subclass__ to
ensure access to the Internet from any/all buildings!
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Horrible abuse of __init_subclass__, or elegant hack?

2021-03-31 Thread Ethan Furman

On 3/31/21 4:14 PM, Chris Angelico wrote:

I think this code makes some sort of argument in the debate about
whether Python has too much flexibility or if it's the best
metaprogramming toolset in the world. I'm not sure which side of the
debate it falls on, though.


Well, `__init_subclass__` is there to provide metaclass power without needing a 
full-blown metaclass.

I vote elegant hack.  :)

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Horrible abuse of __init_subclass__, or elegant hack?

2021-03-31 Thread Chris Angelico
I think this code makes some sort of argument in the debate about
whether Python has too much flexibility or if it's the best
metaprogramming toolset in the world. I'm not sure which side of the
debate it falls on, though.

class Building:
resource = None
@classmethod
def __init_subclass__(bldg):
super().__init_subclass__()
print("Building:", bldg.__name__)
def make_recipe(recip):
print(recip.__name__.replace("_", " "), "is made in a",
bldg.__name__.replace("_", " "))
bldg.__init_subclass__ = classmethod(make_recipe)


class Extractor(Building): ...
class Refinery(Building): ...

class Crude(Extractor):
resource = "Oil"
time: 1
Crude: 1

class Plastic(Refinery):
Crude: 3
time: 6
Residue: 1
Plastic: 2

class Rubber(Refinery):
Crude: 3
time: 6
Residue: 2
Rubber: 2

Full code is here if you want context:
https://github.com/Rosuav/shed/blob/master/satisfactory-production.py

Subclassing Building defines a class that is a building. (The ellipsis
body is a placeholder; I haven't implemented stuff where the buildings
know about their power consumptions and such. Eventually they'll have
other attributes.) But subclassing a building defines a recipe that is
produced in that building. Markers placed before the "time" are
ingredients, those after the "time" are products.

There are actually a lot of interesting wrinkles to trying to replace
__init_subclass__ on the fly. Things get quite entertaining if you
don't use the decorator, or if you define and decorate the function
outside of the class, or various other combinations.

On a scale of 1 to "submit this to The Daily WTF immediately", how bad
is this code? :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to find 'edit with ide' option in the Context menu

2021-03-31 Thread Terry Reedy

On 3/31/2021 2:11 AM, Arjav Jain wrote:

I am using the lastest version of python recently. But I am facing a
problem with the python files, When I am right clicking any python file
there is no option for `Edit with idle'. I have repaired the python
installation too, but this doesn't  solves my problem, please help!


Did you check (or leave checked) [x] Install tkinter and IDLE?
Can you start IDLE otherwise?

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 18:17:46 UTC+3, Dieter Maurer:
> Alexey wrote at 2021-3-31 02:43 -0700: 
> >среда, 31 марта 2021 г. в 06:54:52 UTC+3, Inada Naoki:
> > ...
> >> You can get some hints from sys._debugmallocstats(). It prints 
> >> obmalloc (allocator for small objects) stats to stderr. 
> >> Try printing stats before and after 1st run, and after 2nd run. And 
> >> post it in this thread if you can. (no sensible information in the 
> >> stats).
> `glibc` has similar functions to monitor the memory allocation 
> at the C level: `mallinfo[2]`, `malloc_stats`, `malloc_info`. 
> 
> The `mallinfo` functions can be called via `ctypes`. 
> Provided your `glibc` has `mallinfo2`, I recommend its use. 
> 
> In order to use `malloc_info` from Python, you need 
> a C extension. I have one implemented via `cython`. Let me know, 
> if you are interested.

I think I found something. I'll return tomorrow with update.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Dieter Maurer
Alexey wrote at 2021-3-31 02:43 -0700:
>среда, 31 марта 2021 г. в 06:54:52 UTC+3, Inada Naoki:
> ...
>> You can get some hints from sys._debugmallocstats(). It prints
>> obmalloc (allocator for small objects) stats to stderr.
>> Try printing stats before and after 1st run, and after 2nd run. And
>> post it in this thread if you can. (no sensible information in the
>> stats).

`glibc` has similar functions to monitor the memory allocation
at the C level: `mallinfo[2]`, `malloc_stats`, `malloc_info`.

The `mallinfo` functions can be called via `ctypes`.
Provided your `glibc` has `mallinfo2`, I recommend its use.

In order to use `malloc_info` from Python, you need
a C extension. I have one implemented via `cython`. Let me know,
if you are interested.
-- 
https://mail.python.org/mailman/listinfo/python-list


Source code link was: Re: Ann: New Python curses book

2021-03-31 Thread Alan Gauld via Python-list
On 31/03/2021 00:09, Alan Gauld via Python-list wrote:

> Watch this space. Hopefully tomorrow.

The source code is now available in a zip file at:

http://www.alan-g.me.uk/hills/PythonCursesCode.zip

Or via a link on the programming section of my
home page

http://www.alan-g.me.uk/

It is licensed using a Creative Commons license.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


-- 
https://mail.python.org/mailman/listinfo/python-list


Unable to find 'edit with ide' option in the Context menu

2021-03-31 Thread Arjav Jain
   I am using the lastest version of python recently. But I am facing a
   problem with the python files, When I am right clicking any python file
   there is no option for `Edit with idle'. I have repaired the python
   installation too, but this doesn't  solves my problem, please help!

   Sent from [1]Mail for Windows 10



References

   Visible links
   1. https://go.microsoft.com/fwlink/?LinkId=550986
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Retrieving non-/etc/passwd users with Python 3?

2021-03-31 Thread Christian Heimes
On 31/03/2021 14.45, Loris Bennett wrote:
> Chris Angelico  writes:
> 
>> On Wed, Mar 31, 2021 at 11:21 PM Loris Bennett
>>  wrote:
>>>
>>> Hi,
>>>
>>> I want to get a list of users on a Linux system using Python 3.6.  All
>>> the users I am interested in are just available via LDAP and are not in
>>> /etc/passwd.  Thus, in a bash shell I can use 'getent' to display them.
>>>
>>> When I try to install the PyPi package
>>>
>>>   getent
>>>
>>> I get the error
>>>
>>> File "/tmp/pip-build-vu4lziex/getent/setup.py", line 9, in 
>>>   long_description = file('README.rst').read(),
>>>   NameError: name 'file' is not defined
>>>
>>> I duckduckwent a bit and the problem seems to be that 'file' from Python
>>> 2 has been replaced by 'open' in Python 3.
>>>
>>> So what's the standard way of getting a list of users in this case?
>>>
>>
>> I don't have LDAP experience so I don't know for sure, but is the
>> stdlib "pwd" module suitable, or does it only read /etc/passwd?
>>
>> https://docs.python.org/3/library/pwd.html
>>
>> Failing that, one option - and not as bad as you might think - is
>> simply to run getent using the subprocess module, and parse its
>> output. Sometimes that's easier than finding (or porting!) a library.
> 
> D'oh!  Thanks, 'pwd' is indeed exactly what I need.  When I read the
> documentation here
> 
>   https://docs.python.org/3.6/library/pwd.html 
> 
> I mistakenly got the impression that it was only going to give me the
> local users.  It doesn't actually say that, but it mentions /etc/shadow
> and not getent.  However, it does talk about the "account and password
> database", which is a clue (although our passwords are on an other
> system entirely), since "database" is more getent terminology.
> 
> In any case, I think 'pwd' is hiding its light under a bushel a bit
> here.

Please open a documentation bug :)

The pwd and grp module use the libc API to get users from the local
account database. On Linux and glibc the account database is handled by
NSS and nsswitch.conf.

By the way I recommend that you use SSSD instead of talking to LDAP
directly. You'll have a much more pleasant experience.

Christian

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Retrieving non-/etc/passwd users with Python 3?

2021-03-31 Thread Loris Bennett
Chris Angelico  writes:

> On Wed, Mar 31, 2021 at 11:21 PM Loris Bennett
>  wrote:
>>
>> Hi,
>>
>> I want to get a list of users on a Linux system using Python 3.6.  All
>> the users I am interested in are just available via LDAP and are not in
>> /etc/passwd.  Thus, in a bash shell I can use 'getent' to display them.
>>
>> When I try to install the PyPi package
>>
>>   getent
>>
>> I get the error
>>
>> File "/tmp/pip-build-vu4lziex/getent/setup.py", line 9, in 
>>   long_description = file('README.rst').read(),
>>   NameError: name 'file' is not defined
>>
>> I duckduckwent a bit and the problem seems to be that 'file' from Python
>> 2 has been replaced by 'open' in Python 3.
>>
>> So what's the standard way of getting a list of users in this case?
>>
>
> I don't have LDAP experience so I don't know for sure, but is the
> stdlib "pwd" module suitable, or does it only read /etc/passwd?
>
> https://docs.python.org/3/library/pwd.html
>
> Failing that, one option - and not as bad as you might think - is
> simply to run getent using the subprocess module, and parse its
> output. Sometimes that's easier than finding (or porting!) a library.

D'oh!  Thanks, 'pwd' is indeed exactly what I need.  When I read the
documentation here

  https://docs.python.org/3.6/library/pwd.html 

I mistakenly got the impression that it was only going to give me the
local users.  It doesn't actually say that, but it mentions /etc/shadow
and not getent.  However, it does talk about the "account and password
database", which is a clue (although our passwords are on an other
system entirely), since "database" is more getent terminology.

In any case, I think 'pwd' is hiding its light under a bushel a bit
here.

Cheers,

Loris

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Retrieving non-/etc/passwd users with Python 3?

2021-03-31 Thread Chris Angelico
On Wed, Mar 31, 2021 at 11:21 PM Loris Bennett
 wrote:
>
> Hi,
>
> I want to get a list of users on a Linux system using Python 3.6.  All
> the users I am interested in are just available via LDAP and are not in
> /etc/passwd.  Thus, in a bash shell I can use 'getent' to display them.
>
> When I try to install the PyPi package
>
>   getent
>
> I get the error
>
> File "/tmp/pip-build-vu4lziex/getent/setup.py", line 9, in 
>   long_description = file('README.rst').read(),
>   NameError: name 'file' is not defined
>
> I duckduckwent a bit and the problem seems to be that 'file' from Python
> 2 has been replaced by 'open' in Python 3.
>
> So what's the standard way of getting a list of users in this case?
>

I don't have LDAP experience so I don't know for sure, but is the
stdlib "pwd" module suitable, or does it only read /etc/passwd?

https://docs.python.org/3/library/pwd.html

Failing that, one option - and not as bad as you might think - is
simply to run getent using the subprocess module, and parse its
output. Sometimes that's easier than finding (or porting!) a library.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Retrieving non-/etc/passwd users with Python 3?

2021-03-31 Thread Loris Bennett
Hi,

I want to get a list of users on a Linux system using Python 3.6.  All
the users I am interested in are just available via LDAP and are not in
/etc/passwd.  Thus, in a bash shell I can use 'getent' to display them.

When I try to install the PyPi package

  getent

I get the error

File "/tmp/pip-build-vu4lziex/getent/setup.py", line 9, in 
  long_description = file('README.rst').read(),
  NameError: name 'file' is not defined

I duckduckwent a bit and the problem seems to be that 'file' from Python
2 has been replaced by 'open' in Python 3.

So what's the standard way of getting a list of users in this case?

Cheers,

Loris

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 14:16:30 UTC+3, Inada Naoki:
> > ** Before first run:
> > # arenas allocated total = 776 
> > # arenas reclaimed = 542 
> > # arenas highwater mark = 234 
> > # arenas allocated current = 234 
> > 234 arenas * 262144 bytes/arena = 61,341,696
> > ** After first run:
> > # arenas allocated total = 47,669 
> > # arenas reclaimed = 47,316 
> > # arenas highwater mark = 10,114 
> > # arenas allocated current = 353 
> > 353 arenas * 262144 bytes/arena = 92,536,832
> > ** After second run:
> > # arenas allocated total = 63,635 
> > # arenas reclaimed = 63,238 
> > # arenas highwater mark = 10,114 
> > # arenas allocated current = 397 
> > 397 arenas * 262144 bytes/arena = 104,071,168
> OK, memory allocated by obmalloc is 61MB -> 92MB -> 104MB. 
> 
> Memory usage increasing, but it is much smaller than 1GB. 90% memory 
> is allocated by malloc(). 
> 
> You should try jemalloc. Trying jemalloc is not hard. You don't need 
> to rebuild Python. 
> Google " jemalloc LD_PRELOAD". 
> 
> 
> -- 
> Inada Naoki 

With jemalloc it looks like a memory leak :D
After first run it grabs 980Mb, second run 1.4Gb then 2.6Gb and so on
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Inada Naoki
> ** Before first run:
> # arenas allocated total   =  776
> # arenas reclaimed =  542
> # arenas highwater mark=  234
> # arenas allocated current =  234
> 234 arenas * 262144 bytes/arena=   61,341,696
> ** After first run:
> # arenas allocated total   =   47,669
> # arenas reclaimed =   47,316
> # arenas highwater mark=   10,114
> # arenas allocated current =  353
> 353 arenas * 262144 bytes/arena=   92,536,832
>  ** After second run:
> # arenas allocated total   =   63,635
> # arenas reclaimed =   63,238
> # arenas highwater mark=   10,114
> # arenas allocated current =  397
> 397 arenas * 262144 bytes/arena=  104,071,168


OK, memory allocated by obmalloc is 61MB -> 92MB -> 104MB.

Memory usage increasing, but it is much smaller than 1GB. 90% memory
is allocated by malloc().

You should try jemalloc. Trying jemalloc is not hard. You don't need
to rebuild Python.
Google " jemalloc LD_PRELOAD".


-- 
Inada Naoki  
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 11:52:43 UTC+3, Marco Ippolito:
> > > At which point does the problem start manifesting itself? 
> > The problem spot is my cache(dict). I simplified my code to just load 
> > all the objects to this dict and then clear it.
> What's the memory utilisation just _before_ performing this load? I am 
> assuming 
> it's much less than this 1 GB you can't seem to drop under after you run your 
> `.clear()`. 

Around 100Mb before first run.

> > After loading "top" 
> 
> You may be using `top` in command line mode already but in case you aren't, 
> consider sorting processes whose command name is `python` (or whatever filter 
> selects your program) by RSS, like so, for easier collection of 
> machine-readable statistics:

I'm using following command to highlight what I need -
top -c -p $(pgrep -d',' -f python) and then sort by RSS 
and switch to Mb by pressing 'e'.


> # ps -o rss,ppid,pid,args --sort -rss $(pgrep python) 
> RSS PPID PID COMMAND 
> 32836 14130 14377 python3 
> 10644 14540 14758 python3
> > For debugging I use Pycharm
> Sounds good, you can then use the GUI to set the breakpoint and consult 
> external statistics-gathering programs (like the `ps` invocation above) as 
> you 
> step through your code. 
> 
> Pycharm also allows you to see which variables are in scope in a particular 
> stack frame, so you'll have an easier time reasoning about garbage collection 
> in terms of which references might be preventing GC.

That's what I tried in the first place and I see no references 
to this dict. I'll try that one more time anyway.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 06:54:52 UTC+3, Inada Naoki:

> First of all, I recommend upgrading your Python. Python 3.6 is a bit old. 
I was thinking about that.

> As you saying, Python can not return the memory to OS until the whole 
> arena become unused. 
> If your task releases all objects allocated during the run, Python can 
> release the memory. 
> But if your task keeps at least one object, it may prevent releasing 
> the whole arena (256KB). 
> 
> Python manages only small (~256bytes) objects. Larger objects is 
> allocated by malloc(). 
> And glibc malloc may not efficient for some usage. jemalloc is better 
> for many use cases. 
> 
> You can get some hints from sys._debugmallocstats(). It prints 
> obmalloc (allocator for small objects) stats to stderr. 
> Try printing stats before and after 1st run, and after 2nd run. And 
> post it in this thread if you can. (no sensible information in the 
> stats). 
> 
> That is all I can advise. 

** Before first run:
class   size   num pools   blocks in use  avail blocks
-      -   -  
0  8   52404   126
1 16   3 611   148
2 24  13210381
3 32 371   4670343
4 40 292   2942270
5 48 233   1949973
6 561360   9790317
7 641614  10165428
8 721964  10994737
9 801056   5278317
   10 88 436   2002333
   11 96 297   1245519
   12104 266   1009513
   13112 193693018
   14120 127417021
   15128 217671215
   161361299   37669 2
   171441223   34239 5
   18152 113292018
   19160  781949 1
   201681474   35369 7
   21176  541237 5
   22184  46 99121
   23192  42 86418
   24200  531054 6
   25208  39 71328
   26216  54 95517
   27224 575   10350 0
   28232  43 724 7
   29240  32 49715
   30248  73115315
   31256  29 431 4
   32264  25 375 0
   33272  46 637 7
   34280  24 328 8
   35288  20 280 0
   36296 3985167 7
   37304  21 26112
   38312  22 256 8
   39320  17 195 9
   40328  18 215 1
   41336  57 675 9
   42344  17 183 4
   43352  18 194 4
   44360  14 153 1
   45368  14 153 1
   46376  15 148 2
   47384  15 148 2
   48392  14 131 9
   49400  15 149 1
   50408  17 147 6
   51416  16 142 2
   52424  25 221 4
   53432  36 317 7
   54440  44 393 3
   55448  45 399 6
   56456  53 420 4
   57464  46 363 5
   58472  36 288 0
   59480  35 274 6
   60488  29 227 5
   61496  29 230 2
   62504  21 161 7
   63512  85 589 6

# arenas allocated total   =  776
# arenas reclaimed =  542
# arenas highwater mark=  234
# arenas allocated current =  234
234 arenas * 262144 bytes/arena=   61,341,696

# bytes in allocated blocks=   59,737,176
# bytes in available blocks  

Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 05:45:27 UTC+3, cameron...@gmail.com:
> Since everyone is talking about vague OS memory use and not at all about 
> working set size of Python objects, let me ...
> On 29Mar2021 03:12, Alexey  wrote: 
> >I'm experiencing problems with memory consumption. 
> > 
> >I have a class which is doing ETL job. What`s happening inside: 
> > - fetching existing objects from DB via SQLAchemy
> Do you need to? Or do you only need to fetch their ids? Or do you only 
> need to fetch a subset of the objects? 

I really need all the objects because I'm performing update and create 
operations. If I'll be fetching them on the go, this will take hours or even 
days
to complete.

> It is easy to accidentally suck in way too many db session entity 
> objects, or at any rate, more than you need to.
> > - iterate over raw data
> Can you prescan the data to determine which objects you care about, 
> reducing the number of objects you need to obtain?

In this case I still need to iterate over raw and old data. As I said before
if I'll try it without caching it'll take days

> > - create new/update existing objects
> Depoending what you're doing, you may not need to "create new/update 
> existing objects". You could collate changes and do an UPSERT (the 
> incantation varies a little depending on the SQL dialect behind 
> SQLAlchemy). 
Good advice.

> > - commit changes 
> 
> Do you discard the SQLAlchemy session after this? Otherwise it may lurk 
> and hold onto the objects. Commit doesn't forget the objects. 
I tried expire_all() and expunge_all. Should I try rollback ?

> For my current client we have a script to import historic data from a 
> legacy system. It has many of the issues you're dealing with: the naive 
> (ORM) way consumes gads of memory, and can be very slow too (udating 
> objects in an ad hoc manner tends to do individual UPDATE SQL commands, 
> very latency laden). 
> 
> I wrote a generic batch UPSERT function which took an accrued list of 
> changes and prepared a PostgreSQL INSERT...ON CONFLICT statement. The 
> main script hands it the accrued updates and it runs batches (which lets 
> up do progress reporting). Orders of magnitude faster, _and_ does not 
> require storing the db objects. 
> 
> On the subject of "fetching existing objects from DB via SQLAchemy": you 
> may not need to do that, either. Can you identify _which_ objects are of 
> interest? Associate with the same script I've go a batch_select 
> function: it takes an terable if object ids and collects them in 
> batches, where before we were really scanning the whole db because we 
> had an arbitrary scattering of relevant object ids from the raw data. 

I'll try to analyze if it's possible to rewrite code this way

> It basicly collected ids into batches, and ran a SELECT...WHERE id in 
> (batch-of-ids). It's really fast considering, and also scales _way_ down 
> when the set of arbitrary ids is small. 
> 
> I'm happy to walk through the mechanics of these with you; the code at 
> this end is Django's ORM, but I prefer SQLAlchemy anyway - the project 
> dictated the ORM here.
> >Before processing data I create internal cache(dictionary) and store all 
> >existing objects in it. 
> >Every 1 items I do bulk insert and flush. At the end I run commit 
> >command.
> Yah. I suspect the session data are not being released. Also, SQLAlchemy 
> may be caching sessions or something across runs, since this is a celery 
> worker which survives from one task to the next. 

I tried to dig in this direction. Created a few graphs with "objgraph"
 but it has so much references under the hood. I'll try to measure size of 
session
 object before and after building cache.

> You could try explicitly creating a new SQLAlchemy session around your 
> task.
> >Problem. Before executing, my interpreter process weighs ~100Mb, after first 
> >run memory increases up to 500Mb 
> >and after second run it weighs 1Gb. If I will continue to run this class, 
> >memory wont increase, so I think 
> >it's not a memory leak, but rather Python wont release allocated memory back 
> >to OS. Maybe I'm wrong.
> I don't know enough about Python's "release OS memory" phase. But 
> reducing the task memory footprint will help regardless. 

Definitely. I'll think about it.
Thank you!

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Marco Ippolito
> > At which point does the problem start manifesting itself?
> The problem spot is my cache(dict). I simplified my code to just load 
> all the objects to this dict and then clear it.

What's the memory utilisation just _before_ performing this load? I am assuming
it's much less than this 1 GB you can't seem to drop under after you run your
`.clear()`.

> After loading "top"

You may be using `top` in command line mode already but in case you aren't,
consider sorting processes whose command name is `python` (or whatever filter
selects your program) by RSS, like so, for easier collection of
machine-readable statistics:

# ps -o rss,ppid,pid,args --sort -rss $(pgrep python)
  RSSPPID PID COMMAND
  32836   14130   14377 python3
  10644   14540   14758 python3

> For debugging I use Pycharm

Sounds good, you can then use the GUI to set the breakpoint and consult
external statistics-gathering programs (like the `ps` invocation above) as you
step through your code.

Pycharm also allows you to see which variables are in scope in a particular
stack frame, so you'll have an easier time reasoning about garbage collection
in terms of which references might be preventing GC.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
среда, 31 марта 2021 г. в 01:20:06 UTC+3, Dan Stromberg:
> On Tue, Mar 30, 2021 at 1:25 AM Alexey  wrote: 
> 
> > 
> > I'm sorry. I didn't understand your question right. If I have 4 workers, 
> > they require 4Gb 
> > in idle state and some extra memory when they execute other tasks. If I 
> > increase workers 
> > count up to 16, they`ll eat all the memory I have (16GB) on my machine and 
> > will crash as soon 
> > as system get swapped. 
> >
> What if you increase the machine's (operating system's) swap space? Does 
> that take care of the problem in practice?

I can`t do that because it will affect other containers running on this host.
In my opinion it may significantly reduce their performance.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
вторник, 30 марта 2021 г. в 18:43:54 UTC+3, Alan Gauld:
> On 29/03/2021 11:12, Alexey wrote: 

> The first thing you really need to tell us is which 
> OS you are using? Memory management varies wildly 
> depending on OS. Even different flavours of *nix 
> do it differently. 

I'm using Ubuntu(5.8.0-45-generic #51~20.04.1-Ubuntu) in
development and Centos 7 in production

> However, most do it effectively, so you as a programmer 
> shouldn't have to worry too much provided you aren't 
> leaking, which you don't think you are.
> > and after second run it weighs 1Gb. If I will continue 
> > to run this class, memory wont increase, so I think 
> > it's not a memory leak, but rather Python wont release 
> > allocated memory back to OS. Maybe I'm wrong.
> A 1GB process on modern computers is hardly a big problem? 
> Most machines have 4G and many have 16G or even 32G 
> nowadays. 

In case of one worker it's ok. But when 8 workers holding 8Gb 
of garbage it becomes a problem and I cant ignore this due 
to company rules.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: memory consumption

2021-03-31 Thread Alexey
вторник, 30 марта 2021 г. в 18:43:51 UTC+3, Marco Ippolito:
> Have you tried to identify where in your code the surprising memory 
> allocations 
> are made? 
Yes. 
> You could "bisect search" by adding breakpoints: 
> 
> https://docs.python.org/3/library/functions.html#breakpoint 
> 
> At which point does the problem start manifesting itself?
The problem spot is my cache(dict). I simplified my code to just load 
all the objects to this dict and then clear it. After loading "top" 
was showing resident memory usage at 3.3Gb and immediately after that I
did self.__cache.clear() and memory reduced to 1Gb. Then I tried to find 
any references to this dict with no luck. Also I tried "del self.__cache". 
For debugging I use Pycharm
-- 
https://mail.python.org/mailman/listinfo/python-list