Re: Python Dialogs
r...@zedat.fu-berlin.de (Stefan Ram) writes: > Me (indented by 2) and the chatbot (flush left). Lines lengths > 72! Is there a name for this kind of indentation, i.e. the stuff you are writing not being flush left? It is sort of contrary to what I think of as "normal" indentation. You seem to use it in all your postings, too, which hurts my brain, but I guess that's my problem :-) [snip (40 lines)] -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Popping key causes dict derived from object to revert to object
"Michael F. Stemper" writes: > On 25/03/2024 01.56, Loris Bennett wrote: >> Grant Edwards writes: >> >>> On 2024-03-22, Loris Bennett via Python-list wrote: >>> >>>> Yes, I was mistakenly thinking that the popping the element would >>>> leave me with the dict minus the popped key-value pair. >>> >>> It does. >> Indeed, but I was thinking in the context of >>dict_list = [d.pop('a') for d in dict_list] >> and incorrectly expecting to get a list of 'd' without key 'a', >> instead >> of a list of the 'd['a]'. > I apologize if this has already been mentioned in this thread, but are > you aware of "d.keys()" and "d.values"? > > >>> d = {} > >>> d['do'] = 'a deer, a female deer' > >>> d['re'] = 'a drop of golden sunshine' > >>> d['mi'] = 'a name I call myself' > >>> d['fa'] = 'a long, long way to run' > >>> d.keys() > ['fa', 'mi', 'do', 're'] > >>> d.values() > ['a long, long way to run', 'a name I call myself', 'a deer, a female deer', > 'a drop of golden sunshine'] > >>> Yes, I am, thank you. However, I didn't want either the keys or the values. Instead I wanted to remove a key within a list comprehension. Cheers, Loris PS: "a drop of golden *sun*" - rhymes with "a long, long way to run" -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Popping key causes dict derived from object to revert to object
Grant Edwards writes: > On 2024-03-22, Loris Bennett via Python-list wrote: > >> Yes, I was mistakenly thinking that the popping the element would >> leave me with the dict minus the popped key-value pair. > > It does. Indeed, but I was thinking in the context of dict_list = [d.pop('a') for d in dict_list] and incorrectly expecting to get a list of 'd' without key 'a', instead of a list of the 'd['a]'. >> Seem like there is no such function. > > Yes, there is. You can do that with either pop or del: > > >>> d = {'a':1, 'b':2, 'c':3} > >>> d > {'a': 1, 'b': 2, 'c': 3} > >>> d.pop('b') > 2 > >>> d > {'a': 1, 'c': 3} > > > >>> d = {'a':1, 'b':2, 'c':3} > >>> del d['b'] > >>> d > {'a': 1, 'c': 3} > > In both cases, you're left with the dict minus the key/value pair. > > In the first case, the deleted value printed by the REPL because it > was returned by the expression "d.pop('b')" (a method call). > > In the second case is no value shown by the REPL because "del d['b']" > is a statement not an expression. Thanks for pointing out 'del'. My main problem, however, was failing to realise that the list comprehension is populated by the return value of the 'pop', not the popped dict. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Popping key causes dict derived from object to revert to object
writes: > Loris wrote: > > "Yes, I was mistakenly thinking that the popping the element would leave > me with the dict minus the popped key-value pair. Seem like there is no > such function." > > Others have tried to explain and pointed out you can del and then use the > changed dict. > > But consider the odd concept of writing your own trivial function. > > def remaining(adict, anitem): > _ = adict.pop(anitem) > # alternatively duse del on dict and item > return adict > > remaining({"first": 1, "second": 2, "third": 3}, "second") > {'first': 1, 'third': 3} > > > Or do you want to be able to call it as in dict.remaining(key) by > subclassing your own variant of dict and adding a similar method? No, 'del' does indeed do what I wanted, although I have now decided I want something else :-) Nevertheless it is good to know that 'del' exists, so that I don't have to reinvent it. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Popping key causes dict derived from object to revert to object
Mark Bourne writes: > Loris Bennett wrote: >> Hi, >> I am using SQLAlchemy to extract some rows from a table of 'events'. >> From the call to the DB I get a list of objects of the type >>sqlalchemy.orm.state.InstanceState >> I would like to print these rows to the terminal using the >> 'tabulate' >> package, the documentation for which says >>The module provides just one function, tabulate, which takes a >> list of >>lists or another tabular data type as the first argument, and outputs >>a nicely formatted plain-text table >> So as I understand it, I need to convert the InstanceState-objects >> to, >> say, dicts, in order to print them. However I also want to remove one >> of the keys from the output and assumed I could just pop it off each >> event dict, thus: >>event_dicts = [vars(e) for e in events] >> print(type(event_dicts[0])) >> event_dicts = [e.pop('_sa_instance_state', None) for e in event_dicts] >> print(type(event_dicts[0])) > > vars() returns the __dict__ attribute of the object. It may not be a > good idea to modify that dictionary directly (it will also affect the > object), although it might be OK if you're not going to do anything > else with the original objects. To be safer, you could copy the event > objects: > event_dicts = [dict(vars(e)) for e in events] > or: > event_dicts = [vars(e).copy()] Thanks for making this clear to me. However, in the end I actually decided to use the list comprehension without either 'dict()' or 'vars(). Instead I just select the keys I want and so don't need to pop the unwanted key later and can simultaneously tweak the names of the key for better printing to the terminal. >> However, this prints >> >> >> If I comment out the third line, which pops the unwanted key, I get >> >> >> Why does popping one of the keys cause the elements of the list to >> revert back to their original class? > > As Dieter pointed out, the main problem here is that pop() returns the > value removed, not the dictionary with the rest of the values. You > probably want something more like: > for e in event_dicts: > del e['_sa_instance_state'] > (There's not really any point popping the value if you're not going to > do anything with it - just delete the key from the dictionary) Yes, I was mistakenly thinking that the popping the element would leave me with the dict minus the popped key-value pair. Seem like there is no such function. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Popping key causes dict derived from object to revert to object
Hi, I am using SQLAlchemy to extract some rows from a table of 'events'. >From the call to the DB I get a list of objects of the type sqlalchemy.orm.state.InstanceState I would like to print these rows to the terminal using the 'tabulate' package, the documentation for which says The module provides just one function, tabulate, which takes a list of lists or another tabular data type as the first argument, and outputs a nicely formatted plain-text table So as I understand it, I need to convert the InstanceState-objects to, say, dicts, in order to print them. However I also want to remove one of the keys from the output and assumed I could just pop it off each event dict, thus: event_dicts = [vars(e) for e in events] print(type(event_dicts[0])) event_dicts = [e.pop('_sa_instance_state', None) for e in event_dicts] print(type(event_dicts[0])) However, this prints If I comment out the third line, which pops the unwanted key, I get Why does popping one of the keys cause the elements of the list to revert back to their original class? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Configuring an object via a dictionary
Tobiah writes: > I should mention that I wanted to answer your question, > but I wouldn't actually do this. I'd rather opt for > your self.config = config solution. The config options > should have their own namespace. > > I don't mind at all referencing foo.config['option'], > or you could make foo.config an object by itself so > you can do foo.config.option. You'd fill it's attributes > in the same way I suggested for your main object. Thanks for the thoughts. I'll go for self.config = config after all, since, as you say, the clutter caused by the referencing is not that significant. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Configuring an object via a dictionary
Hi, I am initialising an object via the following: def __init__(self, config): self.connection = None self.source_name = config['source_name'] self.server_host = config['server_host'] self.server_port = config['server_port'] self.user_base = config['user_base'] self.user_identifier = config['user_identifier'] self.group_base = config['group_base'] self.group_identifier = config['group_identifier'] self.owner_base = config['owner_base'] However, some entries in the configuration might be missing. What is the best way of dealing with this? I could of course simply test each element of the dictionary before trying to use. I could also just write self.config = config but then addressing the elements will add more clutter to the code. However, with a view to asking forgiveness rather than permission, is there some simple way just to assign the dictionary elements which do in fact exist to self-variables? Or should I be doing this completely differently? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Accessing configuration across multiple modules
Hi, I am using Typer to create a command-line program with multiple levels of subcommands, so a typical call might look like mytool --config-file=~/test/mytool.conf serviceXYZ list people In the top-level mytool.main, I evaluate the option '--config-file' and read the config file to initialize the logging. This works fine. However, in the module which lists people, namely mytool.serviceXYZ.cli_people I need to set up a connection to an LDAP server in order to actually read the data. If the LDAP connection details are also in the config file, what is the best way of making them accessible at the point where the object wrapping the LDAP server is initialized? I found this a suggestion here which involves creating a separate module for the configuration and then importing it https://codereview.stackexchange.com/questions/269550/python-share-global-variables-across-modules-from-user-defined-config-file I think I could probably get that to work, but are there any better alternatives? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Printing dict value for possibly undefined key
DL Neil writes: > On 11/25/2023 3:31 AM, Loris Bennett via Python-list wrote: >> Hi, >> I want to print some records from a database table where one of the >> fields contains a JSON string which is read into a dict. I am doing >> something like >>print(f"{id} {d['foo']} {d['bar']}") >> However, the dict does not always have the same keys, so d['foo'] or >> d['bar'] may be undefined. I can obviously do something like >>if not 'foo' in d: >> d['foo']="NULL" >>if not 'bar' in d: >> d['bar']="NULL" >>print(f"{id} {d['foo']} {d['bar']}") >> Is there any more compact way of achieving the same thing? > > > What does "the dict does not always have the same keys" mean? > > a) there are two (or...) keys, but some records don't include both; > > b) there may be keys other than 'foo' and 'bar' which not-known in-advance; > > c) something else. Sorry for being unclear. There is either 'foo' or 'bar' or both, plus some other keys which are always present. > As mentioned, dict.get() solves one of these. > > Otherwise, there are dict methods which collect/reveal all the keys, > all the values, or both - dict.keys(), .values(), .items(), resp. That is a also a good point. I had forgotten about dict.keys() and dict.values(), and hadn't been aware of dict.items(). Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Printing dict value for possibly undefined key
duncan smith writes: > On 24/11/2023 16:35, duncan smith wrote: >> On 24/11/2023 14:31, Loris Bennett wrote: >>> Hi, >>> >>> I want to print some records from a database table where one of the >>> fields contains a JSON string which is read into a dict. I am doing >>> something like >>> >>> print(f"{id} {d['foo']} {d['bar']}") >>> >>> However, the dict does not always have the same keys, so d['foo'] or >>> d['bar'] may be undefined. I can obviously do something like >>> >>> if not 'foo' in d: >>> d['foo']="NULL" >>> if not 'bar' in d: >>> d['bar']="NULL" >>> print(f"{id} {d['foo']} {d['bar']}") >>> >>> Is there any more compact way of achieving the same thing? >>> >>> Cheers, >>> >>> Loris >>> >> Yes. e.g. >> d.get('foo', "NULL") >> Duncan > > Or make d a defaultdict. > > from collections import defaultdict > > dic = defaultdict(lambda:'NULL') > dic['foo'] = 'astring' > dic['foo'] > 'astring' > dic['bar'] > 'NULL' > > Duncan > I have gone with the 'd.get' solution, as I am just need to print the dict to the terminal. The dict is actually from a list of dicts which is generated by querying a database, so I don't think the defaultdict approach would be so appropriate, but it's good to know about it. Thanks, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Printing dict value for possibly undefined key
Hi, I want to print some records from a database table where one of the fields contains a JSON string which is read into a dict. I am doing something like print(f"{id} {d['foo']} {d['bar']}") However, the dict does not always have the same keys, so d['foo'] or d['bar'] may be undefined. I can obviously do something like if not 'foo' in d: d['foo']="NULL" if not 'bar' in d: d['bar']="NULL" print(f"{id} {d['foo']} {d['bar']}") Is there any more compact way of achieving the same thing? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQL rollback of multiple inserts involving constraints
Jacob Kruger writes: > Think performing a session/transaction flush after the first two > inserts should offer the workaround before you've committed all > transaction actions to the database finally: > > https://medium.com/@oba2311/sqlalchemy-whats-the-difference-between-a-flush-and-commit-baec6c2410a9 > > > HTH Yes, thank you, it does. I hadn't been aware of 'flush'. > Jacob Kruger > +2782 413 4791 > "Resistance is futile!...Acceptance is versatile..." > > > On 2023/11/10 11:15, Loris Bennett via Python-list wrote: >> Hi, >> >> In my MariaDB database I have a table 'people' with 'uid' as the primary >> key and a table 'groups' with 'gid' as the primary key. I have a third >> table 'memberships' with 'uid' and 'gid' being the primary key and the >> constraint that values for 'uid' and 'gid' exist in the tables 'people' >> and 'groups', respectively. I am using SQLAlchemy and writing a method >> to setup a membership for a new person in a new group. >> >> I had assumed that I should be able to perform all three inserts >> (person, group, membership) with a single transaction and then rollback >> if there is a problem. However, the problem is that if the both the >> insert into 'people' and that into 'groups' are not first committed, the >> constraint on the insertion of the membership fails. >> >> What am I doing wrong? >> >> Apologies if this is actually an SQL question rather than something >> related to SQLAlchemy. >> >> Cheers, >> >> Loris >> > -- Dr. Loris Bennett (Herr/Mr) ZEDAT, Freie Universität Berlin -- https://mail.python.org/mailman/listinfo/python-list
SQL rollback of multiple inserts involving constraints
Hi, In my MariaDB database I have a table 'people' with 'uid' as the primary key and a table 'groups' with 'gid' as the primary key. I have a third table 'memberships' with 'uid' and 'gid' being the primary key and the constraint that values for 'uid' and 'gid' exist in the tables 'people' and 'groups', respectively. I am using SQLAlchemy and writing a method to setup a membership for a new person in a new group. I had assumed that I should be able to perform all three inserts (person, group, membership) with a single transaction and then rollback if there is a problem. However, the problem is that if the both the insert into 'people' and that into 'groups' are not first committed, the constraint on the insertion of the membership fails. What am I doing wrong? Apologies if this is actually an SQL question rather than something related to SQLAlchemy. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: NameError: name '__version__' is not defined
"Loris Bennett" writes: > "Loris Bennett" writes: > >> Hi, >> >> I have two applications. One uses the system version of Python, which >> is 3.6.8, whereas the other uses Python 3.10.8 installed in a non-system >> path. For both applications I am using poetry with a pyproject.toml >> file which contains the version information and __init__.py at the root >> which contains >> >> try: >> import importlib.metadata as importlib_metadata >> except ModuleNotFoundError: >> import importlib_metadata >> >> __version__ = importlib_metadata.version(__name__) >> >> For the application with the system Python this mechanism works, but for >> the non-system Python I get the error: >> >> NameError: name '__version__' is not defined >> >> For the 3.6 application I have >> >> PYTHONPATH=/nfs/local/lib/python3.6/site-packages >> PYTHONUSERBASE=/nfs/local >> PYTHON_VERSION=3.6 >> PYTHON_VIRTUALENV= >> >> and for the 3.10 application I have >> >> >> PYTHONPATH=/nfs/easybuild/software/Python/3.10.8-GCCcore-12.2.0/easybuild/python:/nfs/local/lib/python3.10/site-packages >> PYTHONUSERBASE=/nfs/local >> PYTHON_VERSION=3.10 >> PYTHON_VIRTUALENV= >> >> The applications are installed in /nfs/local/lib/python3.6/site-packages >> and /nfs/local/lib/python3.10/site-packages, respectively. >> >> Can anyone see where this is going wrong? I thought it should be >> enough that the packages with the metadata is available via PYTHONPATH, >> but this seems not to be sufficient. So I must be overseeing something. > > If in the 3.10 application I add > > print(f"__init__ Version: {__version__}") > > to __init__.py the correct version is printed. So the problem is that > the variable is not available at the point I am trying access it. The > relevant code (a far as I can tell) in main.py looks like this: > > import typer > > app = typer.Typer() > > > @app.callback() > def version_callback(value: bool): > if value: > typer.echo(f"Version: {__version__}") > raise typer.Exit() > > > @app.callback() > def common( > ctx: typer.Context, > version: bool = typer.Option(None, "--version", >help="Show version", >callback=version_callback), > ): > pass > > if __name__ == "__main__": > > app() > > This is the first time I have used typer, so it is more than likely that > I have made some mistakes. OK, I worked it out. Instead of typer.echo(f"Version: {__version__}") I need typer.echo(f"Version: {mypackage.__version__}") Thanks for the help :-) Even if no-one replies, it still helps me to have to formulate the problem for an audience of people who probably know more than I do. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: NameError: name '__version__' is not defined
"Loris Bennett" writes: > Hi, > > I have two applications. One uses the system version of Python, which > is 3.6.8, whereas the other uses Python 3.10.8 installed in a non-system > path. For both applications I am using poetry with a pyproject.toml > file which contains the version information and __init__.py at the root > which contains > > try: > import importlib.metadata as importlib_metadata > except ModuleNotFoundError: > import importlib_metadata > > __version__ = importlib_metadata.version(__name__) > > For the application with the system Python this mechanism works, but for > the non-system Python I get the error: > > NameError: name '__version__' is not defined > > For the 3.6 application I have > > PYTHONPATH=/nfs/local/lib/python3.6/site-packages > PYTHONUSERBASE=/nfs/local > PYTHON_VERSION=3.6 > PYTHON_VIRTUALENV= > > and for the 3.10 application I have > > > PYTHONPATH=/nfs/easybuild/software/Python/3.10.8-GCCcore-12.2.0/easybuild/python:/nfs/local/lib/python3.10/site-packages > PYTHONUSERBASE=/nfs/local > PYTHON_VERSION=3.10 > PYTHON_VIRTUALENV= > > The applications are installed in /nfs/local/lib/python3.6/site-packages > and /nfs/local/lib/python3.10/site-packages, respectively. > > Can anyone see where this is going wrong? I thought it should be > enough that the packages with the metadata is available via PYTHONPATH, > but this seems not to be sufficient. So I must be overseeing something. If in the 3.10 application I add print(f"__init__ Version: {__version__}") to __init__.py the correct version is printed. So the problem is that the variable is not available at the point I am trying access it. The relevant code (a far as I can tell) in main.py looks like this: import typer app = typer.Typer() @app.callback() def version_callback(value: bool): if value: typer.echo(f"Version: {__version__}") raise typer.Exit() @app.callback() def common( ctx: typer.Context, version: bool = typer.Option(None, "--version", help="Show version", callback=version_callback), ): pass if __name__ == "__main__": app() This is the first time I have used typer, so it is more than likely that I have made some mistakes. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
NameError: name '__version__' is not defined
Hi, I have two applications. One uses the system version of Python, which is 3.6.8, whereas the other uses Python 3.10.8 installed in a non-system path. For both applications I am using poetry with a pyproject.toml file which contains the version information and __init__.py at the root which contains try: import importlib.metadata as importlib_metadata except ModuleNotFoundError: import importlib_metadata __version__ = importlib_metadata.version(__name__) For the application with the system Python this mechanism works, but for the non-system Python I get the error: NameError: name '__version__' is not defined For the 3.6 application I have PYTHONPATH=/nfs/local/lib/python3.6/site-packages PYTHONUSERBASE=/nfs/local PYTHON_VERSION=3.6 PYTHON_VIRTUALENV= and for the 3.10 application I have PYTHONPATH=/nfs/easybuild/software/Python/3.10.8-GCCcore-12.2.0/easybuild/python:/nfs/local/lib/python3.10/site-packages PYTHONUSERBASE=/nfs/local PYTHON_VERSION=3.10 PYTHON_VIRTUALENV= The applications are installed in /nfs/local/lib/python3.6/site-packages and /nfs/local/lib/python3.10/site-packages, respectively. Can anyone see where this is going wrong? I thought it should be enough that the packages with the metadata is available via PYTHONPATH, but this seems not to be sufficient. So I must be overseeing something. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Installing package as root to a system directory
Hi, I use poetry to develop system software packages as a normal user. To install the packages I use, again as a normal user export PYTHONUSERBASE=/some/path pip3 install --user somepackage.whl and add /some/path to /usr/lib64/python3.6/site-packages/zedat.pth This works well enough, but seems to me to be a little clunky, mainly because the files don't then belong to root. The most correct way, in my case, would probably be to create an RPM out of the Python package, but that seems like it would be too much overhead. What other approaches to people use? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Displaying CPU instruction sets used for TensorFlow build?
Hi, Does anyone know how I can display the CPU instruction sets which were used when TensorFlow was compiled? I initially compiled TF on a machine with a CPU which supports AVX512_VNNI. I subsequently recompiled on a second machine without AVX512_VNNI, but when I run a test program on the second machine, I get the error: The TensorFlow library was compiled to use AVX512_VNNI instructions, but these aren't available on your machine. I would like to check the instruction sets explicitly, so I can tell whether I am using the version of TF I think I am, or whether the test program has some sort of problem. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Initialising a Config class
Hi, Having solved my problem regarding setting up 'logger' such that it is accessible throughout my program (thanks to the help on this list), I now have problem related to a slightly similar issue. My reading suggests that setting up a module with a Config class which can be imported by any part of the program might be a reasonable approach: import configparser class Config: def __init__(self, config_file): config = configparser.ConfigParser() config.read(config_file) However, in my config file I am using sections, so 'config' is a dict of dicts. Is there any cleverer generic way of initialising the class than self.config = config ? This seems a bit clunky, because I'll end up with something like import config ... c = config.Config(config_file) uids = get_uids(int(c.config["uids"]["minimum_uid"])) rather than something like, maybe uids = get_uids(int(c.minimum_uid)) or uids = get_uids(int(c.uids_minimum_uid)) So the question is: How can I map a dict of dicts onto class attributes in a generic way such that only code which wants to use a new configuration parameter needs to be changed and not the Config class itself? Or should I be doing this differently? Note that the values from ConfigParser are all strings, so I am fine with the attributes being strings - I'll just convert them as needed at the point of use (but maybe there is also a better way of handling that within a class). Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: When is logging.getLogger(__name__) needed?
"Loris Bennett" writes: > dn writes: > >> On 01/04/2023 02.01, Loris Bennett wrote: >>> Hi, >>> In my top level program file, main.py, I have >>>def main_function(): >>>parser = argparse.ArgumentParser(description="my prog") >>>... >>>args = parser.parse_args() >>>config = configparser.ConfigParser() >>>if args.config_file is None: >>>config_file = DEFAULT_CONFIG_FILE >>>else: >>>config_file = args.config_file >>>config.read(config_file) >>>logging.config.fileConfig(fname=config_file) >>>logger = logging.getLogger(__name__) >>>do_some_stuff() >>> my_class_instance = myprog.MyClass() >>>def do_some_stuff(): >>>logger.info("Doing stuff") >>> This does not work, because 'logger' is not known in the function >>> 'do_some_stuff'. >>> However, if in 'my_prog/my_class.py' I have >>>class MyClass: >>>def __init__(self): >>>logger.debug("created instance of MyClass") >>> this 'just works'. >>> I can add >>>logger = logging.getLogger(__name__) >>> to 'do_some_stuff', but why is this necessary in this case but not >>> in >>> the class? >>> Or should I be doing this entirely differently? >> >> Yes: differently. >> >> To complement @Peter's response, two items for consideration: >> >> 1 once main_function() has completed, have it return logger and other >> such values/constructs. The target-identifiers on the LHS of the >> function-call will thus be within the global scope. >> >> 2 if the purposes of main_function() are condensed-down to a few >> (English, or ..., language) phrases, the word "and" will feature, eg >> - configure env according to cmdLN args, >> - establish log(s), >> - do_some_stuff(), ** AND ** >> - instantiate MyClass. >> >> If these (and do_some_stuff(), like MyClass' methods) were split into >> separate functions* might you find it easier to see them as separate >> sub-solutions? Each sub-solution would be able to contribute to the >> whole - the earlier ones as creating (outputting) a description, >> constraint, or basis; which becomes input to a later function/method. > > So if I want to modify the logging via the command line I might have the > following: > > - > > #!/usr/bin/env python3 > > import argparse > import logging > > > def get_logger(log_level): > """Get global logger""" > > logger = logging.getLogger('example') > logger.setLevel(log_level) > ch = logging.StreamHandler() > formatter = logging.Formatter('%(levelname)s - %(message)s') > ch.setFormatter(formatter) > logger.addHandler(ch) > > return logger > > > def do_stuff(): > """Do some stuff""" > > #logger.info("Doing stuff!") Looks like I just need logger = logging.getLogger('example) logger.info("Doing stuff!") > > def main(): > """Main""" > > parser = argparse.ArgumentParser() > parser.add_argument("--log-level", dest="log_level", type=int) > args = parser.parse_args() > > print(f"log level: {args.log_level}") > > logger = get_logger(args.log_level) > logger.debug("Logger!") > do_stuff() > > > if __name__ == "__main__": > main() > > - > > How can I get logging for 'do_stuff' in this case without explicitly > passing 'logger' as an argument or using 'global'? > > Somehow I am failing to understand how to get 'logger' defined > sufficiently high up in the program that all references 'lower down' in > the program will be automatically resolved. > >> * there is some debate amongst developers about whether "one function, >> one purpose" should be a rule, a convention, or tossed in the >> trash. YMMV! >> >> Personal view: SOLID's "Single" principle applies: there should be >> only one reason (hanging over the head of each method/function, like >> the Sword of Damocles) for it to change - or one 'user' who could >> demand a change to that function. In other words, an updated cmdLN >> option s
Re: When is logging.getLogger(__name__) needed?
dn writes: > On 01/04/2023 02.01, Loris Bennett wrote: >> Hi, >> In my top level program file, main.py, I have >>def main_function(): >>parser = argparse.ArgumentParser(description="my prog") >>... >>args = parser.parse_args() >>config = configparser.ConfigParser() >>if args.config_file is None: >>config_file = DEFAULT_CONFIG_FILE >>else: >>config_file = args.config_file >>config.read(config_file) >>logging.config.fileConfig(fname=config_file) >>logger = logging.getLogger(__name__) >>do_some_stuff() >> my_class_instance = myprog.MyClass() >>def do_some_stuff(): >>logger.info("Doing stuff") >> This does not work, because 'logger' is not known in the function >> 'do_some_stuff'. >> However, if in 'my_prog/my_class.py' I have >>class MyClass: >>def __init__(self): >>logger.debug("created instance of MyClass") >> this 'just works'. >> I can add >>logger = logging.getLogger(__name__) >> to 'do_some_stuff', but why is this necessary in this case but not >> in >> the class? >> Or should I be doing this entirely differently? > > Yes: differently. > > To complement @Peter's response, two items for consideration: > > 1 once main_function() has completed, have it return logger and other > such values/constructs. The target-identifiers on the LHS of the > function-call will thus be within the global scope. > > 2 if the purposes of main_function() are condensed-down to a few > (English, or ..., language) phrases, the word "and" will feature, eg > - configure env according to cmdLN args, > - establish log(s), > - do_some_stuff(), ** AND ** > - instantiate MyClass. > > If these (and do_some_stuff(), like MyClass' methods) were split into > separate functions* might you find it easier to see them as separate > sub-solutions? Each sub-solution would be able to contribute to the > whole - the earlier ones as creating (outputting) a description, > constraint, or basis; which becomes input to a later function/method. So if I want to modify the logging via the command line I might have the following: - #!/usr/bin/env python3 import argparse import logging def get_logger(log_level): """Get global logger""" logger = logging.getLogger('example') logger.setLevel(log_level) ch = logging.StreamHandler() formatter = logging.Formatter('%(levelname)s - %(message)s') ch.setFormatter(formatter) logger.addHandler(ch) return logger def do_stuff(): """Do some stuff""" #logger.info("Doing stuff!") def main(): """Main""" parser = argparse.ArgumentParser() parser.add_argument("--log-level", dest="log_level", type=int) args = parser.parse_args() print(f"log level: {args.log_level}") logger = get_logger(args.log_level) logger.debug("Logger!") do_stuff() if __name__ == "__main__": main() - How can I get logging for 'do_stuff' in this case without explicitly passing 'logger' as an argument or using 'global'? Somehow I am failing to understand how to get 'logger' defined sufficiently high up in the program that all references 'lower down' in the program will be automatically resolved. > * there is some debate amongst developers about whether "one function, > one purpose" should be a rule, a convention, or tossed in the > trash. YMMV! > > Personal view: SOLID's "Single" principle applies: there should be > only one reason (hanging over the head of each method/function, like > the Sword of Damocles) for it to change - or one 'user' who could > demand a change to that function. In other words, an updated cmdLN > option shouldn't affect a function which establishes logging, for > example. > > > Web.Refs: > https://people.engr.tamu.edu/choe/choe/courses/20fall/315/lectures/slide23-solid.pdf > https://www.hanselminutes.com/145/solid-principles-with-uncle-bob-robert-c-martin > https://idioms.thefreedictionary.com/sword+of+Damocles > https://en.wikipedia.org/wiki/Damocles I don't really get the "one reason" idea and the Sword of Damocles analogy. The later to me is more like "there's always a downside", since the perks of being king may mean someone might try to usurp the throne and kill you. Where is the "single principle" aspect? However, the idea of "one responsibility" in the sense of "do only one thing" seems relatively clear, especially if I think in terms of writing unit tests. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: When is logging.getLogger(__name__) needed?
Peter Otten <__pete...@web.de> writes: > On 31/03/2023 15:01, Loris Bennett wrote: [snip (53 lines)] > Your problem has nothing to do with logging -- it's about visibility > ("scope") of names: > >>>> def use_name(): > print(name) > > >>>> def define_name(): > name = "Loris" > > >>>> use_name() > Traceback (most recent call last): > File "", line 1, in > use_name() > File "", line 2, in use_name > print(name) > NameError: name 'name' is not defined > > Binding (=assigning to) a name inside a function makes it local to that > function. If you want a global (module-level) name you have to say so: > >>>> def define_name(): > global name > name = "Peter" > > >>>> define_name() >>>> use_name() > Peter Thanks for the example and reminding me about Python's scopes. With global name def use_name(): print(name) def define_name(): name = "Peter" define_name() use_name() I was initially surprised by the following error: ~/tmp $ python3 global.py Traceback (most recent call last): File "/home/loris/tmp/global.py", line 10, in use_name() File "/home/loris/tmp/global.py", line 4, in use_name print(name) NameError: name 'name' is not defined but I was misinterpreting global name to mean define a global variable 'name' whereas it actually seems to mean more like use the global variable 'name' Correct? -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
When is logging.getLogger(__name__) needed?
Hi, In my top level program file, main.py, I have def main_function(): parser = argparse.ArgumentParser(description="my prog") ... args = parser.parse_args() config = configparser.ConfigParser() if args.config_file is None: config_file = DEFAULT_CONFIG_FILE else: config_file = args.config_file config.read(config_file) logging.config.fileConfig(fname=config_file) logger = logging.getLogger(__name__) do_some_stuff() my_class_instance = myprog.MyClass() def do_some_stuff(): logger.info("Doing stuff") This does not work, because 'logger' is not known in the function 'do_some_stuff'. However, if in 'my_prog/my_class.py' I have class MyClass: def __init__(self): logger.debug("created instance of MyClass") this 'just works'. I can add logger = logging.getLogger(__name__) to 'do_some_stuff', but why is this necessary in this case but not in the class? Or should I be doing this entirely differently? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python file location
windhorn writes: > I have an older laptop I use for programming, particularly Python and > Octave, running a variety of Debian Linux, and I am curious if there > is a "standard" place in the file system to store this type of program > file. OK, I know they should go in a repository and be managed by an > IDE but this seems like way overkill for the kind of programming that > I do, normally a single file. Any suggestions welcome, thanks. How about /usr/local/bin? That should already be included in $PATH. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Standard class for time *period*?
Cameron Simpson writes: > On 28Mar2023 08:05, Dennis Lee Bieber wrote: >> So far, you seem to be the only person who has ever asked for >> asingle >>entity incorporating an EPOCH (datetime.datetime) + a DURATION >>(datetime.timedelta). > > But not the only person to want one. I've got a timeseries data format > where (within a file) time slots are represented as a seconds offset, > and the file has an associated epoch starting point. Dual to that is > that a timeslot has an offset from the file start, and that is > effectively a (file-epoch, duration) notion. > > I've got plenty of code using that which passes around UNIX timestamp > start/stop pairs. Various conversions happen to select the appropriate > file (this keeps the files of bounded size while supporting an > unbounded time range). > > Even a UNIX timestamp has an implied epoch, and therefore kind of > represents that epoch plus the timestamp as a duration. > > I'm not sure I understand Loris' other requirements though. It might > be hard to write a general thing which was also still useful. I am glad to hear that I am not alone :-) However, my use-case is fairly trivial, indeed less complicated than yours. So, in truth I don't really need a Period class. I just thought it might be a sufficiently generic itch that someone else with a more complicated use-case could have already scratched. After all, that the datetime classes exist, even though I only use a tiny part of the total functionality. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Standard class for time *period*?
Thomas Passin writes: > On 3/27/2023 11:34 AM, rbowman wrote: >> On Mon, 27 Mar 2023 15:00:52 +0200, Loris Bennett wrote: >> >>>I need to deal with what I call a 'period', which is a span of time >>>limited by two dates, start and end. The period has a 'duration', >>>which is the elapsed time between start and end. The duration is >>>essentially a number of seconds, but in my context, because the >>>durations are usually hours or days, I would generally want to display >>>the duration in a format such as "dd-hh:mm:ss" >> https://www.programiz.com/python-programming/datetime >> Scroll down to timedelta. If '14 days, 13:55:39' isn't good enough >> you'll >> have to format it yourself. > > I second this. timedelta should give the OP exactly what he's talking > about. No, it doesn't. I already know about timedelta. I must have explained the issue badly, because everyone seems to be fixating on the formatting, which is not a problem and is incidental to what I am really interested in, namely: 1. Is there a standard class for a 'period', i.e. length of time specified by a start point and an end point? The start and end points could obviously be datetimes and the difference a timedelta, but the period '2022-03-01 00:00 to 2022-03-02 00:00' would be different to '2023-03-01 00:00 to 2023-03-02 00:00' even if the *duration* of both periods is the same. 2. If such a standard class doesn't exist, why does it not exist? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Standard class for time *period*?
Dennis Lee Bieber writes: > On Tue, 28 Mar 2023 08:14:55 +0200, "Loris Bennett" > declaimed the following: > >> >>No, it doesn't. I already know about timedelta. I must have explained >>the issue badly, because everyone seems to be fixating on the >>formatting, which is not a problem and is incidental to what I am really >>interested in, namely: >> >>1. Is there a standard class for a 'period', i.e. length of time >> specified by a start point and an end point? The start and end >> points could obviously be datetimes and the difference a timedelta, >> but the period '2022-03-01 00:00 to 2022-03-02 00:00' would be >> different to '2023-03-01 00:00 to 2023-03-02 00:00' even if the >> *duration* of both periods is the same. >> >>2. If such a standard class doesn't exist, why does it not exist? >> > > So far, you seem to be the only person who has ever asked for a single > entity incorporating an EPOCH (datetime.datetime) + a DURATION > (datetime.timedelta). You are asking for two durations of the same length > to be considered different if they were computed from different "zero" > references (epochs). I thought I was asking for two periods of the same duration to be considered different, if they have different starting points :-) > I don't think I've ever encountered an application > that doesn't use a single epoch (possibly per run) with all internal > computations using a timedelta FROM THAT EPOCH! (The exception may have > been computing star atlases during the conversion from B1900 to J2000 > reference frames.) But even if I have a single epoch, January 2022 is obviously different to January 2023, even thought the duration might be the same. I am just surprised that there is no standard Period class, with which I could create objects and then be able to sort, check for identity, equality of length, amount of overlap, etc. I suppose people just use a datetime.datetime pair, as I have been doing so far, and tack on just the specific bit of functionality they need. -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Standard class for time *period*?
r...@zedat.fu-berlin.de (Stefan Ram) writes: > r...@zedat.fu-berlin.de (Stefan Ram) writes: >>d = datetime_diff.days >>h, rem = divmod( datetime_diff.seconds, 3600 ) >>m, s = divmod( rem, 60 ) >>print( f'{d:02}-{h:02}:{m:02}:{s:02}' ) > > If the default formatting is acceptable to you, you can also > print the datetime_diff in a shorter way: > > main.py > > from datetime import datetime > > format = '%Y-%m-%dT%H:%M:%S%z' > datetime_0 = datetime.strptime( '2023-03-27T14:00:52+01:00', format ) > datetime_1 = datetime.strptime( '2023-03-27T14:27:23+01:00', format ) > > print( datetime_1 - datetime_0 ) > > sys.stdout > > 0:26:31 > > . Days will also be shown if greater than zero. Thanks for the examples, but I am not really interested in how to write a bunch of code to do what I need, as I can already do that. I am interested in whether there is a standard class for this and, if there is not such a class, why this is the case. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Standard class for time *period*?
Hi, I have been around long enough to know that, due to time-zones, daylight saving and whatnot, time-related stuff is complicated. So even if I think something connected with time should exist, there may well be a very good reason why it does not. My problem: I need to deal with what I call a 'period', which is a span of time limited by two dates, start and end. The period has a 'duration', which is the elapsed time between start and end. The duration is essentially a number of seconds, but in my context, because the durations are usually hours or days, I would generally want to display the duration in a format such as "dd-hh:mm:ss" My (possibly ill-founded) expectation: There is a standard class which encapsulates this sort of functionality. My (possibly insufficiently researched) conclusion: Such a standard class does not exist. What is at fault here? My expectation or my conclusion? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Implementing a plug-in mechanism
Simon Ward writes: > On Thu, Mar 16, 2023 at 07:45:18AM +1300, dn via Python-list wrote: >> There is a PyPi library called pluggy (not used it). I've used >> informal approaches using an ABC as a framework/reminder (see >> @George's response). > > typing.Protocol is also useful here as the plugin interface can be > defined separately not requiring inheriting from an ABC. Thanks to all for the helpful suggestions. I realise that I don't actually need to be able to load a bunch of arbitrary plugins, but merely to be able to override one (or, perhaps later, more) piece of default behaviour. Therefore I think the following very simple scheme will work for me: $ tree -L 3 . └── myproj ├── __init__.py ├── mailer.py ├── main.py └── plugins └── normanmailer.py Where main.py is #!/usr/bin/env python3 # -*- coding: utf-8 -*- if __name__ == "__main__": try: import plugin.mailer as mailer print("Found plugin.mailer") except ModuleNotFoundError: import mailer print("Found mailer") m = mailer.Mailer('abc') m.run() mailer.py is class Mailer(): def run(self): print("This is a generic Mailer object!") and plugins/normanmailer.py is class Mailer(): def run(self): print("This is a customized Mailer object!") This then gives me $ poetry run myproj\/main.py Found mailer This is a generic Mailer object! $ mv myproj/plugins/{norman,}mailer.py $ poetry run myproj\/main.py Found plugins.mailer This is a customized Mailer object! I suspect I was using slightly incorrect/misleading terminology. I don't want to be able to load arbitrary functionality via plugins, e.g. sending an email, dumping to a database, uploading to a cloud. That would, I far as I can tell, necessitate having some mechanism to select the functionality. Instead I just want to modify the behaviour of a piece of fixed functionality. e.g. sending a mail. So am I really talking about customisation here. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Implementing a plug-in mechanism
Hi, I have written a program which, as part of the non-core functionality, contains a module to generate email. This is currently very specific to my organisation, so the main program contains import myorg.mailer This module is specific to my organisation in that it can ask an internal server to generate individualised salutations for a given UID which is known within the organisation. I want to share the code with other institutions, so I would like to 1. replace the organisation-specific mailer with a generic one 2. allow an organisation-specific mailer to be used instead of the generic one, if so desired Is importlib the way to go here or is there another approach? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Distributing program for Linux
Anssi Saari writes: > "Loris Bennett" writes: > >> I am aware that an individual user could use (mini)conda to install a >> more recent version of Python in his/her home directory, but I am >> interested in how root would install such a program. > > Root would install the script and required Python version somewhere > depending any site specific practices and then use things like pyenv, > stow, environment modules or whatever to give the users access to it. The program is not for normal users, but is a system program. Many admins who might install the program will be using environment modules, so that, coupled with setting #!/usr/bin/env python3 for the scripts, looks like it might be a reasonable solution. > Root might even package your script with the interpreter required into > one binary. See Tools/freeze in the source distribution. Well, anyone who has the sources can do that, if so inclined. Personally, I already have enough versions of Python (currently 12 versions installed via EasyBuild plus the 2 from the OS itself) without creating fat binaries which contain a copy of one of those version. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Distributing program for Linux
"Weatherby,Gerard" writes: > It’s really going to depend on the distribution and whether you have root > access. I am interested in providing a package for people with root access for a variety of distributions. > If you have Ubuntu and root access, you can add the deadsnakes repo, > https://launchpad.net/~deadsnakes, and install whatever Python you > want. I myself have this part covered via EasyBuild, https://easybuild.io/, which will work on any distro. How anybody else installs a given version Python will be left as an exercise for them (potential users of the software are however unlikely to be using Ubuntu). > The default ‘python3’ remains but you can called a specific Python, (e.g. > python3.10). > > A typical shebang line would be: > > #!/usr/bin/env python3.10 I am currently using poetry to build the package and uses a 'scripts' section in the pyproject.toml file to produce stubs to call the main program. These have the shebang #!/usr/bin/python3 So if I can get poetry to use '/usr/bin/env' instead, then I can probably just rely on whatever mechanism other people use to switch between Python versions to do the right thing. Cheers, Loris > From: Python-list > on behalf of Loris Bennett > Date: Tuesday, March 14, 2023 at 12:27 PM > To: python-list@python.org > Subject: Distributing program for Linux > *** Attention: This is an external email. Use caution responding, opening > attachments or clicking on links. *** > > Hi, > > If I write a system program which has Python >= 3.y as a dependency, > what are the options for someone whose Linux distribution provides > Python 3.x, where x < y? > > I am aware that an individual user could use (mini)conda to install a > more recent version of Python in his/her home directory, but I am > interested in how root would install such a program. > > Cheers, > > Loris > > -- > This signature is currently under constuction. > -- > https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!l02A4qczH46l1ScA8yisiIwlDKh96sy16woPSOSABWqym4b6dBtHzExfFwZsnPDezDwDqaM0fdCMs3080WQQZ-b5OghOOpI$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!l02A4qczH46l1ScA8yisiIwlDKh96sy16woPSOSABWqym4b6dBtHzExfFwZsnPDezDwDqaM0fdCMs3080WQQZ-b5OghOOpI$> -- Dr. Loris Bennett (Herr/Mr) ZEDAT, Freie Universität Berlin -- https://mail.python.org/mailman/listinfo/python-list
Distributing program for Linux
Hi, If I write a system program which has Python >= 3.y as a dependency, what are the options for someone whose Linux distribution provides Python 3.x, where x < y? I am aware that an individual user could use (mini)conda to install a more recent version of Python in his/her home directory, but I am interested in how root would install such a program. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: putting JUNK at the end of a [.py] file
Hen Hanna writes: > in a LaTeX file, after the (1st) \end{document} line, > i can put any random Junk i want(afterwards) until the end of the > file. > > > Is there a similar Method for a.py file ? > > Since i know of no such trick, i sometimes put this (below) at the end of a > .py file. > > > > dummy= (""" junk and more junk > words in Dict > 239 words in Dict > ((( notes or Code fragmetns ))) > """ ) > > > > **maybe i don't need the dummy= but it looks better. Apropos looking better: In my newsreader with a fixed-width font, I find your postings rather hard to read as there is a lot of, for me, inexplicable white space. Is there a particular reason why your postings are formatted like this? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: argparse — adding a --version flag in the face of positional args
Mats Wichmann writes: > On 11/27/22 16:40, Skip Montanaro wrote: >> I have a script to which I'd like to add a --version flag. It should print >> the version number then exit, much in the same way --help prints the help >> text then exits. I haven't been able to figure that out. I always get a >> complaint about the required positional argument. >> I think I could use something like nargs='*', but that would push >> off >> detection of the presence of the positional arg to the application. >> Shouldn't I be able to tell argparse I'm going to process --verbose, then >> exit? > > ummm, hate to say this, but have you checked the documentation? this > case is supported using an action named 'version' without doing very > much. I hadn't noticed the action 'version'. I just use parser.add_argument( "-v", "--version", action="store_true", dest="version", help="print version" ) ... if args.version: print(f"Version {my_module.__version__}") sys.exit(0) where the version is specified in a pyproj.toml file and __init__.py contains try: import importlib.metadata as importlib_metadata except ModuleNotFoundError: import importlib_metadata __version__ = importlib_metadata.version(__name__) I use poetry to then build the corresponding versioned package. What am I missing by not using the action 'version'? Do I just save having to explicitly test for the version arg? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Preprocessing not quite fixed-width file before parsing
Thomas Passin writes: > On 11/24/2022 9:06 AM, Loris Bennett wrote: >> Thomas Passin writes: >> >>> On 11/23/2022 11:00 AM, Loris Bennett wrote: >>>> Hi, >>>> I am using pandas to parse a file with the following structure: >>>> Name filesettype KB quota limit >>>> in_doubtgrace |files quotalimit in_doubtgrace >>>> shortname sharedhome USR14097664 524288000 545259520 0 >>>> none | 107110 000 none >>>> gracedays sharedhome USR 774858944 524288000 775946240 0 >>>> 5 days | 1115717 000 none >>>> nametoolong sharedhome USR27418496 524288000 545259520 >>>> 0 none |11581 000 none >>>> I was initially able to use >>>> df = pandas.read_csv(file_name, delimiter=r"\s+") >>>> because all the values for 'grace' were 'none'. Now, however, >>>> non-"none" values have appeared and this fails. >>>> I can't use >>>> pandas.read_fwf >>>> even with an explicit colspec, because the names in the first column >>>> which are too long for the column will displace the rest of the data to >>>> the right. >>>> The report which produces the file could in fact also generate a >>>> properly delimited CSV file, but I have a lot of historical data in the >>>> readable but poorly parsable format above that I need to deal with. >>>> If I were doing something similar in the shell, I would just pipe >>>> the >>>> file through sed or something to replace '5 days' with, say '5_days'. >>>> How could I achieve a similar sort of preprocessing in Python, ideally >>>> without having to generate a lot of temporary files? >>> >>> This is really annoying, isn't it? A space-separated line with spaces >>> in data entries. If the example you give is typical, I don't think >>> there is a general solution. If you know there are only certain >>> values like that, then you could do a search-and-replace for them in >>> Python just like the example you gave for "5 days". >>> >>> If you know that the field that might contain entries with spaces is >>> the same one, e.g., the one just before the "|" marker, you could make >>> use of that. But it could be tricky. >>> >>> I don't know how many files like this you will need to process, nor >>> how many rows they might contain. If I were to do tackle this job, I >>> would probably do some quality checking first. Using this example >>> file, figure out how many fields there are supposed to be. First, >>> split the file into lines: >>> >>> with open("filename") as f: >>> lines = f.readlines() >>> >>> # Check space-separated fields defined in first row: >>> fields = lines[0].split() >>> num_fields = len(fields) >>> print(num_fields) # e.g., 100) >>> >>> # Find lines that have the wrong number of fields >>> bad_lines = [] >>> for line in lines: >>> fields = line.split() >>> if len(fields) != num_fields: >>> bad_lines.append(line) >>> >>> print(len(bad_lines)) >>> >>> # Inspect a sample >>> for line in bad_lines[:10]: >>> print(line) >>> >>> This will give you an idea of how many problems lines there are, and >>> if they can all be fixed by a simple replacement. If they can and >>> this is the only file you need to handle, just fix it up and run it. >>> I would replace the spaces with tabs or commas. Splitting a line on >>> spaces (split()) takes care of the issue of having a variable number >>> of spaces, so that's easy enough. >>> >>> If you will need to handle many files, and you can automate the fixes >>> - possibly with a regular expression - then you should preprocess each >>> file before giving it to pandas. Something like this: >>> >>> def fix_line(line): >>> """Test line for field errors and fix errors if any.""" >>> # >>> return fixed >>> >>> # For each file >>> with open("filename") as f: >>> lines = f.readlines() >>> >>> fixed_lines = [] >>> for line in lines: >>>
Re: Preprocessing not quite fixed-width file before parsing
Thomas Passin writes: > On 11/23/2022 11:00 AM, Loris Bennett wrote: >> Hi, >> I am using pandas to parse a file with the following structure: >> Name filesettype KB quota limit >> in_doubtgrace |files quotalimit in_doubtgrace >> shortname sharedhome USR14097664 524288000 545259520 0 >> none | 107110 000 none >> gracedays sharedhome USR 774858944 524288000 775946240 0 >> 5 days | 1115717 000 none >> nametoolong sharedhome USR27418496 524288000 545259520 0 >>none |11581 000 none >> I was initially able to use >>df = pandas.read_csv(file_name, delimiter=r"\s+") >> because all the values for 'grace' were 'none'. Now, however, >> non-"none" values have appeared and this fails. >> I can't use >>pandas.read_fwf >> even with an explicit colspec, because the names in the first column >> which are too long for the column will displace the rest of the data to >> the right. >> The report which produces the file could in fact also generate a >> properly delimited CSV file, but I have a lot of historical data in the >> readable but poorly parsable format above that I need to deal with. >> If I were doing something similar in the shell, I would just pipe >> the >> file through sed or something to replace '5 days' with, say '5_days'. >> How could I achieve a similar sort of preprocessing in Python, ideally >> without having to generate a lot of temporary files? > > This is really annoying, isn't it? A space-separated line with spaces > in data entries. If the example you give is typical, I don't think > there is a general solution. If you know there are only certain > values like that, then you could do a search-and-replace for them in > Python just like the example you gave for "5 days". > > If you know that the field that might contain entries with spaces is > the same one, e.g., the one just before the "|" marker, you could make > use of that. But it could be tricky. > > I don't know how many files like this you will need to process, nor > how many rows they might contain. If I were to do tackle this job, I > would probably do some quality checking first. Using this example > file, figure out how many fields there are supposed to be. First, > split the file into lines: > > with open("filename") as f: > lines = f.readlines() > > # Check space-separated fields defined in first row: > fields = lines[0].split() > num_fields = len(fields) > print(num_fields) # e.g., 100) > > # Find lines that have the wrong number of fields > bad_lines = [] > for line in lines: >fields = line.split() >if len(fields) != num_fields: > bad_lines.append(line) > > print(len(bad_lines)) > > # Inspect a sample > for line in bad_lines[:10]: > print(line) > > This will give you an idea of how many problems lines there are, and > if they can all be fixed by a simple replacement. If they can and > this is the only file you need to handle, just fix it up and run it. > I would replace the spaces with tabs or commas. Splitting a line on > spaces (split()) takes care of the issue of having a variable number > of spaces, so that's easy enough. > > If you will need to handle many files, and you can automate the fixes > - possibly with a regular expression - then you should preprocess each > file before giving it to pandas. Something like this: > > def fix_line(line): >"""Test line for field errors and fix errors if any.""" ># >return fixed > > # For each file > with open("filename") as f: > lines = f.readlines() > > fixed_lines = [] > for line in lines: > fixed = fix_line(line) > fields = fixed.split() > tabified = '\t'.join(fields) # Could be done by fix_line() > fixed_lines.append(tabified) > > # Now use an IOString to feed the file to pandas > # From memory, some details may not be right > f = IOString() > f.writelines(fixed_lines) > > # Give f to pandas as if it were an external file > # ... > Thanks to both Gerard and Thomas for the pointer to IOString. I ended up just reading the file line-by-line, using a regex to replace ' |' with ' |' and writing the new lines to an IOString, which I then passed to pandas.read_csv. The wrapper approach looks interesting, but it looks like I need to read up more on contexts before adding that to my own code, otherwise I may not understand it in a month's time. Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Preprocessing not quite fixed-width file before parsing
Hi, I am using pandas to parse a file with the following structure: Name filesettype KB quota limit in_doubt grace |files quotalimit in_doubtgrace shortname sharedhome USR14097664 524288000 545259520 0 none | 107110 000 none gracedays sharedhome USR 774858944 524288000 775946240 0 5 days | 1115717 000 none nametoolong sharedhome USR27418496 524288000 545259520 0 none |11581 000 none I was initially able to use df = pandas.read_csv(file_name, delimiter=r"\s+") because all the values for 'grace' were 'none'. Now, however, non-"none" values have appeared and this fails. I can't use pandas.read_fwf even with an explicit colspec, because the names in the first column which are too long for the column will displace the rest of the data to the right. The report which produces the file could in fact also generate a properly delimited CSV file, but I have a lot of historical data in the readable but poorly parsable format above that I need to deal with. If I were doing something similar in the shell, I would just pipe the file through sed or something to replace '5 days' with, say '5_days'. How could I achieve a similar sort of preprocessing in Python, ideally without having to generate a lot of temporary files? Cheers, Loris -- This signature is currently under constuction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Gdal
Hi Conrado, "\"Jorge Conrado Conforte\"" writes: > Hi, > > > > > I installed GDAL using the pip command and conda. But, I did: > > import gdal and I had: Depending on your GDAL version, you might find you have to do from osgeo import gdal See https://gdal.org/api/python_bindings.html#tutorials Cheers, Loris > Traceback (most recent call last): > File "", line 1, in > ModuleNotFoundError: No module named 'gdal' > > > > > > I need gdal to remap some data. > > Please, help me > > > > > Conrado -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
poetry: local dependencies in dev and prod environments
Hi, Say I have two modules: main application and a utility. With poetry I can add the utility as local dependency to the main application thus: poetry add ../utilities/mailer/dist/my_mailer-0.1.0-py3-none-any.whl This then generates the following in the pyproj.toml of the main application: [tool.poetry.dependencies] python = "^3.6" my-mailer = {path = "../utilities/mailer/dist/my_mailer-0.1.0-py3-none-any.whl"} With this configuration I can run the application in the development environment. However, if I want to install both the main application and the utility to my 'site-package' directory, I need to change pyproj.toml to [tool.poetry.dependencies] python = "^3.6" my-mailer = ">0.1.0" This seems a bit clunky to me and I wonder whether there is a better way of handling the situation. The problem is that the venv used by poetry for the main application correctly ignores the 'site-packages' directory, which contains the utility when it is installed. However, it would be handy if I could allow just this single package from 'site-packages' to be used within the venv. Is this possible? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Exclude 'None' from list comprehension of dicts
Antoon Pardon writes: > Op 4/08/2022 om 13:51 schreef Loris Bennett: >> Hi, >> >> I am constructing a list of dictionaries via the following list >> comprehension: >> >>data = [get_job_efficiency_dict(job_id) for job_id in job_ids] >> >> However, >> >>get_job_efficiency_dict(job_id) >> >> uses 'subprocess.Popen' to run an external program and this can fail. >> In this case, the dict should just be omitted from 'data'. >> >> I can have 'get_job_efficiency_dict' return 'None' and then run >> >>filtered_data = list(filter(None, data)) >> >> but is there a more elegant way? > > Just wondering, why don't you return an empty dictionary in case of a failure? > In that case your list will be all dictionaries and empty ones will be > processed > fast enough. When the list of dictionaries is processed, I would have to check each element to see if it is empty. That strikes me as being less efficient than filtering out the empty dictionaries in one go, although obviously one would need to benchmark that. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Exclude 'None' from list comprehension of dicts
Hi, I am constructing a list of dictionaries via the following list comprehension: data = [get_job_efficiency_dict(job_id) for job_id in job_ids] However, get_job_efficiency_dict(job_id) uses 'subprocess.Popen' to run an external program and this can fail. In this case, the dict should just be omitted from 'data'. I can have 'get_job_efficiency_dict' return 'None' and then run filtered_data = list(filter(None, data)) but is there a more elegant way? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Exclude 'None' from list comprehension of dicts
r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>data = [get_job_efficiency_dict(job_id) for job_id in job_ids] > ... >>filtered_data = list(filter(None, data)) > > You could have "get_job_efficiency_dict" return an iterable > that yields either zero dictionaries or one dictionary. > For example, a list with either zero entries or one entry. > > Then, use "itertools.chain.from_iterable" to merge all those > lists with empty lists effectively removed. E.g., > > print( list( itertools.chain.from_iterable( [[ 1 ], [], [ 2 ], [ 3 ]]))) > > will print > > [1, 2, 3] 'itertool' is a bit of a blind-spot of mine, so thanks for pointing that out. > . Or, consider a boring old "for" loop: > > data = [] > for job_id in job_ids: > dictionary = get_job_efficiency_dict( job_id ) > if dictionary: > data.append( dictionary ) > > . It might not be "elegant", but it's quite readable to me. To me to. However, 'data' can occasionally consist of many 10,000s of elements. Would there be a potential performance problem here? Even if there is, it wouldn't be so bad, as the aggregation of the data is not time-critical and only occurs once a month. Still, I wouldn't want the program to be unnecessarily inefficient. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
2 options both mutually exclusive with respect to 3rd option
Hi, I want to plot some data which are collected over time. I either want to specify start and/or end times for the plot, or specify last week, month or year. So the usage would look like: plot_data [[--start START --end END] | --last LAST ] I know about argsparse's mutually exclusive group, but this creates a group of option with are all mutually exclusive amoungst each other. I want both --start and --end to be mutually exclusive with respect to --last. Can I achieve this directly with argsparse, or do I just have to check the options by hand? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: poetry script fails to find module
r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>However, in the development environment, if I run >> python stoat/main.py hpc user --help >>then is >> stoat/hpc/main.py >>being found via >> import hpc.main >>because Python looks in >> stoat >>as the parent directory of >> stoat/main.py >>rather than the current working directory? > > When you run "stoat/main.py", the directory of that file > "main.py" is automatically added at the front of sys.path. > (This also would happen after "cd stoat; python main.py".) OK, that explains it. I initially found that a bit odd, but thinking about it I see that "the directory containing the file being run" is a more sensible reference point than the current working directory, which is totally arbitrary. > Then, when "import hpc.main" is being executed, the system > will search for "hpc/main.py" in every entry of sys.path > and will use the first entry wherein it finds "hpc/main.py". > So it will use the directory of the file "stoat/main.py", > i.e., the directory "stoat". It finds "stoat/hpc/main.py". > > You can call "python" with "-v", and it will show some lines > with information about the imports executed, including the > directories used (albeit hidden in a lot of other lines). That's useful, although in the production case I would have to tweak the script generated by poetry. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: poetry script fails to find module
Hi Stefan, r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>Why is the module 'hpc' not found by the poetry script? > > I have tried to execute the following sequence of shell > commands to understand your problem. Here they all worked > without error messages. Warning: Some of these commands might > alter your directories or data, so only execute them if you > are aware of their meaning/consequences and agree with them! I do know what the commands do. However, just to be on the safe side, but mainly in honour of a classic dad-joke, I changed 'stoat' to 'weasel'. > mkdir stoat > mkdir stoat/hpc > echo import hpc.main >stoat/main.py > echo >stoat/hpc/main.py > python3 stoat/main.py > cd stoat > python3 main.py I guess with the production version, the package stoat is visible in sys.path and thus the subpackages have to be referred to via the main package, e.g. import stoat.hpc.main However, in the development environment, if I run python stoat/main.py hpc user --help then is stoat/hpc/main.py being found via import hpc.main because Python looks in stoat as the parent directory of stoat/main.py rather than the current working directory? That doesn't seem likely to me, but I am already confused. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: poetry script fails to find module
r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>However, this raises the question of why it worked in the first place >>in the poetry shell. > > It might have had a different or extended sys.path. In the poetry shell sys.path has this additional path /home/loris/gitlab/stoat The module not being found was /home/gitlab/stoat/stoat/hpc/main.py But if I run [~/gitlab/stoat] $ python stoat/main.py hpc user --help wouldn't import hpc.main still fail? Or is it because I am calling stoat/main.py and so Python looks for 'hpc' relative to the second 'stoat' directory? Confused, loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: poetry script fails to find module
"Loris Bennett" writes: > Hi, > > The following is a little bit involved, but I hope can make the problem clear. > > Using poetry I have written a dummy application which just uses to typer > to illustrate a possible interface design. The directory structure is a > follows: > > $ tree -P *.py > . > |-- dist > |-- stoat > | |-- hpc > | | |-- database.py > | | |-- group.py > | | |-- __init__.py > | | |-- main.py > | | |-- owner.py > | | `-- user.py > | |-- __init__.py > | |-- main.py > | `-- storage > | |-- database.py > | |-- group.py > | |-- __init__.py > | |-- main.py > | |-- owner.py > | |-- share.py > | `-- user.py > `-- tests > |-- __init__.py > `-- test_stoat.py > > With in the poetry shell I can run the application successfully: > > $ python stoat/main.py hpc user --help > > Usage: main.py hpc user [OPTIONS] COMMAND [ARGS]... > > manage HPC users > > Options: > --help Show this message and exit. > > Commands: > add add a user > remove remove a user > > I then install this in a non-standard path (because the OS Python3 is > 3.6.8) and can run the installed version successfully: > > $ PYTHONPATH=/trinity/shared/zedat/lib/python3.9/site-packages python > /trinity/shared/zedat/lib/python3.9/site-packages/stoat/main.py hpc user > --help > Usage: main.py hpc user [OPTIONS] COMMAND [ARGS]... > > manage HPC users > > Options: > --help Show this message and exit. > > Commands: > add add a user > remove remove a user > > However, poetry creates a script 'stoat' from the entry > > [tool.poetry.scripts] > stoat = "stoat.main:main" > > in pyproject.toml, which looks like > > > #!/trinity/shared/easybuild/software/Python/3.9.6-GCCcore-11.2.0/bin/python3.9 > # -*- coding: utf-8 -*- > import re > import sys > from stoat.main import main > if __name__ == '__main__': > sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) > sys.exit(main()) > > If I run that I get > > $ PYTHONPATH=/trinity/shared/zedat/lib/python3.9/site-packages stoat hpc > user --help > Traceback (most recent call last): > File "/trinity/shared/zedat/bin/stoat", line 5, in > from stoat.main import main > File "/trinity/shared/zedat/lib/python3.9/site-packages/stoat/main.py", > line 3, in > import hpc.main > ModuleNotFoundError: No module named 'hpc' > > Why is the module 'hpc' not found by the poetry script? Never mind, I worked it out. I had to replace import hpc.main with import stoat.hpc.main However, this raises the question of why it worked in the first place in the poetry shell. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
poetry script fails to find module
Hi, The following is a little bit involved, but I hope can make the problem clear. Using poetry I have written a dummy application which just uses to typer to illustrate a possible interface design. The directory structure is a follows: $ tree -P *.py . |-- dist |-- stoat | |-- hpc | | |-- database.py | | |-- group.py | | |-- __init__.py | | |-- main.py | | |-- owner.py | | `-- user.py | |-- __init__.py | |-- main.py | `-- storage | |-- database.py | |-- group.py | |-- __init__.py | |-- main.py | |-- owner.py | |-- share.py | `-- user.py `-- tests |-- __init__.py `-- test_stoat.py With in the poetry shell I can run the application successfully: $ python stoat/main.py hpc user --help Usage: main.py hpc user [OPTIONS] COMMAND [ARGS]... manage HPC users Options: --help Show this message and exit. Commands: add add a user remove remove a user I then install this in a non-standard path (because the OS Python3 is 3.6.8) and can run the installed version successfully: $ PYTHONPATH=/trinity/shared/zedat/lib/python3.9/site-packages python /trinity/shared/zedat/lib/python3.9/site-packages/stoat/main.py hpc user --help Usage: main.py hpc user [OPTIONS] COMMAND [ARGS]... manage HPC users Options: --help Show this message and exit. Commands: add add a user remove remove a user However, poetry creates a script 'stoat' from the entry [tool.poetry.scripts] stoat = "stoat.main:main" in pyproject.toml, which looks like #!/trinity/shared/easybuild/software/Python/3.9.6-GCCcore-11.2.0/bin/python3.9 # -*- coding: utf-8 -*- import re import sys from stoat.main import main if __name__ == '__main__': sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) sys.exit(main()) If I run that I get $ PYTHONPATH=/trinity/shared/zedat/lib/python3.9/site-packages stoat hpc user --help Traceback (most recent call last): File "/trinity/shared/zedat/bin/stoat", line 5, in from stoat.main import main File "/trinity/shared/zedat/lib/python3.9/site-packages/stoat/main.py", line 3, in import hpc.main ModuleNotFoundError: No module named 'hpc' Why is the module 'hpc' not found by the poetry script? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python & nmap
Lars Liedtke writes: > Ansible has got a shell module, so you could run custom commands on all > hosts. But it gets more difficult in parsing the output afterwards. If you just want to copy files, pdsh[1] or clush[2] might be enough. Cheers, Loris Footnotes: [1] https://github.com/chaos/pdsh [2] https://clustershell.readthedocs.io/en/latest/tools/clush.html -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Request for assistance (hopefully not OT)
Chris Angelico writes: > On Wed, 18 May 2022 at 04:05, Loris Bennett > wrote: >> >> [snip (26 lines)] >> >> > I think you had a problem before that. Debian testing is not an >> > operating system you should be using if you have a fairly good >> > understanding of how Debian (or Linux in general) works. >> >> Should be >> >> I think you had a problem before that. Debian testing is not an >> operating system you should be using *unless* you have a fairly good >> understanding of how Debian (or Linux in general) works. >> >> [snip (62 lines)] >> > > Oh! My bad, didn't see this correction, sorry. With this adjustment, > the comment is a bit more reasonable, although I'd still say it's > generally fine to run Debian Testing on a personal desktop machine; > there are a number of distros that base themselves on Testing. > > But yes, "unless" makes much more sense there. It's lucky I never got "if" and "unless" mixed up when I used to program in Perl ;-) Yes, there are a number of distros based on Debian Testing, but those tend to be aimed more at sysadmins (e.g. Kali and Grml) than people just starting out with Linux. However, with plain old Debian Testing you need to be able to deal with things occasionally not working properly. As the Debian people say about Testing: "If it doesn't work for you, then there's a good chance it's broken." And that's even before you delete part of the OS with 'rm'. Cheers, Loris -- https://mail.python.org/mailman/listinfo/python-list
Re: Request for assistance (hopefully not OT)
o1bigtenor writes: > Greetings > > I was having space issues in my /usr directory so I deleted some > programs thinking that the space taken was more an issue than having > older versions of the program. > > So one of the programs I deleted (using rm -r) was python3.9. Deleting anything from /usr via 'rm -r' which was installed via the package manager is an extremely bad idea. If you want to remove stuff, use the package manager. > Python3.10 was already installed so I thought (naively!!!) that things > should continue working. > (Python 3.6, 3.7 and 3.8 were also part of this cleanup.) Python 3.10 may be installed, but a significant number of packages depend on Python 3.9. That's why you should use the package manager - it knows all about the dependencies. > So now I have problems. I think you had a problem before that. Debian testing is not an operating system you should be using if you have a fairly good understanding of how Debian (or Linux in general) works. > Following is the system barf that I get when I run '# apt upgrade'. > > What can I do to correct this self-inflicted problem? > > (running on debian testing 5.17 I think you mean just 'Debian testing', which is what will become the next version of Debian, i.e. Debian 12. The '5.17' is just the kernel version, not a version of Debian. > Setting up python2.7-minimal (2.7.18-13.1) ... > Could not find platform independent libraries > Could not find platform dependent libraries > Consider setting $PYTHONHOME to [:] > /usr/bin/python2.7: can't open file > '/usr/lib/python2.7/py_compile.py': [Errno 2] No such file or > directory > dpkg: error processing package python2.7-minimal (--configure): > installed python2.7-minimal package post-installation script > subprocess returned error exit status 2 > Setting up python3.9-minimal (3.9.12-1) ... > update-binfmts: warning: /usr/share/binfmts/python3.9: no executable > /usr/bin/python3.9 found, but continuing anyway as you request > /var/lib/dpkg/info/python3.9-minimal.postinst: 51: /usr/bin/python3.9: not > found > dpkg: error processing package python3.9-minimal (--configure): > installed python3.9-minimal package post-installation script > subprocess returned error exit status 127 > dpkg: dependency problems prevent configuration of python3.9: > python3.9 depends on python3.9-minimal (= 3.9.12-1); however: > Package python3.9-minimal is not configured yet. > > dpkg: error processing package python3.9 (--configure): > dependency problems - leaving unconfigured > dpkg: dependency problems prevent configuration of python2.7: > python2.7 depends on python2.7-minimal (= 2.7.18-13.1); however: > Package python2.7-minimal is not configured yet. > > dpkg: error processing package python2.7 (--configure): > dependency problems - leaving unconfigured > dpkg: dependency problems prevent configuration of python3.9-dev: > python3.9-dev depends on python3.9 (= 3.9.12-1); however: > Package python3.9 is not configured yet. > > dpkg: error processing package python3.9-dev (--configure): > dependency problems - leaving unconfigured > . . . > Errors were encountered while processing: > python2.7-minimal > python3.9-minimal > python3.9 > python2.7 > python3.9-dev It might be possible to fix the system. If will probably be fairly difficult, but you would probably learn a lot doing it. However, if I were you, I would just install Debian stable over your borked system and then learn a bit more about package management. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Request for assistance (hopefully not OT)
[snip (26 lines)] > I think you had a problem before that. Debian testing is not an > operating system you should be using if you have a fairly good > understanding of how Debian (or Linux in general) works. Should be I think you had a problem before that. Debian testing is not an operating system you should be using *unless* you have a fairly good understanding of how Debian (or Linux in general) works. [snip (62 lines)] -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Instatiating module / Reusing module of command-line tool
r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>My question: What is the analogue to initialising an object via the >>constructor for a module? > > If you need a class, you can write a class. > > When one imports a module, the module actually gets executed. > That's why people write "if __name__ == '__main__':" often. > So, everything one wants to be done at import time can be > written directly into the body of one's module. So if I have a module which relies on having internal data being set from outside, then, even though the program only ever has one instance of the module, different runs, say test and production, would require different internal data and thus different instances. Therefore a class seems more appropriate and it is more obvious to me how to initialise the objects (e.g. by having the some main function which can read command-line arguments and then just pass the arguments to the constructor. I suppose that the decisive aspect is that my module needs initialisation and thus should to be a class. Your examples in the other posting of the modules 'math' and 'string' are different, because they just contain functions and no data. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Match.groupdict: Meaning of default argument?
Julio Di Egidio writes: > On Friday, 29 April 2022 at 09:50:08 UTC+2, Loris Bennett wrote: >> Hi, >> >> If I do >> >> import re >> pattern = >> re.compile(r'(?P\d*)(-?)(?P\d\d):(?P\d\d):(?P\d\d)') >> s = '104-02:47:06' >> match = pattern.search(s) >> match_dict = match.groupdict('0') >> >> I get >> >> match_dict >> {'days': '104', 'hours': '02', 'minutes': '47', 'seconds': '06'} >> >> However, if the string has no initial part (corresponding to the number of >> days), e.g. >> >> s = '02:47:06' >> match = pattern.search(s) >> match_dict = match.groupdict('0') >> >> I get >> >> match_dict >> {'days': '', 'hours': '02', 'minutes': '47', 'seconds': '06'} >> >> I thought that 'days' would default to '0'. >> >> What am I doing wrong? > > You tell, but it's quite obvious that you (just) run a regex on a string and > captures are going to be strings: indeed, '02' is not a number either... > > Julio I am not sure what you are trying to tell me. I wasn't expecting anything other than strings. The problem was, as Stefan helped me to understand, that I misunderstood what 'participating in the match' means. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Match.groupdict: Meaning of default argument?
"Loris Bennett" writes: > r...@zedat.fu-berlin.de (Stefan Ram) writes: > >> "Loris Bennett" writes: >>>I thought that 'days' would default to '0'. >> >> It will get the value '0' if (?P\d*) does >> /not/ participate in the match. >> >> In your case, it /does/ participate in the match, >> \d* matching the empty string. >> >> Try (?P\d+)?. > > Ah, thanks. I was misunderstanding the meaning of 'participate'. What I actually need is ((?P\d+)(-?))?(?P\d\d):(?P\d\d):(?P\d\d) so that I can match both 99-11:22:33 and 11:22:33 and have 'days' be '0' in the later case. Thanks for pointing me in the right direction. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Match.groupdict: Meaning of default argument?
r...@zedat.fu-berlin.de (Stefan Ram) writes: > "Loris Bennett" writes: >>I thought that 'days' would default to '0'. > > It will get the value '0' if (?P\d*) does > /not/ participate in the match. > > In your case, it /does/ participate in the match, > \d* matching the empty string. > > Try (?P\d+)?. Ah, thanks. I was misunderstanding the meaning of 'participate'. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Match.groupdict: Meaning of default argument?
Hi, If I do import re pattern = re.compile(r'(?P\d*)(-?)(?P\d\d):(?P\d\d):(?P\d\d)') s = '104-02:47:06' match = pattern.search(s) match_dict = match.groupdict('0') I get match_dict {'days': '104', 'hours': '02', 'minutes': '47', 'seconds': '06'} However, if the string has no initial part (corresponding to the number of days), e.g. s = '02:47:06' match = pattern.search(s) match_dict = match.groupdict('0') I get match_dict {'days': '', 'hours': '02', 'minutes': '47', 'seconds': '06'} I thought that 'days' would default to '0'. What am I doing wrong? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Instatiating module / Reusing module of command-line tool
Hi, I have a command-line script in Python to get the correct salutation for a user name in either English or German from a 'salutation server': $ get_salutation alice Dear Professor Müller $ get_salutation alice -l de Sehr geehrte Frau Professorin Müller The hostname, port, user and password for the 'salutation server' can be given as options on the command-line, but if omitted are read from a configuration file. The program is implemented in two modules without any classes: main.py: ... parse command-line options, read config file ... salutation = my_mailer.salutations.get_salutation(args.user, args.lang, args.host, args.port, args.user, args.secret) salutations.py def get_salutation(uid, lang, host, port, user, secret): ... I have another program that is intended to run as a cron job and send an email to certain users. This is implemented as a number of modules without any classes and will need to use the 'get_salutation' function from the first module. My question: What is the analogue to initialising an object via the constructor for a module? My understanding is that a module is a singleton of the class 'module'. So do I just write a method which reads the config file, or is there some more standardised way which corresponds to instantiating an object via my_object = MyClass() in the case of a class. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: matplotlib: scatterplot and histogram with same colour scale
"Loris Bennett" writes: > Hi, > > I am using matplotlib to create scatterplot where the colour depends on > the y-value. Additionally I want to project the y-values onto a rotated > histogram along the right side of the scatterplot. > > My problem is that with my current code, the colours used for the > histogram are normalised to the range of actual y-values. These don't > match the colours in the scatterplot, which are based on the absolute > y-value in the range 0-100. > > Can anyone tell me what I am doing wrong (code below)? Now I have written down the problem with MWE I can see that the mapping of the colours to the patches of the histogram is wrong. The following values illustrate the problem much better efficiencies = [51, 52, 53, 54, 55, 56, 57, 58, 59] core_hours = [ 5, 10, 15, 20, 25, 30, 35, 40, 45] The histogram is just being constructed over the actual data, which is reasonable. However, I want the histogram over the entire range 0-100, so I just need to add the 'range' parameter: n, bins, patches = axis[1].hist(efficiencies, n_bins, range=(0, 100), histtype='bar', orientation='horizontal') D'oh! > Cheers, > > Loris > > > > import matplotlib.pyplot as plt > import numpy as np > > efficiencies = [69, 48, 21, 28, 28, 26, 28] > core_hours = [3, 8, 10, 13, 14, 18, 20] > > figure, axis = plt.subplots(ncols=2, nrows=1, sharey=True, > gridspec_kw={'width_ratios': [10, 1]}) > > cm = plt.cm.RdYlGn > > n_bins = 10 > colours = plt.get_cmap(cm)(np.linspace(0, 1, n_bins)) > > axis[0].scatter(core_hours, efficiencies, c=efficiencies, > cmap=cm, vmin=0, vmax=100) > axis[0].set_xlabel("core-hours") > axis[0].set_ylabel("CPU efficiency [%]") > axis[0].set_ylim(ymin=-5, ymax=105) > > n, bins, patches = axis[1].hist(efficiencies, n_bins, > histtype='bar', orientation='horizontal') > for patch, colour in zip(patches, colours): > patch.set_facecolor(colour) > axis[1].set_xlabel("jobs") > > plt.tight_layout() > plt.show() > plt.close() -- Dr. Loris Bennett (Herr/Mr) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de -- https://mail.python.org/mailman/listinfo/python-list
matplotlib: scatterplot and histogram with same colour scale
Hi, I am using matplotlib to create scatterplot where the colour depends on the y-value. Additionally I want to project the y-values onto a rotated histogram along the right side of the scatterplot. My problem is that with my current code, the colours used for the histogram are normalised to the range of actual y-values. These don't match the colours in the scatterplot, which are based on the absolute y-value in the range 0-100. Can anyone tell me what I am doing wrong (code below)? Cheers, Loris import matplotlib.pyplot as plt import numpy as np efficiencies = [69, 48, 21, 28, 28, 26, 28] core_hours = [3, 8, 10, 13, 14, 18, 20] figure, axis = plt.subplots(ncols=2, nrows=1, sharey=True, gridspec_kw={'width_ratios': [10, 1]}) cm = plt.cm.RdYlGn n_bins = 10 colours = plt.get_cmap(cm)(np.linspace(0, 1, n_bins)) axis[0].scatter(core_hours, efficiencies, c=efficiencies, cmap=cm, vmin=0, vmax=100) axis[0].set_xlabel("core-hours") axis[0].set_ylabel("CPU efficiency [%]") axis[0].set_ylim(ymin=-5, ymax=105) n, bins, patches = axis[1].hist(efficiencies, n_bins, histtype='bar', orientation='horizontal') for patch, colour in zip(patches, colours): patch.set_facecolor(colour) axis[1].set_xlabel("jobs") plt.tight_layout() plt.show() plt.close() -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Dennis Lee Bieber writes: > On Tue, 19 Apr 2022 15:51:09 +0200, "Loris Bennett" > declaimed the following: > >>If I am merely trying to represent part a very large number of seconds >>as a number of years, 365 days per year does not seem that controversial > > The Explanatory Supplement to the Astronomical Almanac (table 15.3) > defines the /day/ as 24hrs->1440mins->86400secs BUT defines > the Julian > /year/ as 365.25 days. That is interesting. However, I am not claiming that the definition of a year as 365 24-hour days is in any way correct, merely that it is a possible definition and one that is potentially handy if one wants to represent large numbers of seconds in a more readable way. > It goes on to also give (for my copy -- length of year at 1990): > Tropical (equinox to equinox) 365.2421897 days * > Sidereal (fixed star to fixed star) 365.25636 days > Anomalistic (perihelion to perihelion)365.25964 days > Eclipse (moon's node to moon's node) 346.26005 > Gaussian (Kepler's law /a/=1) 365.25690 > Julian365.25 > > Length of the month (I interpret this as lunar month): > Synodic (new moon to new moon)29.53059 days > Tropical (as with year) 27.32158 > Sidereal (as with year) 27.32166 > Anomalistic (perigee to perigee) 27.55455 > Draconic (node to node) 27.21222 > > * I /think/ this is the year used for leap-day calculations, and why some > leap centuries are skipped as it is really less than a quarter day per > year, so eventually one gets to over-correcting by a day. > > Of course, this book also has a footnote giving the speed of light as > 1.80261750E12 Furlongs/Fortnight And of course I should have been asking why timedelta doesn't provide an easy way to format the period as a number of fortnights :-) > However, as soon you incorporate units that are not SI seconds you have > to differentiate between pure duration (based on SI seconds) and civil time > (which may jump when leap seconds are added/subtracted, time zones are > crossed, or daylight savings time goes into [or out of] effect). > > For the most part, Python's datetime module seems to account for civil > time concepts, not monotonic durations. That indeed seems to be the case and the lack of trivial formatting options for monotonic durations is what surprises me. > The Ada standard separates "duration" (as a measure of elapsed time, in > fixed point seconds) from "time" (a private type in Ada.Calendar -- which > does NOT define hours/minutes... It has Year 1901..2399, Month 1..12, Day > 1..31, Day_duration 0.0..86400.0). There are functions to create a Time > from components, split a Time into components, compare two times, add a > Duration to a Time, subtract a Duration from a Time, and subtract a Time > from a Time (getting a Duration). Oh, and a function to get Time from > system clock. Per the standard, the Time Zone used is implementation > defined (so one needs to read the implementation manual to find out what a > Time really notates). Of note: > """ > 26.a/1 > To be honest: {8652/0106} {AI95-00160-01} By "proper date" above we mean > that the given year has a month with the given day. For example, February > 29th is a proper date only for a leap year. We do not mean to include the > Seconds in this notion; in particular, we do not mean to require > implementations to check for the “missing hour” that occurs when Daylight > Savings Time starts in the spring. > """ > """ > 43 >type Duration is delta implementation-defined range > implementation-defined; > """ > > GNAT provides an extension package GNAT.Calendar that adds > hour/minute/day-of-week and some other utilities... BUT > """ > procedure Split_At_Locale > (Date : Ada.Calendar.Time; > Year : out Ada.Calendar.Year_Number; > Month : out Ada.Calendar.Month_Number; > Day: out Ada.Calendar.Day_Number; > Hour : out Hour_Number; > Minute : out Minute_Number; > Second : out Second_Number; > Sub_Second : out Second_Duration); > -- Split a standard Ada.Calendar.Time value in date data (Year, Month, > Day) >-- and Time data (Hour, Minute, Second, Sub_Second). This version of > Split >-- utilizes the time zone and DST bias of the locale (equivalent to > Clock). >-- Due to this simplified behavior, the implementation does not require >-- expensive system calls on targets such as Windows. >-- WARNING: Split_At_Locale is no longer aware of historic events and > may >-- produce inaccurate results over DST changes which occurred in the > past. > """ -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Random832 writes: > On Tue, Apr 19, 2022, at 07:11, Loris Bennett wrote: >> I now realise that timedelta is not really what I need. I am >> interested solely in pure periods, i.e. numbers of seconds, that I >> can convert back and forth from a format such as > > A timedelta *is* a pure period. A timedelta of one day is 86400 > seconds. > > The thing you *think* timedelta does [making a day act as 23 or 25 > hours across daylight saving boundaries etc], that you *don't* want it > to do, is something it *does not actually do*. I don't know how this > can be made more clear to you. I have now understood this. > timedelta is what you need. if you think it's not, it's because you're > using datetime incorrectly. It is what I need. It just doesn't do the trivial format conversion I (apparently incorrectly) expected. However, I can implement the format conversion myself. [snip (35 lines)] -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Jon Ribbens writes: > On 2022-04-19, Loris Bennett wrote: >> If I am merely trying to represent part a very large number of seconds >> as a number of years, 365 days per year does not seem that controversial >> to me. Obviously there are issues if you expect all periods of an >> integer number of years which start on a given date to all end on the >> same date. >> >> In my little niche, I just need a very simple period and am anyway not >> bothered about years, since in my case the number of days is usually >> capped at 14 and only in extremely exceptional circumstances could it >> get up to anywhere near 100. >> >> However, surely there are plenty of people measuring durations of a few >> hours or less who don't want to have to deal with seconds all the time >> (I am in fact also in this other group when I record my working hours). > > Well, that's my point. Everyone's all in their own slightly-different > little niches. There isn't one straightforward standard that makes all > or even most of them happy. I'm sure you're right. I just strikes me as a little odd that so much effort has gone into datetime to make things work (almost) properly for (almost) everyone, whereas timedelta has remained rather rudimentary, at least in terms of formatting. It seems to me that periods on the order of hours would have quite generic applications, but maybe that's just the view from my niche. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Jon Ribbens writes: > On 2022-04-19, Loris Bennett wrote: >> I now realise that timedelta is not really what I need. I am interested >> solely in pure periods, i.e. numbers of seconds, > > That's exactly what timedelta is. > >> that I can convert back and forth from a format such as >> >> 11-22::44:55 > > I don't recognise that format and can't work out what it means. > It should be trivial to write functions to parse whatever format > you wanted and convert between it and timedelta objects though. days-hours:minutes:seconds >> It is obviously fairly easy to rustle up something to do this, but I am >> surprised that this is not baked into Python (such a class also seems to >> be missing from R). > > I would be very surprised if any language supported the arbitrary format > above you happen to be interested in! But most languages support fairly arbitrary formatting of timedate-style objects. It doesn't seem unreasonable to me that such formatting might be available for simple periods. >> I would have thought that periods crop up all over >> the place and therefore formatting as strings and parsing of string >> would be supported natively by most modern languages. Apparently not. > > I think most languages think that a simple number suffices to represent > a fixed time period (commonly seconds or milliseconds). And if you want > more dynamic intervals (e.g. x months y days) then there is insufficient > consensus as to what that actually means. Maybe. It just seems to me that once you get up to more than a few hundred seconds, the ability to convert and from a more readable format becomes very useful. The length of a month may be unclear, but the definitions for year, week, day, hours, and minute are all trivial. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Jon Ribbens writes: > On 2022-04-19, Loris Bennett wrote: >> Jon Ribbens writes: >>> On 2022-04-19, Loris Bennett wrote: >>>> I now realise that timedelta is not really what I need. I am interested >>>> solely in pure periods, i.e. numbers of seconds, >>> >>> That's exactly what timedelta is. >>> >>>> that I can convert back and forth from a format such as >>>> >>>> 11-22::44:55 >>> >>> I don't recognise that format and can't work out what it means. >>> It should be trivial to write functions to parse whatever format >>> you wanted and convert between it and timedelta objects though. >> >> days-hours:minutes:seconds > > If by 'days' it means '86,400 seconds' then that's very easily > convertible to and from timedelta. > >>> I would be very surprised if any language supported the arbitrary format >>> above you happen to be interested in! >> >> But most languages support fairly arbitrary formatting of timedate-style >> objects. It doesn't seem unreasonable to me that such formatting might >> be available for simple periods. >> >>>> I would have thought that periods crop up all over >>>> the place and therefore formatting as strings and parsing of string >>>> would be supported natively by most modern languages. Apparently not. >>> >>> I think most languages think that a simple number suffices to represent >>> a fixed time period (commonly seconds or milliseconds). And if you want >>> more dynamic intervals (e.g. x months y days) then there is insufficient >>> consensus as to what that actually means. >> >> Maybe. It just seems to me that once you get up to more than a few >> hundred seconds, the ability to convert and from a more readable format >> becomes very useful. The length of a month may be unclear, but the >> definitions for year, week, day, hours, and minute are all trivial. > > Eh? The definitions for "year, week, day" are not in the slightest bit > trivial (unless you define 'day' as '86,400 seconds', in which case > 'year' is still not remotely trivial). Yes, I do mean just the trivial definitions from https://docs.python.org/3/library/datetime.html i.e. A millisecond is converted to 1000 microseconds. A minute is converted to 60 seconds. An hour is converted to 3600 seconds. A week is converted to 7 days. plus a 24-hour day and a 365-day year. > I think the issue is simply lack of consensus. Even though ISO 8601, > which is extremely common (possibly even ubiquitous, for anything > modern) for the format of date/times, also defines a format for > durations (e.g. 'P4Y3M' for '4 years 3 months'), I don't think > I have ever seen it used in practice - not least because apparently > it doesn't define what it actually means. So there isn't one simple > standard agreed by everyone that is an obvious candidate for inclusion > in language standard libraries. If I am merely trying to represent part a very large number of seconds as a number of years, 365 days per year does not seem that controversial to me. Obviously there are issues if you expect all periods of an integer number of years which start on a given date to all end on the same date. In my little niche, I just need a very simple period and am anyway not bothered about years, since in my case the number of days is usually capped at 14 and only in extremely exceptional circumstances could it get up to anywhere near 100. However, surely there are plenty of people measuring durations of a few hours or less who don't want to have to deal with seconds all the time (I am in fact also in this other group when I record my working hours). Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
"Peter J. Holzer" writes: > On 2022-04-16 20:35:22 -, Jon Ribbens via Python-list wrote: >> On 2022-04-16, Peter J. Holzer wrote: >> > On 2022-04-16 14:22:04 -, Jon Ribbens via Python-list wrote: >> >> ... although now having looked into the new 'zoneinfo' module slightly, >> >> it really should have a giant red flashing notice at the top of it >> >> saying "BEWARE, TIMEZONES IN PYTHON ARE UTTERLY BROKEN, NEVER USE THEM". >> >> >> >> Suppose we do this: >> >> >> >> >>> import datetime, zoneinfo >> >> >>> LOS_ANGELES = zoneinfo.ZoneInfo('America/Los_Angeles') >> >> >>> UTC = zoneinfo.ZoneInfo('UTC') >> >> >>> d = datetime.datetime(2020, 10, 31, 12, tzinfo=LOS_ANGELES) >> >> >>> print(d) >> >> 2020-10-31 12:00:00-07:00 >> >> >>> d1 = d + datetime.timedelta(days=1) >> >> >>> print(d1) >> >> 2020-11-01 12:00:00-08:00 >> >> >> >> d1 is *wrong*. >> > >> > No, this is correct. That's the result you want. >> >> I can categorically guarantee you it is not. But let's put it a >> different way, if you like, if I want to add 24 hours, i.e. 86,400 >> seconds (or indeed any other fixed time period), to a timezone-aware >> datetime in Python, how do I do it? > > What you *should* be able to do is use datetime.timedelta(hours=24). > > Unfortunately, you can't, because somebody decided to add a > normalization rule to timedelta which turns this into timedelta(days=1, > hours=0). > >> It would appear that, without converting to UTC before doing the >> calculation, you can't. > > When doing calculations of this kind I frankly prefer converting to > "seconds since the epoch" and doing simple arithmetic. (Yes, leap > seconds, I know .. I just ignore those) > > >> > So why didn't this work for me (I also used Python 3.9)? My guess is >> > that astimezone() doesn't pick the correct time zone. >> >> astimezone() doesn't pick a time zone at all. It works out the current >> local offset from UTC. > > The timezone object it returns also includes a timezone string ("CET" in > my example). So it's not *just* the offset. The result is misleading, > though. You get something which looks like it's a timezone object for > Central European Time, but isn't. > >> It doesn't know anything about when or if that >> offset ever changes. > > astimezone() doesn't have to. It just has to pick the correct timezone > object. That object then knows about offset changes. > > >> >> timedelta(days=1) is 24 hours (as you can check by >> >> calling timedelta(days=1).total_seconds() ), >> > >> > It shouldn't be. 1 Day is not 24 hours in the real world. >> >> Nevertheless, timedelta is a fixed time period so that is the only >> definition possible. > > Yeah, you keep repeating that. I think we are talking at cross-purposes > here. You are talking about how timedelta is implemented while I'm > talking what semantics it *should* have. > > >> >> It appears that with Python it's not so much a guideline as an >> >> absolute concrete rule, and not because programmers will introduce >> >> bugs, but because you need to avoid bugs in the standard library! >> > >> > As a programmer you must always adapt to the problem. Saying "I must do >> > it the wrong way because my library is buggy" is just lazy. >> >> I didn't say any of that. I said you must do it the conservative way, >> and it's not "my library" that's buggy, it's the language's built-in >> *standard library* that's buggy. > > With "your library" I meant "the library you have" not "the library you > wrote". And while having a buggy (or just badly designed) standard > library is especially annoying, you still aren't forced to use it if if > doesn't fit your needs. > > hp I now realise that timedelta is not really what I need. I am interested solely in pure periods, i.e. numbers of seconds, that I can convert back and forth from a format such as 11-22::44:55 (These are the lengths of time a job has run on an HPC system - leap seconds and time zones are of no relevance). It is obviously fairly easy to rustle up something to do this, but I am surprised that this is not baked into Python (such a class also seems to be missing from R). I would have thought that periods crop up all over the place and therefore formatting as strings and parsing of string would be supported natively by most modern languages. Apparently not. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
"Loris Bennett" writes: > Hi, > > With Python 3.9.2 I get > > $ import datetime > $ s = "1-00:01:01" > $ t = datetime.datetime.strptime(s, "%d-%H:%M:%S") > $ d = datetime.timedelta(days=t.day, hours=t.hour, minutes=t.minute, > seconds=t.second) > $ d.days > 1 > $ d.seconds > 61 > $ d.minutes > AttributeError: 'datetime.timedelta' object has no attribute 'minutes' > > Is there a particular reason why there are no attributes 'minutes' and > 'hours and the attribute 'seconds' encompasses is the entire fractional > day? That should read: Is there a particular reason why there are no attributes 'minutes' and 'hours' and the attribute 'seconds' encompasses the entire fractional day? > Cheers, > > Loris -- Dr. Loris Bennett (Herr/Mr) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de -- https://mail.python.org/mailman/listinfo/python-list
Why does datetime.timedelta only have the attributes 'days' and 'seconds'?
Hi, With Python 3.9.2 I get $ import datetime $ s = "1-00:01:01" $ t = datetime.datetime.strptime(s, "%d-%H:%M:%S") $ d = datetime.timedelta(days=t.day, hours=t.hour, minutes=t.minute, seconds=t.second) $ d.days 1 $ d.seconds 61 $ d.minutes AttributeError: 'datetime.timedelta' object has no attribute 'minutes' Is there a particular reason why there are no attributes 'minutes' and 'hours and the attribute 'seconds' encompasses is the entire fractional day? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Suggestion for Linux Distro (from PSA: Linux vulnerability)
Marco Sulla writes: > On Fri, 11 Mar 2022 at 19:10, Michael Torrie wrote: >> Both Debian stable and Ubuntu LTS state they have a five year support >> life cycle. > > Yes, but it seems that official security support in Debian ends after > three years: > > "Debian LTS is not handled by the Debian security team, but by a > separate group of volunteers and companies interested in making it a > success" > https://wiki.debian.org/LTS > > This is the only problem for me. I am not sure how different the two situations are. Ubuntu is presumably relying on the Debian security team as well as other volunteers and at least one company, namely Canonical. The sysadmins I know who are interested in long-term stability and avoiding unnecessary OS updates use Debian rather than Ubuntu, but that's maybe just my bubble. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to test input via subprocess.Popen with data from file
Dieter Maurer writes: > Loris Bennett wrote at 2022-3-10 13:16 +0100: >>I have a command which produces output like the >>following: >> >> Job ID: 9431211 >> Cluster: curta >> User/Group: build/staff >> State: COMPLETED (exit code 0) >> Nodes: 1 >> Cores per node: 8 >> CPU Utilized: 01:30:53 >> CPU Efficiency: 83.63% of 01:48:40 core-walltime >> Job Wall-clock time: 00:13:35 >> Memory Utilized: 6.45 GB >> Memory Efficiency: 80.68% of 8.00 GB >> >>I want to parse this and am using subprocess.Popen and accessing the >>contents via Popen.stdout. However, for testing purposes I want to save >>various possible outputs of the command as text files and use those as >>inputs. > > What do you want to test? the parsing? the "popen" interaction? > You can separately test both tasks (I, at your place, would do this). I just want to test the parsing. > For the parsing test, it is not relevant that the actual text > comes from an external process. You can directly read it from a file > or have it in your text. As mentioned in the original post, for the tests I indeed want to read the input from files. > In my view, you do not need a test for the `Popen` interaction: > if it works once, it will work always. Sorry if I was unclear but my question is: Given that the return value from Popen is a Popen object and given that the return value from reading a file is a single string or maybe a list of strings, what should the common format for the argument which is passed to the actual parsing function be? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Suggestion for Linux Distro (from PSA: Linux vulnerability)
Marco Sulla writes: > On Thu, 10 Mar 2022 at 04:50, Michael Torrie wrote: >> >> On 3/9/22 13:05, Marco Sulla wrote: >> > So my laziness pays. I use only LTS distros, and I update only when >> > there are security updates. >> > PS: any suggestions for a new LTS distro? My Lubuntu is reaching its >> > end-of-life. I prefer lightweight debian-like distros. >> >> Maybe Debian itself? > > I tried Debian on a VM, but I found it too much basical. A little > example: it does not have the shortcut ctrl+alt+t to open a terminal > that Ubuntu has. I'm quite sure it's simple to add, but I'm starting > to be old and lazy... The shortcuts are properties of the desktop environment. You could just install LXDE/LXQt on Debian if that's what you're used to from Lubuntu. Of course, if you're too old and lazy to set up a shortcut, you might also be too old and lazy to install a different desktop environment ;-) Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
How to test input via subprocess.Popen with data from file
Hi, I have a command which produces output like the following: Job ID: 9431211 Cluster: curta User/Group: build/staff State: COMPLETED (exit code 0) Nodes: 1 Cores per node: 8 CPU Utilized: 01:30:53 CPU Efficiency: 83.63% of 01:48:40 core-walltime Job Wall-clock time: 00:13:35 Memory Utilized: 6.45 GB Memory Efficiency: 80.68% of 8.00 GB I want to parse this and am using subprocess.Popen and accessing the contents via Popen.stdout. However, for testing purposes I want to save various possible outputs of the command as text files and use those as inputs. What format should I use to pass data to the actual parsing function? I could in both production and test convert the entire input to a string and pass the string to the parsing method. However, I could use something like test_input_01 = subprocess.Popen( ["cat test_input_01.txt"], stdout=subprocess.PIPE, ) for the test input and then pass a Popen object to the parsing function. Any comments on these alternative or suggestions for doing something completely different? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Dennis Lee Bieber writes: > On Tue, 01 Mar 2022 08:35:05 +0100, Loris Bennett > declaimed the following: > >>Thanks for the various suggestions. The data I need to store is just a >>dict with maybe 3 or 4 keys and short string values probably of less >>than 32 characters each per event. The traffic on the DB is going to be >>very low, creating maybe a dozen events a day, mainly triggered via a >>command-line interface, although I will probably set up one or two cron >>jobs, each of which might generate another 0 to maybe 5 records a day. >> >>I could go for JSON (or rather LONGSTRING, as JSON is just an alias for >>LONGSTRING, but JSON is not available on the version of MariaDB I am >>using). However, that seems like overkill, since I am never going to >>have to store anything near 4 GB in the field. So I should probably in >>fact just use say VARCHAR(255). >> >>WDYT? >> > > Having taken a few on-line short courses on database normalization and > SQL during my first lay-off, my view would be to normalize everything > first... Which, in your description, means putting that dictionary into a > separate table of the form (I also tend to define an autoincrement primary > key for all tables): > > DICTDATA(*ID*, _eventID_, dictKey, dictValue) > > where * delimits primary key, _ delimits foreign key to parent (event?) > record. Ah, yes, you are right. That would indeed be the correct way to do it. I'll look into that. Up to now I was thinking I would only ever want to read out the dict in its entirety, but that's probably not correct. > Caveat: While I have a book on SQLAlchemy, I confess it makes no sense to > me -- I can code SQL joins faster than figuring out how to represent the > same join in SQLAlchemy. I currently can't code SQL joins fast anyway, so although doing it in SQLAlchemy is might be relatively slower, absolutely there's maybe not going to be much difference :-) Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Robert Latest writes: > Loris Bennett wrote: >> Thanks for the various suggestions. The data I need to store is just a >> dict with maybe 3 or 4 keys and short string values probably of less >> than 32 characters each per event. The traffic on the DB is going to be >> very low, creating maybe a dozen events a day, mainly triggered via a >> command-line interface, although I will probably set up one or two cron >> jobs, each of which might generate another 0 to maybe 5 records a day. >> >> I could go for JSON (or rather LONGSTRING, as JSON is just an alias for >> LONGSTRING, but JSON is not available on the version of MariaDB I am >> using). However, that seems like overkill, since I am never going to >> have to store anything near 4 GB in the field. So I should probably in >> fact just use say VARCHAR(255). >> >> WDYT? > > Using TypeDecorator to transparently convert between a dict and its JSON > string > representation and MutableDict to track changes, you will get a completely > transparent attribute that works just like a dict. Make sure to check that the > generated JSON fits into your column width. I once got bitten by the fact that > VARCHAR(x) can hold only x/4 characters in utf8mb4 character set. Thanks for pointing out TypeDecorator - I wasn't aware of that. I won't need to track changes in the JSON data, because the events I am recording form an audit trail and so are written and read, but never modified. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Cameron Simpson writes: > On 28Feb2022 10:11, Loris Bennett wrote: >>I have an SQLAlchemy class for an event: >> >> class UserEvent(Base): >> __tablename__ = "user_events" >> >> id = Column('id', Integer, primary_key=True) >> date = Column('date', Date, nullable=False) >> uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) >> info = ?? >> >>The event may have arbitrary, but dict-like data associated with it, >>which I want to add in the field 'info'. This data never needs to be >>modified, once the event has been inserted into the DB. >> >>What type should the info field have? JSON, PickleType, String, or >>something else? > > I would use JSON, it expresses dicts well provided the dicts contain > only basic types (strings, numbers, other dicts/lists of basic types > recursively). > > I have personal problems with pickle because nonPython code can't read > it. > > Cheers, > Cameron Simpson Thanks for the various suggestions. The data I need to store is just a dict with maybe 3 or 4 keys and short string values probably of less than 32 characters each per event. The traffic on the DB is going to be very low, creating maybe a dozen events a day, mainly triggered via a command-line interface, although I will probably set up one or two cron jobs, each of which might generate another 0 to maybe 5 records a day. I could go for JSON (or rather LONGSTRING, as JSON is just an alias for LONGSTRING, but JSON is not available on the version of MariaDB I am using). However, that seems like overkill, since I am never going to have to store anything near 4 GB in the field. So I should probably in fact just use say VARCHAR(255). WDYT? Cheers, Loris -- https://mail.python.org/mailman/listinfo/python-list
SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Hi, I have an SQLAlchemy class for an event: class UserEvent(Base): __tablename__ = "user_events" id = Column('id', Integer, primary_key=True) date = Column('date', Date, nullable=False) uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) info = ?? The event may have arbitrary, but dict-like data associated with it, which I want to add in the field 'info'. This data never needs to be modified, once the event has been inserted into the DB. What type should the info field have? JSON, PickleType, String, or something else? I couldn't find any really reliable sounding information about the relative pros and cons, apart from a Reddit thread claiming that pickled dicts are larger than dicts converted to JSON or String. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: When to use SQLAlchemy listen events
"Loris Bennett" writes: > Hi, > > I am wondering whether SQLAlchemy listen events are appropriate for the > following situation: > > I have a table containing users and a table for events related to users > > class User(Base): > __tablename__ = "users" > > uid = Column('uid', String(64), primary_key=True) > gid = Column('gid', String(64), ForeignKey('groups.gid'), > nullable=False) > lang = Column('lang', String(2)) > > > class UserEvent(Base): > __tablename__ = "user_events" > > id = Column('id', Integer, primary_key=True) > date = Column('date', Date, nullable=False) > uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) > comment = Column('comment', String(256)) > > (There are also analogous tables for groups and group events). > > The functions provided by the interface are things like the following > > add_user(user, group, lang) > move_user(user, group) > delete_user(user) > warn_user(user, reason) > > Whereas 'add/move/delete_user' result in changes to the table 'users', > 'warn_user' does not. All should produce entries in the table > 'user_events'. > > There could be more functions similar to 'warn_user' that only create an > entry in 'user_events'. Potentially there could be a lot more of > these than the 'user'-table-changing type. > > It seems like for the first three functions, capturing the resulting > database changes in the table 'user_events' would be a standard use-case > for listen event. However, the 'warn_user' function is different. > > So can/should I shoehorn the 'warn_user' function to being like the > others three and use listen events, or should I just come up with my own > mechanism which will allow any function just to add an entry to the > events table? So I just ended up writing my own decorator. That seems more appropriate and flexible in this instance. -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
When to use SQLAlchemy listen events
Hi, I am wondering whether SQLAlchemy listen events are appropriate for the following situation: I have a table containing users and a table for events related to users class User(Base): __tablename__ = "users" uid = Column('uid', String(64), primary_key=True) gid = Column('gid', String(64), ForeignKey('groups.gid'), nullable=False) lang = Column('lang', String(2)) class UserEvent(Base): __tablename__ = "user_events" id = Column('id', Integer, primary_key=True) date = Column('date', Date, nullable=False) uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) comment = Column('comment', String(256)) (There are also analogous tables for groups and group events). The functions provided by the interface are things like the following add_user(user, group, lang) move_user(user, group) delete_user(user) warn_user(user, reason) Whereas 'add/move/delete_user' result in changes to the table 'users', 'warn_user' does not. All should produce entries in the table 'user_events'. There could be more functions similar to 'warn_user' that only create an entry in 'user_events'. Potentially there could be a lot more of these than the 'user'-table-changing type. It seems like for the first three functions, capturing the resulting database changes in the table 'user_events' would be a standard use-case for listen event. However, the 'warn_user' function is different. So can/should I shoehorn the 'warn_user' function to being like the others three and use listen events, or should I just come up with my own mechanism which will allow any function just to add an entry to the events table? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Abstraction level at which to create SQLAlchemy ORM object
Hi Cameron, Cameron Simpson writes: > On 10Feb2022 14:14, Loris Bennett wrote: >>I am writing a command line program which will modify entries in a >>database and am trying out SQLAlchemy. >> >>A typical command might look like >> >> um --operation add --uid ada --gid coders --lang en >> >>Parsing the arguments I get, ignoring the operation, a dict >> >> {uid: "ada", gid: "coders", lang: "en"} >> >>At some point this needs to be converted into an object of the class User: >> >> class User(Base): >> __tablename__ = "users" >> >> uid = Column('uid', String, primary_key=True) >> gid = Column('gid', String) >> lang = Column('lang', String) >> >>In a way it seems it would be economical to do the conversion as early >>as possible, so I can just pass around User objects. However, this >>would mean that the entry point for the program would already be tightly >>coupled to the specifics of the database interaction. >> >>On the other hand, delaying the conversion would mean probably having to >>define my own User class, which seems like unnecessary overhead. It >>would have the advantage that I could ditch SQLAlchemy more easily if I >>find it too mind-bending. > > If the entire persistent state of the user lives in the db I'd just > define the User ORM type and give it whatever methods you need. So > exactly what you've got above. > > It is close to the db, but if you only interact via the methods and the > core attributes/columns that should be mostly irrelevant to you. > > If you're concerned about switching backends, maybe define an > AbstractUser abstract class with the required methods. Then you can at > least ensure method coverage if you make another backend: > > class AbstractUser(ABC): > @abstractmethod > def some_user_method(self,...): > > > class SQLAUser(Base, AbstractUser): > ... your SQLA ORM User class above ... > > User = SQLAUser > > ... everything else just talks about user ... > > But you can do all of that _later_, only needed if you decide to change > backends in a controlled manner. Thanks for reminding me about abstract classes, but more importantly that I can do this kind of stuff later. Cheers, Loris -- https://mail.python.org/mailman/listinfo/python-list
Abstraction level at which to create SQLAlchemy ORM object
Hi, I am writing a command line program which will modify entries in a database and am trying out SQLAlchemy. A typical command might look like um --operation add --uid ada --gid coders --lang en Parsing the arguments I get, ignoring the operation, a dict {uid: "ada", gid: "coders", lang: "en"} At some point this needs to be converted into an object of the class User: class User(Base): __tablename__ = "users" uid = Column('uid', String, primary_key=True) gid = Column('gid', String) lang = Column('lang', String) In a way it seems it would be economical to do the conversion as early as possible, so I can just pass around User objects. However, this would mean that the entry point for the program would already be tightly coupled to the specifics of the database interaction. On the other hand, delaying the conversion would mean probably having to define my own User class, which seems like unnecessary overhead. It would have the advantage that I could ditch SQLAlchemy more easily if I find it too mind-bending. WDYT? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: When to initialise a session
"Loris Bennett" writes: > Hi, > > I am writing a fairly simple command-line application which will just > add or delete an entry in a database and then generate a corresponding > email. > > I am using SQLAlchemy to wrap a class around a database and have > > class DatebaseWrapper(): > """Encapsulation of the database""" > > def __init__(self, url): > self.engine = create_engine(url) > > Should I extend the initialisation to > > def __init__(self, url): > self.engine = create_engine(url) > self.session = sessionmaker(self.engine) > > since each there will be only one session per call of the program? > > Or, since I am writing the database wrapper as its own module for > possible reuse, should the program using the wrapper class > initialise the session itself? Turns out this is all explained here: https://docs.sqlalchemy.org/en/14/orm/session_basics.html#session-frequently-asked-questions Sorry for the noise. -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
SQLAlchemy: When to initialise a session
Hi, I am writing a fairly simple command-line application which will just add or delete an entry in a database and then generate a corresponding email. I am using SQLAlchemy to wrap a class around a database and have class DatebaseWrapper(): """Encapsulation of the database""" def __init__(self, url): self.engine = create_engine(url) Should I extend the initialisation to def __init__(self, url): self.engine = create_engine(url) self.session = sessionmaker(self.engine) since each there will be only one session per call of the program? Or, since I am writing the database wrapper as its own module for possible reuse, should the program using the wrapper class initialise the session itself? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to package a Python command line app?
Hi Manfred, Manfred Lotz writes: > Hi Loris, > > On Wed, 08 Dec 2021 15:38:48 +0100 > "Loris Bennett" wrote: > >> Hi Manfred, >> >> Manfred Lotz writes: >> >> > The are many possibilities to package a Python app, and I have to >> > admit I am pretty confused. >> > >> > Here is what I have: >> > >> > A Python command line app which requires some packages which are >> > not in the standard library. >> > >> > I am on Linux and like to have an executable (perhaps a zip file >> > with a shebang; whatever) which runs on different Linux systems. >> > >> > Different mean >> > - perhaps different glibc versions >> > - perhaps different Python versions >> > >> > In my specific case this is: >> > - RedHat 8.4 with Python 3.6.8 >> > - Ubuntu 20.04 LTS with Python 3.8.10 >> > - and finally Fedora 33 with Python 3.9.9 >> > >> > >> > Is this possible to do? If yes which tool would I use for this? >> >> I use poetry[1] on CentOS 7 to handle all the dependencies and create >> a wheel which I then install to a custom directory with pip3. >> >> You would checkout the repository with your code on the target system, >> start a poetry shell using the Python version required, and then build >> the wheel. From outside the poetry shell you can set PYTHONUSERBASE >> and then install with pip3. You then just need to set PYTHONPATH >> appropriately where ever you want to use your software. >> > > In my case it could happen that I do not have access to the target > system but wants to handover the Python app to somebody else. This > person wants just to run it. For what ever reasons, there does not seem to be much focus on this kind of deployment for Python. Similar to the way things are with Perl, the assumption seems to be that you have a working environment and install any dependencies needed to get the program you have been given working. I have never used it, but you might want to look at something like pyinstaller https://pyinstaller.readthedocs.io However, it looks as if it is aimed towards bundling a single script. I don't know how it would work if you have a more complex program consisting of a number of modules. >> Different Python versions shouldn't be a problem. If some module >> depends on a specific glibc version, then you might end up in standard >> dependency-hell territory, but you can pin module versions of >> dependencies in poetry, and you could also possibly use different >> branches within your repository to handle that. >> > > I try to avoid using modules which depeng on specific glibc. So would I, but you mentioned it above in your definition of 'different'. > Although, it seems that it doesn't really help for my use case I will > play with poetry to get a better understanding of its capabilities. You're right, poetry doesn't seem to address your main problem. Nevertheless, it might be useful for developing your program before you get to the question of how to distribute it Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to package a Python command line app?
Hi Manfred, Manfred Lotz writes: > The are many possibilities to package a Python app, and I have to admit > I am pretty confused. > > Here is what I have: > > A Python command line app which requires some packages which are not in > the standard library. > > I am on Linux and like to have an executable (perhaps a zip file with a > shebang; whatever) which runs on different Linux systems. > > Different mean > - perhaps different glibc versions > - perhaps different Python versions > > In my specific case this is: > - RedHat 8.4 with Python 3.6.8 > - Ubuntu 20.04 LTS with Python 3.8.10 > - and finally Fedora 33 with Python 3.9.9 > > > Is this possible to do? If yes which tool would I use for this? I use poetry[1] on CentOS 7 to handle all the dependencies and create a wheel which I then install to a custom directory with pip3. You would checkout the repository with your code on the target system, start a poetry shell using the Python version required, and then build the wheel. From outside the poetry shell you can set PYTHONUSERBASE and then install with pip3. You then just need to set PYTHONPATH appropriately where ever you want to use your software. Different Python versions shouldn't be a problem. If some module depends on a specific glibc version, then you might end up in standard dependency-hell territory, but you can pin module versions of dependencies in poetry, and you could also possibly use different branches within your repository to handle that. HTH Loris Footnotes: [1] https://python-poetry.org -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Alternatives to Jupyter Notebook
Martin Schöön writes: > Den 2021-10-20 skrev Shaozhong SHI : >> >> My Jupyter notebook becomes unresponsive in browsers. >> > Odd, I never had any problems like that. I use Firefox on Linux. > >> Are there alternatives to read, edit and run Jupyter Notebook? >> > I know some people use emacs orgmode. I have never tried it > myself and do not know how well it works. I don't know Jupyter well enough to know whether Org mode is a real alternative. However, with Org mode you can combine text and code in multiple languages, run the code within Emacs and then export the results in various formats such as PDF and HTML. I use it to collate monthly statistics by running shell scripts on a remove server to generate data in tables within the Org file, running R code on the data to generate plots from the data. I finally export the whole report to a PDF. This is all done within a single local Emacs instance. YMMV. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
argparse - 3rd arg only valid for one of two mutually exclusive args?
Hi, With argparse's add_mutually_exclusive_group() I can add mutually exclusive args, but how do I deal with a 3rd arg which only makes sense for one of the mutually exclusive args? More generally I suppose I am interested in having something like [ --foo (--foobar) | --bar (--barfoo) ] if that makes it any clearer. I have seen subcommand suggested as a way of doing this, but that doesn't seem like a very good fit for my use-case. Is there a more argument-orientated alternative? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Making command-line args available to deeply-nested functions
George Fischhof writes: > George Fischhof ezt írta (időpont: 2021. aug. 29., V, > 21:27): > >> >> >> Loris Bennett ezt írta (időpont: 2021. aug. >> 26., Cs, 16:02): >> >>> George Fischhof writes: >>> >>> [snip (79 lines)] >>> >>> >> > Hi, >>> >> > >>> >> > Also you can give a try to click and / or typer packages. >>> >> > Putting args into environment variables can be a solution too >>> >> > All of these depends on several things: personal preferences, >>> colleagues >>> >> / >>> >> > firm standards, the program, readability, variable accessibility (IDE >>> >> > support, auto completition) (env vars not supported by IDEs as they >>> are >>> >> not >>> >> > part of code) >>> >> >>> >> Thanks for the pointers, although I have only just got my head around >>> >> argparse/configargparse, so click is something I might have a look at >>> >> for future project. >>> >> >>> >> However, the question of how to parse the arguments is somewhat >>> separate >>> >> from that of how to pass (or not pass) the arguments around within a >>> >> program. >>> >>> [snip (16 lines)] >>> > >>> > Hi, >>> > I thought not just parsing, but the usage method: you add a decorator to >>> > the function where you want to use the parameters. This way you do not >>> have >>> > to pass the value through the calling hierarchy. >>> > >>> > Note: typer is a newer package, it contains click and leverages command >>> > line parsing even more. >>> >>> Do you have an example of how this is done? From a cursory reading of >>> the documentation, it didn't seem obvious to me how to do this, but then >>> I don't have much understanding of how decorators work. >>> >>> Cheers, >>> >>> Loris >>> >>> >>> -- >>> This signature is currently under construction. >>> -- >>> https://mail.python.org/mailman/listinfo/python-list >> >> >> Hi, >> >> will create a sample code on Monday - Tuesday >> >> BR, >> George >> > > > Hi, > > here is the program ;-) (see below) > typer does not uses decorators, to solve this problem they advice to use > click's decorators, mixing typer and click. > Practically I prefer not to mix them, also the parts for easiest way to do > this just available in latest click, which is not supported in typer. > > So I created all the stuff in click, 8.x should be used > > BR, > George > > > import click > > > # read command line parameters > @click.command() > @click.option('--level_1', help='Level 1') > @click.option('--level_2', help='Level 2') > def main(level_1, level_2): > # put command line parameters into global context > ctx = click.get_current_context() > ctx.meta['level_1'] = level_1 > ctx.meta['level_2'] = level_2 > > level_1_function() > > > # pass / inject level_1 parameter to this function > @click.decorators.pass_meta_key('level_1') > def level_1_function(level_1): > print(f'level 1 variable: {level_1}') > level_2_function() > > > # pass / inject level_2 parameter to this function > @click.decorators.pass_meta_key('level_2') > def level_2_function(level_2): > print(f'level 2 variable: {level_2}') > > > if __name__ == "__main__": > main() Thanks for the example - that's very interesting. However, after a bit of reflection I think I am going to stick to explicit argument passing, so that I can have more generic modules that can be used by other programs. I'll then just encapsulate the argument parsing in a single function corresponding to the command line tool. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Trouble propagating logging configuration
"Dieter Maurer" writes: > Loris Bennett wrote at 2021-9-1 13:48 +0200: >> ... >>Yes, but to quote from >>https://docs.python.org/3.6/howto/logging.html#logging-basic-tutorial: >> >> A good convention to use when naming loggers is to use a module-level >> logger, in each module which uses logging, named as follows: >> >>logger = logging.getLogger(__name__) >> >> This means that logger names track the package/module hierarchy, and >> it’s intuitively obvious where events are logged just from the logger >> name. >> >>so in this case the source layout is relevant, isn't it? > > Relevant in this case is the package/module hierarchy. > > Often the package/module hierarchy follows the source layout > **BUT** this is not necessarily the case. > > In particular, the "start" module of a script is called `__main__` > indepently of its location. > Furthermore, so called "namespace packages" consist of > decentralized (i.e. located at different places in the file system) > subpackages. > > Thus, in general, the connection between pachage/module hierarchy > and the source layout is loose. > > >>> Furthermore, the place of the configuration (and where in the >>> code it is activated) is completely irrelevant for the "inheritance". >> >>OK, so one issue is that I was getting confused by the *order* in which >>modules are being called. If I have two modules, 'foo' and 'bar', in >>the same directory, configure the logging just in 'foo' and then call >> >> >> foo.some_method() >> bar.some_method() >> >>then both methods will be logged. If I do >> >> bar.some_method() >> foo.some_method() >> >>then only the method in 'foo' will be logged. > > Usually, log configuration is considered a (global) application > (not a (local) package/module) concern: > The components (modules) decide what to log at what level > and global configuration decides what to do with those messages. > > Thus, typically, you configure the complete logging in > your main application module and then start to import modules > and call their functions. > >> ... >>If I have >> >> [loggers] >> keys=root,main,log_test >> >>in my logging configuration and initialise the logging in run.py ... > > This logging configuration is obviously not complete (thus, I cannot > check what happens in your case). > You may have made errors at other places in your configuration > which may explain your observations. > > At your place, I would look at the `logging` source code > to find out where you can see (in an interactive Python session) > which loggers have been created by your configuration and > how they are configured. For the last part you use "vars(logger)". Thanks Peter and Dieter for all the help. I have finally figured out what my problem was. If in a module 'mylibs.mylib' I have import logging logger = logging.getLogger(__name__) and in my main script have import logger import logger.config import mylibs.mylib logging.config.fileConfig("/home/loris/config") logger = logging.getLogger(__name__) then 'logger' in 'mylibs.mylib' is defined before 'logging.config.fileConfig' is called and the logger in the module is not configured. If change the order and write import logger import logger.config logging.config.fileConfig("/home/loris/config") logger = logging.getLogger(__name__) import mylibs.mylib then the 'logger' in 'mylibs.mylibs' does get configured properly. I'm still thinking what implications this has if I want to load a configuration file via a command-line option. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Trouble propagating logging configuration
"Dieter Maurer" writes: > Loris Bennett wrote at 2021-8-31 15:25 +0200: >>I am having difficulty getting the my logging configuration passed on >>to imported modules. >> >>My initial structure was as follows: >> >> $ tree blorp/ >> blorp/ >> |-- blorp >> | |-- __init__.py >> | |-- bar.py >> | |-- foo.py >> | `-- main.py >> `-- pyproject.toml >> >>whereby the logging configuration is done in main.py. >> >>After thinking about it, I decided maybe the inheritance wasn't working >>because main.py is in the same directory as the other files. > > Should you speak about Python's `logging` module, then > the "inheritance" does not depend on the source layout. > Instead, it is based on the hierarchy of dotted names. > It is completely up to you which dotted names you are using > in your `getLogger` calls. Yes, but to quote from https://docs.python.org/3.6/howto/logging.html#logging-basic-tutorial: A good convention to use when naming loggers is to use a module-level logger, in each module which uses logging, named as follows: logger = logging.getLogger(__name__) This means that logger names track the package/module hierarchy, and it’s intuitively obvious where events are logged just from the logger name. so in this case the source layout is relevant, isn't it? > Furthermore, the place of the configuration (and where in the > code it is activated) is completely irrelevant for the "inheritance". OK, so one issue is that I was getting confused by the *order* in which modules are being called. If I have two modules, 'foo' and 'bar', in the same directory, configure the logging just in 'foo' and then call foo.some_method() bar.some_method() then both methods will be logged. If I do bar.some_method() foo.some_method() then only the method in 'foo' will be logged. However, I still have the following problem. With the structure $ tree . . |-- log_test | |-- __init__.py | |-- first.py | `-- second.py |-- pyproject.toml |-- README.rst |-- run.py `-- tests |-- __init__.py |-- config `-- test_log_test.py I have __name__ variables as follows: __file__: /home/loris/log_test/log_test/first.py, __name__: log_test.first __file__: /home/loris/log_test/log_test/second.py, __name__: log_test.second __file__: ./run.py, __name__: __main__ If I have [loggers] keys=root,main,log_test in my logging configuration and initialise the logging in run.py with logging.config.fileConfig("/home/loris/log_test/tests/config") logger = logging.getLogger() or logging.config.fileConfig("/home/loris/log_test/tests/config") logger = logging.getLogger("log_test") then only calls in 'run.py' are logged. I can obviously initialise the logging within the subordinate package, i.e. in 'log_test/__init__.py', but that seems wrong to me. So what is the correct way to initialise logging from a top-level script such that logging is activated in all modules requested in the logging configuration? > For details, read the Python documentation for the `logging` module. If they were sufficient, I wouldn't need the newsgroup :-) Thanks for the help, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Trouble propagating logging configuration
"Loris Bennett" writes: > Hi, > > I am having difficulty getting the my logging configuration passed on > to imported modules. > > My initial structure was as follows: > > $ tree blorp/ > blorp/ > |-- blorp > | |-- __init__.py > | |-- bar.py > | |-- foo.py > | `-- main.py > `-- pyproject.toml > > whereby the logging configuration is done in main.py. > > After thinking about it, I decided maybe the inheritance wasn't working > because main.py is in the same directory as the other files. So I > changed the structure to > > $ tree blorp/ > blorp/ > |-- blorp > | |-- __init__.py > | |-- bar.py > | `-- foo.py > |-- main.py > `-- pyproject.toml > > but the logging configuration still is not propagated. > > Can anyone at least confirm that moving main.py to the directory above > the other files is the correct thing to do and thus the problem is being > caused by something else? I should mention that I am using poetry and thus the program is called via an entry in the pyproject.toml file such as [tool.poetry.scripts] blorp_out = "main:blorp_out" I have a suspicion that this way of calling the program somehow interferes with the inheritance mechanism used by logging. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Trouble propagating logging configuration
Hi, I am having difficulty getting the my logging configuration passed on to imported modules. My initial structure was as follows: $ tree blorp/ blorp/ |-- blorp | |-- __init__.py | |-- bar.py | |-- foo.py | `-- main.py `-- pyproject.toml whereby the logging configuration is done in main.py. After thinking about it, I decided maybe the inheritance wasn't working because main.py is in the same directory as the other files. So I changed the structure to $ tree blorp/ blorp/ |-- blorp | |-- __init__.py | |-- bar.py | `-- foo.py |-- main.py `-- pyproject.toml but the logging configuration still is not propagated. Can anyone at least confirm that moving main.py to the directory above the other files is the correct thing to do and thus the problem is being caused by something else? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: configargparse - reading option from config only
Richard Damon writes: > On 8/27/21 3:37 AM, Loris Bennett wrote: >> Richard Damon writes: >> >>> On 8/26/21 6:01 AM, Loris Bennett wrote: >>>> Hi, >>>> >>>> When using configargparse, it seems that if a value is to be read from a >>>> config file, it also has to be defined as a command-line argument in >>>> order to turn up as an attribute in the parser namespace. >>>> >>>> I can sort of see why this is the case, but there are also some options >>>> I would like to read just from the config file and not have them >>>> available as command-line options. This would be, say, to prevent the >>>> number of options on the command-line from becoming bloated by >>>> little-used settings. >>>> >>>> Is there an elegant way to do this? >>>> >>>> Cheers, >>>> >>>> Loris >>>> >>> Look at the read() member function to supply the file name to read. Then >>> in the config object there will be sections for each section in the >>> config file. No need for any of these to be 'options' >> Do you have a link for this? As far as I can see, the config files are >> given in the following manner: >> >> p = >> configargparse.ArgParser(default_config_files=['/etc/app/conf.d/*.conf', >> '~/.my_settings']) >> >> I can obviously just read the config file with configparser, but the >> idea of configargparse is that an option can be specified as an option, >> in a config file, or as an environment variable, >> >> As far as I can tell, configargparse only loads entries from the config >> file into the appropriate namespace if they have also been defined as >> long options (i.e. with '--'). I was hoping to access *all* the config >> file entries, regardless of whether they are also options, since the >> config is obviously being read. >> >> Cheers, >> >> Loris >> > I misread your question, I thought you were talking about configparse. > > Question is, if configargparse doesn't do what you want, then it isn't > the right tool. > > It looks like configargparse is SPECIFICALLY designed to allow the use > to use a file as a shorthand to present command line arguements. The > whole parsing structure is based on an enumerated set of options, if > that isn't what you have, it is the wrong tool. I am not sure what you mean by using a file as a shorthand to present command line arguements since the command-line arguments are defined by caling the 'add_argument' method of the configargparse object. However, I agree with your analysis that configargparse is the wrong tool for what I want to do. I like the idea that a variable can be defined as a command-line option, an entry in a config file, or as an environment variable. However, when I think about it, it seems that command-line options are essentially different from parameters in a configuration file, not least because the former need some sort of description for the output of '--help', whereas the latter do not. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: configargparse - reading option from config only
tuxifreund writes: > Hello, > > you could use the argparse module[1] to parse command line arguments and > configparser[2] to parse configuration files following the ini file > structure. Is that what you are expecting? I have used the combination of argparse and configparser before. However, I was hoping to just use configargparse instead. Cheers, Loris > > Cheers > > > [1]: https://docs.python.org/3/library/argparse.html > [2]: https://docs.python.org/3/library/configparser.html -- Dr. Loris Bennett (Hr./Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de -- https://mail.python.org/mailman/listinfo/python-list
Re: configargparse - reading option from config only
Richard Damon writes: > On 8/26/21 6:01 AM, Loris Bennett wrote: >> Hi, >> >> When using configargparse, it seems that if a value is to be read from a >> config file, it also has to be defined as a command-line argument in >> order to turn up as an attribute in the parser namespace. >> >> I can sort of see why this is the case, but there are also some options >> I would like to read just from the config file and not have them >> available as command-line options. This would be, say, to prevent the >> number of options on the command-line from becoming bloated by >> little-used settings. >> >> Is there an elegant way to do this? >> >> Cheers, >> >> Loris >> > Look at the read() member function to supply the file name to read. Then > in the config object there will be sections for each section in the > config file. No need for any of these to be 'options' Do you have a link for this? As far as I can see, the config files are given in the following manner: p = configargparse.ArgParser(default_config_files=['/etc/app/conf.d/*.conf', '~/.my_settings']) I can obviously just read the config file with configparser, but the idea of configargparse is that an option can be specified as an option, in a config file, or as an environment variable, As far as I can tell, configargparse only loads entries from the config file into the appropriate namespace if they have also been defined as long options (i.e. with '--'). I was hoping to access *all* the config file entries, regardless of whether they are also options, since the config is obviously being read. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
configargparse - reading option from config only
Hi, When using configargparse, it seems that if a value is to be read from a config file, it also has to be defined as a command-line argument in order to turn up as an attribute in the parser namespace. I can sort of see why this is the case, but there are also some options I would like to read just from the config file and not have them available as command-line options. This would be, say, to prevent the number of options on the command-line from becoming bloated by little-used settings. Is there an elegant way to do this? Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Making command-line args available to deeply-nested functions
George Fischhof writes: [snip (79 lines)] >> > Hi, >> > >> > Also you can give a try to click and / or typer packages. >> > Putting args into environment variables can be a solution too >> > All of these depends on several things: personal preferences, colleagues >> / >> > firm standards, the program, readability, variable accessibility (IDE >> > support, auto completition) (env vars not supported by IDEs as they are >> not >> > part of code) >> >> Thanks for the pointers, although I have only just got my head around >> argparse/configargparse, so click is something I might have a look at >> for future project. >> >> However, the question of how to parse the arguments is somewhat separate >> from that of how to pass (or not pass) the arguments around within a >> program. [snip (16 lines)] > > Hi, > I thought not just parsing, but the usage method: you add a decorator to > the function where you want to use the parameters. This way you do not have > to pass the value through the calling hierarchy. > > Note: typer is a newer package, it contains click and leverages command > line parsing even more. Do you have an example of how this is done? From a cursory reading of the documentation, it didn't seem obvious to me how to do this, but then I don't have much understanding of how decorators work. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Decoding of EmailMessage text
jak writes: > Il 23/08/2021 13:12, Loris Bennett ha scritto: >> Jon Ribbens writes: >> >>> On 2021-08-23, Loris Bennett wrote: >>>> If instead of >>>> >>>>mail.set_content(body) >>>> >>>> I do >>>> >>>>mail.set_content(body, cte="quoted-printable") >>> >>> Try print(mail.get_content()) rather than print(mail.as_string()) >> >> That did the trick - thanks! >> >> Cheers, >> >> Loris >> > > > If you also want to know about the text, then that is probably utf8 > encoded and converted to base64: > > from base64 import b64decode > > coded=(b'RGVhciBEci4gQmVubmV0dCwKCkJsb3JwISBCbGVlcCEKCgotLQpNYWlsYm90IEl' >b'uYy4KMDEwMTAxIEJvdCBCb3VsZXZhcmQKQm90aGFtIENpdHkKQsO2dGxhbmQK') > > uncoded = b64decode(coded).decode('utf8') > print(uncoded) > > output: > > Dear Dr. Bennett, > > Blorp! Bleep! > > > -- > Mailbot Inc. > 010101 Bot Boulevard > Botham City > Bötland Thanks! I don't need that right now, but it's good to know which decoding hoop I would have to jump through, if I did. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list