Re: libvirt-python: API change List → (Named)Tuple?

2020-07-27 Thread Daniel P . Berrangé
On Mon, Jul 27, 2020 at 11:19:46AM +0200, Erik Skultety wrote:
> On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
> > Hello,
> > 
> > Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> > > Am 27.04.20 um 15:44 schrieb Philipp Hahn:
> > >> I'm working on adding PEP 484 type hints
> > >>  to the Python binding of
> > >> libvirt.
> > ...
> > > I just opened a merge request
> > >  for my
> > > code at .
> > 
> > While working on that I stumbled over some annoyances in the current
> > Python API: There are many methods which return a List[int], for example:
> >   virDomainGetInfo
> >   virDomainGetState
> >   virDomainGetControlInfo
> >   virDomainGetBlockInfo
> >   virDomainGetJobInfo
> >   virStoragePoolGetInfo
> >   virStorageVolGetInfo
> >   virStorageVolGetInfoFlags
> > 
> > There are more function returning similar information as plain List:
> >   virNodeGetSecurityModel
> >   virNodeGetSecurityModel
> >   virDomainGetSecurityLabelList
> >   virDomainBlockStats
> >   virDomainInterfaceStats
> >   virDomainGetVcpus
> >   virNodeGetCPUMap
> > 
> > The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
> > int*7]`: The problem for type annotation is that `List` is unlimited in
> > the number of elements, that is you cannot specify the number of entries
> > the list must or will contain.
> > Also all elements of a list must have the same (super-)type; for "str"
> > and "int" of "virNodeGetInfo()" that would be "Any", which is not very
> > useful here.
> > 
> > A better type for those `List`s would be `Tuple`, which has a fixed
> > length and can have different types for each position.
> > 
> > But that would be an API change: In most cases users of those functions
> > should not notice the difference as indexing tuples or lists is the same.
> > It would break code where someone would do something like this:
> >   ret = virFunction_returning_list()
> >   ret += [1, 2, 3]
> 
> I would argue that ^this was never the intended way of using the
> returned object and would lean towards using a named tuple, since like
> you've pointed out it would be a transparent change in the vast majority
> of cases.  However, since we don't document anywhere how the return
> value should be treated, from that perspective it's still valid to
> change the returned list for whatever purposes and therefore we are
> breaking the API contract, which we can't do even though the change
> itself is reasonable.

I think the above example is just plain crazy. While it may be technically
possible, I don't think that example is a compelling usage to justify not
doing the tuple change. Even if it does break some app (I very much doubt
it will), it'll still be a net win for all the other sane apps to change
to using a named tuple.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: libvirt-python: API change List → (Named)Tuple?

2020-07-27 Thread Erik Skultety
On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
> Hello,
> 
> Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> > Am 27.04.20 um 15:44 schrieb Philipp Hahn:
> >> I'm working on adding PEP 484 type hints
> >>  to the Python binding of
> >> libvirt.
> ...
> > I just opened a merge request
> >  for my
> > code at .
> 
> While working on that I stumbled over some annoyances in the current
> Python API: There are many methods which return a List[int], for example:
>   virDomainGetInfo
>   virDomainGetState
>   virDomainGetControlInfo
>   virDomainGetBlockInfo
>   virDomainGetJobInfo
>   virStoragePoolGetInfo
>   virStorageVolGetInfo
>   virStorageVolGetInfoFlags
> 
> There are more function returning similar information as plain List:
>   virNodeGetSecurityModel
>   virNodeGetSecurityModel
>   virDomainGetSecurityLabelList
>   virDomainBlockStats
>   virDomainInterfaceStats
>   virDomainGetVcpus
>   virNodeGetCPUMap
> 
> The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
> int*7]`: The problem for type annotation is that `List` is unlimited in
> the number of elements, that is you cannot specify the number of entries
> the list must or will contain.
> Also all elements of a list must have the same (super-)type; for "str"
> and "int" of "virNodeGetInfo()" that would be "Any", which is not very
> useful here.
> 
> A better type for those `List`s would be `Tuple`, which has a fixed
> length and can have different types for each position.
> 
> But that would be an API change: In most cases users of those functions
> should not notice the difference as indexing tuples or lists is the same.
> It would break code where someone would do something like this:
>   ret = virFunction_returning_list()
>   ret += [1, 2, 3]

I would argue that ^this was never the intended way of using the
returned object and would lean towards using a named tuple, since like
you've pointed out it would be a transparent change in the vast majority
of cases.  However, since we don't document anywhere how the return
value should be treated, from that perspective it's still valid to
change the returned list for whatever purposes and therefore we are
breaking the API contract, which we can't do even though the change
itself is reasonable.

Regards,
Erik



libvirt-python: API change List → (Named)Tuple?

2020-07-26 Thread Philipp Hahn
Hello,

Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> Am 27.04.20 um 15:44 schrieb Philipp Hahn:
>> I'm working on adding PEP 484 type hints
>>  to the Python binding of
>> libvirt.
...
> I just opened a merge request
>  for my
> code at .

While working on that I stumbled over some annoyances in the current
Python API: There are many methods which return a List[int], for example:
  virDomainGetInfo
  virDomainGetState
  virDomainGetControlInfo
  virDomainGetBlockInfo
  virDomainGetJobInfo
  virStoragePoolGetInfo
  virStorageVolGetInfo
  virStorageVolGetInfoFlags

There are more function returning similar information as plain List:
  virNodeGetSecurityModel
  virNodeGetSecurityModel
  virDomainGetSecurityLabelList
  virDomainBlockStats
  virDomainInterfaceStats
  virDomainGetVcpus
  virNodeGetCPUMap

The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
int*7]`: The problem for type annotation is that `List` is unlimited in
the number of elements, that is you cannot specify the number of entries
the list must or will contain.
Also all elements of a list must have the same (super-)type; for "str"
and "int" of "virNodeGetInfo()" that would be "Any", which is not very
useful here.

A better type for those `List`s would be `Tuple`, which has a fixed
length and can have different types for each position.

But that would be an API change: In most cases users of those functions
should not notice the difference as indexing tuples or lists is the same.
It would break code where someone would do something like this:
  ret = virFunction_returning_list()
  ret += [1, 2, 3]
which breaks if that function would return a `Tuple` in the future:

> $ python -c '() + []'
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: can only concatenate tuple (not "list") to tuple

Even better then plain `Tuple`s would be
,
which would allow us to use the return value of `virNodeGetInfo()` as this:

> from collections import namedtuple
> import  libvirt
> virNodeInfo = namedtuple("virNodeInfo", ["model", "memory", "cpus", "mhz", 
> "nodes", "sockets", "cores", "threads"])
> c = libvirt.open('test:///default')
> info = virNodeInfo(*c.getInfo())
> print(info.model)
> print(info)
> # virNodeInfo(model='i686', memory=3072, cpus=16, mhz=1400, nodes=2, 
> sockets=2, cores=2, threads=2)



This could be improved even more by using
,
which allows to add type information:

> from typing import NamedTuple
> virNodeInfo = NamedTuple("virNodeInfo", [("model", str), ("memory", int), 
> ("cpus", int), ("mhz", int), ("nodes", int), ("sockets", int), ("cores", 
> int), ("threads", int)])

or with Python 3.6 the more readable form

> class virNodeInfo(NamedTuple):
>   model: str
>   memory: int
>   cpus: int
>   mhz: int
>   nodes: int
>   sockets: int
>   cores: int
>   threads: int

IMHO that would be a real usability improvement as I have to lookup that
information every time I have to use those functions myself.
Indexing with `namedtuple` and `NamedTuple` still works, so you can
still use `info[0]` above to get the model.

What do you think of that?
When would be a good time to make such a change?

Philipp