On 19/10/2017 16:42, Stefan Ram wrote:
Robin Becker <ro...@reportlab.com> writes:
Presumably the information in any attribute is highest
if the number of distinct occurrences is the the same as the list length and
pairs of attributes are more likely to be unique, but is there some proper way
to go about determining what tests to use?
When there is a list
|>>> list = [ 'b', 'b', 'c', 'd', 'c', 'b' ]
|>>> l = len( list )
, the length of its set can be obtained:
|>>> s = len( set( list ))
. The entries are unique if the length of the set is the
length of the list
|>>> l == s
|False
And the ratio between the length of the set and the length
of the list can be used to quantify the amount of repetiton.
|>>> s / l
|0.5
.......
this sort of makes sense for single attributes, but ignores the possibility of
combining the attributes to make the checks more discerning.
--
Robin Becker
--
https://mail.python.org/mailman/listinfo/python-list