Re: efficient way to get a sufficient set of identifying attributes

Robin Becker Thu, 19 Oct 2017 09:11:54 -0700

On 19/10/2017 16:42, Stefan Ram wrote:

Robin Becker <ro...@reportlab.com> writes:

                        Presumably the information in any attribute is highest
if the number of distinct occurrences is the the same as the list length and
pairs of attributes are more likely to be unique, but is there some proper way
to go about determining what tests to use?


   When there is a list

|>>> list = [ 'b', 'b', 'c', 'd', 'c', 'b' ]
|>>> l = len( list )

   , the length of its set can be obtained:

|>>> s = len( set( list ))

   . The entries are unique if the length of the set is the
   length of the list

|>>> l == s
|False

   And the ratio between the length of the set and the length
   of the list can be used to quantify the amount of repetiton.

|>>> s / l
|0.5

.......

this sort of makes sense for single attributes, but ignores the possibility ofcombining the attributes to make the checks more discerning.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list

Re: efficient way to get a sufficient set of identifying attributes

Reply via email to