Hi David,
On my machine, on a simple, pathological example, it is significantly
faster (around a factor of 2) to check for duplicates.
However, it depends if you expect to have many duplicates. In the second
example with only one duplicated item, it is faster to not check.
The winner in both cases, however, is to use dict() directly.
Cheers,
Josh
IPython session:
In [1]: l = [('a', 1)] * 10000
In [2]: def f(l):
...: d = {}
...: for k, v in l:
...: d[k] = v
...: return d
...:
In [3]: def g(l):
...: d = {}
...: for k, v in l:
...: if k not in d:
...: d[k] = v
...: return d
...:
In [4]: %timeit f(l)
1000 loops, best of 3: 1.34 ms per loop
In [5]: %timeit g(l)
100 loops, best of 3: 773 µs per loop
In [6]: %timeit dict(l)
1000 loops, best of 3: 371 µs per loop
In [7]: m = [(str(x), x) for x in range(10000)] + [('0', 0)]
In [8]: %timeit f(m)
1000 loops, best of 3: 1.99 ms per loop
In [9]: %timeit g(m)
100 loops, best of 3: 2.48 ms per loop
In [10]: %timeit dict(m)
100 loops, best of 3: 943 µs per loop
On 21 February 2014 09:31, David Crisp <[email protected]> wrote:
> The following question is more along the lines of "good practice" rather
> than "how do you do it" .
>
> If I have a name:value list of values that I want to read into a dict for
> ease of lookup, lets define them as:
>
> name | value
> ==========
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30
>
> Now, obviously there are duplicates in that list. If I use a simple
> loop such as the following:
> for each_name in list:
> item[name] = value
>
> Then I will get a dict with three pairs in it:
> ItemOne : 10
> ItemTwo : 20
> ItemThree : 30
>
> Which is what I want.
>
> Now, MY question is, is there any harm in creating the dict that way
> and looping through all those values multiple times and re-defining the
> values constantly (to the same thing). OR, should I put a check in there
> such as:
>
> if name not in item:
> # add name and its pair value to dict
>
>
> In my case I HAVE added checking in but I was wondering if it was really
> needed. Given no matter what, in either case, the resulting dict would be
> the same.
>
> Regards,
> David
>
> _______________________________________________
> melbourne-pug mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/melbourne-pug
>
>
_______________________________________________
melbourne-pug mailing list
[email protected]
https://mail.python.org/mailman/listinfo/melbourne-pug