On Sat, Jan 04, 2014 at 07:19:43AM -0500, Keith Winston wrote: > I understand that Python doesn't have composite objects, but neither does > it dislallow my list of lists of ints and lists... which is, I imagine, > very space efficient.
I'm afraid I have no idea what you mean by Python not having "composite objects". What's a composite object? > I think what you are in essence saying is that it's a > mistake for me to worry about space at the expense of clarity... What Denis and I are trying to say is that when you are using a high-level language like Python, you should aim to optimize programmer time and effort, not computer time and effort. There are applications where you have to optimize what the computer does, where every byte counts, where microseconds matter. Python is not the right programming language for those applications. Python is a language for when you have the luxury of tens of megabytes of memory, not kilobytes, when you don't care whether something takes a millisecond to calculate instead of a tenth of a millisecond. (Actually, there are versions of Python for embedded devices, such as Android, where memory does matter -- at least, where every kilobyte counts.) Everything is a tradeoff. In programming, one common set of tradeoffs is space (memory) versus time: use more memory to run faster, or save memory but run slower. Python generally trades off memory for speed. There is very little point in trying to save bytes, because behind the scenes Python is using and re-using memory like its going out of fashion, just to ensure that it can be as fast as possible. py> import sys py> sys.getsizeof(42) 14 Fourteen bytes just for a tiny little integer like that??? How wasteful! I remember when numbers like 42 would only require *two* bytes. Of course, back then the biggest number you could deal with was 65535, and a computer with 64K of memory was considered unbelievably luxurious. Meanwhile, Python let's me handle numbers with thousands of digits with ease: py> n = 2**10000 py> len(str(n)) 3011 Another tradeoff is between programmer effort (which equals time, and cost) versus speed. Programs written in C are typically between 10 and 10000 times faster than the same program written in Python, but they typically take between 20 and 200 times longer and more effort to write. You should ask, is my program *fast enough*? rather than ask if it is fast. Often, Python is fast enough. When it's not, there are ways to make it faster. So don't sweat the small stuff. If you ever have to write an operating system kernel or a graphics card driver, then you need care about optimizing every little thing. Until then, write the most natural code you can, and only if it actually is too slow should you worry about it. > I'm not meaning to argue, but to understand. Especially in lieu of an > upcoming project with, perhaps, larger and more complex structures. I am > increasingly curious about whether namedtuples are a good strategy for some > of this: they store their field names, as I understand it, and I can live > with an immutable type in all these cases: I wonder if they are as > efficient in named-field (key) lookup as dictionaries? Pretty close to it. Not quite, but within a factor of about 3. Let's do some micro-benchmarks! First, let's create some data objects with three fields, using three different techniques: a dict x, a regular class with named fields y, and a namedtuple z. py> x = {'a': 23, 'b': 42, 'c': 57} py> class Record: ... def __init__(self, a, b, c): ... self.a = a ... self.b = b ... self.c = c ... py> y = Record(23, 42, 57) py> from collections import namedtuple py> recordnt = namedtuple('recordnt', 'a b c') py> z = recordnt(23, 42, 57) Now let's set up some timing code, where we extract all three fields in reverse order: py> from timeit import Timer py> setup = "from __main__ import x, y, z" py> t1 = Timer("x['c'], x['b'], x['a']", setup) py> t2 = Timer("y.c, y.b, y.a", setup) py> t3 = Timer("z.c, z.b, z.a", setup) And now let's time them: py> min(t1.repeat()) 0.2941127344965935 py> min(t2.repeat()) 0.34186235070228577 py> min(t3.repeat()) 0.7729006875306368 That's not too shabby. Even the slowest of the three (test t3, the one using the namedtuples) takes only 0.26 microseconds per field lookup. Times shown are seconds for one million repeats of the test code, or nanoseconds per single run. There are three field lookups per test, so 0.77 nanoseconds/3 is about 0.26 nanoseconds. If you ever find a piece of code where the difference between 0.1 ns and 0.26 ns per field lookup is meaningful, I'd like to see it. > I'm also a bit confused here: obviously tuples are immutable, but one can > use lists in them... I think that makes those lists' contents immutable? Nope. It makes the tuple immutable in the sense that it's *direct* contents cannot be changed, but mutable in the sense that the contents of the tuple can be mutated. py> t = (1, 2, []) py> t[2] = ["hello"] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'tuple' object does not support item assignment py> t[2].append("hello") py> t (1, 2, ['hello']) > And could one define a namedtuple that included lists that were different > lengths for different instantiations (like my game results, for example)? Naturally. The nametuple doesn't care what you put inside it. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor