On Wed, 03 Oct 2012 18:30:24 -0400, Benjamin Jessup wrote: > I have a group of objects identified by unique (x,y) pairs and I want to > find out an object's "neighbors" in a matrix of size 2400 x 2400. [...] > There is either a neighbor, or a null value. I always know the (x,y) > pair to check the neighbors of, so is doing, > >> obj = grid[x][y] #lists, doesn't scale with num of objects > or, > >> obj = grid.get((x,y),None) #dictionary, scales with num of objects > the fastest? I can't seem to find a conclusion by testing each alone, > then in the full environment. Is it that, depending on the number of > objects, each has an advantage?
Almost certainly not. Since you don't show how you test each, I have no idea why you can't find a conclusion. Let's start off with the question you don't ask: which uses less memory? The answer is, the dict solution by far. Here is a test: # populate a random matrix using both dict and list adict = {} alist = [[None]*2400 for i in range(2400)] from random import randrange for i in range(1000): x = randrange(2400) y = randrange(2400) adict[(x, y)] = "something" alist[x][y] = "something" import sys print(sys.getsizeof(adict)) print(sys.getsizeof(alist) + sum(sys.getsizeof(L) for L in alist)) The actual sizes printed will depend on how sparse the matrices are, but for the same above (approximately half full), using Python 2.7, the figures I get are: adict: 24712 alist: 23127324 Now let's test the speed: # randomly select matrix coordinates to look-up test_coords = [] for i in range(1000): x = randrange(2400) y = randrange(2400) test_coords.append((x, y)) # build some test code from timeit import Timer setup = "from __main__ import adict, alist, test_coords" t1 = Timer("for p in test_coords: obj = adict.get(p)", setup) t2 = Timer("for p in test_coords: obj = alist[p[0]][p[1]]", setup) # run the test code print(min(t1.repeat(number=10000, repeat=7))) print(min(t2.repeat(number=10000, repeat=7))) Again, on my system using Python 2.7, I get: 3.13823986053 2.97539305687 which shows that the two are very close, but the list solution is about 6% faster. So in this situation, a list of lists uses about 100 times more memory than a dict, but look-ups are about 6% faster. I would be very surprised if the timing results depended on the number of objects in the matrices. In case you are not familiar with timeit, let me explain what I have done: * I pre-select 1000 random coordinates. * I write some test code inside a Timer object that look up each of those coordinates, plus some setup code. * timeit runs the setup code once. * Then it runs my test code 10000 times as a single trial, timing it as accurately as possible. * It does seven trials, and I report the best of the seven. The time printed is measured in seconds. In this case, I get 3 seconds per trial, or 3e-7 seconds = 0.3 microseconds per lookup. > I know the fastest way to retrieve them would be to have them store > pointers to their neighbors, then use those for retrieval. How do you know that? No offence, but if you can't even work out whether lookups in a dict or a list are faster, I can't imagine why you think you can intuit what the fastest way to retrieve the nearest neighbours would be. -- Steven -- http://mail.python.org/mailman/listinfo/python-list