Thanks a lot for the clarification. Actually my problem is giving to raster dataset in geo-tif format find out unique pair combination, count the number of observation unique combination in rast1, count the number of observation unique combination in rast2, count the number of observation
I try different solution and this seems to me the faster Rast00=dsRast00.GetRasterBand(1).ReadAsArray() Rast10=dsRast10.GetRasterBand(1).ReadAsArray() mask=( Rast00 != 0 ) & ( Rast10 != 0 ) # may be this masking operation can be included in the for loop Rast00_mask= Rast00[mask] # may be this masking operation can be included in the for loop Rast10_mask= Rast10[mask] # may be this masking operation can be included in the for loop array2D = np.array(zip( Rast00_mask,Rast10_mask)) unique_u=dict() unique_k1=dict() unique_k2=dict() for key1,key2 in array2D : row = tuple((key1,key2)) if row in unique_u: unique_u[row] += 1 else: unique_u[row] = 1 if key1 in unique_k1: unique_k1[key1] += 1 else: unique_k1[key1] = 1 if key2 in unique_k2: unique_k2[key2] += 1 else: unique_k2[key2] = 1 output = open(dst_file_rast0010, "w") for (a, b), c in unique_u.items(): print(a, b, c, file=output) output.close() output = open(dst_file_rast00, "w") for (a), b in unique_k1.items(): print(a, b, file=output) output.close() output = open(dst_file_rast10, "w") for (a), b in unique_k2.items(): print(a, b, file=output) output.close() What do you think? is there a way to speed up the process? Thanks Giuseppe On 9 August 2012 16:34, Roman Vashkevich <vashkevic...@gmail.com> wrote: > Actually, they are different. > Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred > thousand entries, and you will feel the difference. > Dict uses hashing to get a value from the dict and this is why it's O(1). > > 10.08.2012, в 1:21, Tim Chase написал(а): > >> On 08/09/12 15:41, Roman Vashkevich wrote: >>> 10.08.2012, в 0:35, Tim Chase написал(а): >>>> On 08/09/12 15:22, Roman Vashkevich wrote: >>>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2} >>>>>> and i want to print to a file without the brackets comas and semicolon >>>>>> in order to obtain something like this? >>>>>> 4 5 1 >>>>>> 5 4 1 >>>>>> 4 4 2 >>>>>> 2 3 1 >>>>>> 4 3 2 >>>>> >>>>> for key in dict: >>>>> print key[0], key[1], dict[key] >>>> >>>> This might read more cleanly with tuple unpacking: >>>> >>>> for (edge1, edge2), cost in d.iteritems(): # or .items() >>>> print edge1, edge2, cost >>>> >>>> (I'm making the assumption that this is a edge/cost graph...use >>>> appropriate names according to what they actually mean) >>> >>> dict.items() is a list - linear access time whereas with 'for >>> key in dict:' access time is constant: >>> http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1 >> >> That link doesn't actually discuss dict.{iter}items() >> >> Both are O(N) because you have to touch each item in the dict--you >> can't iterate over N entries in less than O(N) time. For small >> data-sets, building the list and then iterating over it may be >> faster faster; for larger data-sets, the cost of building the list >> overshadows the (minor) overhead of a generator. Either way, the >> iterate-and-fetch-the-associated-value of .items() & .iteritems() >> can (should?) be optimized in Python's internals to the point I >> wouldn't think twice about using the more readable version. >> >> -tkc >> >> > -- Giuseppe Amatulli Web: www.spatial-ecology.net -- http://mail.python.org/mailman/listinfo/python-list