Hey Zbyszek, 2010/8/17, Zbyszek Szmek <[email protected]>: > Hi, > this is a problem which came up when trying to replace a hand-written > array concatenation with a call to numpy.vstack: > for some array sizes, > > numpy.vstack(data) > > runs > 20% longer than a loop like > > alldata = numpy.empty((tlen, dim)) > for x in data: > step = x.shape[0] > alldata[pos:pos+step] = x > pos += step > > (example script attached) [clip]
I was curious on what is happening here, so after some profiling with cachegrind, I've come to the conclusion that `numpy.concatenate` is using the `memcpy` system call so as to copy data from sources to recipient. On his hand, your `concat` function is making use of the `__setitem__` method of ndarray, which does not use `memcpy` (this is probably due to the fact that it has to deal with strides). Now, it turns out that `memcpy` may be not optimal for every platform, and a direct fetch and assign approach could be sometimes faster. My guess is that this is what is happening in your case. On my machine, running latest Ubuntu Linux, I'm not seeing this difference though: fal...@ubuntu:~/carray$ python bench/concat.py numpy 1000 1000 10 3 problem size: (1000x1000) x 10 = 10^7 0.247s fal...@ubuntu:~/carray$ python bench/concat.py concat 1000 1000 10 3 problem size: (1000x1000) x 10 = 10^7 0.246s and neither when running Windows (XP): C:\tmp>python del_cum3.py numpy 10000 1000 1 10 problem size: (10000x1000) x 1 = 10^7 0.227s C:\tmp>python del_cum3.py concat 10000 1000 1 10 problem size: (10000x1000) x 1 = 10^7 0.223s Coincidentally, I've been lately working out a proof of concept for an array that can hold data in-memory in compressed state (using the high-performance Blosc compressor under the hood). This object (I'm calling it ``carray`` for the time being) can also be `append`-ed with additional data, so it can be used in this concatenation use case. So, I've setup a new benchmark based in your script (I called it concat.py) and tried it out with your problem. Here are the results for my netbook wearing a humble Intel Atom processor. First, the figures for the initial `numpy.concatenate` and `concat` styles: fal...@ubuntu:~/carray$ PYTHONPATH=. python bench/concat.py numpy 1000000 10 3 problem size: (1000000) x 10 = 10^7 time for concat: 0.228s size of the final container: 76.294 MB fal...@ubuntu:~/carray$ PYTHONPATH=. python bench/concat.py concat 1000000 10 3 problem size: (1000000) x 10 = 10^7 time for concat: 0.230s size of the final container: 76.294 MB Now the new method (carray) with compression level 1 (note the new parameter at the end of the command line): fal...@ubuntu:~/carray$ PYTHONPATH=. python bench/concat.py carray 1000000 10 3 1 problem size: (1000000) x 10 = 10^7 time for concat: 0.186s size of the final container: 5.076 MB which is more than a 20% faster than `numpy.concatenate` or your `concat` method, while the space taken in memory is significantly lower (5.1 MB vs 76.3 MB; of course, I've chosen a very compressible dataset for this example ;-) Even if you tell Blosc not to use compression (Blosc level 0), I can still see a win here: fal...@ubuntu:~/carray$ PYTHONPATH=. python bench/concat.py carray 1000000 10 3 0 problem size: (1000000) x 10 = 10^7 time for concat: 0.200s size of the final container: 77.001 MB which is 15% faster than the initial cases. However, note how space grows from the original 76.3 MB to 77.0 MB. This is because carray has to keep an internal buffer for accelerating the appending of small arrays; this is the main responsible of the space overhead. Finally, it is interesting to see the effect of forcing the use of a single thread instead of two (Atom has support for hyper-threading): fal...@ubuntu:~/carray$ PYTHONPATH=. python bench/concat.py carray 1000000 10 3 0 problem size: (1000000) x 10 = 10^7 time for concat: 0.210s size of the final container: 77.001 MB which is still a 10% faster than plain `numpy.concatenate` (remember, based on `memcpy`). Why carray/Blosc is faster for this case is rather a mystery to me, but the effect is there. I have not yet released carray publicly, but in case you want to play with it, I've uploaded my current git repository to: http://www.pytables.org/download/preliminary/carray-0.1.dev.tar.gz Of course, carray is still pre-alpha, and it does not support multidimensional arrays and you cannot modify its contents (other than append new data) but still, it can be a lot of fun. Cheers! -- Francesc Alted _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
