On Wed, Dec 28, 2011 at 05:21:39PM +0100, Alexandre Gramfort wrote:
> thanks Gael for the christmas present :)
I just couldn't help playing more. I have pushed a new update that
enables to control the compression level, and in general can achieve
better compromises between speed and compression. Here are benchmarks on
my computer (3.5 year old dell laptop, Intel Core 2 Duo with 2Go RAM):
Olivetti old code , write 2.01s, read 0.197s, disk 3M
compress 0, write 0.26s, read 0.024s, disk 12M
compress 1, write 0.74s, read 0.176s, disk 4M
compress 3, write 1.03s, read 0.164s, disk 3M
compress 6, write 2.03s, read 0.156s, disk 3M
compress 9, write 2.16s, read 0.158s, disk 3M
mmap , write 0.89s, read 0.003s, disk 12M
20news old code , write 4.23s, read 0.435s, disk 9M
compress 0, write 0.59s, read 0.118s, disk 23M
compress 1, write 1.80s, read 0.415s, disk 10M
compress 3, write 1.83s, read 0.401s, disk 9M
compress 6, write 2.91s, read 0.397s, disk 8M
compress 9, write 3.92s, read 0.402s, disk 8M
mmap , write 0.57s, read 0.112s, disk 23M
LFW pairs old code , write 12.84s, read 0.799s, disk 18M
compress 0, write 2.24s, read 0.080s, disk 48M
compress 1, write 3.11s, read 0.790s, disk 21M
compress 3, write 4.80s, read 0.687s, disk 18M
compress 6, write 10.71s, read 0.725s, disk 18M
compress 9, write 55.39s, read 0.666s, disk 17M
mmap , write 2.14s, read 0.003s, disk 48M
Species old code , write 7.57s, read 0.986s, disk 6M
compress 0, write 4.31s, read 0.167s, disk 103M
compress 1, write 1.61s, read 0.468s, disk 4M
compress 3, write 2.19s, read 0.457s, disk 3M
compress 6, write 2.13s, read 0.444s, disk 3M
compress 9, write 4.65s, read 0.443s, disk 2M
mmap , write 4.99s, read 0.007s, disk 103M
LFW people old code , write 40.93s, read 2.490s, disk 60M
compress 0, write 6.39s, read 0.231s, disk 147M
compress 1, write 9.87s, read 2.629s, disk 66M
compress 3, write 16.86s, read 2.380s, disk 59M
compress 6, write 35.20s, read 2.483s, disk 60M
compress 9, write 188.15s, read 2.300s, disk 56M
mmap , write 6.35s, read 0.003s, disk 147M
Big LFW people old code not available
compress 0, write 22.86s, read 0.819s, disk 441M
compress 1, write 39.20s, read 20.898s, disk 199M
compress 3, write 53.81s, read 15.821s, disk 179M
compress 6, write 110.72s, read 13.421s, disk 179M
compress 9, write 526.09s, read 11.922s, disk 170M
mmap , write 21.54s, read 0.040s, disk 441M
As with any benchmarks, caveat emptor!
The take home message seems to be that in general compress=3 gives a
reasonnable tradeoff between dump/load time, and disk space. Not
compressing is always faster, even in loading, and on non-compressed
data, memmaping kicks ass.
I used the scikit's datasets for benching because the performances
depend a lot on the entropy of the data, and thus I needed real-world
usecases. Obviously the fine-tuning that I did is not needed for the
scikit's storage of the datasets, but it general fast dump/load of Python
objects is useful for scientific computing and big data (think caching or
message passing parallel computing).
Some notes on the datasets:
* 20 news is mostly not arrays, it is a useful bench of fairly
general Python objects.
* Big LFW people is LFW people 4 times bigger. I created it because it
takes ~ 450M, and with the various memory duplications (one due to the
benching code, and other due to compression) it is pretty much the
upper limit of what I can compress on my computer. It gives a good
indication of RAM-limited performance.
I am attaching the benching code used. It plays ugly tricks to try to
flush the disk cache. Performance tradeoffs will depend on the relative
speed of the CPU and the disk. I'd love if other people could try it on
their computer (with the github/0.5.X version of joblib). Warning: it
will take a while to run! Once I have more insight, I'll make a 0.6
joblib release, and a blog posts with pretty graphs. Give me input for
this :)
Gael
PS: a general lesson that I relearned during this process is that any
optimization is tricky and surprising, and that doing good benchmarks
takes a while, but is always worth the effort.
import os
import time
import shutil
import numpy as np
# Neuroimaging specific I/O
from sklearn import datasets
import joblib
from joblib.disk import disk_used
def kill_disk_cache():
# Write ~100M to the disk
file('tmp', 'w').write(np.random.random(2e7))
def timeit(func, *args, **kwargs):
times = list()
for _ in range(5):
kill_disk_cache()
t0 = time.time()
func(*args, **kwargs)
times.append(time.time() - t0)
times.sort()
return np.mean(times[1:-1])
def bench_dump(dataset, name='', compress_levels=(1, 0, 3, 6, 9)):
time_write = list()
time_read = list()
du = list()
for compress in compress_levels:
if os.path.exists('out'):
shutil.rmtree('out')
os.mkdir('out')
time_write.append(
timeit(joblib.dump, dataset, 'out/test.pkl', compress=compress))
du.append(disk_used('out')/1024.)
time_read.append(
timeit(joblib.load, 'out/test.pkl'))
print '% 15s, compress %i, write % 6.2fs, read % 7.3fs, disk % 5.1fM' % (
name, compress, time_write[-1], time_read[-1], du[-1])
if os.path.exists('out'):
shutil.rmtree('out')
os.mkdir('out')
time_write.append(
timeit(joblib.dump, dataset, 'out/test.pkl'))
time_read.append(
timeit(joblib.load, 'out/test.pkl', mmap_mode='r'))
du.append(disk_used('out')/1024.)
print '% 15s, mmap , write % 6.2fs, read % 7.3fs, disk % 5.1fM' % (
name, time_write[-1], time_read[-1], du[-1])
return '% 10s | %s' % (name, ' | '.join('% 6.2fs/% 7.3fs, % 5.1fM' %
(t_w, t_r, d)
for t_w, t_r, d in zip(time_write, time_read, du)))
#d = datasets.fetch_olivetti_faces()
#bench_dump(d, 'Olivetti')
#print 80*'-'
#d = datasets.fetch_20newsgroups()
#bench_dump(d, '20news')
#print 80*'-'
#d = datasets.fetch_lfw_pairs()
#bench_dump(d, 'lfw_pairs')
#print 80*'-'
#d = datasets.fetch_species_distributions()
#bench_dump(d, 'Species')
d = datasets.fetch_lfw_people()
print 80*'-'
#bench_dump(d, 'big people')
d.data = np.r_[d.data, d.data, d.data ]
print 80*'-'
bench_dump(d, 'big people')
------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general