Matthew, If you are talking about rotational media, the more reader you add the worse your aggregate bandwidth is going to be... Since LMDB is storing it as a btree, the readers have to random access which turns into a lot of seek. Seek time ends up being amortized as a higher average time to read a block / page and your aggregate bandwidth disappears.
If you have enough memory to store most of the data, or your working set it only a small subset of that data this won't be as visible. Best, - Milosz On Thu, Feb 26, 2015 at 5:50 PM, Matthew Moskewicz <moske...@alumni.princeton.edu> wrote: > > warnings: new to list, first post, lmdb noob. > > i'm a caffe user: > https://github.com/BVLC/caffe > > in one use case, caffe sequentially streams though >100GB lmdbs at a rate of > ~30MB/s in blocks of about 40MB. however, if multiple caffe processes are > reading the same lmdb (opened with MDB_RDONLY), read performance becomes > limiting (i.e. the processes become IO bound), even though the disk has > sufficient read bandwidth (say ~180MB/s). some of the relevant caffe lmdb > code is here: > > https://github.com/BVLC/caffe/blob/master/src/caffe/util/db.cpp > > however, if i *both* > 1) run blockdev --setra 65536 --setfra 65536 /dev/sdwhatever > 2) modify lmdb to call posix_madvise(env->me_map, env->me_mapsize, > POSIX_MADV_SEQUENTIAL); > > then i can get >1 reader to run without being IO limited. > > for (2), see https://github.com/moskewcz/scratch/tree/lmdb_seq_read_opt > > similarly, using a sequential read microbenchmark designed to model the > caffe reads from here: > https://github.com/moskewcz/boda/blob/master/src/lmdbif.cc > > if i run one reader, i get 180MB/s bandwidth. > with two readers, but neither (1) nor (2) above, each gets ~30MB/s > bandwidth. > with (1) and (2) enabled, and two readers, each gets ~90MB/s bandwidth. > > any advice? > > mwm > > PS: backstory (skippable): > caffe originally used LevelDB to get better read performance for > sequentially loading sets of ~1M 227x227x3 raw images (~200GB data). > typically processing time is ~2 hours for this data set size, yielding a > read BW need of 30MB/s or so. it's not really clear if/why LevelDB was uses > aside from the fact that the caffe author was a google intern at the time he > wrote it, but anecdotally i think the claim is that reading the raw .jpgs > had perf. issues, although it's unclear exactly what or why. i guess it was > the usual story about not getting sequential reads without using LevelDB. > they switched to lmdb a while back. > > -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: mil...@adfin.com