Re: Durable writes and parallel reads

Erik Søe Sørensen Tue, 01 Nov 2011 07:53:52 -0700

On 31-10-2011 17:38, Erik Søe Sørensen wrote:
[snip]

Parallel Reads.
---------------
Within a vnode, bitcask read operations happen in serial.
Is there any reason for reads not happening in parallel?

[snip]

I've made a small test of this - just to check that my intuition isn'toff track.


In the test, I
- create a 2GB file
- clear the disk caches
- From Erlang, read 1000 randomly-placed 1KB blocks from the file.
The last two steps are repeated for different read strategies.

On my setup (Ubuntu laptop), I get the following read timings (per block):
- Calling file:pread/3 in one process:  8.2ms
- Same, but sort the reads by position: 5.7ms

- Calling file:pread/3 from separate processes (limited to 20simultaneous outstanding reads): 5.8ms- Calling file:pread/3 from separate processes (limited to 50simultaneous outstanding reads): 5.4ms

(NB: This only works if a separate file descriptor is used for eachread, otherwise no improvement is observed.)

This means that read ordering really does matter - and that thepotential performance gains may be as much as 50% (i.e. significant).

As to whether this also holds in a Riak context, I've tried startingmultiple simultaneous instances of these strategies, each working ondifferent files (simulating multiple vnodes working from the same disk),and observed similar improvements (30-45% for three instances).

(For completeness, I must add that this may be highly I/O systemdependent. The above numbers are from the 'anticipatory' I/O schedulerstrategy for Linux; switch to the 'CFQ' strategy reduces the benefits alot - and also makes the absolute numbers worse.)


Regards,
Erik Søe Sørensen
Trifork A/S

[Code is available on request.]


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Durable writes and parallel reads

Reply via email to