Alex, Your question is outside my knowledge space. I had to ask around. This is the message that came back:
You can grab _ALL_ stats for a node over HTTP at HOST:PORT/stats. This gets you a big JSON blob of all stats. The stat in question is in the JSON with key 'leveldb_read_block_error', the value will be either "undefined" or an integer. The former if there is no leveldb backend. Matthew On Nov 27, 2012, at 1:37 PM, Alex Babkin <[email protected]> wrote: > Thank you for quick response Matt > > So you are saying that i will have facilities in Riak 1.3 to handle these > errors in application layer? automatically by riak? > > Alex > > > On Mon, Nov 26, 2012 at 2:09 PM, Matthew Von-Maszewski <[email protected]> > wrote: > Alex, > > The eleveldb backend creates a CRC for every item placed on the disk. You > can activate the test of the CRC on every read by adding: > > {verify_checksums, true}, > > to the "{eleveldb " portion of app.config. With riak 1.2, you must manually > monitor each vnode directory for the lost/BLOCKS.bad file changing size. It > only increases upon read operations detecting a CRC and/or compression > corruption error. > > Manually monitoring the BLOCKS.bad file is tacky (my apologies). The > upcoming 1.3 release will populate riak admin with a counter of errors seen. > But that code is still weeks from release. > > Matthew > > On Nov 26, 2012, at 1:25 PM, Alex Babkin <[email protected]> wrote: > > > Hi all > > > > first post here, so please be kind :) > > > > I have plans to build an experimental riak cluster out of cheap ARM > > computing parts and consumer grade SSDs to measure performance and > > experiment to assess production viability > > I plan to use levelDB as the backend > > > > One thing to be concerned of, in light of various SSD failure stories, is > > of course a scenario of SSD failure and also the way it fails (some parts > > of SSD space just aren't writable anymore, but still readable, i.e stuck at > > some constant value). This may potentially result in a scenario where a > > replicated record on two clusters, one with working SSD and one with > > faulty, will have different data. Will riak try to account for this > > scenario? > > > > I'm trying to think of ways to mitigate this risk of nodes failing due to > > these SSD failures or at least get an early indication of a failure > > (however insignificant it may be). > > Guess my first question should be "Does riak provide any form of checksums > > or what not on the data it reads/writes, or it blindly trusts that the > > backend/filesystem reads/writes data correctly?" > > > > If not, are there any other tricks people use to trigger some alarm bells > > that an SSD is 'going' ? > > > > Thanks > > Alex > > > > > > > > > > > > _______________________________________________ > > riak-users mailing list > > [email protected] > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
