Another thing 2i lacks, is the ability to set the R-value. Just like
when GET'ing on a specific key, there is a tradeoff between consistency
and performance in choosing the set of vnodes to query.
The 2i query implementation in Riak chooses performance over consistency
by only querying the minimum set vnodes (1/n), which is the equivalent
of R=1. In other words, 2i queries have no entropy tolerence. If an
object is missing from a given vnode (not yet fixed by AE), you will get
inconcistent results in approximately 1/n of your 2i queries for it.
Riak chooses the 1/n vnode set somewhat randomly, so you can experience
that running the 2i query multiple times returns different sets of keys.
There is currently no easy way to change that for high-consistency use
cases. As an experiment, I have tried modifying Riak's source code, to
make it query all vnodes, which seemingly works fine. However, Riak will
then return (up to) n copies of each key, so these need to be deduped by
the client. Obviously, this alternative approach favours concistency
over performance, and will be ~3 times as expensive.
Also remember that AAE will fix entropy eventually, so the windows for
inconsistent 2i queries will be closed by Riak itself after some time.
You can look at the AAE logs to deduct the level of entropy you are
running with and guestimate the impact from that on the consistency of
your 2i queries.
- Rune, Trifork
Den 07-10-2013 23:46, Jon Meredith skrev:
Hi Brady,
The 2I indices are written in the same store as the main objects
whenever the main object is updated. If a primary node is down, the
indices will be written to a fallback node. When the fallback sees
the primary come back online and stops receiving requests for that
partition it will send the main object back to the primary and that
will re-index it.
The docs could benefit with a little clarification. Secondary indices
do benefit from read repair, that is if the main object is spotted as
being out of date or missing during a get, it is rewritten with the up
to date information on all nodes. The anti-entropy mechanism that we
are currently missing is spotting corruption within leveldb itself.
For example if part of a the leveldb database storing a vnode is
corrupted so that the .sst files containing the index entries were
destroyed there is no mechanism to spot and repair that. We are
intending to add that for the next major release.
Jon
On Mon, Oct 7, 2013 at 2:39 PM, Brady Wetherington
<[email protected] <mailto:[email protected]>> wrote:
What happens to your 2i indexes if you do a write and one of the
nodes you're trying to write to is down?
http://docs.basho.com/riak/latest/dev/using/2i/ says:
* When you want or need anti-entropy. Since 2i is just metadata
on the KV object and the indexes reside on the same node, 2i
piggybacks off of read-repair.
But
http://docs.basho.com/riak/latest/ops/running/recovery/repairing-indexes/
says:
Riak Secondary indexes (2i) currently have no form of anti-entropy
(such as read-repair). Furthermore, for performance and load
balancing reasons, 2i reads from 1 random node. This means that
when a replica loss has occurred, inconsistent results may be
returned.
I am building a solution around 2i - so I just wanted to know if
there was any way to clarify these points - how resilient are
these indexes? Under what circumstances will they stop working (or
return inconsistent results)?
-B.
_______________________________________________
riak-users mailing list
[email protected] <mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
Jon Meredith
VP, Engineering
Basho Technologies, Inc.
[email protected] <mailto:[email protected]>
--
sdfd
Best regards / Venlig hilsen
*Rune Skou Larsen*
Trifork Public A/S
Dyssen 1, 8200 Ã…rhus N, Denmark
Phone: +45 3160 2497 Skype: runeskoularsen twitter: @RuneSkouLarsen
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com