On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen <[email protected]> wrote:

> It really is not a good idea to use siblings to represent 1-to-many
> relations. That's not what it's intended for, nor what it's optimized for...
>
Ok, understood.


> Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would
> probably do it.
>
1) According to
http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups ,
it's real pain to implement backups with leveldb.
2) According to
http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads
may be slower comparing to bitcask, it's critical for us

Otherwise, storing a list of items under each key could be a solution,
> depending of course on the number of items per key. (But do perform
> conflict resolution.)
>
Why any conflict resolving is required? As far as I understood, with
allow_mult=true riak should just collect all the values written to key
without anything additional work? What design decision leads to exponential
slowdown and crashes when multiple values allowed for any single key?.. So,
what's the REAL purpose of allow_mult=true if it's bad idea to use it for
unlimited values per single key?

Ok, documentation contains the following paragraph:

> Sibling explosion occurs when an object rapidly collects siblings without
being reconciled. This can lead to a myriad of issues. Having an enormous
object in your node can cause reads of that object to crash the entire
node. Other issues are increased cluster latency as the object is
replicated and out of memory errors.

But there is no point if it related to allow_mult=false or both cases.

So, the only solution is leveldb+2i?


On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen <[email protected]> wrote:

> It really is not a good idea to use siblings to represent 1-to-many
> relations. That's not what it's intended for, nor what it's optimized for...
> Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would
> probably do it.
> Otherwise, storing a list of items under each key could be a solution,
> depending of course on the number of items per key. (But do perform
> conflict resolution.)
> /Erik
>
>
>
> -------- Oprindelig meddelelse --------
> Fra: Viable Nisei <[email protected]>
> Dato:
> Til: [email protected]
> Emne: May allow_mult cause DoS?
>
>
> Hi.
>
> Recently we've described that something is going unexpectedly. We are
> using Riak 1.4.2 with some buckets with allow_mult=true.
> We've tried our app under load then found that... concurrently writes into
> bucket with allow_mult turning Riak into irresponsible slowpoke and even
> crash it.
>
> Core i3 with 4GB RAM performs only 20 writes/sec with 5 client threads
> writing 20 short strings into 20 keys in bucket with allow_mult=true,
> search=false. With 40 values per 40 keys it performs only 6 writes/sec.
> 60x60 cause riak crash?
> Throughput drops drastically. Ok, we've not chaged concurrency factor (5)
> and increased our data set 4x, but why throughput drops?
> Ok, we increase our dataset linear, 20 strings * 20 keys, 40 strings*20
> keys,  60 strings*20 keys... Results will be same - exponential throughput
> drop with crash at end.
>
> Cluster of five Amazon EC2 cc2.8xlarge nodes becomes irresponsibly with
> throughput 1-5 writes/sec with only 80-100 values per 1-10 keys.
>
> So, we think it is very strange.
>
> Here you can check our code sample (in java) reproducing this behavior:
> https://bitbucket.org/vsnisei/riak-allow_mult_wtf
>
> So, we have asked Basho about this, but they said that "we think SQLish"
> and asked us for $5k for 2-days consultation to resolve our problem.
> So, I've decided to ask here if we are really so stupid and not able to
> understood some simple things or Basho didn't understood us correctly?..
>
> Anyway, looks like that some DoS/DDoS attack approach utilizing this
> behavior may be proposed. We should only know that some
> service/appliation/website is using Riak with allow_mult buckets then
> provoke concurrent writes into them...
>
> Actually our question to Basho was broader. Our application needs to
> implement 1-many bindings. Riak allows the following approaches to
> simultate such bindings, according to documentation:
>
>  1.  Riak search - but we've found that it's VERY slow (20x performance
> drop when search enabled, even for simple objects like {source_id: xxx,
> target_id: yyy}, also we've found that search is not really scalable -
> adding new nodes into cluster not increasing throughput, but even slows
> cluster down...
>  2.  secondary indexes. But, according to docs, they are working only on
> LevelDb, but we need Bitcask
>  3.  Link walking. But, according to docs, it's "rest only operation" and
> in java driver it's implemented as a hack
>  4.  allow_mult. But we've found that it's just a nightmare. So we told
> Basho about this and given link to our example, but they didn't given us
> any feedback
>  5.  Bucket keys enumeration. But, according to docs, this operation
> causes full keys scan on each node and must not be used in production
>  6.  Mapred queries. Ok, we didn't tried them yet, maybe it's silver
> bullet, really. But according to docs (and common sense) mapred causes
> full-scan (for bucket at least. Or for all keys?) and it's operation with
> unpredictable latency.
>
> So, where we are wrong? Is everything ok with behavior I've described? Are
> we misunderstood Riak completely and should pay $5k for some
> mind-expansion, or there is no any hidden mystical knowledge and they will
> not say us anything excepting approaches listed above?
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to