Hi.

Recently we've described that something is going unexpectedly. We are using
Riak 1.4.2 with some buckets with allow_mult=true.
We've tried our app under load then found that... concurrently writes into
bucket with allow_mult turning Riak into irresponsible slowpoke and even
crash it.

Core i3 with 4GB RAM performs only 20 writes/sec with 5 client threads
writing 20 short strings into 20 keys in bucket with allow_mult=true,
search=false. With 40 values per 40 keys it performs only 6 writes/sec.
60x60 cause riak crash?
Throughput drops drastically. Ok, we've not chaged concurrency factor (5)
and increased our data set 4x, but why *throughput* drops?
Ok, we increase our dataset linear, 20 strings * 20 keys, 40 strings*20
keys,  60 strings*20 keys... Results will be same - exponential throughput
drop with crash at end.

Cluster of five Amazon EC2 cc2.8xlarge nodes becomes irresponsibly with
throughput 1-5 writes/sec with only 80-100 values per 1-10 keys.

So, we think it is very strange.

Here you can check our code sample (in java) reproducing this behavior:
https://bitbucket.org/vsnisei/riak-allow_mult_wtf

So, we have asked Basho about this, but they said that "we think SQLish"
and asked us for $5k for 2-days consultation to resolve our problem.
So, I've decided to ask here if we are really so stupid and not able to
understood some simple things or Basho didn't understood us correctly?..


*Anyway, looks like that some DoS/DDoS attack approach utilizing this
behavior may be proposed. We should only know that some
service/appliation/website is using Riak with allow_mult buckets then
provoke concurrent writes into them...*

Actually our question to Basho was broader. Our application needs to
implement 1-many bindings. Riak allows the following approaches to
simultate such bindings, according to documentation:

   1. Riak search - but we've found that it's VERY slow (20x performance
   drop when search enabled, even for simple objects like {source_id: xxx,
   target_id: yyy}, also we've found that search is not really scalable -
   adding new nodes into cluster not increasing throughput, but even slows
   cluster down...
   2. secondary indexes. But, according to docs, they are working only on
   LevelDb, but we need Bitcask
   3. Link walking. But, according to docs, it's "rest only operation" and
   in java driver it's implemented as a hack
   4. allow_mult. But we've found that it's just a nightmare. *So we told
   Basho about this and given link to our example, but they didn't given us
   any feedback*
   5. Bucket keys enumeration. But, according to docs, this operation
   causes full keys scan on each node and must not be used in production
   6. Mapred queries. Ok, we didn't tried them yet, maybe it's silver
   bullet, really. But according to docs (and common sense) mapred causes
   full-scan (for bucket at least. Or for all keys?) and it's operation with
   unpredictable latency.

So, where we are wrong? Is everything ok with behavior I've described? Are
we misunderstood Riak completely and should pay $5k for some
mind-expansion, or there is no any hidden mystical knowledge and they will
not say us anything excepting approaches listed above?
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to