Elias, I could see that being something some folks want. From my point of view, I > find that the existing design of one core per bucket may be more useful, so > long as I can search across cores with similar schemas (I created an > issue<https://github.com/basho/yokozuna/issues/87>to track that feature), as > it allows me to easily drop the index for a > bucket. In a multi-tenant environment, where you may have an index per > customer, this is rather useful. A lot less painful than trying to delete > the index (and data) by performing a key listing and delete operations. >
Well you still can't avoid the key-listing/delete for Riak itself. For Solr this would be a delete-by-query which isn't nearly as expensive. > > As I've expressed before, I wish buckets behaved the same way, segregating > their data into distinct backend, but I understand that this results in > lower resource usage, as things like LevelDB caches would then not be > shared and you'd need additional file descriptors. At the very least, it > would be great if backend instances could be > created programmatically through the HTTP or PB API, rather than having to > modify app.config and perform a rolling restart. That not very > operationally friendly. > Yes, there are benefits to be had both ways. Segregating the actual backend instances allows for efficient drop of entire bucket, but adds strain in terms of file descriptors and I/O contention. Multi-backend sorta helps but is static in nature as you mention. > > As for large number of cores, I could see some folks creating many of > them. Buckets are relatively cheap, since by default they are all stored > in the default backend instance. Their only cost being > the additional network traffic for gossiping non-default bucket properties. > So folks create them freely. Once Yokozuna is better documented, it should > be pointed out that the same is not true of a bucket's index, since they > create one core per bucket. So an indexed bucket has quite a bit more > static overhead than non-indexed one. > Good point. > > If you use Riak and have 300 customers, you can easily create a bucket per > customer, even if you only have 64 partions and are using Riak Search on > all of them, as Search stores all the data in the same merge index backend. > You may want to twice before upgrading such cluster to Yokozuna. > Well, Riak Search will have issues as well. First, each bucket will require a pre-commit hook to be installed which means custom bucket properties to be copied into the ring. There is a known drawback with Riak where many bucket properties greatly reduce ring gossip throughput and can cause issues. I believe Joseph Blomstedt may have some patches going into the next release that will improve this but ultimately we need to get bucket properties out of the ring. Even if that is solved, Riak Search will have other tradeoffs such as substantially reduced feature support compared to Yokozuna as well as reduced performance for many types of queries. But I do agree many indexes (thus cores) could pose a problem for Yokozuna. -Z
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
