ElasticSearch's clustering is much better than Solr as well. If you're looking for two products that both work well in a clustered, shared nothing environment, elasticsearch + riak makes more sense than solr + riak to me.
- Alex ----- Original Message ----- From: "Guido Medina" <[email protected]> To: "riak-users" <[email protected]> Sent: Thursday, November 1, 2012 5:34:25 AM Subject: Re: Web doc buglet You will realize that one specific solution doesn't solve everything, so you will probably have to use a piece from each base on their strengths, Riak = Key-Value , HA proxy (not related) is a good TCP balancer , we use it and it works like a charm, here is a sample configuration https://gist.github.com/1507077 And finally, I meant that ElasticSearch could be a Solr replacement which both strengths are text index searching and both have Geo Spatial special kind of search tools, and include polygons, convex hull algorithms, etc, I know Solr needs to be heavily configured to have advanced geo stuff working, which in ElasticSearch is already in place or way easier to configure, and both are based in Lucene so, either will be as fast. ElasticSearch had near real time index enabled were Solr just enabled it at version 4 which was released a month ago, but hard to find documentation. Hope that helps, Guido. On 31/10/12 22:50, Tin Le wrote: For some reason, I did not get the original reply from Mark. Only saw it when it was included in Guido's email. HA proxy + Riak + ElasticSearch are your friends, Solr lacks documentation (way outdated), hard to find stuff done and samples, so if you have your cluster well setup and your meaning to do only key-value retrieval with assist of text index search using ElasticSearch, you are good. *Note:* We have Solr for GeoSpatial functionality and is amazingly fast, but there isn't much we can do, if you need complex polygon features it gets complicated in Solr. Except for some incubation projects that will be brought into Solr, it is kind of hard to do anything. We are not using Solr at the moment, and I don't want to add yet another piece into our infrastructure unless I really, really have to. If it is the best option for our needs, I'll use it. Hope that helps, Guido. On 31/10/12 17:15, Mark Phillips wrote: One of the thing we've found missing and really need is the geospatial indexing mongodb has. We've just pushed our updated app to both iTunes and Google Playstore that uses this as an intrinsic part of our app. We decided to stay with mongo as there was no time to code up equivalent for riak. So Riak isn't really a great fit for geospatial right now. You might be able to fake it to some extent using secondary indexes and doing range queries on lat/longs (stored as ints) but it might not be too performant. The other thing worth noting is that Ryan Zezeski has been hacking on a revised implementation of Riak Search that ties Riak and Solr together quite nicely (and thus supports geospatial). It's called Yokozuna [1], and it's still alpha, but it's worth looking at and testing (as this code getting more stable pretty quickly). What are the specifics of the use case? Our app is a popular music recognition app for iOS, Android and other mobile OS (over 100+ million users). We just pushed out a feature that allow users to discovered songs being "identified" by nearby users on a map. It work globally. You can see what songs are being ID'ed around you. As users ID'd a song, the song and user location is added to mongo. This info can be queried and display on a proximity map. Since we were more familiar with mongo, and were under time pressure to get this out, recoding would not work for us. We will see how mongo scale for us over time for this particular feature. No rush to go to riak now. I have new HW on order for intended production usage. It wil be a 5 nodes cluster. Each with 16 cores, 64GB RAM (upgradeable to 512GB), and 2TB in a RAID10 config. If riak tested out, we will go with this HW. Otherwise, wipe and put mongo on it. The test bed originally used 64 partitions with 3 nodes. But as I was adding 2 more nodes, I read somewhere in the doc that I should use 256 or 512 with more than 3 nodes. So I wiped and upped it to 256 partitions on 5 nodes test bed. Regarding the docs. I would like to see a Best Practices section on sizing HOWTOs and other configuration recommendations. Perhaps a list of usage scenarios and HW/SW configuration recs. An explanation of why pick a certain config and what need to be adjusted is helpful for Operations/SysAdmins. Current docs are fine for getting feet wet, just not enough for long term, production usage. I have more experience with mongo in a production environment than I have with riak. So anything that help make it easier for me to get up to speed is definitely good. Right now, I am the sole person championing riak at $WORK$. So it's tough, heh :) Tin _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
