Roger, Riak Search has a hardcoded max result set size of 100K items. It enforces this to prevent blowing out memory and causing other issues. Riak Search definitely has some issues when it comes to handling a use case like yours.
That said, our new Search solution in 2.0 (code named Yokozuna) should do a lot better. Not only does it not have the hardcoded 100K limit but it should also execute the queries faster. In some cases by 1-3 orders of magnitude (10-1000x). At that point you're more likely to be slowed down by the map-reduce. You might even be able to remove that stage by using stored fields, but I'd need to know more about your use case. I agree that current Riak (pre 2.0) is not a general search solution. Riak Search can work very well but it requires some hand holding and careful vigilance of how you index and query the data. I feel that the new Search (Yokozuna) fixes this in many ways. In general, it has more robust search support and lower, more consistent latency. Yokozuna would also have no issues dealing with 1 million objects. My micro benchmark that I run is 1-10 million objects. Granted, they are small plain-text objects, but I'm fairly confident it would work with your 1 million objects. I realize that Riak 2.0, and thus the new search functionality, is not out yet. We have an early release, Riak 2.0.0pre5 [1], that you can try. I also do monthly releases of the new search functionality [2]. So if you want to kick the tires I can point you in the right direction. -Z [1]: http://docs.basho.com/riak/2.0.0pre5/downloads/ [2]: https://github.com/basho/yokozuna/blob/develop/docs/INSTALL.md On Wed, Nov 20, 2013 at 11:45 AM, Roger Diller < [email protected]> wrote: > I could dig up all our nitty gritty Riak details but I don't think that > will help really. > > The point I think is this: Using search map reduce is not a viable way to > do real time search queries. Especially ones that may have 2000+ plus > results each. Couple that with search requests coming in every few seconds > from 300+ customer app instances and you literally bring Riak to it's > knees. > > Not that Riak is the problem really, it's just we are using it in a way it > was not designed for. In essence, we are using Riak as a search engine for > our application data. Correct me if I'm wrong but Riak is more for storing > large amounts of KV data, but not really for finding that data in a search > sense. > > Am I missing something here? Is there a viable way for doing real time > search queries on a bucket with 1 million keys? > > > On Mon, Nov 18, 2013 at 5:29 PM, Alexander Sicular <[email protected]>wrote: > >> More info please... >> >> Version >> Current config >> Hardware >> Data size >> Search Schema >> Etc. >> >> But I would probably say that your search is returning too many keys to >> your mr. More inline. >> >> @siculars >> http://siculars.posthaven.com >> >> Sent from my iRotaryPhone >> >> On Nov 18, 2013, at 13:59, Roger Diller <[email protected]> >> wrote: >> >> Using the Riak Java client, I am executing a search map reduce like this: >> >> MapReduceResult result = riakClient.mapReduce(SEARCH_BUCKET, >> search).execute(); >> >> >> ^is this part a typo. Cause otherwise it looks like you do a s>mr, set >> the search and then another s>mr. >> >> >> String search = "systemId:" + systemName + " AND indexId:" + indexId; >> >> MapReduceResult result = riakClient.mapReduce(SEARCH_BUCKET, >> search).execute(); >> >> This worked fine when the bucket contained a few thousand keys. Now that >> we have far more data stored in the bucket (at least 250K keys), it's >> throwing this generic error: >> >> com.basho.riak.client.RiakException: java.io.IOException: >> {"error":"map_reduce_error"} >> >> We've also noticed that storing new key/values in the bucket has slowed >> WAY down. >> >> Any idea what's going on? >> >> >> Your data set is incorrectly sized to your production config. >> >> Are there limitations to Search Map Reduce? >> >> >> Certainly >> >> Are there configuration options that need changed? >> >> >> Possibly >> >> Any help would be greatly appreciated. >> >> >> -- >> Roger Diller >> Flex Rental Solutions, LLC >> Email: [email protected] >> Skype: rogerdiller >> Time Zone: Eastern Time >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > > -- > Roger Diller > Flex Rental Solutions, LLC > Email: [email protected] > Skype: rogerdiller > Time Zone: Eastern Time > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
