Hi, comments inline On Wed, Dec 5, 2012 at 8:10 AM, Abhinav Singh <[email protected]>wrote: > > > We are facing an issue where search queries works fine on my local dev box > (which have riak-1.2.1rc2 installed). > However same queries timeout on our production boxes (which have > riak-1.2.1 installed): > > 2012-12-05 14:49:59.777 [error] <0.1035.0>@mi_server:handle_info:524 > lookup/range failure: > {{badfun,#Fun<riak_search_client.9.56347389>},[{mi_server,iterate,6},{mi_server,lookup,8}]} >
Did you recently upgrade your production boxes? The 'badfun' error is an indication that you currently have a "mixed" cluster. The error will occur when two or more machines are involved and they are not all the same version. This is a bug in Riak Search. > > This query does succeed sometimes (1-5%), but fails most of the times. > I want to know if the above logs indicate towards a particular error with > our riak cluster? > Yes, so in 1-5% of the cases the nodes involved in a query are all the same version. The reasons this is non-deterministic is because Riak Search uses some randomness during query time to help spread load around. > > Since this query has never failed on my local development box, > I suspect either it has to do with something that changed between 1.2.1rc2 > and 1.2.1-stable release or something that is related to our production > riak cluster. > > > As I said above. I strongly suspect a mixed cluster scenario. That's the only time I've seen an error like the above. The second email also strongly indicates a mixed cluster scenario given the behavior you are seeing. -Z
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
