Issues have been reported by users and customers with the MapReduce subsystem in Riak 1.1.0.
1) MapReduce fails in clusters that contain both 1.1.0 and older nodes 2) Javascript MapReduce jobs leak Javascript VMs under some failure conditions. 3) Keylisting continues after bucket MapReduce is cancelled/timeout We are actively investigating these issues to be resolved in an upcoming point release. We will update the individual issue trackers as we make progress. = Issue: MapReduce fails in clusters that contain both 1.1.0 and older nodes = Description: The 1.1.0 release has enhancements to the way requests are routed between nodes. It includes a legacy mode for use while clusters are being upgraded, using the original routing used in the 1.0 series and before. Code required to support legacy routing for MapReduce requests was omitted. What version of Riak is affected? Open Source and Enterprise versions of Riak 1.1.0 are affected. Are all users affected? This issue only arises during a rolling upgrade. Users will be unable to issue MapReduce jobs while there remains 1.0.x or 0.14.x nodes in the cluster. Users that are not using MapReduce, or who have clusters containing only 1.1.0 nodes are unaffected. Can I safely upgrade to 1.1.0 if I am not using MapReduce? Yes Issue Tracker: https://github.com/basho/riak_core/issues/144 = Issue: Javascript MapReduce jobs leak Javascript VMs under some failure conditions = Description: Javascript MapReduce uses a pool of Javascript virtual machines inside Riak. There are some edge cases where MapReduce jobs are cancelled (due to timeouts/dropped connections) where the VMs are not returned to the pool. Eventually all Javascript VMs in the pool can become exhausted so that no further Javascript MapReduce jobs can run until the node is restarted. What version of Riak is affected? Open Source and Enterprise versions of Riak 1.0.x and 1.1.0 are affected. Are all users affected? Only users using Javascript MapReduce are affected. Can I safely upgrade to 1.1.0 if I am not using Javascript MapReduce? Yes. Issue Tracker: https://github.com/basho/riak_kv/issues/287 = Issue: Keylisting continues after bucket MapReduce is cancelled/timeout = Description: If MapReduce against a bucket is cancelled (timeout or dropped client connection) the listkeys feeding the objects from the bucket continues to run and generates a large number of error messages: "Pipe worker startup failed:fitting was gone before startup" Additionally, these running listkey jobs continue to tie up resources that would otherwise be capable of running other MapReduce queries. If enough listkey jobs end up in this state, subsequent MapReduce queries will be unable to complete and will timeout. In the worst case, all MapReduce queries submitted will timeout until the running listkeys eventually terminate and the cluster recovers. What version of Riak is affected? Riak 1.0.x and 1.1.0 are both affected. The listkeys back-pressure mechanism added in 1.1.0 has increased the recovery time from the issue. Are all users affected? This affects users that are seeing frequent timeouts with bucket MapReduce. Can I safely upgrade to 1.1.0 if I am not using MapReduce? Yes. Issue Tracker https://github.com/basho/riak_kv/issues/293 -- Jon Meredith Platform Engineering Manager Basho Technologies, Inc. [email protected]
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
