GitHub user dkatten opened a pull request:
https://github.com/apache/storm/pull/462
With hash key option for RedisMapState, only get the values for keys in the
requested batch
This commit fixes a bug whereby if the state updater is constructed with a
hash key (ie, the state will be stored as a key in a redis hash, versus as a
key in the top-level redis space), each call to multiGet would request the
entire hash and iterate to extract only the values in the hash relevant to the
batch.
This can cause an inordinate amount of network traffic (and actually caused
our interfaces to fall over) for states with either a moderately high
cardinality or large values. Instead, the call to Redis should be an hmget
(hash multiget) that takes the hash key as its first argument and an array of
strings as the keys to fetch from that key, thereby retrieving only the
requested values.
The change also deprecates and removes buildValuesFromMap.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dkatten/storm redis-hgetall-fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/462.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #462
----
commit 26bab1595c56f6f40ea2392401e829f3ccb0cff0
Author: David Katten <[email protected]>
Date: 2015-03-10T20:52:40Z
When using a hash key as an option for RedisMapState, only get the values
for keys in the requested batch.
This commit fixes a bug whereby if the state updater is constructed with a
hash key (ie, the state will be stored as a key in a redis hash, versus as a
key in the top-level redis space), each call to multiGet would request the
entire hash and iterate to extract only the values in the hash relevant to the
batch.
This can cause an inordinate amount of network traffic (and actually caused
our interfaces to fall over) for states with either a moderately high
cardinality or large values. Instead, the call to Redis should be an hmget
(hash multiget) that takes the hash key as its first argument and an array of
strings as the keys to fetch from that key, thereby retrieving only the
requested values.
The change also deprecates and removes buildValuesFromMap.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---