I've ran into a similar issue. There's been talk in the past about fixing this, but it hasn't been. As a work around, you can actually use Reflection to get a hold of the private "values" variable, and just call clear() on it.
https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/tuple/TupleImpl.java On Thu, Apr 17, 2014 at 11:00 AM, Mike Heffner <[email protected]> wrote: > We have a topology that we're trying to push the throughput on as much as > possible. While profiling the topology we found that we are holding onto a > lot of memory in our list of tuples prior to acking them. It appears that > most of this memory is coming from holding onto the original message > payload in its raw format (char[] in our case). Our topology is performing > online aggregation, so our internal tracking memory is typically quite > small as we aggregate 1,000's of messages into a single bucket. However, > maintaining the list of all raw tuple payloads that went into the > aggregation bucket for the duration of our checkpointing frequency can chew > up a significant footprint of memory. > > Is there a way to clear the tuple Values() after it has been processed, > but before acking it? Our alternative solution is to try a different > serialization format that requires a smaller payload. While this would > potentially reduce our footprint by a good factor, it would still have > limits. Ideally we could strip the tuple list down to only the required > message IDs bits required for proper storm message acking. > > Any ideas? We are on version 0.9.0.1. > > Thanks, > > Mike > > -- > > Mike Heffner <[email protected]> > Librato, Inc. > >
