I've ran into a similar issue. There's been talk in the past about fixing
this, but it hasn't been. As a work around, you can actually use Reflection
to get a hold of the private "values" variable, and just call clear() on it.

https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/tuple/TupleImpl.java


On Thu, Apr 17, 2014 at 11:00 AM, Mike Heffner <[email protected]> wrote:

> We have a topology that we're trying to push the throughput on as much as
> possible. While profiling the topology we found that we are holding onto a
> lot of memory in our list of tuples prior to acking them. It appears that
> most of this memory is coming from holding onto the original message
> payload in its raw format (char[] in our case). Our topology is performing
> online aggregation, so our internal tracking memory is typically quite
> small as we aggregate 1,000's of messages into a single bucket. However,
> maintaining the list of all raw tuple payloads that went into the
> aggregation bucket for the duration of our checkpointing frequency can chew
> up a significant footprint of memory.
>
> Is there a way to clear the tuple Values() after it has been processed,
> but before acking it? Our alternative solution is to try a different
> serialization format that requires a smaller payload. While this would
> potentially reduce our footprint by a good factor, it would still have
> limits. Ideally we could strip the tuple list down to only the required
> message IDs bits required for proper storm message acking.
>
> Any ideas? We are on version 0.9.0.1.
>
> Thanks,
>
> Mike
>
> --
>
>   Mike Heffner <[email protected]>
>   Librato, Inc.
>
>

Reply via email to