Re: Clearing tuple payloads after processing

Mike Heffner Fri, 18 Apr 2014 06:52:33 -0700

Jon,

We actually took that exact approach in our testing:


tuple.getValues().clear()

Good to hear that others have recognized as a pain point and that there's
room for improvement.

Cheers,

Mike


On Thu, Apr 17, 2014 at 8:16 PM, Jon Logan <[email protected]> wrote:

> I've ran into a similar issue. There's been talk in the past about fixing
> this, but it hasn't been. As a work around, you can actually use Reflection
> to get a hold of the private "values" variable, and just call clear() on it.
>
>
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/tuple/TupleImpl.java
>
>
> On Thu, Apr 17, 2014 at 11:00 AM, Mike Heffner <[email protected]> wrote:
>
>> We have a topology that we're trying to push the throughput on as much as
>> possible. While profiling the topology we found that we are holding onto a
>> lot of memory in our list of tuples prior to acking them. It appears that
>> most of this memory is coming from holding onto the original message
>> payload in its raw format (char[] in our case). Our topology is performing
>> online aggregation, so our internal tracking memory is typically quite
>> small as we aggregate 1,000's of messages into a single bucket. However,
>> maintaining the list of all raw tuple payloads that went into the
>> aggregation bucket for the duration of our checkpointing frequency can chew
>> up a significant footprint of memory.
>>
>> Is there a way to clear the tuple Values() after it has been processed,
>> but before acking it? Our alternative solution is to try a different
>> serialization format that requires a smaller payload. While this would
>> potentially reduce our footprint by a good factor, it would still have
>> limits. Ideally we could strip the tuple list down to only the required
>> message IDs bits required for proper storm message acking.
>>
>> Any ideas? We are on version 0.9.0.1.
>>
>> Thanks,
>>
>> Mike
>>
>> --
>>
>>   Mike Heffner <[email protected]>
>>   Librato, Inc.
>>
>>
>


-- 

  Mike Heffner <[email protected]>
  Librato, Inc.

Re: Clearing tuple payloads after processing

Reply via email to