Hi TD,
Yay! Thanks for the help. That solved our issue of ever increasing
processing time. I added filter functions to all our reduceByKeyAndWindow()
operations and now its been stable for over 2 days already! :-).
One small feedback about the API though. The one that accepts the filter
function
Responses inline.
On Thu, Jul 16, 2015 at 9:27 PM, N B nb.nos...@gmail.com wrote:
Hi TD,
Yes, we do have the invertible function provided. However, I am not sure I
understood how to use the filterFunction. Is there an example somewhere
showing its usage?
The header comment on the function
Hi TD,
Thanks for the response. I do believe I understand the concept and the need
for the filterfunction now. I made the requisite code changes and keeping
it running overnight to see the effect of it. Hopefully this should fix our
issue.
However, there was one place where I encountered a
MAke sure you provide the filterFunction with the invertible
reduceByKeyAndWindow. Otherwise none of the keys will get removed, and the
key space will continue increase. This is what is leading to the lag. So
use the filtering function to filter out the keys that are not needed any
more.
On Thu,
Thanks Akhil. For doing reduceByKeyAndWindow, one has to have checkpointing
enabled. So, yes we do have it enabled. But not Write Ahead Log because we
don't have a need for recovery and we do not recover the process state on
restart.
I don't know if IO Wait fully explains the increasing
Hi TD,
Yes, we do have the invertible function provided. However, I am not sure I
understood how to use the filterFunction. Is there an example somewhere
showing its usage?
The header comment on the function says :
* @param filterFunc function to filter expired key-value pairs;
*