Hi, how are you generating your watermarks? Could it be that they advance faster when the job is processing more data?
Cheers, Aljoscha On Fri, 16 Dec 2016 at 21:01 Seth Wiesman <swies...@mediamath.com> wrote: > Hi, > > > > I’ve noticed something peculiar about the relationship between state size > and cluster size and was wondering if anyone here knows of the reason. I am > running a job with 1 hour tumbling event time windows which have an allowed > lateness of 7 days. When I run on a 20-node cluster with FsState I can > process approximately 1.5 days’ worth of data in an hour with the most > recent checkpoint being ~20gb. Now if I run the same job with the same > configurations on a 40-node cluster I can process 2 days’ worth of data in > 20 min (expected) but the state size is only ~8gb. Because allowed lateness > is 7 days no windows should be purged yet and I would expect the larger > cluster which has processed more data to have a larger state. Is there some > why a slower running job or a smaller cluster would require more state? > > > > This is more of a curiosity than an issue. Thanks’ in advance for any > insights you may have. > > > > Seth Wiesman >