> On Sept. 2, 2014, 4:43 p.m., Bill Farner wrote:
> > I'm concerned that the problem we're solving is underspecified.  An 
> > immediate issue i have with this patch is that it introduces a memory leak 
> > (in practice, this is hidden with default settings due to failover failover 
> > hides this).
> > 
> > I'll reply on the ticket to try to reach consensus on what the problem is 
> > and what it means to fix it.
> 
> Maxim Khutornenko wrote:
>     I don't think we are trying to fix a Mesos problem here. Regardless of 
> the underlying Mesos resolution (MESOS-1646), I do think Aurora should be a 
> good Mesos citizen here and avoid abnormal offer activity spikes driven by 
> failovers. 
>     
>     In order for that memory leak to expose itself in any reasonable manner, 
> there must be an intensive and constant churn of hosts in the cluster. I 
> don't think it's a likely scenario but if you are concerned it should be easy 
> to converge on a hybrid solution with expiring cache where the 
> EXECUTOR_GC_RESTART_INTERVAL kicks in any time an item is not found in the 
> cache.
> 
> Bill Farner wrote:
>     > I do think Aurora should be a good Mesos citizen here and avoid 
> abnormal offer activity spikes driven by failovers
>     
>     Can you offer more detail here - why is this bad behavior?

We spread GC activity randomly to avoid hourly spikes 
(https://reviews.apache.org/r/16247/). Failovers, however, break that pattern 
and manifest in "unusual" GC activity spikes. All Mesos matters aside, I don't 
think scheduler interruptions should result in what appears externally as a 
"catch-up" behavior.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24915/#review52037
-----------------------------------------------------------


On Aug. 20, 2014, 11:35 p.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24915/
> -----------------------------------------------------------
> 
> (Updated Aug. 20, 2014, 11:35 p.m.)
> 
> 
> Review request for Aurora and Bill Farner.
> 
> 
> Bugs: AURORA-608
>     https://issues.apache.org/jira/browse/AURORA-608
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Adding initial GC task delay on scheduler restart.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
> 5a38a1f69ac5dbe68af3bfe175899ddee392880b 
>   src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java 
> 65f404915bc60ffe11a7a57d9861ac5b37fa646a 
>   src/test/java/org/apache/aurora/scheduler/async/GcExecutorLauncherTest.java 
> 04137072891b2a1f0ad663182629dd469b09324f 
> 
> Diff: https://reviews.apache.org/r/24915/diff/
> 
> 
> Testing
> -------
> 
> gradle -Pq build
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>

Reply via email to