[
https://issues.apache.org/jira/browse/MESOS-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858431#comment-15858431
]
Steven Schlansker commented on MESOS-7085:
------------------------------------------
More evidence of confusion over this in the ecosystem:
https://github.com/mesosphere/marathon/issues/1917
> Consider reducing processing of DECLINE calls log from info to debug
> --------------------------------------------------------------------
>
> Key: MESOS-7085
> URL: https://issues.apache.org/jira/browse/MESOS-7085
> Project: Mesos
> Issue Type: Improvement
> Components: master
> Affects Versions: 1.0.1
> Reporter: Steven Schlansker
>
> The Mesos master gets resource decline messages as a normal matter of course.
> It repeatedly logs the offers declined from schedulers. This is critical
> diagnostics information, but unless your scheduler is broken or buggy,
> usually uninteresting.
> In our production environment this ended up being a significant fraction of
> all logging. One of our operators got paged:
> > Checking to see what I can delete.
> > 90% of the 1.6GB mesos log file is taken up by by these ( + we are also
> > outputting this to syslog ) :
> > I0208 15:54:41.032714 10833 master.cpp:3951] Processing DECLINE call for
> > offers: [ 68809dc9-6d79-467c-a20b-b3b7d50dc415-O12488245 ] for framework
> > Singularity (Singularity) at
> > [email protected]:38844
> > I0208 15:54:41.032871 10833 master.cpp:3951] Processing DECLINE call for
> > offers: [ 68809dc9-6d79-467c-a20b-b3b7d50dc415-O12488246 ] for framework
> > Singularity (Singularity) at
> > [email protected]:38844
> > I0208 15:54:41.033025 10833 master.cpp:3951] Processing DECLINE call for
> > offers: [ 68809dc9-6d79-467c-a20b-b3b7d50dc415-O12488247 ] for framework
> > Singularity (Singularity) at
> > [email protected]:38844
> ➢ wc -l
> mesos-master.mesos3-prod-sc.invalid-user.log.INFO.20170130-014425.10812
> 6796024
> mesos-master.mesos3-prod-sc.invalid-user.log.INFO.20170130-014425.10812
> ➢ grep -c DECLINE
> mesos-master.mesos3-prod-sc.invalid-user.log.INFO.20170130-014425.10812
> 5846770
> It seems that this line looks scary ("DECLINE" is a scary word to an
> operator), is a huge percentage of log output, and is part of normal
> operation.
> Should it be reduced to DEBUG? Or could Mesos print it out in a time based
> manner? ("654 offers declined in last 1 minute")
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)