[ 
https://issues.apache.org/jira/browse/MESOS-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-920:
----------------------------------
    Description: 
We've observed issues where the masters are slow to respond. Two perf traces 
collected while the masters were slow to respond:

{noformat}
 25.84%  [kernel]                [k] default_send_IPI_mask_sequence_phys
 20.44%  [kernel]                [k] native_write_msr_safe
  4.54%  [kernel]                [k] _raw_spin_lock
  2.95%  libc-2.5.so             [.] _int_malloc
  1.82%  libc-2.5.so             [.] malloc
  1.55%  [kernel]                [k] apic_timer_interrupt
  1.36%  libc-2.5.so             [.] _int_free
{noformat}

{noformat}
 29.03%  [kernel]                [k] default_send_IPI_mask_sequence_phys
  9.64%  [kernel]                [k] _raw_spin_lock
  7.38%  [kernel]                [k] native_write_msr_safe
  2.43%  libc-2.5.so             [.] _int_malloc
  2.05%  libc-2.5.so             [.] _int_free
  1.67%  [kernel]                [k] apic_timer_interrupt
  1.58%  libc-2.5.so             [.] malloc
{noformat}

These have been found to be attributed to the posix_fadvise calls made by glog. 
We can disable these via the environment:

{noformat}
GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log 
contents. "
                 "Logs can grow very quickly and they are rarely read before 
they "
                 "need to be evicted from memory. Instead, drop them from 
memory "
                 "as soon as they are flushed to disk.");

{noformat}

{code}
    if (FLAGS_drop_log_memory) {
      if (file_length_ >= logging::kPageSize) {
        // don't evict the most recent page
        uint32 len = file_length_ & ~(logging::kPageSize - 1);
        posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED);
      }
    }
{code}

We should set GLOG_drop_log_memory=false prior to making our call to 
google::InitGoogleLogging, to avoid others running into this issue.

  was:
We've observed performance scaling issues attributed to the posix_fadvise calls 
made by glog. This can currently only disabled via the environment:

GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log 
contents. "
                 "Logs can grow very quickly and they are rarely read before 
they "
                 "need to be evicted from memory. Instead, drop them from 
memory "
                 "as soon as they are flushed to disk.");


    if (FLAGS_drop_log_memory) {
      if (file_length_ >= logging::kPageSize) {
        // don't evict the most recent page
        uint32 len = file_length_ & ~(logging::kPageSize - 1);
        posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED);
      }
    }

We should set GLOG_drop_log_memory=false prior to making our call to 
google::InitGoogleLogging.


> Set GLOG_drop_log_memory=false in environment prior to logging initialization.
> ------------------------------------------------------------------------------
>
>                 Key: MESOS-920
>                 URL: https://issues.apache.org/jira/browse/MESOS-920
>             Project: Mesos
>          Issue Type: Improvement
>          Components: technical debt
>    Affects Versions: 0.15.0, 0.16.0
>            Reporter: Benjamin Mahler
>
> We've observed issues where the masters are slow to respond. Two perf traces 
> collected while the masters were slow to respond:
> {noformat}
>  25.84%  [kernel]                [k] default_send_IPI_mask_sequence_phys
>  20.44%  [kernel]                [k] native_write_msr_safe
>   4.54%  [kernel]                [k] _raw_spin_lock
>   2.95%  libc-2.5.so             [.] _int_malloc
>   1.82%  libc-2.5.so             [.] malloc
>   1.55%  [kernel]                [k] apic_timer_interrupt
>   1.36%  libc-2.5.so             [.] _int_free
> {noformat}
> {noformat}
>  29.03%  [kernel]                [k] default_send_IPI_mask_sequence_phys
>   9.64%  [kernel]                [k] _raw_spin_lock
>   7.38%  [kernel]                [k] native_write_msr_safe
>   2.43%  libc-2.5.so             [.] _int_malloc
>   2.05%  libc-2.5.so             [.] _int_free
>   1.67%  [kernel]                [k] apic_timer_interrupt
>   1.58%  libc-2.5.so             [.] malloc
> {noformat}
> These have been found to be attributed to the posix_fadvise calls made by 
> glog. We can disable these via the environment:
> {noformat}
> GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log 
> contents. "
>                  "Logs can grow very quickly and they are rarely read before 
> they "
>                  "need to be evicted from memory. Instead, drop them from 
> memory "
>                  "as soon as they are flushed to disk.");
> {noformat}
> {code}
>     if (FLAGS_drop_log_memory) {
>       if (file_length_ >= logging::kPageSize) {
>         // don't evict the most recent page
>         uint32 len = file_length_ & ~(logging::kPageSize - 1);
>         posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED);
>       }
>     }
> {code}
> We should set GLOG_drop_log_memory=false prior to making our call to 
> google::InitGoogleLogging, to avoid others running into this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to