[ 
https://issues.apache.org/jira/browse/TS-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated TS-2302:
----------------------------

    Fix Version/s: 4.1.0
         Assignee: James Peach

> log collation causes error logging to stop
> ------------------------------------------
>
>                 Key: TS-2302
>                 URL: https://issues.apache.org/jira/browse/TS-2302
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Logging, Quality
>            Reporter: James Peach
>            Assignee: James Peach
>             Fix For: 4.1.0
>
>
> When we started using the log collation ATS stopped writing to the errors.log 
> file. Traffic Server thinks that the configured log space is exhausted:
> {code}
> [Sep 18 23:00:40.817] Server {0x2b1bdcc51700} WARNING: Access logging to 
> local log directory suspended - configured space allocation exhausted.
> [Sep 18 23:01:45.008] Server {0x2b05c4e93700} WARNING: Access logging to 
> local log directory suspended - configured space allocation exhausted.
> [Sep 19 20:57:05.766] Server {0x2b05c4c91700} NOTE: logging space exhausted, 
> failed to write orphan file, drop(1832) bytes
> [Sep 19 20:57:15.743] Server {0x2b05c4c91700} NOTE: logging space exhausted, 
> failed to write orphan file, drop(4360) bytes
> {code}
> {{LogConfig::update_space_used()}} emits the exhaustion message when 
> {{LogConfig::space_to_write(headroom)}} fails:
> {code}
> $ sudo /opt/bazinga/bin/traffic_line -r proxy.config.log.max_space_mb_headroom
> 1000
> $ sudo gdb -p 28375
> ...
> (gdb) p (bool)(Log::config->space_to_write((int64_t)(1000*1024)))
> $1 = false
> (gdb) p *(Log::config)
> $1 = {initialized = true, reconfiguration_needed = false, 
> logging_space_exhausted = true, m_space_used = 28313954, 
> m_partition_space_left = 46870138880, 
>   roll_log_files_now = false, log_object_manager = {_objects = 
> 0x2b1b68758d00, _numObjects = 1, _maxObjects = 8, _APIobjects = 
> 0x2b1b68758d50, _numAPIobjects = 0, 
>     _maxAPIobjects = 8, _APImutex = 0x2b1b6874ee40}, global_filter_list = 
> {m_filter_list = {<DLL<LogFilter, LogFilter::Link_link>> = {head = 0x0}, tail 
> = 0x0}, 
>     m_does_conjunction = true}, global_format_list = {m_format_list = 
> {<DLL<LogFormat, LogFormat::Link_link>> = {head = 0x2b1b68764810}, tail = 
> 0x2b1b68767e10}}, 
>   log_buffer_size = 9216, max_secs_per_buffer = 5, max_space_mb_for_logs = 
> 102400, max_space_mb_for_orphan_logs = 25, max_space_mb_headroom = 1000, 
>   logfile_perm = 420, squid_log_enabled = false, xuid_logging_enabled = true, 
> squid_log_is_ascii = false, common_log_enabled = false, common_log_is_ascii = 
> true, 
>   extended_log_enabled = false, extended_log_is_ascii = false, 
> extended2_log_enabled = false, extended2_log_is_ascii = true, 
> separate_icp_logs = false, 
>   separate_host_logs = false, collation_mode = 2, collation_port = 8085, 
> collation_host_tagged = false, collation_preproc_threads = 1, 
> collation_retry_sec = 5, 
>   collation_max_send_buffers = 16, rolling_enabled = 2, rolling_interval_sec 
> = 86400, rolling_offset_hr = 0, rolling_size_mb = 10240, 
>   auto_delete_rolled_files = true, custom_logs_enabled = true, 
> search_log_enabled = false, search_rolling_interval_sec = 86400, 
> search_server_ip_addr = 0, 
>   search_server_port = 8080, search_top_sites = 100, search_log_filters = 
> 0x0, search_url_filter = 0x0, search_log_file_one = 0x2b1b68758e00 
> "search_log1", 
>   search_log_file_two = 0x2b1b68758e20 "search_log2", sampling_frequency = 1, 
> file_stat_frequency = 32, space_used_frequency = 2, ascii_buffer_size = 
> 36864, 
>   max_line_size = 9216, hostname = 0x2b1b68758f20 "localhost", logfile_dir = 
> 0x2b1b6823fcd0 "/var/log/trafficserver", squid_log_name = 0x2b1b68240960 
> "squid", 
>   squid_log_header = 0x0, common_log_name = 0x2b1b68225a90 "common", 
> common_log_header = 0x0, extended_log_name = 0x2b1b682419f0 "extended", 
>   extended_log_header = 0x0, extended2_log_name = 0x2b1b6874f7b0 "extended2", 
> extended2_log_header = 0x0, collation_host = 0x2b1b6874fe50 "none", 
>   collation_secret = 0x2b1b6874fa20 "Aapl_Baz1ng4!", xml_config_file = 
> 0x2b1b68758da0 "logs_xml.config", hosts_config_file = 0x2b1b68758dc0 
> "log_hosts.config", 
>   use_orphan_log_space_value = true, m_log_collation_accept = 0x0, 
> m_dir_entry = 0x2b1b68769e30, m_pDir = 0x0, m_disk_full = true, m_disk_low = 
> false, 
>   m_partition_full = false, m_partition_low = false, 
> m_log_directory_inaccessible = false}
> {code}
> Note that use_orphan_log_space_value = true.
> In proxy/logging/LogConfig.h:
> {code}
>    169  
>    170    int get_max_space_mb() {
>    171      return (use_orphan_log_space_value ? max_space_mb_for_orphan_logs 
> : max_space_mb_for_logs);
>    172    }
>    173  
> {code}
> So the orphan log space is being used to determine the maximum log space 
> allowed. This is determined in LogConfig::init():
> {code}
>    742    // determine if we should use the orphan log space value or not
>    743    // we use it if all objects are collation clients, or if some are 
> and
>    744    // the specified space for collation is larger than that for local 
> files
>    745    //
>    746    size_t num_collation_clients = 
> log_object_manager.get_num_collation_clients();
>    747    use_orphan_log_space_value =
>    748      (num_collation_clients == 0 ? false :
>    749       (log_object_manager.get_num_objects() == num_collation_clients ? 
> true :
>    750        max_space_mb_for_orphan_logs > max_space_mb_for_logs));
>    751  
> {code}
> Our configuration has 1 log object that is a collation client, so 
> {{log_object_manager.get_num_objects() == 
> log_object_manager.get_num_collation_clients()}}, and we end up using the 
> orphan log space.
> The fix is to treat the error.log like a standard log object and manage it 
> like all the others. This makes the assumptions around logging thresholds 
> turn out to be true.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to