James Peach created TS-2302:
-------------------------------
Summary: log collation causes error logging to stop
Key: TS-2302
URL: https://issues.apache.org/jira/browse/TS-2302
Project: Traffic Server
Issue Type: Bug
Components: Logging, Quality
Reporter: James Peach
When we started using the log collation ATS stopped writing to the errors.log
file. Traffic Server thinks that the configured log space is exhausted:
{code}
[Sep 18 23:00:40.817] Server {0x2b1bdcc51700} WARNING: Access logging to local
log directory suspended - configured space allocation exhausted.
[Sep 18 23:01:45.008] Server {0x2b05c4e93700} WARNING: Access logging to local
log directory suspended - configured space allocation exhausted.
[Sep 19 20:57:05.766] Server {0x2b05c4c91700} NOTE: logging space exhausted,
failed to write orphan file, drop(1832) bytes
[Sep 19 20:57:15.743] Server {0x2b05c4c91700} NOTE: logging space exhausted,
failed to write orphan file, drop(4360) bytes
{code}
{{LogConfig::update_space_used()}} emits the exhaustion message when
{{LogConfig::space_to_write(headroom)}} fails:
{code}
$ sudo /opt/bazinga/bin/traffic_line -r proxy.config.log.max_space_mb_headroom
1000
$ sudo gdb -p 28375
...
(gdb) p (bool)(Log::config->space_to_write((int64_t)(1000*1024)))
$1 = false
(gdb) p *(Log::config)
$1 = {initialized = true, reconfiguration_needed = false,
logging_space_exhausted = true, m_space_used = 28313954, m_partition_space_left
= 46870138880,
roll_log_files_now = false, log_object_manager = {_objects = 0x2b1b68758d00,
_numObjects = 1, _maxObjects = 8, _APIobjects = 0x2b1b68758d50, _numAPIobjects
= 0,
_maxAPIobjects = 8, _APImutex = 0x2b1b6874ee40}, global_filter_list =
{m_filter_list = {<DLL<LogFilter, LogFilter::Link_link>> = {head = 0x0}, tail =
0x0},
m_does_conjunction = true}, global_format_list = {m_format_list =
{<DLL<LogFormat, LogFormat::Link_link>> = {head = 0x2b1b68764810}, tail =
0x2b1b68767e10}},
log_buffer_size = 9216, max_secs_per_buffer = 5, max_space_mb_for_logs =
102400, max_space_mb_for_orphan_logs = 25, max_space_mb_headroom = 1000,
logfile_perm = 420, squid_log_enabled = false, xuid_logging_enabled = true,
squid_log_is_ascii = false, common_log_enabled = false, common_log_is_ascii =
true,
extended_log_enabled = false, extended_log_is_ascii = false,
extended2_log_enabled = false, extended2_log_is_ascii = true, separate_icp_logs
= false,
separate_host_logs = false, collation_mode = 2, collation_port = 8085,
collation_host_tagged = false, collation_preproc_threads = 1,
collation_retry_sec = 5,
collation_max_send_buffers = 16, rolling_enabled = 2, rolling_interval_sec =
86400, rolling_offset_hr = 0, rolling_size_mb = 10240,
auto_delete_rolled_files = true, custom_logs_enabled = true,
search_log_enabled = false, search_rolling_interval_sec = 86400,
search_server_ip_addr = 0,
search_server_port = 8080, search_top_sites = 100, search_log_filters = 0x0,
search_url_filter = 0x0, search_log_file_one = 0x2b1b68758e00 "search_log1",
search_log_file_two = 0x2b1b68758e20 "search_log2", sampling_frequency = 1,
file_stat_frequency = 32, space_used_frequency = 2, ascii_buffer_size = 36864,
max_line_size = 9216, hostname = 0x2b1b68758f20 "localhost", logfile_dir =
0x2b1b6823fcd0 "/var/log/trafficserver", squid_log_name = 0x2b1b68240960
"squid",
squid_log_header = 0x0, common_log_name = 0x2b1b68225a90 "common",
common_log_header = 0x0, extended_log_name = 0x2b1b682419f0 "extended",
extended_log_header = 0x0, extended2_log_name = 0x2b1b6874f7b0 "extended2",
extended2_log_header = 0x0, collation_host = 0x2b1b6874fe50 "none",
collation_secret = 0x2b1b6874fa20 "Aapl_Baz1ng4!", xml_config_file =
0x2b1b68758da0 "logs_xml.config", hosts_config_file = 0x2b1b68758dc0
"log_hosts.config",
use_orphan_log_space_value = true, m_log_collation_accept = 0x0, m_dir_entry
= 0x2b1b68769e30, m_pDir = 0x0, m_disk_full = true, m_disk_low = false,
m_partition_full = false, m_partition_low = false,
m_log_directory_inaccessible = false}
{code}
Note that use_orphan_log_space_value = true.
In proxy/logging/LogConfig.h:
{code}
169
170 int get_max_space_mb() {
171 return (use_orphan_log_space_value ? max_space_mb_for_orphan_logs :
max_space_mb_for_logs);
172 }
173
{code}
So the orphan log space is being used to determine the maximum log space
allowed. This is determined in LogConfig::init():
{code}
742 // determine if we should use the orphan log space value or not
743 // we use it if all objects are collation clients, or if some are and
744 // the specified space for collation is larger than that for local
files
745 //
746 size_t num_collation_clients =
log_object_manager.get_num_collation_clients();
747 use_orphan_log_space_value =
748 (num_collation_clients == 0 ? false :
749 (log_object_manager.get_num_objects() == num_collation_clients ?
true :
750 max_space_mb_for_orphan_logs > max_space_mb_for_logs));
751
{code}
Our configuration has 1 log object that is a collation client, so
{{log_object_manager.get_num_objects() ==
log_object_manager.get_num_collation_clients()}}, and we end up using the
orphan log space.
The fix is to treat the error.log like a standard log object and manage it like
all the others. This makes the assumptions around logging thresholds turn out
to be true.
--
This message was sent by Atlassian JIRA
(v6.1#6144)