[ 
https://issues.apache.org/jira/browse/TS-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794681#comment-13794681
 ] 

Sean Cosgrave commented on TS-1411:
-----------------------------------

Based on the stack traces, it looks like the issue is caused more by the 
unmapped url added to your custom logging config than to the user agent being 
logged. We saw a similar issue in a older version of Traffic Server, and solved 
it by putting url_clear_string_ref(this) in URLImpl::move_strings(), and adding 
the following check to LogAccessHttp::marshal_client_req_url() before 
marshal_mem() :

if (buf && m_url->valid() && m_url->m_url_impl->clean && 
m_url->m_url_impl->m_ptr_printed_string)

I suspected that the m_ptr_printed_string was pointing to a buffer that the URL 
object no longer owned, so I made sure to NULL the pointer in the URL when the 
URL was being moved to a different buffer and then check for that NULL before 
attempting to marshal. This cases seems a little different though since the 
m_client_req_url_canon_str is pointing to memory owned by LogAccessHttp, not 
the client request URL object. 

I'm not a expert in this part of the code, so if anyone has any advice on where 
to start looking for the cause of this, I would appreciate the help. I'm 
thinking there must be some issue with LogAccessHttp's m_arena member, but I 
don't know what the lifetime of that object is, or how it gets used over time, 
except that it appears to allocate the memory that m_client_req_url_canon_str 
points to. Any help would be appreciated. Thanks!


> Seg fault related to logging
> ----------------------------
>
>                 Key: TS-1411
>                 URL: https://issues.apache.org/jira/browse/TS-1411
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Logging
>    Affects Versions: 3.2.0
>         Environment: RHEL 6.2 x86_64
>            Reporter: David Carlin
>            Assignee: Yunkai Zhang
>            Priority: Critical
>             Fix For: 4.2.0
>
>         Attachments: Log rotation segaults.txt, TS-1411 backtraces.txt
>
>
> I've been experiencing some segfaults during log rotation.  The sequence of 
> events is this.. log rotation occurs, then I get hundreds of dropping log 
> buffer error msgs, then the segfault.
> This started occurring when I lengthened the default log format to include 
> the unmapped URL and the user agent string:
> %<cqtq> %<ttms> %<chi> %<crc>/%<pssc> %<psql> %<cqhm> %<cquc> %<caun> 
> %<phr>/%<pqsn> %<psct> %<xid> %<cquuc> \"%<{User-Agent}cqh>\"
> In terms of frequency, we have a number of boxes and I probably see one of 
> these crashed per day since the above change.  Logs are rotated every 2 hours.
> I've had other log related segfaults, reported in TS-1330 - these new ones 
> seem to have a different cause.
> [Aug 14 21:07:20.002] Server {0x2ae3a8887700} STATUS: The rolled logfile, 
> /home/y/logs/trafficserver/error.log_l30.ycs.a4e.yahoo.com.20120814.17h59m50s-20120814.20h00m00s.old,
>  was auto-deleted; 3148252 bytes were reclaimed.
> [Aug 14 21:07:42.859] Server {0x2ae3a8887700} STATUS: The rolled logfile, 
> /home/y/logs/trafficserver/squid.blog_l30.ycs.a4e.yahoo.com.20120814.18h00m00s-20120814.20h00m00s.old,
>  was auto-deleted; 14735520048 bytes were reclaimed.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.865] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [...]
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> [Aug 14 21:07:42.876] Server {0x2ae3a8887700} WARNING: Dropping log buffer, 
> can't keep up.
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /home/y/bin/traffic_server - STACK TRACE: 
> /lib64/libpthread.so.0[0x383f00f500]
> /home/y/bin/traffic_server(_ZN9LogAccess11marshal_memEPcPKcii+0x48)[0x58a118]
> /home/y/bin/traffic_server(_ZN13LogAccessHttp28marshal_client_req_url_canonEPc+0x20)[0x58c3f0]
> /home/y/bin/traffic_server(_ZN12LogFieldList7marshalEP9LogAccessPc+0x32)[0x59d5a2]
> /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x399)[0x5a7ed9]
> /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
> /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
> /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
> /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
> /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
> /home/y/bin/traffic_server[0x673871]
> /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x6756e7]
> /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x66e076]
> /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
> /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
> /home/y/bin/traffic_server[0x695cb2]
> /lib64/libpthread.so.0[0x383f007851]



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to