[ 
https://issues.apache.org/jira/browse/TS-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs updated TS-4915:
-------------------------------
    Attachment: ts-4915.diff

Updating my patch.  Got another assert where entry->index >= _v.length() in 
erase().  To erase a value, it pulls the last value and puts it in the 
erase_index target.  But it does not update the index value, and that index 
value will soon be invalid (i.e. greater than vector length).  So when we erase 
that value, the assertion will fail.  If we used that index to pull the value 
of of the vector, it would be accessing unused data.

Added assert(false) into PriorityQueue<>::pop() because I don't think it is 
called, and the index maintaining logic doesn't make sense.  Wanting to see if 
it is in fact used.

> Crash from hostdb in PriorityQueueLess
> --------------------------------------
>
>                 Key: TS-4915
>                 URL: https://issues.apache.org/jira/browse/TS-4915
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HostDB
>            Reporter: Susan Hinrichs
>            Priority: Blocker
>             Fix For: 7.1.0
>
>         Attachments: ts-4915.diff, ts-4915.diff
>
>
> Saw this while testing fix for TS-4813 with debug enabled.
> {code}
> (gdb) bt full
> #0  0x0000000000547bfe in RefCountCacheHashEntry::operator< (this=0x1cc0880, 
> v2=...) at ../iocore/hostdb/P_RefCountCache.h:94
> No locals.
> #1  0x000000000054988d in 
> PriorityQueueLess<RefCountCacheHashEntry*>::operator() (this=0x2b78a9a2587b, 
> a=@0x2b78f402af68, b=@0x2b78f402aa28)
>     at ../lib/ts/PriorityQueue.h:41
> No locals.
> #2  0x0000000000549785 in PriorityQueue<RefCountCacheHashEntry*, 
> PriorityQueueLess<RefCountCacheHashEntry*> >::_bubble_up (this=0x1cb2990, 
>     index=2) at ../lib/ts/PriorityQueue.h:191
>         comp = {<No data fields>}
>         parent = 0
> #3  0x00000000006ecfcc in PriorityQueue<RefCountCacheHashEntry*, 
> PriorityQueueLess<RefCountCacheHashEntry*> >::push (this=0x1cb2990, 
>     entry=0x2b78f402af60) at ../../lib/ts/PriorityQueue.h:91
>         len = 2
> #4  0x00000000006ec206 in RefCountCachePartition<HostDBInfo>::put 
> (this=0x1cb2900, key=6912554662447498853, item=0x2b78aee04f00, size=96, 
>     expire_time=1475202356) at ./P_RefCountCache.h:210
>         expiry_entry = 0x2b78f402af60
>         __func__ = "put"
>         val = 0x1cc0880
> #5  0x00000000006eb3de in RefCountCache<HostDBInfo>::put (this=0x18051e0, 
> key=6912554662447498853, item=0x2b78aee04f00, size=16, 
>     expiry_time=1475202356) at ./P_RefCountCache.h:462
> No locals.
> #6  0x00000000006e2d8e in HostDBContinuation::dnsEvent (this=0x2b7938020f00, 
> event=600, e=0x2b78ac009440) at HostDB.cc:1422
>         is_rr = false
>         old_rr_data = 0x0
>         first_record = 0x2b78ac0094f8
>         m = 0x1
>         failed = false
>         old_r = {m_ptr = 0x0}
>         af = 2 '\002'
>         s_size = 16
>         rrsize = 0
>         allocSize = 16
>         r = 0x2b78aee04f00
>         old_info = {<RefCountObj> = {<ForceVFPTToTop> = {_vptr.ForceVFPTToTop 
> = 0x7f3630}, m_refcount = 0}, iobuffer_index = 0, 
>           key = 47797242059264, app = {allotment = {application1 = 5326300, 
> application2 = 0}, http_data = {http_version = 4, 
>               pipeline_max = 59, keepalive_timeout = 17, fail_count = 81, 
> unused1 = 0, last_failure = 0}, rr = {offset = 5326300}}, data = {
>             ip = {sa = {sa_family = 54488, sa_data = 
> "^\000\000\000\000\000\020\034$\274x+\000"}, sin = {sin_family = 54488, 
> sin_port = 94, 
>                 sin_addr = {s_addr = 0}, sin_zero = "\020\034$\274x+\000"}, 
> sin6 = {sin6_family = 54488, sin6_port = 94, sin6_flowinfo = 0, 
>                 sin6_addr = {__in6_u = {__u6_addr8 = 
> "\020\034$\274x+\000\000\030\036$\274\375\b\000", __u6_addr16 = {7184, 48164, 
> 11128, 
>                       0, 7704, 48164, 2301, 0}, __u6_addr32 = {3156483088, 
> 11128, 3156483608, 2301}}}, sin6_scope_id = 3156478176}}, 
>             hostname_offset = 6214872, srv = {srv_offset = 54488, srv_weight 
> = 94, srv_priority = 0, srv_port = 0, key = 3156483088}}, 
>           hostname_offset = 11128, ip_timestamp = 2845989456, 
> ip_timeout_interval = 11128, is_srv = 0, reverse_dns = 0, round_robin = 1, 
>           round_robin_elt = 0}
>         valid_records = 0
>         tip = {_family = 2, _addr = {_ip4 = 540420056, _ip6 = {__in6_u = 
> {__u6_addr8 = "\330'6 x+\000\000\360L\020\250x+\000", 
>                 __u6_addr16 = {10200, 8246, 11128, 0, 19696, 43024, 11128, 
> 0}, __u6_addr32 = {540420056, 11128, 2819640560, 11128}}}, 
>             _byte = "\330'6 x+\000\000\360L\020\250x+\000", _u32 = 
> {540420056, 11128, 2819640560, 11128}, _u64 = {47794936489944, 
>               47797215710448}}}
>         ttl_seconds = 132
>         aname = 0x2b7938021000 "fbmm1.zenfs.com"
>         offset = 96
>         thread = 0x2b78a8101010
>         __func__ = "dnsEvent"
> #7  0x00000000005145dc in Continuation::handleEvent (this=0x2b7938020f00, 
> event=600, data=0x2b78ac009440)
>     at ../iocore/eventsystem/I_Continuation.h:153
> No locals.
> #8  0x00000000006f681e in DNSEntry::postEvent (this=0x2b78f4028600) at 
> DNS.cc:1269
>         __func__ = "postEvent"
> #9  0x00000000005145dc in Continuation::handleEvent (this=0x2b78f4028600, 
> event=1, data=0x2aac954db040)
>     at ../iocore/eventsystem/I_Continuation.h:153
> No locals.
> #10 0x00000000007bc9be in EThread::process_event (this=0x2b78a8101010, 
> e=0x2aac954db040, calling_code=1) at UnixEThread.cc:143
>         c_temp = 0x2b78f4028600
>         lock = {m = {m_ptr = 0x17dea10}, lock_acquired = true}
>         __func__ = "process_event"
> #11 0x00000000007bcc2d in EThread::execute (this=0x2b78a8101010) at 
> UnixEThread.cc:197
>         done_one = false
>         e = 0x2aac954db040
>         NegativeQueue = {<DLL<Event, Event::Link_link>> = {head = 0x18ce400}, 
> tail = 0x18ce400}
>         next_time = 1475191803711988905
>         __func__ = "execute"
> #12 0x00000000007bbfd2 in spawn_thread_internal (a=0x17fb9a0) at Thread.cc:84
>         p = 0x17fb9a0
> #13 0x00002b78a2555aa1 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #14 0x00000032310e893d in clone () from /lib64/libc.so.6
> No symbol table info available.
> core == ET_NET 13 and core == ET_NET 20
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to