[
https://issues.apache.org/jira/browse/TS-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944635#comment-13944635
]
Alan M. Carroll commented on TS-2594:
-------------------------------------
The suspected change for this problem is adding a string to the hdrtoken_strs
array. This array is used to initialize
the token (WKS) indices of the hard coded keys. Adding a string before the last
of the special keys will cause the WKS
index for those keys to change. This can cause a chain of errors leading to the
behavior observed in TS-2564 and
TS-2594.
The failure scenario is that first an object is cached on disk. During a later
read the object is revalidated with a
fetch from the origin server. In this case fields can be added and removed from
the MIME data. This will cause the
presence bits and slot accelerators to be updated. The problem is that the
update logic uses the WKS index values stored
in the MIME field to lookup the presence bit and slot accelerator values. If
the WKS index has changed then the wrong
data will be in turn causing wrong bits or slot accelerator to be updated. This
can leave the MIME instance in a bad
state, with an accelerator pointing at a deleted slot or a presence bit set
that should have been unset.
In the particular case for TS-2594 what happened was the "Vary" field was
updated. The WKS index stored in the MIME
field was 65, the value for "Vary" when the object was stored in the cache. In
the current version the WKS index for
"Vary" is 66. 65 is the WKS index for "User Content". As a result when the slot
accelerator was updated because of the
change to "Vary" the value 65 was used to find the key metadata giving the
metadata for "User Content". The accelerator
slot for "User Content" was updated but as a consequence the "Vary" slot
accelerator was left pointing at a deleted
field containing an invalid pointer.
For TS-2564 I was not able to fully examine the core file but I did see enough
to verify the presence bit for "Vary" was
set but no "Vary" field was found. The most likely explanation is that the
"Vary" slot accelerator was not updated to
reflect the new "Vary" field location correctly leaving the field marked as
present in the presence bit but not found
when searched because that used the slot accelerator data. Alteratively, this
fell through to the call to
mime_hdr_field_list_search_by_wks() which checks the WKS value for the passed
in key against the stored WKS values which
will not match and so the search will fail even if the key is actually there.
It is my view this change explain all of the observed problems, including the
cache upgrade required to cause the failure and that reverting did not fix the
problem (as there would be new objects stored with the updated WKS data causing
failure for the same reason just the other way).
> Segmentation fault for 4.2.0-rc0
> --------------------------------
>
> Key: TS-2594
> URL: https://issues.apache.org/jira/browse/TS-2594
> Project: Traffic Server
> Issue Type: Bug
> Affects Versions: 4.2.0
> Reporter: bettydramit
> Labels: crash
> Fix For: 4.2.0, 5.0.0
>
>
> {code}
> [E. Mgmt] log ==> [TrafficManager] using root directory '/usr'
> [TrafficServer] using root directory '/usr'
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /usr/bin/traffic_server - STACK TRACE:
> /lib64/libpthread.so.0(+0xf70f)[0x2b4e95ab670f]
> /lib64/libc.so.6(memcpy+0xe)[0x2b4e96a4595e]
> /usr/lib64/trafficserver/libtsutil.so.4(StrList::_new_cell(char const*,
> int)+0xef)[0x2b4e939ac88f]
> /usr/bin/traffic_server(StrList::new_cell(char const*, int)+0x9f)[0x5ff919]
> /usr/bin/traffic_server(StrList::append_string(char const*,
> int)+0x28)[0x5ff9ce]
> /usr/bin/traffic_server(mime_field_value_get_comma_list(MIMEField*,
> StrList*)+0x51)[0x602a39]
> /usr/bin/traffic_server(MIMEField::value_get_comma_list(StrList*)+0x22)[0x551052]
> /usr/bin/traffic_server(MIMEHdr::value_get_comma_list(char const*, int,
> StrList*)+0x4a)[0x5510a0]
> /usr/bin/traffic_server(HttpTransactCache::CalcVariability(CacheLookupHttpConfig*,
> HTTPHdr*, HTTPHdr*, HTTPHdr*)+0xab)[0x5a3399]
> /usr/bin/traffic_server(HttpTransactCache::calculate_quality_of_match(CacheLookupHttpConfig*,
> HTTPHdr*, HTTPHdr*, HTTPHdr*)+0xc68)[0x5a19b0]
> /usr/bin/traffic_server(HttpTransactCache::SelectFromAlternates(CacheHTTPInfoVector*,
> HTTPHdr*, CacheLookupHttpConfig*)+0x28c)[0x5a0a7e]
> /usr/bin/traffic_server(CacheVC::openReadStartHead(int,
> Event*)+0x5f2)[0x68e590]
> /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x6b)[0x4e7137]
> /usr/bin/traffic_server(CacheVC::handleReadDone(int, Event*)+0x1127)[0x66bfa9]
> /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x6b)[0x4e7137]
> /usr/bin/traffic_server(AIOCallbackInternal::io_complete(int,
> void*)+0x3b)[0x671371]
> /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x6b)[0x4e7137]
> /usr/bin/traffic_server(EThread::process_event(Event*, int)+0xc7)[0x6d9165]
> /usr/bin/traffic_server(EThread::execute()+0x9f)[0x6d9333]
> /usr/bin/traffic_server(main+0x122f)[0x50e634]
> /lib64/libc.so.6(__libc_start_main+0xfc)[0x2b4e969dad1c]
> /usr/bin/traffic_server[0x4ca098]
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)