[
https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
B Wyatt updated TS-949:
-----------------------
Attachment: explicit-pair.patch
I made a quick patch which converts the implied pairing of elements in the
rtable array into an explicit pair (it applies on top of TS-949-jp2.patch).
It is a non-functional change however, I thought it may make future
review/modification a little easier.
Feel free to toss it in the circular file, it won't hurt my feelings.
> key->volume hash table is not consistent when a disk is marked as bad or
> removed due to failure
> -----------------------------------------------------------------------------------------------
>
> Key: TS-949
> URL: https://issues.apache.org/jira/browse/TS-949
> Project: Traffic Server
> Issue Type: Bug
> Components: Cache
> Affects Versions: 3.1.0
> Environment: Multi-volume cache with apparently faulty drives
> Reporter: B Wyatt
> Assignee: John Plevyak
> Fix For: 3.1.2
>
> Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch,
> explicit-pair.patch
>
>
> The method for resolving collisions when distributing hash-table space to
> volumes for the object_key->volume hash table creates inconsistency when a
> disk is determined to be bad, or when a failed disk is removed from the
> volume.config.
> Background:
> The hash space is distributed by round robin draft where each volume "drafts"
> a random index in the hash table until the hash space is exhausted. The
> random order in which a given volume drafts hash table slots is consistent
> across reboot/crash/disk-failure, however when a volume attempts to draft a
> slot which has already been occupied, it skips to its next random pick and
> attempts to draft that slot until it finds an open slot. This ensures that
> the hash is partitioned evenly between volumes.
> The issue:
> Resolving slot contention breaks the consistency as it is dependent on the
> order that the volumes draft. When rebuilding the hash after disk failure or
> reboot with fewer drives, a volume may secure an index that was previously
> occupied by the dead-disk. In the old hash, the surviving volume would have
> selected another random index due to contention. If this index is taken, by
> the next draft round it will represent an inconsistent key->volume result.
> The effects of one inconsistency will then cascade as whichever volume
> occupies that index after removing a dead disk is now behind on its draft
> sequence as well.
> An Example:
> ||Disk||Draft Sequence||
> |A|1,4,7,5|
> |B|4,2,8,1|
> |C|3,7,5,2|
> Pre-failure Hash Table after 2 rounds of draft:
> |A|B|C|B|C|?|A|?|
> Post-failure of drive B Hash Table after 3 rounds of draft:
> |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?|
> Two slots have become inconsistent and more will probably follow. These
> inconsistencies become objects stored in a volume but lost to the top level
> cache for open/lookup.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira