#18910: distributing descriptors accross CollecTor instances -------------------------------+----------------------------------- Reporter: iwakeh | Owner: iwakeh Type: enhancement | Status: needs_information Priority: High | Milestone: CollecTor 1.1.0 Component: Metrics/CollecTor | Version: Severity: Normal | Resolution: Keywords: ctip | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------+----------------------------------- Changes (by iwakeh):
* priority: Medium => High Comment: The following is a summary of the discussion above and elsewhere, and should give an overview of the first sync-version functionality. == Functionality and design of descriptor distribution in CollecTor 1.1.0 === Configuration 1. General settings Add a SyncManager configuration in the Scheduler section of the properties file. Property `SyncFolder` contains the path for storing the downloded descriptors. 1. Choice of sync-sources Add a configuration property `SyncSources` containing an array of strings specifying a source name and source URL for each CollecTor instance to retireve descriptors from. This setup is similar to the current torperf configuration. 1. Choice of descriptors Add a configuration property `SyncDescriptorLists`, which will contain comma separated lists (separated by space) with a source name defined in `SyncSources` and a list of descriptor designations. 1. Backup of replaced local files if `KeepReplaceBackup` is set to true, keep a copy of the old local descriptors in `BackupFolder`. === SyncManager The SyncManager module will be started by the Scheduler accordinng to the configuration defined above. Each SyncManager run will perform the following steps: a. Retrieve descriptors from the CollecTor instances defined in `SyncSources`. These descriptors are stored in `SyncFolder` under the host part of the instance's url, e.g. {{{my-sync- folder/collector.torproject.org/recent/exit-lists}}} for exitlists from the main instance. b. Following retrieval the fetched descriptors are examined: i. discard descriptor files that do not contain what they should (see comment:11) and log a warning with sync-source info and reason (see criteria). i. move valid descriptors (see criteria) without a pre-existing local copy to the localstore. i. if there is a local copy already, decide which copy to keep (see criteria). I. local copy is kept, log debug message with source and reason and delete fetched descriptor. I. local and fetched are identical, log debug message with source and reason and delete fetched descriptor. I. fetched copy should replace local descriptor. Depending on `KeepReplaceBackup` move local copy to `BackupFolder` and move fetched copy to main storage. If `KeepReplaceBackup` is false, replace local copy by fetched. In all cases log debug message with source and reason. === Replacement criteria As the replacement criteria are not fully defined yet and it is very likely that there will be more criteria in future a modular/pluggable approach seems useful, i.e.: 1. define `KeepCriterium` and `ReplaceCriterium` interfaces 1. register implementing classes with the SyncManager, which will apply these for the selection steps described above. == Open Questions A. Which `KeepCriterium` and `ReplaceCriterium` classes shuld be implemented initially? currently there are 1. a `ReplaceCriterium` keep the consensus with more signatures and 1. a `KeepCriterium` only keep descriptors that contain what they claim to be. 1. More criteria that should be implemented with release 1.1.0? A. Should the applied criteria be configurable? E.g. this could be done by listing the classes in collector.properties, but we have already more than fifty config settings, which is a lot. A. The data combination mentioned in comment:11 part two is not yet considered, but the design will be open to add this later. Anyway some questions: What kind of data enhancement could be there? What about descriptor signatures? ----- Set to high in order to solve the open questions quickly. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:13> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs