#18910: distributing descriptors accross CollecTor instances -------------------------------+----------------------------------- Reporter: iwakeh | Owner: iwakeh Type: enhancement | Status: needs_information Priority: High | Milestone: CollecTor 1.1.0 Component: Metrics/CollecTor | Version: Severity: Normal | Resolution: Keywords: ctip | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------+-----------------------------------
Comment (by karsten): Hmm, the suggested config options would imply that there's only one new sync manager module that syncs all descriptors from the various sources and that runs, say, once per hour? I wonder how to schedule that in a way that it does not interfere with the other modules. So far, modules were pretty much independent, but this new module would create a dependency between modules. Alternative suggestion: we add four (sets of) configurations, one for each module, that internally re-use the same code for syncing descriptors and for importing them. For example, `SyncRelayDescriptors`, `SyncBridgeDescriptors`, `SyncExitLists`, and `SyncTorperfFiles`. We could then provide a remote path where to find descriptor files (like `/recent/relay-descriptors/`) and could implictly only consider descriptor types that the respective module understands (like `RelayServerDescriptor`, `RelayExtraInfoDescriptor`, etc., but not `BridgeServerDescriptor`). (If we're worried that there are too many config options already, I'm more than happy to make a list of options that can go away! But this shouldn't mean we should hold back useful new options.) Here's a potential policy we could apply to decided whether to keep a local or remote descriptor: while syncing, if we find out that a remotely obtained descriptor would be stored under a file name that already exists locally, we always discard that; and while processing descriptors locally, if we find that we already have a file locally with different content, which we likely received while syncing, we always overwrite that. This means that we're only adding data but never replacing data. Regarding deleting synced descriptors, we should never do that, but we should rather let `DescriptorCollector` clean up the local directory when it finds that a local file does not exist anymore remotely. Here's something else to watch out for while writing this code: whenever we learn descriptors from syncing, we'll have to include them in our `/recent/` directory, too. This wasn't entirely clear to me from the description above, so if this was already the plan, never mind. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:14> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs