#19934: CollecTor should use new metrics-lib json classes -------------------------------+--------------------------------- Reporter: iwakeh | Owner: iwakeh Type: enhancement | Status: needs_review Priority: Medium | Milestone: CollecTor 1.1.0 Component: Metrics/CollecTor | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------+---------------------------------
Comment (by karsten): Replying to [comment:4 iwakeh]: > Some thoughts: > > 1. The implementation of #18910 requires CollecTor to have a Java representation of index.json to choose the documents to download from the partner-synch-Collector instance(s), especially with the pick-and-choose requirements from comment:11 in #18910. So, this is about using metrics-lib `*Node` classes for obtaining descriptors, not for providing `index.json*` files, right? I don't recall the exact requirements we discussed for #18910, and I think we discussed quite a few variations there. But what we can already do is specify an array of directories to synchronize. The local CollecTor instance would then decide locally from looking at synchronized files which ones to keep and copy over and which ones to ignore. What we ''could'' do is pass a list of excluded paths to `DescriptorCollector`, which would contain paths of consensuses and votes, possibly even with last modified times, that we already have and that we don't want to synchronize. However, this feels a bit like premature optimization. There's no big harm in downloading the entire `recent/` folder from a remote CollecTor instance and decide locally what to do with the data. We're moving around larger chunks of bytes than that. > 2. Shouldn't a CollecTor instance have more fine grained control when creating index.json? More than just specifying a directory. Currently it includes already `recent` and `archive`. Well, we could accept an array of directories instead of just one, if that helps. Or a base directory and an array of contained subdirectories to include. Whatever is most intuitive and does the job. > 3. The package was designated as '''alpha''' to prevent too early reliance on the new API to have more flexibility when implementing #18910. It is well tested in metrics-lib. Oh, I'm not worried that it might not be tested well enough. I'm worried about making the API bigger and having to maintain these parts in the future. This whole `index.json*` stuff is something that library users ideally shouldn't have to worry about. That's why I'm still trying hard to hide it away as best as I can. If this turns out to be impossible or impracticable, so be it. But I'm not there yet. :) -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/19934#comment:5> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs