#20380: Expand INSTALL.md to a more complete operator's guide -------------------------------+--------------------------------- Reporter: karsten | Owner: Type: enhancement | Status: needs_review Priority: Medium | Milestone: CollecTor 1.1.0 Component: Metrics/CollecTor | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------+---------------------------------
Comment (by karsten): Thanks for the detailed feedback! Please take a look at [https://gitweb.torproject.org/karsten/metrics-db.git/log/?h=task-20380 my updated task-20380 branch] for changes discussed below. Replying to [comment:10 iwakeh]: > Replying to [comment:9 karsten]: > > ... > > A few thoughts: > > > > - When you say that closer monitoring will be needed when disk space drops below a given number, do you mean 200G or 20G or a different number? > > I was referring to the disk space available when starting, i.e. very close to 150G and logging to the same disk requires more attention than a terabyte setup. Hmm, but if that is confusing just discard it. Ah, now I understand. Hmm, I think I'd rather pick a different number than 150G than going into more detail there. After all, a CollecTor instance that doesn't download and serve the full tarball archive will need a lot less than 150G, and an instance that does serve tarballs might run out of disk space in a year or two even with 150G. Let's just change it to 200G to have some more room to breathe. > > - We shouldn't add new section headers easily. The chosen section headers and even paragraphs in this document (will) have equivalents in the other operator's guides for other metrics tools. If we want to add new sections, we'll also have to add those sections to the other manuals. The current sections are: > > > > {{{ > > $ grep "^#" INSTALL.md > > # CollecTor Operator's Guide > > ## Setting up the host > > ## Setting up the service > > ## Maintaining the service > > }}} > > > > It's important to have a consistent structure, but it would be helpful for readers to have sub-headings, which are application dependent. Scrolling through a document with only generic headings when looking for particular information takes longer (of course, there is a search). > So, maybe keep the top level consistent and allow for application dependent headlines below? I admit that there could be more sections, though I have not yet given up on keeping them independent of the application. I added a few more section headers. > > - (continued) What other sections or even subsections should we include, and what instructions would go into those vs. the existing sections? > > I see two more sections. > * 'Planning the Service' contrasts those sections giving a to-do list. People running instances will have different needs that can be better covered this way. > * and even more important, a section 'Bootstrapping' or similar. What data to download before a first run etc. Again this is not a to-do list as it depends what data should be processed. Added the first but not yet the second. Let me know if anything is still missing. > > > The idea behind my changes is that I think the service shouldn't be run from the unpacked tar > > > folder. The tar contains a development environment, so the jar would disappear after 'ant clean' or changed etc. > > > The runtime directory should only contain files that are really necessary for the application or which were created by the application. > > > Hope this doesn't make the description too complicated. > > > > Yes, makes sense, let's change that. There are still a few paths left where we refer to files in `collector-<version>/` and where we should tell the user to copy those files to the working directory and run them from there. I can update those places. > > > > > I also would like have even less description of tools from the OS, because such things should be decided by the operator. > > > > Which parts would that include? The crontab, `@reboot`, `screen`, etc.? Can you make a list? > > When we avoid mentioning any such tools and methods, we avoid getting out of date and stay platform independent. People operating servers have their favorite tools for and know what to do when told > > * run this script every three days > * provide an http server for serving data and files in folders X, Y, Z. > * for continuous operation ensure start-up on reboot and > * monitoring of logs as well as running service is important > etc. > > CollecTor does not depend on apache or crontab only the services provided by them. Even the suggested install of openjdk could be left out. Also apt-get. Attempt of a list: > > * apt-get > * apache2 > * crontab > * gpg > * openjdk, only Java 7 > * screen > * ... Agreed with almost all changes mentioned here, except for Apache. I believe that CollecTor depends on Apache to put together its `header.html`, `footer.html`, and to create directory listings. I haven't tried out other HTTP servers, but unless somebody has, I don't want to recommend any HTTP server if what we really need is an Apache. (Note that this is different for Metrics, Onionoo, and ExoneraTor which can all work with any HTTP server that can forward requests to Tomcat/Jetty.) > Another thing would be to use `<OutPath>/recent` and similar instead of the default choices provided. So, it is clear which option is referred to. Good idea. > The backup recommendation I would also leave out. It depends on the setup and the kind of data collected. Or, move it to 'Planning the service'? I'm not sure. This seems like a question that new operators might have, though maybe not during the setup process when they're not yet certain that they will succeed. That recommendation would probably benefit from a section header, so that people who don't care can skip it more easily. Changed. Please take another look. Thanks! -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20380#comment:11> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs