On Fri, Dec 13, 2019 at 5:46 PM Aishwarya Thangappa <aishwarya.thanga...@microsoft.com.invalid> wrote: > > Hello everyone, > > I have not subscribed to the dev mailing list earlier and missed on some of > your questions. I will address them here. > > @Christopher > Most of the changes except the actual installation of the systemd units could > possibly go into Accumulo. These would be the systemd units for various > accumulo services, modification to cluster-wide scripts in accumulo to use > systemd instead of directly starting/stopping the processes. We would be > happy to accommodate/answer any suggestions or follow-up questions you may > have. > > Attached the accumulo_cluster and accumulo_service scripts with systemd > changes. > > > @Keith Turner > Once we determine where the different pieces should land, I can post PRs > accordingly. In our current setup, in muchos.properties file I have added a > `use_systemd` flag which when set to true, will overwrite the accumulo > cluster-wide scripts in the nodes with the attached ones. These files > currently reside at ansible/roles/accumulo/files. If we determine that these > scripts and the systemd unit files will instead go to Accumulo project, I > will have to make changes accordingly.
Reading over this I realized these system d scripts may enable a different kind of cluster agitation. Instead of killing processes, restart VMs (via AWS or Azure APIs) and rely on the systemd scripts to restart all of the processes. There may already be tools for this. > > @Michael Wall > Systemd units internally call the same scripts that accumulo_cluster commands > currently use. The change is that accumulo_cluster commands would call > systemd start/stop which inturn would call accumulo_service commands. > Attached a sample systemd_unit template. Can you please elaborate if this is > still an issue? > > ________________________________ > From: Aishwarya Thangappa > Sent: Thursday, December 12, 2019 11:25 AM > To: dev@fluo.apache.org <dev@fluo.apache.org> > Cc: Arvind Shyamsundar <arvin...@microsoft.com>; Billie Rinaldi > <billie.rina...@microsoft.com> > Subject: Run Accumulo and Hadoop services under systemd > > Hi everyone, > > While using fluo-muchos to deploy an Accumulo cluster, we recognized the need > for various Accumulo and Hadoop services to be run under a service manager > like systemd which will ensure that all these services are brought up > correctly in the event of VM / OS reboots / cold starts. We have made the > required changes for this and would like to contribute it back to the > community if there is any interest around it. > > Summarizing what we have done: > > Crafted separate systemd unit files for Accumulo (master, monitor, gc, > traser, tserver), Hadoop (journalnode, namenode, datanode, resourcemanager, > nodemanager, zkfc) and Zookeeper services. > All of these unit files will be copied to the respective nodes' > /etc/systemd/system directory; the services will then be started and enabled > by ansible systemd module. > In case of num_tservers > 1, multiple tserver systemd units will be copied to > the node and each will be independently managed. > Also made necessary changes to the existing cluster-wide scripts including > accumulo_cluster, accumulo_service, start_dfs, start_yarn etc., to have them > work seamlessly with sytemd. > > Is there an appetite to look at the details? If so, we can post a PR or if > there are any feedbacks and other considerations, please let us know and we > can discuss them. > > Thanks, > Aishwarya >