I couldn't make the call today, but am curious if anyone has previously brought up creating a FileSystem API for Accumulo so that we could use implementations other than Hadoop. I realize that Hadoop provides implementations for things other than HDFS but that doesn't necessarily mean that all filesystem implementations are covered.
-----Original Message----- From: Christopher <ctubb...@apache.org> Sent: Wednesday, March 25, 2020 1:45 PM To: accumulo-dev <email@example.com> Subject: Slack call notes Several committers/contributors in the community joined a call in Slack on Wednesday, at 1130-1230, New York (Eastern) time. Here are my notes of the call. Please feel free to add to them. I shared the overall philosophy and backstory to some of the script improvements in 2.x to help guide current/future work on the scripts. * bin/accumulo is inspired by old jpackage.org standards which are still in use in RPM macros for Java packaging in Fedora/RHEL/etc. The key idea is that scripts are simple... set up environment (class path, etc.), locate java, and exec a single process with the provided args. * bin/accumulo-service is inspired by old SysVInit scripts for start/stop/restart/status of a single service * behavior of bin/accumulo and bin/accumulo-service can be manipulated through launch environment * bin/accumulo-cluster uses bin/accumulo-service, and is provided as a simple, out-of-the-box cluster management tool * bin/accumulo-cluster and bin/accumulo-service are replaceable; they are useful for out-of-the-box, but one would expect them to be unnecessary if using systemd, or a vendor-provided cluster management system * we discussed possibly moving bin/accumulo-cluster and bin/accumulo-service to contrib/ in the tarball, or some subdir of bin/, but it was suggested to not make too many disruptive changes there * we discussed the possibility of adding a config file for bin/accumulo-cluster (also mentioned on https://github.com/apache/accumulo/pull/1568) * we discussed the need to document the intent/purpose/scope of the scripts in comments inside the scripts themselves * Ed Coleman asked if it'd be good to document a systemd example; I suggested it might make for a good blog post (perhaps by the person who wrote the systemd unit files for Fluo Muchos) Keith Turner discussed his development efforts with regard to enabling more controls over compactions. * one main idea was to keep configuration/API for data separate from that for execution * data is concerns to application owners, whereas execution involves system admins (resource contention, etc.) * he will submit a PR for review when ready * he also suggested another call to go over the PR Billie Rinaldi discussed better support for Azure Data Lake Storage Gen2 (ADLSv2). * maintaining a fork for experimenting, and working on reliably testing issues involving WALs * did not recommend using ADLSv2 with WALs, but that we should still support it * might need to implement a custom log closer to better support it Mike Miller brought up the idea of eliminating more static internal state. * ServerConfigurationFactory might be improved in this regard, with some additional ZK cleanup * Other ZK cleanup might help elsewhere (such as ZooCache) * I suggested tablet location cache might also benefit from being bound to an AccumuloClient lifecycle (or a dedicated opaque object that could be shared across AccumuloClient instances with its own user-managed lifecycle) Please add anything I might have missed (or got wrong) in response to this post.