Expanding on the testing aspect, I am going to attempt to summarize some of the things Dave and I have discussed. There are at least three priorities for cluster testing, listed in order of priority below.
1. Ensure feature does not corrupt/lose data or crash/destabilize Accumulo. 2. Ensure that external compactors can be scaled independently of tservers as needed. 3. Measure the impact on queries of moving compactions outside of the tserver. Could achieve #1 with the following test. a. Spin up cluster w/ muchose b. Manually start coordinator on Manager VM c. Manually start compactors on Tservers VMs d. Configure continuous ingest table to use external compaction e. Run continuous ingest for extended period f. Run continuous verify Could achieve test goal #2 by running compactors in K8s (instead of on tserver VMs) and using metrics emitted by Accumulo to automatically scale compactors up and down in response to compactions queued. This test is important because it will help identify feature gaps in the ability of a user to independently scale external compactors. If we find problems attempting to do this, it may lead to changes in the feature. Could possibly use the accumulo-docker repo as a base for running compactors in K8s. May need to write custom code to scale compactors based on metrics from Accumulo. Goal #3 could be achieved by running continuous ingest+query on two different systems and comparing the query timings. One system would be configured to use external compactions and the other not. Seems like #1 and #2 must be tested, thinking the results of #3 would be positive based on past experience and it would be nice to test but maybe not required. On Tue, May 11, 2021 at 10:34 AM Dave Marion <[email protected]> wrote: > > Keith and I have been working on a solution for issue #1451 - being able to > run major compactions outside the tablet server. This would enable > compactions to run when tables are offline, tablet servers die, and tablets > are balancing. We have created two pull requests, one for the code[1] > changes and another for the documentation[2] changes. > > This change introduces two new optional components in the architecture. The > CompactionCoordinator is much like the Manager in that it is a singleton in > the system and it manages the state of external compactions across the > system. The CompactionCoordinator is started with the command: > > bin/accumulo compaction-coordinator > > The Compactor is the other optional component. There can be many > Compactor's running in the system and each Compactor runs one compaction at > a time. It communicates with the CompactionCoordinator to get information > about the next compaction that it needs to complete and to relay the status > of the compaction. The Compactor is started with the command: > > bin/accumulo compactor -q <queueName> > > The queueName parameter should match the name of the external queue set in > the compaction service options. This allows an administrator to define > different compaction services for tables, each with their own queue, and to > scale the number of Compactors differently. For example we can define a > compaction service named cs1, then create a table and configure it to use > the compaction service: > > config -s > tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner > config -s > 'tserver.compaction.major.service.cs1.planner.opts.executors=[{"name":"all","externalQueue":"Q1"}]' > createtable test > config -t test -s > table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher > config -t test -s table.compaction.dispatcher.opts.service=cs1 > > Compactions on table "test" will occur externally by starting the > CompactionCoordinator and Compactor with queueName "Q1". > > With regards to testing, we have unit and integration tests. > ExternalCompactionIT has pretty decent coverage. We have also tested > locally with multiple Compactors using uno. We are hoping to perform a > cluster test soon, potentially deploying the Compactors using k8s and it's > horizontal pod scaler for a follow-on blog post. Please let us know if you > are interested in helping out with testing. > > > [1] https://github.com/apache/accumulo/pull/2096 > [2] https://github.com/apache/accumulo-website/pull/282
