keith-turner commented on PR #3801:
URL: https://github.com/apache/accumulo/pull/3801#issuecomment-1749904934
@dlmarion if you have some time to chat sometime I would like to discuss
some questions I have. Wondering about the following.
Where things run? Maybe instead of having a task runner executable, task
runner is more of an internal code library that user facing executable
components instantiate. For example thinking through what the user facing
accumulo commands could look like and what they could do.
```
accumulo compactor -- runs compactions, instantiates a task runner in its
impl to make this happen
accumulo tserver -- hosts tablets and does log sorts, it runs a task
runner inside of a tablet server process to do log sorts
accumulo manager -- runs fate, assigns tablets, does split
calculations... non-primary manager processes can run a task runner process to
do split calculations
```
What task return? Maybe nothing, could we structure all task such that
there is no return of data to manager? A task runner gets a task from the
manager, runs it, and when done gets another task. It never reports completion
to the manager or status.
- compaction task run the compaction and commit it to the metadata table
as part of the task
- compaction task do not report stats back to the manager, but only to the
metrics system. Could the monitor get the data it needs? Maybe the monitor
contacts task runners directly if it wants info?
- log sort task sort the logs and create the appropriate dirs in hdfs when
done
- split task could update the metadata table with the needed information
instead of reporting back
If task do not return anything, then that simplifies the thrift API and the
manager possibly. Would not need to worry about keeping info in memory in the
manager and keeping that info consistent and avoiding using to much memory.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]