Dear all After a while it is finally getting more concrete. We would like to add "Mechanism 1" to Apache Crail. I have created the following "New Feature" in JIRA to track the progress: https://issues.apache.org/jira/browse/CRAIL-111
We plan to borrow most of the mechanism we have in Pocket, but will need to adapt it to the current Apache Crail version, which is much newer and uses slightly different data structures. This implementation step is only about the mechanism. As already described, it won't be automatically invoked so that performance and behaviour is not being affected in any way, when running Apache Crail the usual way. What we would like to add in addition is a trigger mechanism to invoke the mechanism for testing purposes only. When running Apache Crail as an elastic storage service, a separate policy engine should decide when and how to scale-out or scale-in the running Crail instance. It will call into this mechanism to actually apply the decision. We plan to present a design idea for the policy engine later for a discussion among the Apache Crail community. Please feel free to comment on this first step. Thanks a lot Adrian From: "Adrian Schüpbach" <adrian.schuepb...@gribex.net> To: dev@crail.apache.org Date: 05/27/2020 11:32 AM Subject: [EXTERNAL] Mechanism that allow datanode to leave Dear all Crail supports dynamically adding new datanodes, while the Crail cluster is running. To build an elastic storage service, it would be nice to have mechanisms and protocol extensions, which allow a datanode to gracefully leave the cluster. Especially in serverless environments, it would be great if the Crail cluster could also dynamically grow and shrink according to current storage capacity needs. Since a while I am experimenting with an older version, which is being used in Pocket. I changed and extended it and gained some more experience, what I believe would be good to have. I also believe that adding such functionality natively to Apache Crail (instead of "around" Crail as in Pocket) would help making Crail a storage service choice in serverless environments. Furthermore, having this mechanisms natively in Crail does not harm running Crail the classical way. More concretely, I suggest to add the following: - Add mechanisms to gracefully leave datanodes (with the namenode's help) - Mechanism 1: Datanode leaves when no more blocks are allocated (as in Pocket) - Mechanism 2: Namenode helps to move blocks from the leaving datanode to a remaining datanode. "helps" does not mean, that the namenode has to perform the actual data copying, but only to find new blocks and update the file block lists. - Allow datanodes to express the wish to leave (ask namenode to initiate the process), by sending a message to the namenode. I would also volunteer to add this functionality. I would like to emphasize that I would like to add the mechanisms in a way that they do not get invoked automatically and that performance and current functionality will no be affected in any way, when Crail is used the usual way. Instead, adding the mechanisms allows building a dynamically scaling system based on policy code, which can also run outside of the namenode and datanodes. Thanks Adrian