dlmarion commented on PR #52:
URL:
https://github.com/apache/accumulo-classloaders/pull/52#issuecomment-3785157997
> * I think this was done to provide a mechanism to pause to support a
cleanup.
Yes, the earlier getReferencedFiles JMX endpoint, and this new pause
endpoint, are building blocks to help users create a mechanism to do cleanup of
the locally cached files. There are likely other ways to do the cleanup, but I
think the only other way to do it while the processes are not creating new
classloaders is to shut down the processes that are using the same local cache
directory.
> * I think it's a bit overkill. I don't think cleaning up the local cache
is going to be needed very often, as it's unlikely to cause any serious storage
issues (it likely contains orders of magnitude less than a user's
`.m2/repository/` storage, which most people don't really need to clean up very
often).
> * If people do want to clean up, I don't think this really helps all that
much, because proper cleanup really depends on the user's knowledge of the
totality of their system's state, and there are better options for cleaning up
if it becomes critical for the user to do so: stop java processes and clear the
directory. If you're actually running out of storage space and need to clean
up, then stopping everything on the system in order to do a clean isn't really
that dramatic, since a full disk will be much worse. And, if you're not
reaching full disk, it's probably not critical to clean up the local cache.
I think the frequency with which the local cache needs to be cleaned up is
going to depend on how the processes are deployed. There are two situations
where I can see local cache directory not being straightforward.
1. When the Accumulo process is run in a Docker container and the local
cache directory is not mapped to a host volume. In this case the local cache
directory will be inside the container and not shared. If this is done on
purpose for some reason, then the user could start a cleanup process in the
entrypoint script when starting the Accumulo process.
2. A Kubernetes deployment is a similar situation to the previous point.
It may be simpler in this model to have the local cache directory inside the
Pod and start a sidecar container to clean up the directory. Another approach
would be to have each Pod use a PersistentVolume that is mapped to a directory
on the host. But I'm not sure how you do the cleanup - maybe you can have a
cleanup type Pod that runs on each host, but I'm not sure how it would
coordinate with the other Pods on the same host.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]