Hi Jan Willem, Thanks for taking the time to review and respond.
Regarding the ACE - Chef integration, this idea came about as a result of brainstorming around some of our provisioning pain points. Specifically, we are concerned about using ACE to perform updates to key infrastructure, such as the JRE, the OS and the agent itself. We are also concerned about how we would hook into the deployment lifecycle on the target to do things such as aborting deployment if the target is “in use”. Using a deployment manager that runs outside of the JRE would help solve some of those problems. Deployment managers have a well defined lifecycle and are generally easy to extend. However, the standard "update mechanism" with these tools is to download (to the target) and install a complete “application" with each update. This sort of works against the OSGi way (invalidating the bundle-cache) and is wasteful in terms of downloading bundles that haven’t been updated. ACE’s core features are around the ability to perform incremental/partial updates in the container, however in order to apply this capability ACE has to provide a complete Deployment Management tool. I am envision an integration with Chef (et al) that uses ACE to perform OSGi updates, while Chef manages the deployment lifecycle, logging, scheduling, alerting and etc. If this were a thing, we’d probably use it to deploy to our embedded environments in the field as well as to our cloud infrastructure. >> We need a well defined/documented way to hook into the deployment lifecycle >> in the agent to support (at least) the following. >> - Ability to execute code when deployment starts and/or completes (with >> success or failure). >> - Programatically abort a deployment attempt. >> - Define deployment windows (for example, to configure a deployment via the >> UI, but know that the deployment won't start until the target is within the >> deployemnt window) > > While this is currently possible with the Agent, it is not clearly > documented. To be honest, I do have a todo on my list that should describe > most of your wishes. I’ve created > https://issues.apache.org/jira/browse/ACE-537 as a “gentle” reminder. My understanding is that the Agent code is basically kept completely separate from the client and server parts of ACE because the agent can’t be updated. In fact any bundles that ACE doesn’t provision can’t be updated. So while it’s possible to hook in to the agent, without being able to provision/update that code, the feature is not very useful to us. Perhaps my understanding is incorrect. >> Managing all the various config files has been problemattic. Often the same >> values are replicated in multiple config files. > > What do you exactly mean by this? The way that “tags” are used in the UI, or > …? I was referring to the ACE config files and how, for example if you want to enable https, you basically have to make setting changes in multiple config files (e.g. connection factory). I realize that this is driven by the bundle structure, but there are better ways to manage common settings. >> We have to clean our internal Dev/QA ACE instance every few weeks because >> the server becomes unresponsive. Restarting ACE does not resolve the issue. >> - We're using CDS with around 150 artifacts in each distribution and around >> 10 and 20 targets at any given time, so I'm sure there's a lot of metadata, >> but it's frustrating to deal with these performance issues. >> - We're very concerned that this issue will bite us in production. We will >> need to keep at least two or tree releases in ACE and other than completely >> blowing ACE (OBR and bundle-cache) away, we don't have a good mechanism to >> clean up old distributions. > > Hm, that is seriously bad. We’ve put some effort in tuning and dealing with > large repositories (up to 100 targets with 1000+ artifacts) which yielded > several improvements. We might want to dive deeper into this. Are you willing > to create an issue on JIRA with some details on this so we are able to > replicate this issue? Yes, absolutely willing to create a ticket. Reproducing may be a bit more tricky as I would need to figure out how to do so without using our custom bundles. As an aside, each time we encounter this issue and I have to “clean” the system, I make a copy of the bundle-cache and store directories. Of course, these contain proprietary code, so we’d have to be careful with whom and how they are inspected, but would this help triage the issue? >> Rolling, file-based logging out of the box >> - Log target activity (deployment started/completed, results, etc) > > Logging at the target or at the server? You can currently configure the > number of entries you want to retain in ACE; this would provide a rolling > file-based logging. My understanding is that the Felix Log Service implementation keeps a circular queue of log events that can be displayed using Gogo, however in order to actually log to a file, we are using Pax Logging (w/ log4j). I’m assuming that something like this would be required to have ACE emit logs to a file. While we’re capable of doing this, it gets into the question of “how does one customize ACE”? This is a question for which we don’t currently have a clear answer. We’ve basically taken the route of repackaging, in order to include custom bundles. While this makes sense for application specific, custom code, it seems like overkill for something as basic as logging to a file. Another issue we’ve run into has to do with custom resource processors. We have defined custom resource processors (and helpers, etc) that we use to provision non-OSGi artifacts. However, we have provision the underlying services before we can provision a custom artifact. This means that our initial “install” can only contain OSGi resources and we have to perform a two step process to provision a custom artifact on a new target. Thanks again, Bree > On Feb 26, 2016, at 7:13 AM, Jan Willem Janssen > <[email protected]> wrote: > > Hi Bree, > > Thanks for your elaborate feedback! This reply was lingering in my drafts for > unknown reasons, sorry for the delayed response. > >> On 03 Feb 2016, at 19:28, Bree Van Oss <[email protected]> wrote: >> >> Following are a list of issues we have encountered with ACE (2.0.1) and a >> few suggestions. >> >> Before getting into that, I’m curious if you guys have considered providing >> plugins for one or more Deployment/Configuration Management platforms (e.g. >> Chef, Ansible, Salt)? As I see it, ACE's primary value add is the ability to >> perform incremental, live updates to an OSGi container. Would be awesome to >> leverage the larger communities that have grown up around these toolsets. > > Interesting thought; I’ve only limited experience with these other toolkits, > so might I ask how you would envision this? Let Ace interact with these > platforms, or the other way around? > >> Issues >> >> UI's lack of support for concurrent workspace edits leads to conflicts in >> high-use environment like our internal ACE server used to provision Testers' >> environments. >> >> Current UI is clunky and unpredictable. > > Totally agree. The UI is something we definitely should overhaul and improve > upon. > >> We've had difficulty using GoSH as a scripting language. It's syntax is >> non-obvious and does not seem to follow any "standard" syntax metaphors. > > Agreed. Our idea was to redesign the whole “client” API to more up-to-date > standards. This also would involve the way you interact with the repository > models in ACE. Using Gogo script is not on the roadmap for that, I can assure > you that ;) > >> We need a well defined/documented way to hook into the deployment lifecycle >> in the agent to support (at least) the following. >> - Ability to execute code when deployment starts and/or completes (with >> success or failure). >> - Programatically abort a deployment attempt. >> - Define deployment windows (for example, to configure a deployment via the >> UI, but know that the deployment won't start until the target is within the >> deployemnt window) > > While this is currently possible with the Agent, it is not clearly > documented. To be honest, I do have a todo on my list that should describe > most of your wishes. I’ve created > https://issues.apache.org/jira/browse/ACE-537 as a “gentle” reminder. > >> >> Managing all the various config files has been problemattic. Often the same >> values are replicated in multiple config files. > > What do you exactly mean by this? The way that “tags” are used in the UI, or > …? > >> >> We had to extend the ConnectionManager and ConnectionFactory (server and >> agent) to provide support for HTTPS with PKCS#11 > > We do not support PKCS#11 out of the box. This might be a good thing to do. > Also, the documentation on using HTTPS is lacking at the moment. I’ve created > https://issues.apache.org/jira/browse/ACE-538 for this. > >> Certificate validation should be implemented using PKIX trustmanager via >> Jetty. No custom code should be required to do basic validation, >> revocation-checking, etc. > > You are referring to the code in connection factory & the authentication? > What custom code are you referring to? > >> >> We have to clean our internal Dev/QA ACE instance every few weeks because >> the server becomes unresponsive. Restarting ACE does not resolve the issue. >> - We're using CDS with around 150 artifacts in each distribution and around >> 10 and 20 targets at any given time, so I'm sure there's a lot of metadata, >> but it's frustrating to deal with these performance issues. >> - We're very concerned that this issue will bite us in production. We will >> need to keep at least two or tree releases in ACE and other than completely >> blowing ACE (OBR and bundle-cache) away, we don't have a good mechanism to >> clean up old distributions. > > Hm, that is seriously bad. We’ve put some effort in tuning and dealing with > large repositories (up to 100 targets with 1000+ artifacts) which yielded > several improvements. We might want to dive deeper into this. Are you willing > to create an issue on JIRA with some details on this so we are able to > replicate this issue? > >> Suggestions >> >> UI/REST-API revamp >> - support multiple users performing concurrent updates > > actually, that is supported, the only nasty thing is that you have to resolve > conflicts by hand. This is something we should improve on. > >> Persistence revamp >> - support multi-user scenarios >> - support for Nexus as OBR > > We’ve recently upgraded the default OBR version to Felix OBR, which supports > the latest OBR specification. I’ve to check if this would imply that you can > support Nexus out of the box. > >> Rolling, file-based logging out of the box >> - Log target activity (deployment started/completed, results, etc) > > Logging at the target or at the server? You can currently configure the > number of entries you want to retain in ACE; this would provide a rolling > file-based logging. > >> Deployment history >> - Via the UI > > +1, definitely. > >> Better support for config files when using CDS style deployment scenario >> - When using CDS, config files that have not changed are _always_ uploaded >> - Config files uploaded using CDS are not grouped like multiple versions of >> JARs are > > Agreed. We should support the grouping of artifacts on all artifacts, not on > bundles only. The uploading of unchanged configuration files is something we > need to look in to, it sounds like a bug to me. > >> Upgrade Jetty from 7 to latest (9.x) >> Upgrade Felix Dependency Manager > > Both of these are part of the newly released Apache Ace 2.1.0. > >> Provide a secure (HTTPS) configuration out of the box. > > Fair point; we could include “snake oil” certificates for testing purposes > out of the box with instructions on how to use your own certificates. At > least we should make it as easy as possible to switch to HTTPS. I’ve created > a JIRA issue (see above) for just that. > >> Provide a way to "cascade delete" a distribution via the UI. Delete should >> remove the distribution, related features and related artifacts, if they >> aren't being referenced by another distribution. > > Fair point. > >> Migrate to Git from SVN to facilitate more community involvement > > We’ve recently made some changes to ensure that you can use the Git mirror > (see [1]) if you’d like. Merging PRs is quite easy and straightforward to do. > >> Consider developing ACE-based plugins for DevOps tools like Chef and Ansible. > > As I wrote above, it is interesting to discuss this further, are you willing > to share your thoughts on this? > > > Thanks for your feedback, > > -- > Met vriendelijke groeten | Kind regards > > Jan Willem Janssen | Software Architect > +31 631 765 814 > > > My world is something with Amdatu and Apache > > Luminis Technologies > Churchillplein 1 > 7314 BZ Apeldoorn > +31 88 586 46 00 > > https://www.luminis.eu > > KvK (CoC) 09 16 28 93 > BTW (VAT) NL8170.94.441.B.01 >
