bdoyle0182 opened a new pull request, #5355: URL: https://github.com/apache/openwhisk/pull/5355
## Description This is a critical bug that's gone unsolved for several years. Since attachments are considered secondary files to the main action document, if the attachment is lost for whatever reason (upload timeout, concurrent upload, bad database replication, etc.) the action document is therefore corrupted in a stuck state because the document factory is set up to do a get of the document first when performing a `putEntity` or `deleteEntity`. The `WhiskAction` type overrides the `get` operation to include getting the attachment as a part of the `get` operation. However once the root document is retrieved, the attachment is then tried to be read and returns a not found making the `get` return a `DocumentNotFoundException` to the `putEntity` or `deleteEntity`. You can therefore never delete or update an action (that is broken since the attachment was lost) once in this state through the openwhisk api and must go through an openwhisk admin to delete the document directly from the artifact db. Worse, for updating an action it incorrectly returns a 409 to the user when a conflict isn't what's actually happening in this case. The `get` of the `WhiskAction` correctly returns a `DocumentNotFoundException` when the attachment is not found, however the `putEntity` function then catches this and tries to put the document without a revision thinking it is a brand new document for the artifact db when the root document does in fact exist so the db returns a 409 when a put request is sent to it without a revision; which is what is then fed back to the user on any update action attempt. I have implemented a simple fix that finally resolves this issue for updating and deleting an action in this missing attachment state without impacting any other behavior of the system outside of the update and delete action api. The change is simple in that we recover in the `get` of the `WhiskAction` type if getting the attachment fails for not being found only if an ignore boolean is set. This ignore boolean is only set when `get` is called within `putEntity` and `getEntity`, and only has an effect on the `WhiskAction` type. I've reproduced and verified in my environment that this change resolves the issue. ## Related issue and scope - [X] I opened an issue to propose and discuss this change (#651) ## My changes affect the following components - [ ] API - [X] Controller - [ ] Message Bus (e.g., Kafka) - [ ] Loadbalancer - [ ] Scheduler - [ ] Invoker - [ ] Intrinsic actions (e.g., sequences, conductors) - [ ] Data stores (e.g., CouchDB) - [ ] Tests - [ ] Deployment - [ ] CLI - [ ] General tooling - [ ] Documentation ## Types of changes - [X] Bug fix (generally a non-breaking change which closes an issue). - [ ] Enhancement or new feature (adds new functionality). - [ ] Breaking change (a bug fix or enhancement which changes existing behavior). ## Checklist: - [X] I signed an [Apache CLA](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md). - [X] I reviewed the [style guides](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md#coding-standards) and followed the recommendations (Travis CI will check :). - [ ] I added tests to cover my changes. - [ ] My changes require further changes to the documentation. - [ ] I updated the documentation where necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
