Hi Folks, please find below a first idea on how to extend the current crypto backend and the workflow engine to support asyncronous key operations when using offline/offsite CA.
For each kind of such CA, we provide an own set of OpenXPKI::Crypto::API and its relateded classes. For most applications it would be sufficient to subclass from the present Backend::OpenSSL classes and just override the commands which differ from the default model. If an async operation is started, the main workflow returns into a "waiting for async operation" state. To complete the workflow after the async operation was done, we need to poll the async process for completion and restart/continue the workflow which is not natively possible with the current workflow model. A possible Solution based on the current workflow engine might be to fork of a polling process before the main workflow returns to the waiting state and reinject the workflow (or a superseeding one) on success. For async operations with a long latency or on heavy loaded systems this will not scale well as we need to keep one fork alive until the async request returns. Besides, a restart of the daemon will terminate all polling processes resulting in stale workflows that need manual care. Therefore I suggest a modification of the workflow engine to introduce pause/auto-pickup of workflows. Basic idea: create a workflow status table which records the current status of all unfinished workflows. Possible states are: * running - regularly executed by a running process * poll - workflow is in a state the needs regular polling * crashed - no handling process found but not in a regular state Each workflow registers and removes itself from the table with a running state or marks itself for polling. An external entity (thread inside the dameon) will check and trigger the requested polls. The "crash" states can handle several error situations, each one demanding its own logic: 1) If the daemon is restarted, all unfinished processes can be considered dead 2) Regularly check if the process is still alive (requires recording of the PID and might be a bit tricky when using forks) 3) In distributed environments (t.b.d!) the unavailybilty of a node is equal to 1) - requires the recording of the working node The approach might also deliver the groundwork for load balancing or special purpose nodes in a distributed environment. For example, a resource intensive task will register itself as "paused" and discontinue if the system load is to high or even if the current node is not suited for the requested operation. Obviously, it requires another decission system to delegate paused/unsuited jobs to another node. Comments welcome ;) Oliver -- Protect your environment - close windows and adopt a penguin! PGP-Key: 3B2C 8095 A7DF 8BB5 2CFF 8168 CAB7 B0DD 3985 1721 ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ OpenXPKI-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openxpki-devel
