Hi, I would like to suggest a scale-out of UIMA by enabling it to run in a P2P environment.
>From my understanding, the CPE is a 1st generation scaleout, and it can run a distributed pipeline using vinci/soap but the machines involved in the pipeline are predefined in the UIMA descriptors. The 2nd generation scaleout is called UIMA-AS (AS = Asynchronous Scaleout), and is based on some Java and web standards, such as JMS (Java Messaging Service). It is now officially released on Apache UIMA. This allows users to selectively choose which parts of their pipeline to run in this mode, which in turn allows scaling out individual parts of the pipeline, as needed. Again there is no dynamic discovery of resources after startup. I would like to suggest a 3rd generation scaleout using a fully decentralized P2P network. Assume that each peer can publish its capabilities (namely which annotators it can run) and its current availability, then we may extend UIMA/UIMA-AS pipeline to discover an available and capable peer for running an annotator and thus achieve better load balancing and thus better performance than previous generations. What people on the list think about this? Thanks, Yosi
