Thanks Richard and Nicholas, Nicholas - have you looked at SUIM (https://github.com/oaqa/suim) ?
It's also doing UIMA on Spark - I'm wondering if you are aware of it and how it is different from your own project? Thanks for any info, -John ________________________________________ From: Richard Eckart de Castilho [r...@apache.org] Sent: Friday, September 15, 2017 5:29 AM To: user@uima.apache.org Subject: Re: UIMA analysis from a database On 15.09.2017, at 09:28, Nicolas Paris <nipari...@gmail.com> wrote: > > - UIMA-AS is another way to program UIMA Here you probably meant uimaFIT. > - UIMA-FIT is complicated > - UIMA-FIT only work with UIMA ... and I suppose you mean UIMA-AS here. > - UIMA only focuses on text Annotation Yep. Although it has also been used for other media, e.g. video and audio. But the core UIMA framework doesn't specifically consider these media. People who apply it UIMA in the context of other media do so with custom type systems. > - UIMA is not good at: > - text transformation It is not straight-forward but possible. E.g. the text normalizers in DKPro Core make use of either different views for different states of normalization or drop the original text and forward the normalized text within a pipeline by means of a CAS multiplier. > - read data from source in parallel > - write data to folder in parallel Not sure if these two are limitations of the framework rather than of the way that you use readers and writers in the particular scale-out mode you are working with. > - machine learning interface UIMA doesn't offer ML as part of the core framework because that is simply not within the scope of what the UIMA framework aims to achieve. There are various people who have built ML around UIMA, e.g. ClearTK (https://urldefense.proofpoint.com/v2/url?u=http-3A__cleartk.github.io_cleartk_&d=DwICAw&c=o3PTkfaYAd6-No7SurnLtwPssd47t-De9Do23lQNz7U&r=SEpLmXf_P21h_X0qEQSssKMDDEOsGxxYoSxofi_ZbFo&m=tAU9eh1Sq_D-L1P4GfuME4SQleRf9q_7Ll9siim5W0c&s=J1-BGfzlrX9t3-Vg5K7mAVBHQSb7M5PAbTYIJoh6sOM&e= ) or DKPro TC (https://urldefense.proofpoint.com/v2/url?u=https-3A__dkpro.github.io_dkpro-2Dtc_&d=DwICAw&c=o3PTkfaYAd6-No7SurnLtwPssd47t-De9Do23lQNz7U&r=SEpLmXf_P21h_X0qEQSssKMDDEOsGxxYoSxofi_ZbFo&m=tAU9eh1Sq_D-L1P4GfuME4SQleRf9q_7Ll9siim5W0c&s=kye5D2izwKE_9V2QQW8leiKp0p-91U-CFwXJMFmCd3w&e= ) - and as you did, it can be combined in various ways with ML frameworks that specialize specifically on ML. Cheers, -- Richard