Hi Zakir, I would strongly suggest against using MIST/Sharp. This is a fairly bespoke toolset in de-identifying data for a specific use-case. See "ii) Skip de-identification." in the Readme.
Also, see the steps near the bottom to get a sense of how to run this in a reasonable generic way: https://github.com/tmills/ctakes-docker#running-with-custom-dictionaries The documentation (and examples) aren't perfect. They should generally point you in the right direction, but PRs and notes (on your experience) are appreciated to improve them! Thanks, *Matthew Vita*Healthcare Software Engineer https://www.patreon.com/matthewvita On Tue, May 22, 2018 at 5:10 AM Zakir Saifi <zakir.sa...@raxa.com> wrote: > Hi Matthew, I have looked into the ctakes docker. I have followed the > instructions on the readme file and I am running it on my machine by > skipping de-identification and run the CVD point to the > desc/nodeidPipeline.xml. It is giving me the following error > > Unexpected error while initializing > "org.apache.uima.aae.jms_adapter.JmsAnalysisEngineServiceAdapter" from > descriptor file:/Users/zakirsaifi/ctakes-docker/desc/docker-mist.xml . > > > docker-mist.xml file > > <?xml version="1.0" encoding="UTF-8" ?> > <customResourceSpecifier xmlns="http://uima.apache.org/resourceSpecifier"> > > > <resourceClassName>org.apache.uima.aae.jms_adapter.JmsAnalysisEngineServiceAdapter</resourceClassName> > <parameters> > <parameter name="brokerURL" value="tcp://localhost:61616"/> > <parameter name="endpoint" value="mistQueue"/> > <parameter name="binary_serialization" value="false"/> > <parameter name="ignore_process_errors" value="false"/> > </parameters> > </customResourceSpecifier> > > > > > On Tue, May 22, 2018 at 10:20 AM, Zakir Saifi <zakir.sa...@raxa.com> > wrote: > > > Thanks, Matthew and Gandhi. I look into the docker solution. > > > > On Mon, May 21, 2018 at 9:59 PM, Matthew Vita <matthewvit...@gmail.com> > > wrote: > > > >> Hi all, > >> > >> It's not using the REST service, but this docker solution allows for > >> distributed queuing: https://github.com/tmills/ctakes-docker > >> > >> This may help with your large text volumes. > >> > >> > >> On Mon, May 21, 2018, 9:02 AM Gandhi Rajan Natarajan < > >> gandhi.natara...@arisglobal.com> wrote: > >> > >> > Even if you send it in batch, the processing will be sequential I > guess. > >> > You may have to run multiple instances of REST service to process huge > >> > volume of records. > >> > > >> > Regards, > >> > Gandhi > >> > > >> > -----Original Message----- > >> > From: Zakir Saifi [mailto:zakir.sa...@raxa.com] > >> > Sent: Monday, May 21, 2018 4:23 PM > >> > To: dev@ctakes.apache.org > >> > Subject: Batching Queries in Ctakes Web Rest > >> > > >> > I am using ctakesRestService to process unstructured clinical text. I > >> have > >> > a long list of records which I want to be structured. On average > Ctakes > >> > service for me is taking 3.6 seconds to process a record. I want to > >> *batch > >> > this process* in order to reduce time. Is there any way in which I can > >> sent > >> > number of queries to the ctakes web rest service in batch and get the > >> > appropriate result from it. My Ctakes version is 4.0.1. I have also > >> changed > >> > the default piper file and added other annotators for extracting more > >> > information like BackwardsTimeAnnotator, DocTimeRelAnnotator etc. > >> > > >> > This email and any files transmitted with it are confidential and > >> intended > >> > solely for the use of the individual or entity to whom they are > >> addressed. > >> > If you are not the named addressee you should not disseminate, > >> distribute > >> > or copy this e-mail. Please notify the sender or system manager by > email > >> > immediately if you have received this e-mail by mistake and delete > this > >> > e-mail from your system. If you are not the intended recipient you are > >> > notified that disclosing, copying, distributing or taking any action > in > >> > reliance on the contents of this information is strictly prohibited > and > >> > against the law. > >> > > >> > > > > >