Question about cTAKES consumers - Database

2018-12-09 Thread Manuel Lamy
Hello,

I would like to persist cTAKES' not negated clinical findings and their CUI
in a relational database, right after the processing. Do you recommend me
to use a jdbc consumer directly? If yes, which kind of configurations have
to be made in order to put it to work? And which consumer exactly do you
recommend?

Thank you.

Best regards,

Manuel


Re: Output formats - CPE - cTAKES - Persist in database

2018-03-06 Thread Manuel Lamy
Hello Gandhi,

What I'm actually looking for is to persist the diseases, medications,
anatomical regions, procedures and signs/symptoms found by cTAKES in a
database.

I have thousands of clinical records to process, like a lot of them, and
performance is already a concern to me. So I'm studying what my options are
in order to do this.

For the first experimental stage of my research, I just outputted the
results in XMI format (I didn't know any better) and created a script to
regex all the findings. Needless to say, even if you have a really good
script with a great amount of processing capacity, with thousands of
records it is just not feasible, since it will take much time to run.
Another mistake I made was to use a SQLite database. I will need to use
something clearly more powerful and scalable like MySQL from now on.

My problem now is deciding which path to take. I've tried all the outputs
listed by Sean (all the different writers) and none of them seems easier to
process than the XMI. I would just like to have something more basic, like
to create a txt file for each record processed, and that txt file would
just have a row with medications discovered, another row with the diseases
disocvered, another row with procedures, etc. Something straightforward.
Another solution would be to work with JdbcWriter, but I don't find any
good documentation to start working with it.

Maybe you can give me some suggestions about which path to take? Thanks a
lot!

Best regards,

Manuel Lamy

2018-03-05 3:51 GMT+00:00 Gandhi Rajan Natarajan <
gandhi.natara...@arisglobal.com>:

> Hi Manuel,
>
> As far as I know cTAKES supports Pretty print and HTML format too. For
> more info on this, you may have to look at the cTAKES demo webapp code
> under https://github.com/healthnlp/examples/blob/master/ctakes-
> web-client/src/main/java/org/apache/ctakes/web/client/
> servlet/DemoServlet.java
>
> Also if you are looking for help on parsing XML output, have a look at the
> beta version of cTAKES REST service XML parsing code under
> https://github.com/GoTeamEpsilon/ctakes-rest-service/blob/master/ctakes-
> web-rest/src/main/java/org/apache/ctakes/rest/util/XMLParser.java
>
> Regards,
> Gandhi
>
>
> -Original Message-
> From: Manuel Lamy [mailto:mmvp...@gmail.com]
> Sent: Monday, March 05, 2018 8:59 AM
> To: dev@ctakes.apache.org
> Subject: Output formats - CPE - cTAKES - Persist in database
>
> Hello everyone,
>
> I'm using cTAKES clinical pipeline in order to process a lot of documents
> in a row.
>
> I'm using this command in the command line:  runClinicalPipeline.bat  -i
> input --xmiOut output  --user username  --pass password
>
> This works, adapted to my credentials and my paths of course. My problem
> is that I can only output in XMI format.
>
> My questions are the following:
>
> -Is it possible to output a different kind of format than XMI? If yes,
> what should I change in this command and what are the available formats?
>
> -It is of my interest to persist the structured clinical information
> extracted by cTAKES directly in a database. Is there a format that is more
> suitable to that task? At the moment, I can only output in XMI format. I
> built a parser in Perl with a lot of regex in order to process all the
> information in the XMI file and persist in a database. However, the XMI
> file has a complex structure and the script, despite of working well, is
> taking more time than it should to run and persist.
>
> If someone could give me some advice about what my possibilities are, I
> would be appreciated.
>
> Best regards,
>
> Manuel
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the named addressee you should not disseminate, distribute
> or copy this e-mail. Please notify the sender or system manager by email
> immediately if you have received this e-mail by mistake and delete this
> e-mail from your system. If you are not the intended recipient you are
> notified that disclosing, copying, distributing or taking any action in
> reliance on the contents of this information is strictly prohibited and
> against the law.
>


Re: Output formats - CPE - cTAKES - Persist in database [EXTERNAL]

2018-03-06 Thread Manuel Lamy
Hello Sean,

Thanks for the quick response as always. I've tried several of those
writers and any of them gives me what I pretend in order to conduct my
research successfully.

What I'm aiming for is an output that is easily processed (the opposite of
the XMI obtained), in order to persist in a database after at ease.

What I want to persist in a database is only the diseases, medications,
anatomical regions, clinical procedures and signs/symptoms, associated with
each clinical record passed to cTAKES. So clearly I just want the most
standard findings made by cTAKES, nothing from the other world.

Now I have three option that I can think of in order to accomplish the
objective:


   1. Try to mesh and work with the JdbcWriterTemplate. This would fit my
   needs, by the name of it. But for what I've already seen, people usually
   have problems putting this to work properly, since the configuration is not
   straighforward. So I guess this would be a rough path to take, what you
   think? Read my other two options and maybe you'll understand my doubts.
   2. The second option would be to have an output that is so
   straighforward, that I could build a script and regex the sake of it, in
   order to obtain the clinical entities that I want (enunciated above). I'm
   thinking about a txt file that would just have something like: "Diseases ->
   diseases a, disease b   \n   Medications -> medication a, medication b,
   etc" This way I could just run a script and grab all the clinical entities.
   The processing performance would be much better than the XMI since it would
   have just some lines with what I want. From the formats that I tried and
   worked, none of them seems easily processable.
   3. This one would be rough probably, but maybe "write my own writer",
   that would perform like described in point 2.


So Sean, I'm again at doubt about which path to take. I have thousands of
records coming at me soon and I'll have to make decisions. I hope that, as
always, you can help me taking the most efficient path to do the job.

If I'm overestimating the difficulty of putting JdbcWriterTemplate to work,
please tell me. I already have the Dev version of cTAKES for several months
now so I'm already kinda conversant with the system already.

Thanks again!

Best regards,

Manuel


2018-03-05 15:35 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:

> Hi Manuel,
>
> The default clinical pipeline runs a piper file located in ctakes-core-res
> [1].  If you are running using a ctakes binary build, which is how it
> looks, you can find the file in:
> Resources/org/apache/ctakes/core/pipeline/DefaultFastPipeline.piper
>
> You can edit this file and add a different writer at the end / bottom.
> There are a lot of file writers available, more than I have time to fully
> describe, but below is a partial list.
>
> pretty.html.HtmlTextWriter
> pretty.plaintext.PrettyTextWriterFit
> property.plaintext.PropertyTextWriterFit
> CuiCountFileWriter
> CuiListFileWriter
> CuiLookupLister
> HtmlTableCasConsumer
> SentenceTokensPrinter
> TextSpanWriter
> TokenFreqCasConsumer
> TokenOffsetsCasConsumer
>
> As you have seen, xmi output contains everything under the sun.  The first
> three writers in the list create output with information that is most
> commonly desired (cuis, negation, uncertainty, etc.).  The rest are more
> focused in their output.  You can add the whole list to the end of the
> piper file mentioned above, prefixing each with the "add " command, or just
> add them individually.  Then make sure that you specify "-o
> " in your command line.  Some of the older writers may not
> accept -o as a valid parameter value specifier, in which case you may need
> to do something different.  Ending with "CasConsumer" is a good giveaway
> that the writer is one of the older types.
>
> There is a JdbcWriterTemplate that was built to write to a database, but
> it requires a fair amount of configuration.
>
> Sean
>
> [1]  https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files
>
>
>
> -Original Message-
> From: Manuel Lamy [mailto:mmvp...@gmail.com]
> Sent: Sunday, March 04, 2018 10:29 PM
> To: dev@ctakes.apache.org
> Subject: Output formats - CPE - cTAKES - Persist in database [EXTERNAL]
>
> Hello everyone,
>
> I'm using cTAKES clinical pipeline in order to process a lot of documents
> in a row.
>
> I'm using this command in the command line:  runClinicalPipeline.bat  -i
> input --xmiOut output  --user username  --pass password
>
> This works, adapted to my credentials and my paths of course. My problem
> is that I can only output in XMI format.
>
> My questions are the following:
>
> -Is it possible to output a different kind of format than XMI? If ye

Output formats - CPE - cTAKES - Persist in database

2018-03-04 Thread Manuel Lamy
Hello everyone,

I'm using cTAKES clinical pipeline in order to process a lot of documents
in a row.

I'm using this command in the command line:  runClinicalPipeline.bat  -i
input --xmiOut output  --user username  --pass password

This works, adapted to my credentials and my paths of course. My problem is
that I can only output in XMI format.

My questions are the following:

-Is it possible to output a different kind of format than XMI? If yes, what
should I change in this command and what are the available formats?

-It is of my interest to persist the structured clinical information
extracted by cTAKES directly in a database. Is there a format that is more
suitable to that task? At the moment, I can only output in XMI format. I
built a parser in Perl with a lot of regex in order to process all the
information in the XMI file and persist in a database. However, the XMI
file has a complex structure and the script, despite of working well, is
taking more time than it should to run and persist.

If someone could give me some advice about what my possibilities are, I
would be appreciated.

Best regards,

Manuel


Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

2018-01-25 Thread Manuel Lamy
Yes, MeSH surely has some support for the Portuguese language.

However, MeSH can't help me reaching my goal I guess, since it is focused
in metadata about biomedical articles and is not a dictionary of terms as
SNOMED-CT is.

Or I'm not quite well understanding the purpose of MeSH.

I have to investigate OpenEMR sure.

Well, a first look of the Excel you sent me shows 8691 entries. Just to put
things in proportion SNOMED-CT has at least 300,000 Clinical Terms. And a
lot of Portuguese terms in that Excel you sent me aren't filled and/or are
incorrect for what I see.

Not trying to be condescendent at all, just shows how bad is the Portuguese
situation in this domain right now. What makes my goal a little bit harsher
:)

I'll see what I can do with OpenEMR. My last solution will be the direct
translation of my EMR's to English, but I'm afraid the performance of the
system will be too much compromised.

Thanks a lot for the references Sean!

Best regards,

Manuel

2018-01-25 23:15 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:

> Yeah, the translation is going to require a bit of effort.
>
> I didn't know that there is no Portuguese in the snomed international.
> However, there should be other parts of the umls with Portuguese.  It looks
> like MeSH has at least a bit of Portuguese:
> https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MSHPOR/
>
> You can probably find other sources.  OpenEMR is a cool project with
> available medical term translations.  Some info here:
> http://www.open-emr.org/wiki/index.php/OpenEMR_Internationalization_
> Configuration
> A spreadsheet with terms:
> https://docs.google.com/spreadsheets/d/1i2_WsjBX9cwa9mx0gIv3psMzQ28VsUZ-
> MqlAyZcmbX0/edit?hl=en=en
>
>
> Cheers,
> Sean
>
>
> -Original Message-
> From: Manuel Lamy [mailto:mmvp...@gmail.com]
> Sent: Thursday, January 25, 2018 6:02 PM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> Thanks for the awesome inputs as always!
>
> *SNOMED*
>
> Afaik SNOMED doesn't exist in the Portuguese language yet. As per my
> research and this reference[1], SNOMED-CT is only translated for Australian
> English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear
> about SNOMED-CT translated to Portuguese somewhere?
>
> I may have to come with a solution for this. It's possible for me to build
> a mechanism that tries to translate all SNOMED-CT from English to
> Portuguese, or from the Spanish since it's a close language from
> Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a
> direct translation tool. However, this path will not be easy to take on my
> own. But it's a possibility though. I have to think about it.
>
>
> *Bat files in Development Version*
>
> I was trying to run the bat files that were inside the module
> ctakes-distribution of my Dev Version. I guess that was my problem after
> all.
>
>
> *Translation*
>
> Yes, I know cTAKES won't translate for me. I was thinking in using an
> offline translator and adding it to my pipeline. I have yet to find a
> translator that is half as good as the Google Translator though. I don't
> want to rely in an online translator.
>
>
> *Wiki Documentation*
>
> Thanks for your compliment. Sure, I would love to help. I like to express
> myself as clear as possible.
>
> However, my knowledge about the system is still limited. I only started
> using cTAKES a couple of months ago.
>
> But if I can help with something just ask me, I would be glad to help.
>
>
> Thanks a lot Sean.
>
> Best regards,
>
> Manuel Lamy
>
>
> [1] -
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.
> snomed.org_snomed-2Dct_snomed-2Dct-2Dworldwide_translations-
> 2Dof-2Dsnomed-2Dct=DwIFaQ=qS4goWBT7poplM69zy_
> 3xhKwEW14JZMSdioCoppxeFU=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao=
> UOqyIzWZMiqyTVgWAmnNt4cTeil7s4oY_RwVgTy1XI8=Dglu7v-Nns-
> ao41Lbn-oe6MAFF2cEWlkJ-8NQmHv7Xk=
>
>
> 2018-01-25 20:43 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:
>
> > Hi Manuel,
> >
> > Thank you for the information.  I have a couple of response lines …
> >
> >
> > > I need to do it because cTAKES seems to not work with the Portuguese
> > language at all
> > - Yes and no … You can create a dictionary of terms in
> > the Portuguese language.  This would allow ctakes to at least
> > recognize these terms and save them for posterity.  However, the more
> > advanced processing available for English (negation, uncertainty
> > detection, etc.) will not be available.  If you can find other nlp
> > projects that work with Portugues

Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

2018-01-25 Thread Manuel Lamy
Hello Sean,

Thanks for the awesome inputs as always!

*SNOMED*

Afaik SNOMED doesn't exist in the Portuguese language yet. As per my
research and this reference[1], SNOMED-CT is only translated for Australian
English, Danish, Dutch, Spanish, Swedish, and USA/UK English. Did you hear
about SNOMED-CT translated to Portuguese somewhere?

I may have to come with a solution for this. It's possible for me to build
a mechanism that tries to translate all SNOMED-CT from English to
Portuguese, or from the Spanish since it's a close language from
Portuguese. I can use many sources, such as ICD-9/10, DBpedia or/and a
direct translation tool. However, this path will not be easy to take on my
own. But it's a possibility though. I have to think about it.


*Bat files in Development Version*

I was trying to run the bat files that were inside the module
ctakes-distribution of my Dev Version. I guess that was my problem after
all.


*Translation*

Yes, I know cTAKES won't translate for me. I was thinking in using an
offline translator and adding it to my pipeline. I have yet to find a
translator that is half as good as the Google Translator though. I don't
want to rely in an online translator.


*Wiki Documentation*

Thanks for your compliment. Sure, I would love to help. I like to express
myself as clear as possible.

However, my knowledge about the system is still limited. I only started
using cTAKES a couple of months ago.

But if I can help with something just ask me, I would be glad to help.


Thanks a lot Sean.

Best regards,

Manuel Lamy


[1] -
https://www.snomed.org/snomed-ct/snomed-ct-worldwide/translations-of-snomed-ct


2018-01-25 20:43 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:

> Hi Manuel,
>
> Thank you for the information.  I have a couple of response lines …
>
>
> > I need to do it because cTAKES seems to not work with the Portuguese
> language at all
> - Yes and no … You can create a dictionary of terms in the
> Portuguese language.  This would allow ctakes to at least recognize these
> terms and save them for posterity.  However, the more advanced processing
> available for English (negation, uncertainty detection, etc.) will not be
> available.  If you can find other nlp projects that work with Portuguese it
> may be possible to insert them into a ctakes pipeline.  The instructions
> for creating a custom dictionary are here (language selection is not
> documented but it is on the gui, download the umls with portugese snomed if
> you can):
> https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
>
> > What I have in mind is to create a pipeline system that first translates
> the texts from Portuguese to English
> - Probably a good way to go if you have a decent
> translation tool.
>
> > From my research, I couldn't find anything relevant in this topic.
> - We definitely could use more documentation.
>
> > Well, since this is the user version, I don't have the
> runPiperSubmitter.bat available
> - Correct.  It is a tool that was created after the 4.0
> release.
>
> > When I try to run the bat files inside the bin of the Dev Version, I
> have the results shown in the image attached to this e-mail.
> -  Your attachments were scrubbed so I can’t see them.
> However, I have a guess: did you run a “maven package”, unzip the created
> installation file and run from the bin/ directory there?  Or are you
> running with the bin/ inside your development sandbox?  The second method
> won’t work and will give you the “class not found” errors that you are
> seeing.  If you want to run using Intellij, turn on the profile
> “runPiperGui” and compile.  Maven should launch the gui after compilation.
>
> > Well, first of all, my objective is to share my experiences with cTAKES,
> in order to share with the community what I'm going through. This way I can
> contribute to the community and probably help others who are going through
> the same as me.
> -  Excellent.  Would you be willing to write documentation
> for the ctakes wiki?  Your emails are clear and extremely well formatted!
>
>
>   1.  Is this feasible? Am I aiming for something that I simply can't rely
> in cTAKES only to do, because I have to translate the texts first?
>
> -  Ctakes won’t translate for you, but if you can find a tool that
> will then processing with ctakes should be possible.
>
>   1.  Why don't I have a TypeSystem.xml file to feed CVD first, in the
> Development Version? I can only find it in the User Version, under
> /resources.
>
> -  The typesystem.xml file is in the ctakes-type-system project
> until you “maven package” and create an “installation”.  If you just run
> from your develo

Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

2018-01-25 Thread Manuel Lamy
I aiming for something that I simply can't rely
   in cTAKES only to do, because I have to translate the texts first?
   2. Why don't I have a TypeSystem.xml file to feed CVD first, in the
   Development Version? I can only find it in the User Version, under
   /resources.
   3. Why do we have options in CVD for other languages, but it clearly
   only works for the English language?
   4. Any other hint you can give me, concerning the big picture of what
   I'm trying to build here?

Any additional information you need from my side, just tell me.

Thanks one more time for the quick answers and support Sean.

Best regards,

Manuel


2018-01-25 15:35 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:

> Hi Manuel,
>
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty much
> deprecated.
>
> You should try to run ctakes 4.0.  If you are software savvy then I advise
> that you try the development version that is in trunk.  You’ve probably
> been on the ctakes download page, but just a reminder :
> http://ctakes.apache.org/
>
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0
>
> To start playing with ctakes I suggest that you try to run the default
> clinical pipeline, following the instructions here:
> https://cwiki.apache.org/confluence/display/CTAKES/Default+
> Clinical+Pipeline
>
> Those instructions will start the default clinical pipeline from a command
> line.  If you have the development version from trunk then there is a gui
> available to run pipelines:
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+
> File+Submitter+GUI
>
> There are also many other pipeline configurations available in trunk to
> run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the replacement
> for those xml descriptor files.  The replacements are called “piper files”.
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files
>
> I hope that you find the pipers easier to understand and use than the old
> xml descriptors.
>
> Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in
> the wiki page it will use the new FileTreeReader and FileTreeXmiWriter
> combination.
>
> Give it a whirl and let me know how things go.
>
> Sean
>
>
> From: Manuel Lamy [mailto:mmvp...@gmail.com]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello Sean,
>
> First of all, thanks for your quick answer.
>
> I'm probably making some confusion over here, so I have the following
> questions.
>
>
>   1.  A CAS Consumer is defined by a XML file. What you are implying is
> that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
> it's  tag to 'org.apache.ctakes.core.cc.FileTreeXmiWriter'
> instead of 'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny
> enough, it gives me a classNotFoundException if I do this. Would like to
> have your confirmation if I'm doing the right thing please. The class is
> well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to my
> descriptor and change it's  tag from '
> org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr.FileTreeReader'?
> I did these two things and the error is the same concerning the new
> consumer 'FileTreeXmiWriter', as you can see in the first image attached to
> this e-mail.
>
> I would also like to ask you another question:
>
>
>3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
> classes? You can see it in the second image attached to this e-mail. I
> can't seem to import them right. I tried to import the extension of this
> class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> defined for this project. If you could give me a little bit of
> clarification in this topic and how to solve it I would be appreciated.
>
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
>
> Best regards,
>
> Manuel
>
>
>
>
>
> 2018-01-24 22:26 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu
> <mailto:sean.fi...@childrens.harvard.edu>>:
> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
&g

Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]

2018-01-25 Thread Manuel Lamy
Hello Sean,

First of all, thanks for your quick answer.

I'm probably making some confusion over here, so I have the following
questions.


   1. A CAS Consumer is defined by a XML file. What you are implying is
   that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
   it's  tag to
   'org.apache.ctakes.core.cc.FileTreeXmiWriter' instead of
   'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny enough, it gives me
   a classNotFoundException if I do this. Would like to have your confirmation
   if I'm doing the right thing please. The class is well defined in that path
   though.
   2. Concerning the reader, I make the same analogy. Should I go to my
   descriptor and change it's  tag from
   'org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to
   'org.apache.ctakes.core.cr.FileTreeReader'?

I did these two things and the error is the same concerning the new
consumer 'FileTreeXmiWriter', as you can see in the first image attached to
this e-mail.

I would also like to ask you another question:


   3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
classes? You can see it in the second image attached to this e-mail. I
can't seem to import them right. I tried to import the extension of this
class only to check the result, and look how it solved the import to me.
'apache' is not recognized. I'm just kinda baffled with the hierarchy
defined for this project. If you could give me a little bit of
clarification in this topic and how to solve it I would be appreciated.

Thanks for your attention! I'm really looking forward to put this to work.
cTAKES seems awesome. It just needs these little tweaks.

Best regards,

Manuel





2018-01-24 22:26 GMT+00:00 Finan, Sean <sean.fi...@childrens.harvard.edu>:

> Hi Manuel,
>
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
>
> Sean
>
>
>
> From: Manuel Lamy [mailto:mmvp...@gmail.com]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
>
> Hello guys,
>
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
>
> Problem
>
> In the figure below, you can see my setup and the error I'm obtaining:
>
> [Imagem inline 2]
>
> Logs
>
> Concerning logs, I'm obtaining this from Intellij:
>
> org.apache.uima.resource.ResourceInitializationException
> at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:81)
> at org.apache.uima.impl.UIMAFramework_impl._
> produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
> at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
> at org.apache.uima.tools.cpm.CpmPanel.startProcessing(
> CpmPanel.java:573)
> at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.
> java:105)
> at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
> at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> produceIntegratedCasProcessor(CPEFactory.java:1093)
> at org.apache.uima.collection.impl.cpm.container.CPEFactory.
> getCasProcessors(CPEFactory.java:547)
> at org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(
> BaseCPMImpl.java:253)
> at org.apache.uima.collection.impl.cpm.BaseCPMImpl.(
> BaseCPMImpl.java:127)
> at org.apache.uima.collection.impl.CollectionProcessingEngine_
> impl.initialize(CollectionProcessingEngine_impl.java:73)
> ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
> ... 10 more
>
> Attempted Solutions
>
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> , from
>
>  org.apache.ctakes.core.cc.
> XmiWriterCasConsumerCtakes
>
> to
>
> org.apache.uima.tools.components.
> XmiWriterCasConsumer
>
>
>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what

Problem using CPE and XMI Writer CAS Consumer

2018-01-24 Thread Manuel Lamy
Hello guys,

I'm having problems running the CPE using a XMI Writer CAS Consumer.
However, it works with other consumers.

*Problem*

In the figure below, you can see my setup and the error I'm obtaining:

[image: Imagem inline 2]

*Logs*

Concerning logs, I'm obtaining this from Intellij:

org.apache.uima.resource.ResourceInitializationException
at
org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:81)
at
org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
at
org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAFramework.java:918)
at org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
at org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
Caused by: org.apache.uima.resource.ResourceConfigurationException
at
org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1093)
at
org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
at
org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
at
org.apache.uima.collection.impl.cpm.BaseCPMImpl.(BaseCPMImpl.java:127)
at
org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:73)
... 5 more
Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
cannot be created. (Thread Name: Thread-5)
... 10 more

*Attempted Solutions*

I only found one guy with the same problem as me. The solution proposed in
the thread, by Sean Finan, was to change the xml of my consumer
(__XmiWriterCasConsumer.xml), particularly the content of the tag
, from

 
org.apache.ctakes.core.cc.XmiWriterCasConsumerCtakes

to

org.apache.uima.tools.components.XmiWriterCasConsumer


However, this didn't work. The error is exactly the same. I'm out of
ideas about what to do. I would like to have the report of CPE in XMI,
in order to read it with CVD. You can see the thread here:

http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.mbox/%3c29cefd1fa1b44ce4a8dc92ec8b1cd...@chexmail1a.chboston.org%3E


*Result Expected*

Running the CPE process and have outputs as XMI files.


*Result Obtained*

Running the CPE results in an error, specifically for the consumer
__XMIWriterCasConsumer.


*Conclusion*

Do any of you guys had this problem before? Do you have a suggestion
about how can it be solved? Thanks a lot


Best regards,

Manuel