Re: suggestion for default pipelines
It would be nice if uimaFIT provided a Maven plugin to automatically generate descriptors for aggregates. Maybe if we come up with a convention for factories, e.g. a class with static methods that do not take any parameters and that return descriptors, or methods that bear a specific Java annotation, e.g. @AutoGenerateDescriptor) it should be possible to implement such a Maven plugin. Cheers, -- Richard On 16.04.2014, at 05:21, Steven Bethard steven.beth...@gmail.com wrote: +1. And note that once you have a descriptor, you can generate the XML, so we should arrange to replace the current XML descriptors with ones generated automatically from the uimaFIT code. That should reduce some synchronization problems when the Java code was changed but the XML descriptor was not. Steve On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy timothy.mil...@childrens.harvard.edu wrote: The discussion in the other thread with Abraham Tom gave me an idea I wanted to float to the list. We have been using some UIMAFit pipeline builders in the temporal project that maybe could be moved into clinical-pipeline. For example, look to this file: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup with the static methods getPreprocessorAggregateBuilder() and getLightweightPreprocessorAggregateBuilder() [no umls]. So my idea would be to create a class in clinical-pipeline (CTakesPipelines) with static methods for some standard pipelines (to return AnalysisEngineDescriptions instead of AggregateBuilders?): getStandardUMLSPipeline() -- builds pipeline currently in AggregatePlaintextUMLSProcessor.xml getFullPipeline() -- same as above but with SRL, constituency parsing, etc., every component in ctakes We could then potentially merge our entry points -- I think Abraham's experience points out that this is currently confusing, as well as probably not implemented optimally. For example, either ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static method to run a uimafit-style pipeline. Maybe we can slowly deprecate our xml descriptors too unless people feel strongly about keeping those around. Another benefit is that the cTAKES API is then trivial -- if you import ctakes into your pom file getting a UIMA pipeline is one UimaFit call: builder.add(CTAKESPipelines.getStandardUMLSPipeline()); I think this would actually be pretty easy to implement, but hoping to get some feedback on whether this is a good direction. Tim
RE: ctakes-vm.apache.org
Hi Andy, Let me know if you're able to ssh -l -v to and...@ctakes-vm.apache.org I believe all you would need to do is run ssh-keygen and then copy your public key to id.apache.org. James: opiekey is only required for sudo access. I see you're able to log in successfully already. Nethertheless, Jan reset opiekey so you can try again if needed. https://issues.apache.org/jira/browse/INFRA-7451 --Pei -Original Message- From: andy mcmurry [mailto:mcmurry.a...@gmail.com] Sent: Monday, April 07, 2014 5:43 PM To: dev@ctakes.apache.org Subject: Re: ctakes-vm.apache.org On a mac here, also having trouble logging in to the VM. Looking more into the keys situation. On mac, there is this problem: no support for PKCS#11 Which I'm working to resolve. On Sat, Apr 5, 2014 at 12:15 PM, John Green john.travis.gr...@gmail.comwrote: Thanks Pei. Sure, that would be great, add me. Jg -- Sent from Mailbox for iPhone On Fri, Apr 4, 2014 at 10:15 AM, Chen, Pei pei.c...@childrens.harvard.edu wrote: John, You should have committer rights now... I would suggest opening a Jira item just so that it can be tracked. But you should be able create a subdir within https://svn.apache.org/repos/asf/ctakes/sandbox and do an svn commit. As a side note: ctakes-vm.apache.org has been created now. John, let me know if you would like to added as list of maintainers. We can use that machine to host any of the demo's. It requires passwordless ssh so you'll need to ssh-keygen and save them via http://id.apache.org. --Pei -Original Message- From: John Green [mailto:john.travis.gr...@gmail.com] Sent: Thursday, April 03, 2014 6:05 PM To: dev@ctakes.apache.org Cc: dev@ctakes.apache.org Subject: RE: ctakes-vm.apache.org Would love to! Ive only submitted those example notes I did though to a jira ticket. How do I push to the sandbox dir? Any special permissions I need? JG -- Sent from Mailbox for iPhone On Wed, Apr 2, 2014 at 10:51 PM, Chen, Pei pei.c...@childrens.harvard.edu wrote: John, If there are no other objections, you can also put it directly in sandbox https://svn.apache.org/repos/asf/ctakes/sandbox/ It may make it easier in the future if folks decided to integrate into cTAKES... and possibly save any potential IP/License questions... --Pei From: John Green [john.travis.gr...@gmail.com] Sent: Wednesday, April 02, 2014 6:24 PM To: dev@ctakes.apache.org Subject: Re: ctakes-vm.apache.org Great! Let me clean it up this weekend and ill throw it out onto my github. Will post link soon; nlt cob this weekend. JG -- Sent from Mailbox for iPhone On Wed, Apr 2, 2014 at 1:53 PM, andy mcmurry mcmurry.a...@gmail.com wrote: Yes! Impeccable timing. Where can we find the python source? On Apr 2, 2014 8:33 AM, John Green john.travis.gr...@gmail.com wrote: Andy: this is very interesting and exciting. I hacked out a script that makes a visually appealing representation of the aggregate pipeline in d3js that, at least for a clinician, is a nice overall summary of the meta data generated from the pipeline. Its really no more than a parser of the xml through the type system spitted out into json, but when I was talking to my informatics department who didnt know much at all about ctakes, it was a great visual summary. Its in python. I dont know if youd want it but it might be worth having the demo site spit out a visually appealing graphic like this automatically. If not in python it might be worth adapting it to whatever your using for a platform to spit out the json for the d3js graphic im using. John -- Sent from Mailbox for iPhone On Thu, Mar 20, 2014 at 5:31 AM, andy mcmurry mcmurry.a...@gmail.com wrote: Yes! I have been working full time on the apt-get install task specific to medical genetics: http://www.ncbi.nlm.nih.gov/medgen Right now, millions of $$$ are invested in getting phenotype concepts -- indications, diseases, problem lists -- linked to patient test results including DNA / RNA / etc. In industry, most of the curation work is done manually because platforms like cTAKES are not yet immediately accessible. I have written code to A) start automating the installer tasks for cTAKES on Ubuntu 13 B) install UMLS NLP tools metamap, semrep, semmed C) mirror NLM content that extends UMLS annotation *SO THAT : * Mentions of diseases relationships -- SNOMED-CT, HPO, OMIM, GTR, UMLS -- reference the same semantic relationships in UMLS Clinical Terms and Genetic Test Reference. This is powerful and all credit to the NLM for
Apache cTAKES Example Application?
We spent some time in the past to make it easier for users to launch the CVD/CPE. But based on the questions/discussions, I think we are passed this stage and a very common use case would be for developers to use cTAKES as a lib, extend a class or two and then, embed it into their existing app. I am proposing a ctakes-web-demo for the sandbox, A simple webapp- war/maven pom.xml that uses the ctakes as a dependency. Have a simple servlet that wires up a pipeline (uimaFIT style), and then dump the CAS as html table. We could even host it on: https://demo-ctakes.apache.org/ It will probably only be a few lines of code, but it may be a good starting point for developers who are more interested in using it as a lib and not necessarily modifying ctakes code. What do folks think? --Pei
RE: errors when run BagOfCUIsGenerator.java
Ying, Are you behind a proxy or firewall? If you're trying to use the umls resources, it attempts to make a call to their umls service to validate your credentials. --Pei -Original Message- From: Liu, Ying [mailto:l...@advisory.com] Sent: Wednesday, April 16, 2014 1:13 PM To: dev@ctakes.apache.org Subject: errors when run BagOfCUIsGenerator.java It failed when run BagOfCUIsGenerator.java. The followings are the error information. Thanks for your help. Ying Exception in thread main org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator failed. (Descriptor: file:/C:/Users/Ying/workspacectakes/ctakes/ctakes- dictionary- lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize AnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize (PrimitiveAnalysisEngine_impl.java:156) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analysi sEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(C ompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java: 269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework .java:387) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:25 4) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initAS B(AggregateAnalysisEngine_impl.java:431) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializ eAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializ e(AggregateAnalysisEngine_impl.java:185) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analysi sEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(C ompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java: 269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework .java:354) at org.uimafit.factory.AnalysisEngineFactory.createAnalysisEngineFromPath(An alysisEngineFactory.java:147) at org.apache.ctakes.clinicalpipeline.runtime.BagOfAnnotationsGenerator.init (BagOfAnnotationsGenerator.java:42) at org.apache.ctakes.clinicalpipeline.runtime.BagOfAnnotationsGenerator.init (BagOfAnnotationsGenerator.java:36) at org.apache.ctakes.clinicalpipeline.runtime.BagOfCUIsGenerator.init(BagOf CUIsGenerator.java:16) at org.apache.ctakes.clinicalpipeline.runtime.BagOfCUIsGenerator.main(BagOf CUIsGenerator.java:49) Caused by: org.apache.uima.resource.ResourceInitializationException at org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator.initi alize(UmlsDictionaryLookupAnnotator.java:79) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize AnalysisComponent(PrimitiveAnalysisEngine_impl.java:250) ... 18 more Caused by: java.net.ConnectException: Connection timed out: connect at java.net.DualStackPlainSocketImpl.connect0(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.SocksSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at sun.security.ssl.SSLSocketImpl.connect(Unknown Source) at sun.security.ssl.BaseSSLSocketImpl.connect(Unknown Source) at sun.net.NetworkClient.doConnect(Unknown Source) at sun.net.www.http.HttpClient.openServer(Unknown Source) at sun.net.www.http.HttpClient.openServer(Unknown Source) at sun.net.www.protocol.https.HttpsClient.init(Unknown Source) at sun.net.www.protocol.https.HttpsClient.New(Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNew HttpClient(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect( Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknow n Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(Un known Source) at org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator.isV alidUMLSUser(UmlsDictionaryLookupAnnotator.java:93) at
RE: errors when run BagOfCUIsGenerator.java
Try to open https://uts-ws.nlm.nih.gov If that works then try https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser and see if you get a message like This XML file does not appear to have any style information associated with it. The document tree is shown below. If that works and you are comfortable with the code, try with umlsaddr : https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser vendor : NLM-6515182895 /** * @param umlsaddr - * @param vendor - * @param username - * @param password - * @return true if the server at umlsaddr approves of the vendor, user, password combination */ public static boolean isValidUMLSUser( final String umlsaddr, final String vendor, final String username, final String password ) { String data; try { data = URLEncoder.encode( licenseCode, UTF-8 ) + = + URLEncoder.encode( vendor, UTF-8 ); data += + URLEncoder.encode( user, UTF-8 ) + = + URLEncoder.encode( username, UTF-8 ); data += + URLEncoder.encode( password, UTF-8 ) + = + URLEncoder.encode( password, UTF-8 ); } catch ( UnsupportedEncodingException unseE ) { LOGGER.error( Could not encode URL for + username + with vendor license + vendor ); return false; } try { final URL url = new URL( umlsaddr ); final URLConnection connection = url.openConnection(); connection.setDoOutput( true ); final OutputStreamWriter writer = new OutputStreamWriter( connection.getOutputStream() ); writer.write( data ); writer.flush(); boolean result = false; final BufferedReader reader = new BufferedReader( new InputStreamReader( connection.getInputStream() ) ); String line; while ( (line = reader.readLine()) != null ) { final String trimline = line.trim(); if ( trimline.isEmpty() ) { break; } result = trimline.equalsIgnoreCase( Resulttrue/Result ); } writer.close(); reader.close(); return result; } catch ( IOException ioE ) { LOGGER.error( ioE.getMessage() ); return false; } } -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Wednesday, April 16, 2014 1:25 PM To: dev@ctakes.apache.org Subject: RE: errors when run BagOfCUIsGenerator.java Ying, Are you behind a proxy or firewall? If you're trying to use the umls resources, it attempts to make a call to their umls service to validate your credentials. --Pei -Original Message- From: Liu, Ying [mailto:l...@advisory.com] Sent: Wednesday, April 16, 2014 1:13 PM To: dev@ctakes.apache.org Subject: errors when run BagOfCUIsGenerator.java It failed when run BagOfCUIsGenerator.java. The followings are the error information. Thanks for your help. Ying Exception in thread main org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator failed. (Descriptor: file:/C:/Users/Ying/workspacectakes/ctakes/ctakes- dictionary- lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.init ialize AnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.init ialize (PrimitiveAnalysisEngine_impl.java:156) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analys i sEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(C ompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java: 269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework .java:387) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java: 25 4) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init AS B(AggregateAnalysisEngine_impl.java:431) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init ializ eAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init ializ e(AggregateAnalysisEngine_impl.java:185) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analys i sEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(C ompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java: 269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework .java:354) at org.uimafit.factory.AnalysisEngineFactory.createAnalysisEngineFromPath (An alysisEngineFactory.java:147)
RE: errors when run BagOfCUIsGenerator.java
Sorry bother the email list. The problem is caused by my VPN connection. I connected to VPN and it didn't allow me to access any other website. So, my UMLS username and password didn't get through. Thanks, Ying From: Finan, Sean [sean.fi...@childrens.harvard.edu] Sent: Wednesday, April 16, 2014 10:30 AM To: dev@ctakes.apache.org Subject: RE: errors when run BagOfCUIsGenerator.java Try to open https://uts-ws.nlm.nih.gov If that works then try https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser and see if you get a message like This XML file does not appear to have any style information associated with it. The document tree is shown below. If that works and you are comfortable with the code, try with umlsaddr : https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser vendor : NLM-6515182895 /** * @param umlsaddr - * @param vendor - * @param username - * @param password - * @return true if the server at umlsaddr approves of the vendor, user, password combination */ public static boolean isValidUMLSUser( final String umlsaddr, final String vendor, final String username, final String password ) { String data; try { data = URLEncoder.encode( licenseCode, UTF-8 ) + = + URLEncoder.encode( vendor, UTF-8 ); data += + URLEncoder.encode( user, UTF-8 ) + = + URLEncoder.encode( username, UTF-8 ); data += + URLEncoder.encode( password, UTF-8 ) + = + URLEncoder.encode( password, UTF-8 ); } catch ( UnsupportedEncodingException unseE ) { LOGGER.error( Could not encode URL for + username + with vendor license + vendor ); return false; } try { final URL url = new URL( umlsaddr ); final URLConnection connection = url.openConnection(); connection.setDoOutput( true ); final OutputStreamWriter writer = new OutputStreamWriter( connection.getOutputStream() ); writer.write( data ); writer.flush(); boolean result = false; final BufferedReader reader = new BufferedReader( new InputStreamReader( connection.getInputStream() ) ); String line; while ( (line = reader.readLine()) != null ) { final String trimline = line.trim(); if ( trimline.isEmpty() ) { break; } result = trimline.equalsIgnoreCase( Resulttrue/Result ); } writer.close(); reader.close(); return result; } catch ( IOException ioE ) { LOGGER.error( ioE.getMessage() ); return false; } } -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Wednesday, April 16, 2014 1:25 PM To: dev@ctakes.apache.org Subject: RE: errors when run BagOfCUIsGenerator.java Ying, Are you behind a proxy or firewall? If you're trying to use the umls resources, it attempts to make a call to their umls service to validate your credentials. --Pei -Original Message- From: Liu, Ying [mailto:l...@advisory.com] Sent: Wednesday, April 16, 2014 1:13 PM To: dev@ctakes.apache.org Subject: errors when run BagOfCUIsGenerator.java It failed when run BagOfCUIsGenerator.java. The followings are the error information. Thanks for your help. Ying Exception in thread main org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator failed. (Descriptor: file:/C:/Users/Ying/workspacectakes/ctakes/ctakes- dictionary- lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.init ialize AnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.init ialize (PrimitiveAnalysisEngine_impl.java:156) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analys i sEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(C ompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java: 269) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework .java:387) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java: 25 4) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init AS B(AggregateAnalysisEngine_impl.java:431) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init ializ eAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.init ializ e(AggregateAnalysisEngine_impl.java:185) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analys i