Hi Folks,
I am using cTAKES 3.2.2 Maven dependencies.
I have some clinical pipeline code along with cTAKES dependencies and some
resources packaged into an uber jar which I am utilizing within my Spark
driver code. When I submit this to the Spark cluster I get a nasty stack
trace [0] with the following being important
- Caused by: java.lang.IllegalArgumentException: URI is not hierarchical
- at java.io.File.<init>(File.java:418)
- at
org.apache.ctakes.lvg.ae.LvgAnnotator.createAnnotatorDescription(LvgAnnotator.java:565)
- at
it.cnr.iac.CTAKESClinicalPipelineFactory.getTokenProcessingPipeline(CTAKESClinicalPipelineFactory.java:146)
The problem here is that
LvgAnnotator.createAnnotatorDescription(LvgAnnotator.java:565) looks as
follows
ExternalResourceFactory.createExternalResourceDescription(
LvgCmdApiResourceImpl.class,
new File(LvgCmdApiResourceImpl.class.getResource(
"/org/apache/ctakes/lvg/data/config/lvg.properties").toURI()))
Here we should be using LvgCmdApiResourceImpl.class.getResourceAsStream,
the transformation to File should then be done if required within
ExternalResourceFactory.createExternalResourceDescription.
The above is an issue which has been reported on a few occasions and a fix
somewhat proposed for a similar issue here [1][2].
I am going to submit a patch for this and submit a test. I'll open an issue
in Jira.
Thanks
Lewis
[0] https://paste.apache.org/gDJa
[1] https://issues.apache.org/jira/browse/CTAKES-307
[2] https://issues.apache.org/jira/browse/CTAKES-89
--
*Lewis*