I think that in late April Sean Finan fixed a problem that was resulting in Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -7
Are you using cTAKES 4.0 (either from the convenience binary download or as a maven dependency) or are you using cTAKES in some other way -- James On Fri, Sep 1, 2017 at 3:13 PM, Michael Trepanier <[email protected]> wrote: > Hi All, > > We've been attempting to scale our cTAKES Pipeline on top of Spark, so > we've switched form using the "getDefaultPipeline" method to the > "getFastPipeline" method to boost the processing speed. However, while the > default pipeline works fine with Spark, the fast pipeline is throwing the > below error (edited down to the cTAKES portion of the stack trace): > > > Caused by: org.apache.uima.resource.ResourceInitializationException: > MESSAGE LOCALIZATION FAILED: Can't find resource for bundle > java.util.PropertyResourceBundle, > key Could not construct org.apache.ctakes.dictionary.lookup2.dictionary. > UmlsJdbcRareWordDictionary > at org.apache.ctakes.dictionary.lookup2.ae. > AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:131) > at org.apache.uima.analysis_engine.impl. > PrimitiveAnalysisEngine_impl.initializeAnalysisComponent( > PrimitiveAnalysisEngine_impl.java:266) > ... 44 more > Caused by: > org.apache.uima.analysis_engine.annotator.AnnotatorContextException: > MESSAGE LOCALIZATION FAILED: Can't find resource for bundle > java.util.PropertyResourceBundle, > key Could not construct org.apache.ctakes.dictionary.lookup2.dictionary. > UmlsJdbcRareWordDictionary > at org.apache.ctakes.dictionary.lookup2.dictionary. > DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser. > java:199) > at org.apache.ctakes.dictionary.lookup2.dictionary. > DictionaryDescriptorParser.parseDictionaries(DictionaryDescriptorParser. > java:156) > at org.apache.ctakes.dictionary.lookup2.dictionary. > DictionaryDescriptorParser.parseDescriptor(DictionaryDescriptorParser. > java:128) > at org.apache.ctakes.dictionary.lookup2.ae. > AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:129) > ... 45 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.ctakes.dictionary.lookup2.dictionary. > DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser. > java:196) > ... 48 more > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -7 > at java.lang.String.substring(String.java:1967) > at org.apache.ctakes.dictionary.lookup2.util. > JdbcConnectionFactory.getConnectionUrl(JdbcConnectionFactory.java:110) > at org.apache.ctakes.dictionary.lookup2.util. > JdbcConnectionFactory.getConnection(JdbcConnectionFactory.java:63) > at org.apache.ctakes.dictionary.lookup2.dictionary. > JdbcRareWordDictionary.<init>(JdbcRareWordDictionary.java:91) > at org.apache.ctakes.dictionary.lookup2.dictionary. > JdbcRareWordDictionary.<init>(JdbcRareWordDictionary.java:72) > at org.apache.ctakes.dictionary.lookup2.dictionary. > UmlsJdbcRareWordDictionary.<init>(UmlsJdbcRareWordDictionary.java:31) > ... 53 more > > > So, looking in "getConnectionUrl," we have this method: > > static private String getConnectionUrl( final String jdbcUrl ) throws > SQLException { > final String urlDbPath = jdbcUrl.substring( > HSQL_FILE_PREFIX.length() ); > final String urlFilePath = urlDbPath + HSQL_DB_EXT; > try { > final URL url = FileLocator.getResource( urlFilePath ); > final String urlString = url.toExternalForm(); > return urlString.substring( 0, urlString.length() - > HSQL_DB_EXT.length() ); // <--- > } catch ( FileNotFoundException fnfE ) { > throw new SQLException( "No Hsql DB exists at Url", fnfE ); > } > > The substring method indicated above appears to be what is causing the > error - for some reason the "urlString" variable has a length of zero. This > seems to indicate that there is something wrong with the cTAKES resources. > However, that isn't making much sense to me as the default pipeline, which > also relies on the resources package, is working fine. Has anyone > encountered something like this before? Does the fast pipeline require some > additional resources? > > As well, for the Spark implementation, we've put the cTAKES jars and > resources on each executor at the same location, and are specifying this in > on the executor classpath. > > Thanks, > > Mike > -- > [image: MetiStream Logo - 500] > Mike Trepanier| Big Data Engineer | MetiStream, Inc. | > [email protected] | 845 - 270 - 3129 <(845)%20270-3129> (m) | > www.metistream.com >
