I think that in late April Sean Finan fixed a problem that was resulting in
Caused by: java.lang.StringIndexOutOfBoundsException: String index out
of range:
-7

Are you using cTAKES 4.0 (either from the convenience binary download or as
a maven dependency) or are you using cTAKES in some other way

-- James


On Fri, Sep 1, 2017 at 3:13 PM, Michael Trepanier <[email protected]>
wrote:

> Hi All,
>
> We've been attempting to scale our cTAKES Pipeline on top of Spark, so
> we've switched form using the "getDefaultPipeline" method to the
> "getFastPipeline" method to boost the processing speed. However, while the
> default pipeline works fine with Spark, the fast pipeline is throwing the
> below error (edited down to the cTAKES portion of the stack trace):
>
>
> Caused by: org.apache.uima.resource.ResourceInitializationException:
> MESSAGE LOCALIZATION FAILED: Can't find resource for bundle 
> java.util.PropertyResourceBundle,
> key Could not construct org.apache.ctakes.dictionary.lookup2.dictionary.
> UmlsJdbcRareWordDictionary
>         at org.apache.ctakes.dictionary.lookup2.ae.
> AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:131)
>         at org.apache.uima.analysis_engine.impl.
> PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(
> PrimitiveAnalysisEngine_impl.java:266)
>         ... 44 more
> Caused by: 
> org.apache.uima.analysis_engine.annotator.AnnotatorContextException:
> MESSAGE LOCALIZATION FAILED: Can't find resource for bundle 
> java.util.PropertyResourceBundle,
> key Could not construct org.apache.ctakes.dictionary.lookup2.dictionary.
> UmlsJdbcRareWordDictionary
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser.
> java:199)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> DictionaryDescriptorParser.parseDictionaries(DictionaryDescriptorParser.
> java:156)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> DictionaryDescriptorParser.parseDescriptor(DictionaryDescriptorParser.
> java:128)
>         at org.apache.ctakes.dictionary.lookup2.ae.
> AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:129)
>         ... 45 more
> Caused by: java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser.
> java:196)
>         ... 48 more
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -7
>         at java.lang.String.substring(String.java:1967)
>         at org.apache.ctakes.dictionary.lookup2.util.
> JdbcConnectionFactory.getConnectionUrl(JdbcConnectionFactory.java:110)
>         at org.apache.ctakes.dictionary.lookup2.util.
> JdbcConnectionFactory.getConnection(JdbcConnectionFactory.java:63)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> JdbcRareWordDictionary.<init>(JdbcRareWordDictionary.java:91)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> JdbcRareWordDictionary.<init>(JdbcRareWordDictionary.java:72)
>         at org.apache.ctakes.dictionary.lookup2.dictionary.
> UmlsJdbcRareWordDictionary.<init>(UmlsJdbcRareWordDictionary.java:31)
>         ... 53 more
>
>
> So, looking in "getConnectionUrl," we have this method:
>
> static private String getConnectionUrl( final String jdbcUrl ) throws
> SQLException {
>       final String urlDbPath = jdbcUrl.substring(
> HSQL_FILE_PREFIX.length() );
>       final String urlFilePath = urlDbPath + HSQL_DB_EXT;
>       try {
>          final URL url = FileLocator.getResource( urlFilePath );
>          final String urlString = url.toExternalForm();
>          return urlString.substring( 0, urlString.length() -
> HSQL_DB_EXT.length() ); // <---
>       } catch ( FileNotFoundException fnfE ) {
>          throw new SQLException( "No Hsql DB exists at Url", fnfE );
>       }
>
> The substring method indicated above appears to be what is causing the
> error - for some reason the "urlString" variable has a length of zero. This
> seems to indicate that there is something wrong with the cTAKES resources.
> However, that isn't making much sense to me as the default pipeline, which
> also relies on the resources package, is working fine. Has anyone
> encountered something like this before? Does the fast pipeline require some
> additional resources?
>
> As well, for the Spark implementation, we've put the cTAKES jars and
> resources on each executor at the same location, and are specifying this in
> on the executor classpath.
>
> Thanks,
>
> Mike
> --
> [image: MetiStream Logo - 500]
> Mike Trepanier| Big Data Engineer | MetiStream, Inc. |
> [email protected] | 845 - 270 - 3129 <(845)%20270-3129> (m) |
> www.metistream.com
>

Reply via email to