[ 
https://issues.apache.org/jira/browse/TIKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281129#comment-14281129
 ] 

Konstantin Gribov commented on TIKA-1516:
-----------------------------------------

[~lewismc], if I understood you correct (English isn't my strong point), you 
asked is this case general or specific for for some input. My point is that 
this issue has general cause which can be reproduced only in non-trivial 
classloading scheme. Little analysis below.

Rome 1.0 use {{Thread.currentThread().getContextClassLoader()}} for loading 
both {{com/sun/syndication/rome.properties}} (default rome config) and 
{{rome.properties}} (user config). If classloader, used to load rome, is 
current thread's context classloader (CCL) or its ancestor classloader this 
works fine. 

Earlier rome (0.9) used same classloader as used when loading rome's 
PluginManager class (which invokes PropertiesLoader). This method finds default 
rome config in since they are loaded from same jar. But if user code is loaded 
by other classloader (e. g. for security reason) in different classloading 
branch it rome can't find user config. I don't know hadoop classloading scheme 
(and nutch use hadoop) but such case can be simply reproduced in servlet 
container if rome is loaded by ext/common classloader and app -- by webapp 
classloader.

I think, this was a reason to use CCL, but it lead to new problem. If is set to 
system and rome is loaded by descandant classloader ({{PluginClassLoader}} in 
this case) rome can't load its default config and ends with NPE as above.

I'll try to create test for this case soon.

> Downgrade Rome dependency to 0.9 to avoid nasty NPE
> ---------------------------------------------------
>
>                 Key: TIKA-1516
>                 URL: https://issues.apache.org/jira/browse/TIKA-1516
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.6
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 1.8
>
>         Attachments: TIKA-1516.patch
>
>
> As documented [in this 
> thread|http://www.mail-archive.com/dev%40nutch.apache.org/msg15755.html] 
> Nutch's 
> [parse-tika|https://github.com/apache/nutch/blob/trunk/src/plugin/parse-tika/plugin.xml#L56]
>  uses Rome 1.0, this is inherited directly from the Tika pom.xml for the 
> [same 
> depenency|https://github.com/apache/tika/blob/trunk/tika-parsers/pom.xml#L184].
> A downgrade is required.
> {code}
> java.lang.Exception: java.lang.ExceptionInInitializerError
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
> Caused by: java.lang.ExceptionInInitializerError
>         at com.sun.syndication.io.SyndFeedInput.build(SyndFeedInput.java:136)
>         at org.apache.tika.parser.feed.FeedParser.parse(FeedParser.java:70)
>         at 
> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:105)
>         at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:95)
>         at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:101)
>         at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:44)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.NullPointerException
>         at java.util.Properties$LineReader.readLine(Properties.java:418)
>         at java.util.Properties.load0(Properties.java:337)
>         at java.util.Properties.load(Properties.java:325)
>         at 
> com.sun.syndication.io.impl.PropertiesLoader.<init>(PropertiesLoader.java:74)
>         at 
> com.sun.syndication.io.impl.PropertiesLoader.getPropertiesLoader(PropertiesLoader.java:46)
>         at 
> com.sun.syndication.io.impl.PluginManager.<init>(PluginManager.java:54)
>         at 
> com.sun.syndication.io.impl.PluginManager.<init>(PluginManager.java:46)
>         at 
> com.sun.syndication.feed.synd.impl.Converters.<init>(Converters.java:40)
>         at 
> com.sun.syndication.feed.synd.SyndFeedImpl.<clinit>(SyndFeedImpl.java:59)
>         ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to