Hi Jorge,

I can indeed reproduce your problem using your code.

After some debugging...
You have to add a contentType to your implementation in plugin.xml:

<implementation id="ImageThumbnailParser" 
class="...ImageThumbnailParser"><parameter name="contentType" 
value="image/png"/></implementation>

Good luck!
Send from my iphone,
Mathijs Homminga

On Jun 28, 2012, at 0:12, Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu> 
wrote:

> Of course Mathijs, thank you for the time and the replies, here goes my 
> parse-plugins.xml (as an attachment).
> 
> Greetings!
> 
> ----- Mensaje original -----
> De: "Mathijs Homminga" <mathijs.hommi...@kalooga.com>
> Para: user@nutch.apache.org
> Enviados: Miércoles, 27 de Junio 2012 17:44:43
> Asunto: Re: Problema with NullPointerException on custom Parser
> 
> Hmmm looking at the ParserFactory code, there can actually be several causes 
> for a NullPointerException...
> Can you also send the parse-plugins.xml? 
> 
> Mathijs Homminga
> 
> On Jun 27, 2012, at 23:23, Jorge Luis Betancourt Gonzalez 
> <jlbetanco...@uci.cu> wrote:
> 
>> This is the content of my plugin.xml
>> 
>> <plugin
>>  id="image-thumbnail"
>>  name="Image thumbnailer for Orion"
>>  version="1.0.0"
>>  provider-name="nutch.org">
>> 
>>   <runtime>
>>     <library name="image-thumbnail.jar">
>>        <export name="*"/>
>>     </library>
>>  </runtime>
>> 
>>  <requires>
>>     <import plugin="nutch-extensionpoints"/>
>>  </requires>
>> 
>>  <extension id="org.apache.nutch.parse.thumbnail.ImageThumbnailParser"
>>             name="Image thumbnailer parser"
>>             point="org.apache.nutch.parse.Parser">
>>     <implementation id="ImageThumbnailParser"
>>                     
>> class="org.apache.nutch.parse.thumbnail.ImageThumbnailParser"/>
>>  </extension>
>> 
>>  <extension 
>> id="org.apache.nutch.parse.thumbnail.ImageThumbnailIndexingFilter"
>>             name="Image thumbnail indexing filter"
>>             point="org.apache.nutch.indexer.IndexingFilter">
>>     <implementation id="ImageThumbnailIndexingFilter"
>>                     
>> class="org.apache.nutch.parse.thumbnail.ImageThumbnailIndexingFilter"/>
>>  </extension>
>> 
>> </plugin>
>> 
>> 
>> ----- Mensaje original -----
>> De: "Mathijs Homminga" <mathijs.hommi...@kalooga.com>
>> Para: user@nutch.apache.org
>> Enviados: Miércoles, 27 de Junio 2012 17:17:12
>> Asunto: Re: Problema with NullPointerException on custom Parser
>> 
>> No need for Tika. Can you send your plugin.xml?
>> 
>> Mathijs Homminga
>> 
>> On Jun 27, 2012, at 23:07, Jorge Luis Betancourt Gonzalez 
>> <jlbetanco...@uci.cu> wrote:
>> 
>>> Hi,
>>> 
>>> I agree with you, and is a genius idea rely on Tika to parse the files, but 
>>> in this particular case when all I want to do is encode the content into 
>>> base64 should I wrote a custom parser to tika and rely on the parser-tika 
>>> plugin to do its magic?
>>> 
>>> Jorge
>>> 
>>> ----- Mensaje original -----
>>> De: "Lewis John Mcgibbney" <lewis.mcgibb...@gmail.com>
>>> Para: user@nutch.apache.org
>>> Enviados: Miércoles, 27 de Junio 2012 16:55:12
>>> Asunto: Re: Problema with NullPointerException on custom Parser
>>> 
>>> Hi,
>>> 
>>> I think you are partly correct.
>>> 
>>> The core Nutch code itself doesn't do any parsing as such. All parsing
>>> is relied upon by external parsing libraries.
>>> 
>>> Basically we need to define a parser to do the parsing, using Tika as
>>> a wrapper for mimeType detection and subsequent parsing saves us a bit
>>> of overhead.
>>> 
>>> Lewis
>>> 
>>> On Wed, Jun 27, 2012 at 9:44 PM, Jorge Luis Betancourt Gonzalez
>>> <jlbetanco...@uci.cu> wrote:
>>>> Hi Lewis, thank you for the reply. Is mandatory wrote a wrap around Tika? 
>>>> I thought this was optional since I really don't parse the content 
>>>> searching for nothing, I only get the content, transform it into an Image 
>>>> object, resize it, and then I encode with base64 to store on the solr 
>>>> backend.
>>>> 
>>>> So I thought that all this processing could be done getParse method.
>>>> 
>>>> Is my assumption correct or is mandatory to write my desired logic using 
>>>> Tika?
>>>> 
>>>> ----- Mensaje original -----
>>>> De: "Lewis John Mcgibbney" <lewis.mcgibb...@gmail.com>
>>>> Para: user@nutch.apache.org
>>>> Enviados: Miércoles, 27 de Junio 2012 16:33:01
>>>> Asunto: Re: Problema with NullPointerException on custom Parser
>>>> 
>>>> Hi Jorge,
>>>> 
>>>> It doesn't look like your actually using Tika as a wrapper for your
>>>> custom parser at all...
>>>> 
>>>> You would be need to specify the correct Tika config by calling
>>>> tikaConfig.getParser
>>>> 
>>>> hth
>>>> 
>>>> On Wed, Jun 27, 2012 at 7:46 PM, Jorge Luis Betancourt Gonzalez
>>>> <jlbetanco...@uci.cu> wrote:
>>>>> Hi all:
>>>>> 
>>>>> I'm working on a custom parser plugin to generate thumbnails from images 
>>>>> fetched with nutch 1.4. I'm doing this because the humbnails will be 
>>>>> converted into a base64 encoded string and stored on a Solr backend.
>>>>> 
>>>>> So I basically wrote a custom parser (to which I send all png images, for 
>>>>> example). I enable the plugin (image-thumbnail) in the nutch-site.xml, 
>>>>> set some custom properties to load the width and height of the thumbnail. 
>>>>> Also set the alias on the parse-plugins.xml and set the plugin to handle 
>>>>> the image/png files, also in this file.
>>>>> 
>>>>> the plugin is being loaded, but every time I get a png image to parse I 
>>>>> get this:
>>>>> 
>>>>> Error parsing: 
>>>>> http://localhost/sites/all/themes/octavitos/images/iconos/audiointernet.png:
>>>>>  java.lang.NullPointerException
>>>>>      at org.apache.nutch.parse.ParserFactory.match(ParserFactory.java:388)
>>>>>      at 
>>>>> org.apache.nutch.parse.ParserFactory.getExtension(ParserFactory.java:397)
>>>>>      at 
>>>>> org.apache.nutch.parse.ParserFactory.matchExtensions(ParserFactory.java:296)
>>>>>      at 
>>>>> org.apache.nutch.parse.ParserFactory.findExtensions(ParserFactory.java:262)
>>>>>      at 
>>>>> org.apache.nutch.parse.ParserFactory.getExtensions(ParserFactory.java:234)
>>>>>      at 
>>>>> org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:119)
>>>>>      at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:71)
>>>>>      at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:86)
>>>>>      at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:42)
>>>>>      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>>      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>>>>      at 
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>>>> 
>>>>> The thing is that I have put some log messages inside the getParse() 
>>>>> method but none of this message are being logged on the hadoop.log file, 
>>>>> so for what I can tell the method is not being executed.
>>>>> 
>>>>> Any one has any idea what I'm doing wrong?
>>>>> 
>>>>> P.S: I've attached the source of the ImageThumbnailParser.
>>>>> 
>>>>> Greetings!
>>>>> 
>>>>> 
>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>>>> INFORMATICAS...
>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>>> 
>>>>> http://www.uci.cu
>>>>> http://www.facebook.com/universidad.uci
>>>>> http://www.flickr.com/photos/universidad_uci
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lewis
>>>> 
>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>>> INFORMATICAS...
>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>> 
>>>> http://www.uci.cu
>>>> http://www.facebook.com/universidad.uci
>>>> http://www.flickr.com/photos/universidad_uci
>>>> 
>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>>> INFORMATICAS...
>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>> 
>>>> http://www.uci.cu
>>>> http://www.facebook.com/universidad.uci
>>>> http://www.flickr.com/photos/universidad_uci
>>> 
>>> 
>>> 
>>> -- 
>>> Lewis
>>> 
>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>> INFORMATICAS...
>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>> 
>>> http://www.uci.cu
>>> http://www.facebook.com/universidad.uci
>>> http://www.flickr.com/photos/universidad_uci
>>> 
>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>> INFORMATICAS...
>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>> 
>>> http://www.uci.cu
>>> http://www.facebook.com/universidad.uci
>>> http://www.flickr.com/photos/universidad_uci
>> 
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>> 
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
>> 
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>> 
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
> 
> <parse-plugins.xml>

Reply via email to