Hi Lewis, thank you for the reply. Is mandatory wrote a wrap around Tika? I 
thought this was optional since I really don't parse the content searching for 
nothing, I only get the content, transform it into an Image object, resize it, 
and then I encode with base64 to store on the solr backend.

So I thought that all this processing could be done getParse method.

Is my assumption correct or is mandatory to write my desired logic using Tika?

----- Mensaje original -----
De: "Lewis John Mcgibbney" <[email protected]>
Para: [email protected]
Enviados: Miércoles, 27 de Junio 2012 16:33:01
Asunto: Re: Problema with NullPointerException on custom Parser

Hi Jorge,

It doesn't look like your actually using Tika as a wrapper for your
custom parser at all...

You would be need to specify the correct Tika config by calling
tikaConfig.getParser

hth

On Wed, Jun 27, 2012 at 7:46 PM, Jorge Luis Betancourt Gonzalez
<[email protected]> wrote:
> Hi all:
>
> I'm working on a custom parser plugin to generate thumbnails from images 
> fetched with nutch 1.4. I'm doing this because the humbnails will be 
> converted into a base64 encoded string and stored on a Solr backend.
>
> So I basically wrote a custom parser (to which I send all png images, for 
> example). I enable the plugin (image-thumbnail) in the nutch-site.xml, set 
> some custom properties to load the width and height of the thumbnail. Also 
> set the alias on the parse-plugins.xml and set the plugin to handle the 
> image/png files, also in this file.
>
> the plugin is being loaded, but every time I get a png image to parse I get 
> this:
>
> Error parsing: 
> http://localhost/sites/all/themes/octavitos/images/iconos/audiointernet.png: 
> java.lang.NullPointerException
>        at org.apache.nutch.parse.ParserFactory.match(ParserFactory.java:388)
>        at 
> org.apache.nutch.parse.ParserFactory.getExtension(ParserFactory.java:397)
>        at 
> org.apache.nutch.parse.ParserFactory.matchExtensions(ParserFactory.java:296)
>        at 
> org.apache.nutch.parse.ParserFactory.findExtensions(ParserFactory.java:262)
>        at 
> org.apache.nutch.parse.ParserFactory.getExtensions(ParserFactory.java:234)
>        at 
> org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:119)
>        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:71)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:86)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:42)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>
> The thing is that I have put some log messages inside the getParse() method 
> but none of this message are being logged on the hadoop.log file, so for what 
> I can tell the method is not being executed.
>
> Any one has any idea what I'm doing wrong?
>
> P.S: I've attached the source of the ImageThumbnailParser.
>
> Greetings!
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci



--
Lewis

10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to